# Case Study - The Current

* The Current is an alternative radio station
* We will pull information about the play list.

# Step 0 - Inspect the following page

* Song title
* Artist
* Play time
* Day, date, period (am/pm)

http://www.thecurrent.org/playlist/2014-01-01/01

In [1]:
# Import modules here
from composablesoup import find, find_all, get_text, has_attr
from composable.sequence import slice, head
from composable.strict import map, filter
from composable.string import replace, split
from composable import pipeable
from composable import from_toolz as tlz

In [2]:
# Read in the page here
import requests
from bs4 import BeautifulSoup
s = requests.Session()
r = s.get('https://www.thecurrent.org/playlist/2014-10-14/01')
music_page = BeautifulSoup(r.content, "html.parser")

# Step 1 - Pull off the period of the day (am/pm)

Pull out the "am"/"pm"

1. Inspect the element
2. Identify the html tag and class
3. Search the soup
    1. There should be one item returned
4. Use soup\string methods to pull out the info

In [3]:
strip = pipeable(lambda s: s.strip())
(music_page
 >> find('span', attrs = {'class':'hour-header'})
 >> get_text
 >> strip
 >> split(' ')
 >> tlz.get(1)
)

'am'

In [4]:
get_period = pipeable( lambda soup: soup
 >> find('span', attrs = {'class':'hour-header'})
 >> get_text
 >> strip
 >> split(' ')
 >> tlz.get(1)
)

In [5]:
music_page >> get_period

'am'

# Step 2 - Pull off DJ

Use a similar process to pull off the DJ.


In [6]:
remove_dj = pipeable(lambda s: s.replace('DJ: ',''))
(music_page
 >> find('h5', attrs = {'class':'currentDj'})
 >> get_text
 >> remove_dj
)

'Jade'

In [7]:
get_dj = pipeable(lambda soup: soup
 >> find('h5', attrs = {'class':'currentDj'})
 >> get_text
 >> remove_dj
)

In [8]:
music_page >> get_dj

'Jade'

# Step 3 - Pull out the day of the week

* Pull out the day of the week

In [15]:
(
    music_page
    >> (find('a', attrs={'class':'start-picker'}))
    >> get_text
    >> split(',')
    >> tlz.get(0)
)

'Tuesday'

In [16]:
get_day = pipeable(lambda soup: soup
    >> (find('a', attrs={'class':'start-picker'}))
    >> get_text
    >> split(',')
    >> tlz.get(0)
)

In [17]:
music_page >> get_day

'Tuesday'

# Title of each song

1. Inspect the element
2. Identify the html tag and class
3. Use `find_all` to make a list of all relevant tags
4. Pull off an example case
5. Write a function to pull out the title
6. Write a single pipe to convert the original soup into a list of titles. 
7. Verify you have the right number of titles.
8. Package the pipe in a function named `get_title`

In [10]:
(music_page
    >> find_all('h5', attrs={'class':'title'})
    >> map(get_text)
)

['Kill The Fun',
 'Artifact #1',
 'Gooey',
 'The Puritan',
 "You're All I've Got Tonight",
 'No One Is Lost',
 'Moon When the Cherries Turn Black',
 'Colorado',
 'Never, Never Gonna Give You Up',
 'Trainwreck 1979',
 'My Girls',
 'BOYTROUBLE feat. Lizzo',
 'White Lies',
 'Beware the Dog',
 'Knock Me Down']

In [12]:
get_title = pipeable(lambda soup: soup
    >> find_all('h5', attrs={'class':'title'})
    >> map(get_text)
)

In [13]:
music_page >> get_title

['Kill The Fun',
 'Artifact #1',
 'Gooey',
 'The Puritan',
 "You're All I've Got Tonight",
 'No One Is Lost',
 'Moon When the Cherries Turn Black',
 'Colorado',
 'Never, Never Gonna Give You Up',
 'Trainwreck 1979',
 'My Girls',
 'BOYTROUBLE feat. Lizzo',
 'White Lies',
 'Beware the Dog',
 'Knock Me Down']

# Pull off the name of the artist

1. Inspect the element
2. Identify the html tag and class
3. Use `find_all` to make a list of all relevant tags
4. Pull off an example case
5. Write a function to pull out the artist
6. Write a single pipe to convert the original soup into a list of artists. 
7. Verify you have the right number of artists.
8. Package the pipe in a function named `get_artist`


In [18]:
(
    music_page
    >> find_all('h5', attrs={'class':'artist'})
    >> map(get_text)
)

['Haley Bonar',
 'Conor Oberst',
 'Glass Animals',
 'Blur',
 'The Cars',
 'Stars',
 'The Pines',
 'Chastity Brown',
 'Barry White',
 'Death From Above 1979',
 'Animal Collective',
 'Prince and 3RDEYEGIRL',
 'Max Frost',
 'The Griswolds',
 'Red Hot Chili Peppers']

In [19]:
get_artist = pipeable(lambda soup: soup
    >> find_all('h5', attrs={'class':'artist'})
    >> map(get_text)
    )

In [20]:
music_page >> get_artist

['Haley Bonar',
 'Conor Oberst',
 'Glass Animals',
 'Blur',
 'The Cars',
 'Stars',
 'The Pines',
 'Chastity Brown',
 'Barry White',
 'Death From Above 1979',
 'Animal Collective',
 'Prince and 3RDEYEGIRL',
 'Max Frost',
 'The Griswolds',
 'Red Hot Chili Peppers']