# Case Study - The Current

* The Current is an alternative radio station
* We will pull information about the play list.

# Step 0 - Inspect the following page

* Song title
* Artist
* Play time
* Day, date, period (am/pm)

http://www.thecurrent.org/playlist/2014-01-01/01

In [1]:
# Import modules here
import requests
from bs4 import BeautifulSoup

In [2]:
# Read in the page here
s = requests.Session()
r = s.get('https://www.thecurrent.org/playlist/2014-01-01/01')
current = BeautifulSoup(r.content, "html.parser")

# Step 1 - Pull off the period of the day (am/pm)

Pull out the "am"/"pm"

1. Inspect the element
2. Identify the html tag and class
3. Search the soup
    1. There should be one item returned
4. Use soup\string methods to pull out the info

In [3]:
current.find("span", class_="hour-header open").text.split()[1]

'am'

# Step 2 - Pull off DJ

Use a similar process to pull off the DJ.


In [4]:
current.find("h5", class_="currentDj").text.split()[1]

'Jade'

In [5]:
# Note: current DJ, not anything related to the playlist on the page

# Step 3 - Pull out the day of the week

* Pull out the day of the week

In [6]:
current.find("a", class_="start-picker").text.split(",")[0]

'Wednesday'

# Title of each song

1. Inspect the element
2. Identify the html tag and class
3. Use `find_all` to make a list of all relevant tags
4. Pull off an example case
5. Write a function to pull out the title
6. Write a single pipe to convert the original soup into a list of titles. 
7. Verify you have the right number of titles.
8. Package the pipe in a function named `get_title`

Tag is h5, class is "title"

In [7]:
current.find_all("h5", class_="title")

[<h5 class="title">Holy Roller</h5>,
 <h5 class="title">Kingdom of Rust</h5>,
 <h5 class="title">Black Dog</h5>,
 <h5 class="title">Turn It Around</h5>,
 <h5 class="title">Flavor of the Month</h5>,
 <h5 class="title">Potential Wife</h5>,
 <h5 class="title">24 Hours</h5>,
 <h5 class="title">Who's Gonna Shoe Your Pretty Little Feet?</h5>,
 <h5 class="title">Marigold</h5>,
 <h5 class="title">High Road</h5>,
 <h5 class="title">The Vampyre Of Time and Memory</h5>,
 <h5 class="title">Valerie Plame</h5>,
 <h5 class="title">Morning Song</h5>,
 <h5 class="title">(You Will) Set The World On Fire</h5>,
 <h5 class="title">Sixteen Saltines</h5>,
 <h5 class="title">Wave of Mutilation</h5>]

In [8]:
ex_title = current.find_all("h5", class_="title")[0]
ex_title

<h5 class="title">Holy Roller</h5>

In [9]:
ex_title.text

'Holy Roller'

In [10]:
[tag.text for tag in current.find_all("h5", class_="title")]

['Holy Roller',
 'Kingdom of Rust',
 'Black Dog',
 'Turn It Around',
 'Flavor of the Month',
 'Potential Wife',
 '24 Hours',
 "Who's Gonna Shoe Your Pretty Little Feet?",
 'Marigold',
 'High Road',
 'The Vampyre Of Time and Memory',
 'Valerie Plame',
 'Morning Song',
 '(You Will) Set The World On Fire',
 'Sixteen Saltines',
 'Wave of Mutilation']

In [11]:
def get_titles(soup: BeautifulSoup) -> list:
    return [tag.text for tag in soup.find_all("h5", class_="title")]

In [12]:
get_titles(current)

['Holy Roller',
 'Kingdom of Rust',
 'Black Dog',
 'Turn It Around',
 'Flavor of the Month',
 'Potential Wife',
 '24 Hours',
 "Who's Gonna Shoe Your Pretty Little Feet?",
 'Marigold',
 'High Road',
 'The Vampyre Of Time and Memory',
 'Valerie Plame',
 'Morning Song',
 '(You Will) Set The World On Fire',
 'Sixteen Saltines',
 'Wave of Mutilation']

In [13]:
from composablesoup import find, find_all, get_text, has_attr
from composable.sequence import slice, head
from composable.strict import map, filter
from composable.string import replace
from composable import from_toolz as tlz

In [14]:
titles = (current
    >> find_all("h5", class_="title")
    >> map(get_text)
)

In [15]:
len(get_titles(current)) == len(titles) == 16

True

In [16]:
def get_title(soup):
    return (soup
            >> find_all("h5", class_="title")
            >> map(get_text)
        )

get_title(current)

['Holy Roller',
 'Kingdom of Rust',
 'Black Dog',
 'Turn It Around',
 'Flavor of the Month',
 'Potential Wife',
 '24 Hours',
 "Who's Gonna Shoe Your Pretty Little Feet?",
 'Marigold',
 'High Road',
 'The Vampyre Of Time and Memory',
 'Valerie Plame',
 'Morning Song',
 '(You Will) Set The World On Fire',
 'Sixteen Saltines',
 'Wave of Mutilation']

# Pull off the name of the artist

1. Inspect the element
2. Identify the html tag and class
3. Use `find_all` to make a list of all relevant tags
4. Pull off an example case
5. Write a function to pull out the artist
6. Write a single pipe to convert the original soup into a list of artists. 
7. Verify you have the right number of artists.
8. Package the pipe in a function named `get_artist`


tag is h5, class is "artist"

In [17]:
current.find_all("h5", class_="artist")

[<h5 class="artist">Thao and The Get Down Stay Down</h5>,
 <h5 class="artist">Doves</h5>,
 <h5 class="artist">Frankie Lee</h5>,
 <h5 class="artist">Lucius</h5>,
 <h5 class="artist">The Posies</h5>,
 <h5 class="artist">Strange Names</h5>,
 <h5 class="artist">Sky Ferreira</h5>,
 <h5 class="artist">Billie Joe and Norah</h5>,
 <h5 class="artist">J. Roddy Walston and The Business</h5>,
 <h5 class="artist">Cults</h5>,
 <h5 class="artist">Queens of the Stone Age</h5>,
 <h5 class="artist">The Decemberists</h5>,
 <h5 class="artist">The Avett Brothers</h5>,
 <h5 class="artist">David Bowie</h5>,
 <h5 class="artist">Jack White</h5>,
 <h5 class="artist">Pixies</h5>]

In [18]:
ex_artist = current.find_all("h5", class_="artist")[0]
ex_artist

<h5 class="artist">Thao and The Get Down Stay Down</h5>

In [19]:
ex_artist.text

'Thao and The Get Down Stay Down'

In [20]:
ex_artist >> get_text()

'Thao and The Get Down Stay Down'

In [21]:
[tag.text for tag in current.find_all("h5", class_="artist")]

['Thao and The Get Down Stay Down',
 'Doves',
 'Frankie Lee',
 'Lucius',
 'The Posies',
 'Strange Names',
 'Sky Ferreira',
 'Billie Joe and Norah',
 'J. Roddy Walston and The Business',
 'Cults',
 'Queens of the Stone Age',
 'The Decemberists',
 'The Avett Brothers',
 'David Bowie',
 'Jack White',
 'Pixies']

In [22]:
def get_artists(soup: BeautifulSoup) -> list:
    return [tag.text for tag in soup.find_all("h5", class_="artist")]

In [23]:
get_artists(current)

['Thao and The Get Down Stay Down',
 'Doves',
 'Frankie Lee',
 'Lucius',
 'The Posies',
 'Strange Names',
 'Sky Ferreira',
 'Billie Joe and Norah',
 'J. Roddy Walston and The Business',
 'Cults',
 'Queens of the Stone Age',
 'The Decemberists',
 'The Avett Brothers',
 'David Bowie',
 'Jack White',
 'Pixies']

In [24]:
artists =(current
    >> find_all("h5", class_="artist")
    >> map(get_text)
)

In [25]:
len(get_artists(current)) == len(artists) == 16

True

In [26]:
def get_artist(soup): # not good name - implies only 
    return (soup
    >> find_all("h5", class_="artist")
    >> map(get_text)
)

In [27]:
get_artist(current)

['Thao and The Get Down Stay Down',
 'Doves',
 'Frankie Lee',
 'Lucius',
 'The Posies',
 'Strange Names',
 'Sky Ferreira',
 'Billie Joe and Norah',
 'J. Roddy Walston and The Business',
 'Cults',
 'Queens of the Stone Age',
 'The Decemberists',
 'The Avett Brothers',
 'David Bowie',
 'Jack White',
 'Pixies']