# Functions Practice Lab

### Introduction

In this lesson, let's continue to practice using functions to both automate operations and to make our codebase more flexible.  We'll do so by working with our Spotify data.

### Loading our Data

Let's get started by just practicing how to writing a function.  Write a function called `get_songs` that returns has a return value of  `['song 1', 'song 2']`.

In [1]:
def get_songs():
    return ['song 1', 'song 2']

Then we'll test that we set our function up correctly.

In [2]:
get_songs()

# ['song 1', 'song 2']

['song 1', 'song 2']

Now let's write a function called `scrape_songs` that pulls the list of songs from spotify and returns a list of dictionaries representing the top streaming songs.  We can write the function based off of the following code.

In [3]:
import pandas as pd

In [4]:
# url = 'https://en.wikipedia.org/wiki/List_of_most-streamed_songs_on_Spotify'
# songs_table = pd.read_html(url)[0]
# top_songs = songs_table.to_dict('records')

In [5]:
def scrape_songs():
    url = 'https://en.wikipedia.org/wiki/List_of_most-streamed_songs_on_Spotify'
    songs_table = pd.read_html(url)[0]
    top_songs = songs_table.to_dict('records')[:-1]
    return top_songs

Then let's test out the code.  We can assign the result to the variable `top_songs`.

In [6]:
top_songs = scrape_songs()

In [9]:
top_songs[:2]

[{'Rank': '1',
  'Song': '"Shape Of You"',
  'Streams(Billions)': '3.161',
  'Artist(s)': 'Ed Sheeran',
  'Date published': '6 January 2017',
  'Ref.': '[7]'},
 {'Rank': '2',
  'Song': '"Blinding Lights"',
  'Streams(Billions)': '3.010',
  'Artist(s)': 'The Weeknd',
  'Date published': '29 November 2019',
  'Ref.': '[8]'}]

In [10]:
top_songs[-1]

{'Rank': '100',
 'Song': '"I Fall Apart"',
 'Streams(Billions)': '1.359',
 'Artist(s)': 'Post Malone',
 'Date published': '9 December 2016',
 'Ref.': '[106]'}

> Notice that it takes a little while to run the above code.  This is because each time we call the function, it scrapes data from the Wikipedia page.

### Querying our Data

Now let's write some functions to query our data.  We can write a function called `stream_numbers_for(song)` that takes in a singe song dictionary, and returns the corresponding number of streams as an integer.

In [17]:
def stream_number_for(song):
    return float(song['Streams(Billions)'])

Then we can test this out below.

In [18]:
first_song = top_songs[0]

second_song = top_songs[1]

In [19]:
stream_number_for(first_song)
# 3.161

3.161

In [20]:
stream_number_for(second_song)
# 3.01

3.01

Next write a function called `stream_numbers_for(songs)` that takes in a list of songs and returns the corresponding stream numbers.

> Try to use the `stream_number_for` function above in solving this.

In [21]:
def stream_numbers_for(songs):
    return [stream_number_for(song) for song in songs]

In [22]:
top_five_songs = top_songs[:5]

In [23]:
stream_numbers_for(top_five_songs)

# [2621, 2033, 1925, 1893, 1871]

[3.161, 3.01, 2.588, 2.443, 2.383]

Next let's use functions to filter our data.  Write a function called `songs_streamed_more_than(songs, number)` that returns a list of names of songs that were streamed more than that number of times.

In [29]:
def songs_streamed_more_than(songs, number):
    return [song['Song'] for song in songs if stream_number_for(song) > number]

In [33]:
songs_streamed_more_than(top_songs, 2.1)

# ['"Shape Of You"', '"Blinding Lights"', '"Dance Monkey"', '"Rockstar"', '"Someone You Loved"', '"One Dance"', '"Sunflower"', '"Closer"', '"Señorita"', '"Believer"']

['"Shape Of You"', '"Blinding Lights"', '"Dance Monkey"', '"Rockstar"', '"Someone You Loved"', '"One Dance"', '"Sunflower"', '"Closer"', '"Señorita"', '"Believer"']


Next return the year number for each song streamed more than a specified number of times. 

In [35]:
def song_years_for_songs_streamed_more_than(songs, number):
    return [int(song['Date published'].split()[-1]) for song in songs if stream_number_for(song) > number]

In [36]:
song_years_for_songs_streamed_more_than(top_songs, 2.1)
# [2017, 2019, 2019, 2017, 2018, 2016, 2018, 2016, 2019, 2017]

[2017, 2019, 2019, 2017, 2018, 2016, 2018, 2016, 2019, 2017]

So we can see that each of the top songs were streamed relatively recently.  Next write a function that given a list of songs and a key, returns just the corresponding value of each song.  

In [37]:
def values_of(songs, key_name):
    return [song[key_name] for song in songs]

In [38]:
values_of(top_songs, 'Song')[:5]

# ['"Shape of You"', '"Rockstar"', '"One Dance"', '"Dance Monkey"', '"Closer"']

['"Shape Of You"',
 '"Blinding Lights"',
 '"Dance Monkey"',
 '"Rockstar"',
 '"Someone You Loved"']

In [39]:
values_of(top_songs, 'Date published')[:5]

# ['6 January 2017',
#  '15 September 2017',
#  '5 April 2016',
#  '10 May 2019',
#  '29 July 2016']

['6 January 2017',
 '29 November 2019',
 '10 May 2019',
 '15 September 2017',
 '8 November 2018']

### Summary

In this lesson we saw how we can use functions to both automate procedures and to make our code more flexible.  For example, it now only takes a call to our `scrape_songs` function for us to gather our data from Wikipedia and transform it to a list of dictionaries.  And with our `values_of` function we decide when we call the function which value from the data we wish to abstract.