## Practice: Spotify's API @ ESADE

Exercise book created by [Dr. Irene Unceta](https://linkedin.com/in/ireneunceta/)

Solution: [Kartik Thopalli](https://www.linkedin.com/in/kartik-thopalli/)

This notebook contains a set of exercises that will guide you through the different steps of this practice exercise. Devise solutions that are code-based, i.e. not hard-coded or manually computed. You can monitor your progress by running the tests for each exercise. Thsi will allow you to spot your mistakes and correct them.

<div class="alert alert-success">The aim of this exercise is to create and save a dataset containing information about every song in a given playlist by requesting data from Spotify's API.</div>

## Getting client credentials

Spotify's API uses OAuth as an Authentication scheme. Hence, before starting to make requests, you need to get your client credentials to the Spotify API. 

To do so, you need to have a Spotify account (free or paid). If you don't have one yet, please create a free account before moving on. Once you do, head over to Spotify for Developers, open your [Dashboard](https://developer.spotify.com/dashboard/) and log in with your account. 

Click on “CREATE AN APP”, choose a name and description for your project and work your way through the checkboxes. Don't worry about the actual name and description. The only thing we are interested in is getting the credentials. Write the redirect URI as  `http://localhost:8080`

<div class="alert alert-info">Create two new variables, <b>client_id</b> and <b>client_secret</b>, that store your ID and Key, respectively. These are available under settings</div>

In [30]:
import os
from dotenv import load_dotenv
load_dotenv()

client_id  = os.getenv('client_id')
client_secret = os.getenv('client_secret')

These are the keys that you will use throughout the remaining of the assignment.

Run the following cell to check that you created them properly. If you get no error when running the cell it means that everything is OK. You can run this cell as many times as you want as long as you **don't modify it**.

<div class="alert alert-danger"> Make sure you pass this test successfully before moving on to the remaining exercises.</div>

In [31]:
import unittest
"""
This line imports the unittest module, which is a testing library in Python that 
allows you to create tests for your Python code.
"""
tc = unittest.TestCase()
"""
Here, you create an instance of the TestCase class from the unittest module. 
This class represents an individual unit of testing and can be used to group together 
related test functions.
"""
tc.assertTrue('client_id' in locals())
tc.assertTrue('client_secret' in locals())
"""
In these lines, you're using the assertTrue method of your TestCase instance to check that client_id and 
client_secret are defined in the local scope. 
The locals() function returns a dictionary of the current namespace.
"""
tc.assertIsInstance('client_id',str)
tc.assertIsInstance('client_secret',str)
"""
Here, you're using the assertIsInstance method to check that client_id and client_secret are both instances 
of the str (string) class.
"""
tc.assertTrue(client_id.isalnum())
tc.assertTrue(client_secret.isalnum())
"""
In these lines, you're using the assertTrue method to check that client_id and client_secret 
are alphanumeric (i.e., they only contain letters and/or numbers) using the str.isalnum method.
"""
print("All good")

All good


Great! We are good to go. Next step is getting an access token.

## Getting an access token

In order to access the various endpoints of the Spotify API, we need to pass an access token. To get one, we need to pass a ```POST``` request with our client credentials. This request will create a token resource in the server and respond back with it. We can build this ```POST``` request using ```requests``` library. 

<div class="alert alert-info">Run the following cell to built your POST request</div>

In [32]:
import requests

# URL for token resource
auth_url = 'https://accounts.spotify.com/api/token'

# request body
params = {'grant_type': 'client_credentials',
          'client_id': client_id,
          'client_secret': client_secret}

# POST the request
auth_response = requests.post(auth_url, params).json()

Take a look at the response you obtained. The access_token you just retrieved will expire after one hour.

<div class="alert alert-info">Retrieve your token from <b>auth_response</b> and save it in a new variable called <b>access_token</b>. </div>

<div class="alert alert-warning">Make sure you define the <i>access_token</i> variable such that it will be updated each time your code is run from scratch, i.e. make sure it hasn't expired by the time your code is graded.</div>

In [33]:
access_token = auth_response['access_token']
token_type=auth_response['token_type']
display(auth_response)

{'access_token': 'BQA_bkKl5VBileEH72dzPs0PT7j0ye_x59dgVKXBKs35RDIjrzO5cISFtSCFOJc3-krr4n0ixIRC0O8YLko5p6x5DwMzAeIwa8EsloKYu_h79VB_7I4',
 'token_type': 'Bearer',
 'expires_in': 3600}

As above, the access token will be used throughout the remaining of the assignment.

Run the following cell to check that you created it properly. If you get no error when running the cell it means that everything is OK. You can run this cell as many times as you want as long as you **don't modify it**.

<div class="alert alert-danger"> Make sure you pass this test successfully before moving on to the remaining exercises.</div>

In [34]:
import unittest

tc = unittest.TestCase()
tc.assertTrue('access_token' in locals())
tc.assertIsInstance(access_token, str)

In [35]:
headers = {'Authorization': '{token_type} {token}'.format(token_type=token_type,token=access_token)}

This token is your golden ticket to access Spotify's API. A copy of this string is now stored in the server, so that everytime you make a request to the API the server will check that the token you provide and the one it has in store match.

## Poking around

Spotify's API provides numerous endpoints to access things like album listings, artist information, playlists, even Spotify-generated audio analysis of individual tracks, which include their time signature or measurements such as their “danceability” or "loudness". You can take a look at all the information available by reading the [Docs](https://developer.spotify.com/documentation/web-api/reference/). In this assignment you will use several of these endpoints.

In order to get a feel of how the API works, we will begin by making a ```GET``` request to the ```audio-features``` endpoint to extract data for a specific track. In particular, let's retrieve all the information for Radiohead's *Creep* song. 

The first thing you need is to identify the appropriate URL or path to direct your request to. The urls for all Spotify API endpoints follow the same structure. They all use the base URL for the API and are then defined as a concatenation of ```base_url + endpoint```. Sometimes, you will also need to provide some additional information as part of the URL. In the case of ```audio-features```, however, it is enough with just the ```base_url``` and the ```endpoint``` name.

The ```base_url``` is defined below:

In [36]:
base_url = 'https://api.spotify.com/v1/'

<div class="alert alert-info">Define the url for the audio-features endpoint by following the instructions above. Store it in a variable called <b>audio_features_endpoint</b>.</div>

In [37]:
audio_features_endpoint = 'audio-features'
url = base_url+audio_features_endpoint

Next thing we need is to fill in the request body. If you check the documentation you'll see that the ```audio-features``` endpoint takes the following query parameters.

As per the query parameters, the final thing you need to extract data about Radiohead's Creep song is to locate its ```id```. This is its unique identifier. Spotify has unique ids for tracks, for artists, for albums, for playlists, etc.

You can get the ```id``` for any song by going to Spotify, looking for the song, clicking the “…” after pressing cntrl button, by the song name, then “Share” and then “Copy Spotify URI”.

<div class="alert alert-warning">Note that this procedure also works for retrieving ids for artists, albums or any other resource type. Alternatively, you can also retrieve the id for a song or a playlist by clicking "..." by the name, then "Share" and the "Copy Link". This will return a full link to the resource. The id corresponds to the string right after the last slash sign / and before the ? sign</div>

This URI should be a string that includes something like **spotify:track:**, followed by an alphanumeric sequence. This sequence is the ID you are looking for.

<div class="alert alert-info">Create a new variable called <b>track_id</b> that stores the ID for Radiohead's song Creep.</div>

In [38]:
track_id='70LcF31zb1H0PyJoS1Sx1r'

Now that you have the id, let's format the body of the request. You can provide the body in dictionary form. The keys of the dictionary should correspond to the different query parameters in the documentation.

<div class="alert alert-info">Create a dictionary called <b>params</b> that stores the body of your request. Make sure you format it in the right way.</div>

In [39]:
params = {'ids':track_id}

Now that everything is ready, you can send the actual GET request to retrieve the data.

<div class="alert alert-info">Write the code to make your get request using the requests library. When doing so, remember to pass the <i>url</i>, the <i>headers</i> and the <i>params</i> dictionary as arguments to the <i>get</i> function. Convert the response to <i>json</i> format and store it in a new variable called <b>creep</b>.</div>

In [40]:
# import requests
creep_response = requests.get(url,params=params,headers=headers)
creep = creep_response.json()


## Getting data from a playlist

In the following exercise you will build a dataset containing data about different songs. You can either use a playlist of your own, or use the one we have created for this purpose.

You can find our playlist in the following [link](https://open.spotify.com/playlist/4NVeFUEHBybfh3ITNG1b8n?si=js9BKt5aTOiCWMm_Cx4Vvg). 

<div class="alert alert-info">Create a variable called <b>playlist_id</b> that stores the id of your playlist of choice.</div>

<div class="alert alert-warning">Your <i>playlist_id</i> should contain only alphanumeric characters. This means that characters such as ? $ % & / ! should not be included.</div>

In [41]:
playlist_id='4NVeFUEHBybfh3ITNG1b8n'

In the following you are going to extract information about the different tracks included in this playlist. Hence, let's make sure that the variable is propelry created and that it refers to an actual id.

As above, run the cell below. If you get no error when running the cell it means that you did right. Otherwise, revise your code to ensure you get no error. You can run this cell as many times as you want, just **remember not to modify it**.

<div class="alert alert-danger"> Make sure you pass this test successfully before moving on to the remaining exercises.</div>

In [42]:
tc = unittest.TestCase()
tc.assertTrue('playlist_id' in locals())
tc.assertIsInstance(playlist_id, str)
tc.assertTrue(playlist_id.isalnum())
tc.assertTrue('spotify:track:' not in playlist_id)

The next step will be making a request to get full details of the tracks included in your chosen playlist. Remember that you can take a look at all the information available at the different endpoints in Spotify's API by reading the [Docs](https://developer.spotify.com/documentation/web-api/reference/). Locate the right endpoint for your query and read the Docs to find out how to build your request. 

<div class="alert alert-info"><b>Exercise 1 </b>Write the code to retrieve all the items from your chosen playlist from the <i>tracks</i> endpoint. When making your request don't use any of the optional arguments. Direct your request to the right endpoint instead. Store the <i>raw</i> response in a new variable called <i>playlist_response</i>.
</div>

<div class="alert alert-warning">To complete this exercise you must use the requests library</div>

In [43]:
playlisturl = base_url+"playlists/"+playlist_id+"/tracks"
playlist_response= requests.get(playlisturl,headers=headers)
playlist_response

<Response [200]>

<div class="alert alert-info"><b>Exercise 2 </b>Convert the response to JSON format and store the result in a new variable called <i>playlist</i>.</div>

In [44]:
playlist = playlist_response.json()

Take your time to familiarize yourself with the data and how they are presented. Note that, by default, Spotify's API only returns information about a maximum of 100 tracks in a playlist. If your playlist of choice has more that 100 tracks, you'll retrieve the data only for the first 100 of them.

<div class="alert alert-danger">Throghout the following exercises you may come across data that are missing. If so, encode these data using a <b>None</b> (Nonetype), unless otherwise stated. </div>

## Retrieving basic track information

In what follows, you are going to retrieve specific data for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 3 </b>Write the code to extract the title (in string form), the name of the album (in string form), the name of the artist (in string form), the duration (in integer form), the track number (in integer form), the popularity (in integer form), and the id (in string form) for all the tracks included in your chosen playlist. Store these data in separate lists called <i>title</i>, <i>album</i>, <i>artist</i>, <i>duration</i>, <i>track_number</i>, <i>track_popularity</i>, and <i>track_id</i>, respectively. In those cases where more than one value is available for these items, retain only the first. In all cases, entries should appear in the same order as they are presented in the playlist.<br><i>[1.25 points]</i></div>

<div class="alert alert-warning">Make sure you correctly set all variable names. They have to be written <b>exactly</b> as given in the instructions.</div>

In [45]:
title=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['name']
album=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['album']['name']
artist=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['album']['artists'][0]['name']
duration=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['duration_ms']
track_number=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['track_number']
track_popularity=[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['popularity']
track_id =[] # playlist['items'][*Iter_num*, i.e., 0,1...]['track']['id']

for  obj in playlist['items']:
    title.append(obj['track']['name'])
    album.append(obj['track']['album']['name'])
    artist.append(obj['track']['album']['artists'][0]['name'])
    duration.append(obj['track']['duration_ms'])
    track_number.append(obj['track']['track_number'])
    track_popularity.append(obj['track']['popularity'])
    track_id.append(obj['track']['id'])

<div class="alert alert-info"><b>Exercise 4 </b>Write the code to extract the number of available markets (in integer form) for all the tracks included in your chosen playlist. Store these data in a separate list called <i>n_available_markets</i>. As above, entries in this list should appear in the same order as they are presented in the playlist.</div>

In [46]:
available_markets = []
n_available_markets = []
for obj in playlist['items']:
    available_markets = obj['track']['available_markets']
    n_available_markets.append(len(available_markets))

Each track in the playlist is associated to an album. Let's retrieve the corresponding release date.

<div class="alert alert-info"><b>Exercise 5 </b>Write the code to extract the release year (in int form) and month (in string form) for all the tracks included in your chosen playlist. Store these data in separate lists called <i>release_year</i> and <i>release_month</i>, respectively. Entries should appear in these lists in the same order as they are presented in the playlist.</div>

<div class="alert alert-warning">Make sure you correctly set the variable names. They have to be written <b>exactly</b> as given in the instructions.</div>

<div class="alert alert-warning">Months should be stored with their full names and capitalized: use <i>September</i> for 09, <i>January</i> for 01, etc.</div>

In [47]:
from datetime import date, datetime
import calendar
release_date=[]
release_year = []
release_month = []
release_day = []

for obj in playlist['items']:
    release_d = obj['track']['album']['release_date']
    
    release_date.append(release_d)
    
    d_split = release_d.split('-')
    
    release_year.append(int(d_split[0]))
    
    if len(d_split) > 1:
        release_month.append(calendar.month_name[int(d_split[1])])
        if len(d_split) > 2:
            release_day.append(int(d_split[2]))
        else:
            release_day.append(None)
    else:
        release_month.append(None)
        release_day.append(None)

## Retrieving additional track information

In what follows, you are going to retrieve additional data for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 6 </b>Write the code to extract data about the danceability, energy, loudness, mode, speechiness, acousticness, instrumentalness, liveness, valence and tempo for all the tracks included in your chosen playlist. Store these data in separate lists called <i>danceability</i> (in float form), <i>energy</i>, <i>loudness</i> (in float form), <i>mode</i> (in int form), <i>speechiness</i> (in float form), <i>acousticness</i> (in float form), <i>instrumentalness</i> (in int and float form), <i>liveness</i> (in float form), <i>valence</i> (in float form) and <i>tempo</i> (in float form), respectively. In those cases where more than one value is available for these items, retain only the first. In all cases, entries should appear in the same order as they are presented in the playlist.</div>

In [48]:
danceability = []
energy = []
loudness = []
mode = []
speechiness = []
acousticness = []
instrumentalness_int = []
instrumentalness_float = []
liveness = []
valence = []
tempo = []
playlist_tracksinfo = []

for track in track_id:
    params = {'ids':track}
    response = requests.get(url,params=params,headers=headers).json()
    playlist_tracksinfo.append(response)
    danceability.append(float(response['audio_features'][0]['danceability']))
    energy.append(float(response['audio_features'][0]['energy']))
    loudness.append(float(response['audio_features'][0]['loudness']))
    mode.append(int(response['audio_features'][0]['mode']))
    speechiness.append(float(response['audio_features'][0]['speechiness']))
    acousticness.append(float(response['audio_features'][0]['acousticness']))
    instrumentalness_int.append(int(response['audio_features'][0]['instrumentalness']))
    instrumentalness_float.append(float(response['audio_features'][0]['instrumentalness']))
    liveness.append(float(response['audio_features'][0]['liveness']))
    valence.append(float(response['audio_features'][0]['valence']))
    tempo.append(float(response['audio_features'][0]['tempo']))
    

## Retrieving artist information

In what follows, you are going to retrieve data about the artists for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 7 </b>Write the code to extract data about the total number of followers (in int form), the first listed genre (in string form) and the popularity (in int form) for the artists of all the tracks included in your chosen playlist. Store these data in separate lists called <i>artist_followers</i>, <i>genre</i> and <i>artist_popularity</i>. In cases where no genre is given, fill in the corresponding entry using a None.</div>

<div class="alert alert-warning">Note that the artist information has to to be extracted from the tracks themselves.</div>

In [49]:
artists_id = []
artist_followers = []
genre = []
artist_popularity = []
all_art = []
for obj in playlist['items']:
    artistID = obj['track']['artists'][0]['id'] # playlist['items'][0]['track']['artists'][0]['id']
    artists_id.append(artistID)
    params = {'ids':artistID}
    art = requests.get(base_url+'artists/',params=params,headers=headers).json()
    genre.append(art['artists'][0]['genres'][0] if art['artists'][0]['genres'] else None)
    artist_popularity.append(art['artists'][0]['popularity'])
    all_art.append(art)
    artist_follow = art['artists'][0]['followers']['total']
    artist_followers.append(int(artist_follow))

In addition to the above, there's also additional information we can retrieve about each artist. For this purpose, let's first retrieve the list of distinct artists for the different tracks in your playlist.

<div class="alert alert-info"><b>Exercise 8 </b>Write the code to identify the list of unique artist ids that correspond to the different tracks in your chosen playlist. Store these data in a list called <i>unique_artist_ids</i>.</div>

<div class="alert alert-warning">Retrieve the artist ids that correspod directly to the tracks.</div>

In [50]:
unique_artist_ids=list(set(artists_id))

We are now interested in retrieving catalog information about each artist’s top tracks. This information is provided by Spotify's API on a country basis. Here, we will retrieve the information corresponding to Spain, whose *ISO 3166-1 alpha-2* code is **ES**. The information we are looking for is stored under the ```top-tracks``` endpoint for ```artists```. Requests to this location retrieve the 10 most famous tracks for a given artist id.

<div class="alert alert-info"><b>Exercise 9 </b>Write the code to retrieve the 10 top tracks for each of the unique artists in your chosen playlist. Store these data in a dictionary called <i>top_tracks</i>. The keys of this dictionary should correspond to the unique artist ids stored in the list <i>unique_artist_id</i>. The values of this dictionary should include the names of the 10 most popular songs in a list.</div>

<div class="alert alert-warning">In cases where the number of tracks included in the top-tracks endpoint is below 10, simply retrieve the provided list.</div>


In [51]:
artist_top_tracks = {}

for id in unique_artist_ids:
    top_tracks = []
    artists_url = base_url + 'artists/' + id + '/top-tracks'
    params = {'market': 'ES'}
    response = requests.get(artists_url, params=params, headers=headers).json()
    
    for song in response['tracks']:
        top_tracks.append(song['name'])
    
    artist_top_tracks[id] = top_tracks


We can now use this information to identify those songs in your chosen playlist that correspond to each artist's top tracks.

<div class="alert alert-info"><b>Exercise 10 </b>Write the code to check whether the different tracks in your chosen playlist are included among the corresponding artist's top tracks. Store the results in a list called <i>is_top</i>. This list should include the boolean value <i>True</i> whenever the considered track is among the top tracks for that artist and <i>False</i> otherwise.</div>

<div class="alert alert-warning">When comparing track names, consider only <b>exact</b> matches.</div>

In [52]:
all_songs_set = set(all_songs)

is_top = []
for song in title:
    is_top.append(song in all_songs_set)

NameError: name 'all_songs' is not defined

A very interesting feature in Spotify is the song recommender.

When creating a playlist from scratch, Spotify offers help by suggesting additional tracks that may be related to the ones already added. Spotify's API also provides a specific endpoint for `recommendations`. We will use this endpoint to retrieve one recommendation for each song in our original playlist.

<div class="alert alert-info"><b>Exercise 11 </b>Write the code to retrieve the id, title and artist from the list of recommendations for each track in your playlist. Store these data in new lists called <i>recommendation_id</i>, <i>recommendation_title</i> and <i>recommendation_artist</i>.</div>

<div class="alert alert-warning">Retrieve recommendations based solely on each track.</div>

<div class="alert alert-warning">Retrieve information only about the first track in the recommendations list.</div>

<div class="alert alert-warning">When more than one artist is associated to the recommended track, retrieve information only about the first.</div>

In [None]:
recommendation_id=[]
recommendation_title=[]
recommendation_artists=[]

for id in track_id:
    recommendation_url = base_url+'recommendations'
    params={'seed_tracks':id}
    response = requests.get(recommendation_url,params=params,headers=headers).json()
    recommendation_id.append(response['tracks'][0]['id'])
    recommendation_title.append(response['tracks'][0]['name'])
    recommendation_artists.append(response['tracks'][0]['artists'][0]['name'])

The final step is to save the data you have collected to a csv file. You can do so by first saving the data to a pandas DataFrame and then exporting it to a csv file of your choosing, in the same directory where your notebook is located.

## Saving the data

In what follows, you are going to store all the data you just retrieved in a convenient form.

<div class="alert alert-info"><b>Exercise 12 </b>Write the code to save the data about the title, the name of the album, the name of the artist, the duration, the track number, the release date, the popularity, the id, the number of available markets, the danceability, the energy, the loudness, the mode, the speechiness, the acousticness, the instrumentalness, the liveness, the valence and the tempo for all the tracks included in your chosen playlist, as well as the data about the total number of followers, the first listed genre, the popularity for the artists, whether the tracks are included in the top 10 and what the id, the title and the artist of the first song in the recommendations list are in a dataframe called <i>df</i> and save this DataFrame to a .csv file called 'spotify.csv'. When creating the dataframe make sure the column names are <b>exactly</b> the same as those of the lists you created in previous exercises to store the different values.</div>

In [None]:
import pandas as pd

data = {
    "Title": title,
    "Album Name": album,
    "Artist Name": artist,
    "Duration": duration,
    "Track Number": track_number,
    "Release Date": release_date,
    "Popularity": track_popularity,
    "Track ID": track_id,
    "Available Markets": n_available_markets,
    "Danceability": danceability,
    "Energy": energy,
    "Loudness": loudness,
    "Mode": mode,
    "Speechiness": speechiness,
    "Acousticness": acousticness,
    "Instrumentalness": instrumentalness_float,
    "Liveness": liveness,
    "Valence": valence,
    "Tempo": tempo,
    "Artist Followers":artist_followers,
    "Artist_genre":genre,
    "Artist_popularity":artist_popularity,
    "Included in Top 10": is_top,
    "Recommendation Song Id":recommendation_id,
    "Recommendation Song":recommendation_title,
    "Recommnedation artist":recommendation_artists
}
df = pd.DataFrame(data)

# Step 6: Save the DataFrame to a CSV file
df.to_csv('spotify.csv', index=False)

print("Data saved to 'spotify.csv'")

Data saved to 'spotify.csv'


In [None]:
df

Unnamed: 0,Title,Album Name,Artist Name,Duration,Track Number,Release Date,Popularity,Track ID,Available Markets,Danceability,...,Liveness,Valence,Tempo,Artist Followers,Artist_genre,Artist_popularity,Included in Top 10,Recommendation Song Id,Recommendation Song,Recommnedation artist
0,Creep,Pablo Honey,Radiohead,238640,2,1993-02-22,1,6b2oQwSGFkzsMtQruIWm2p,0,0.515,...,0.1290,0.104,91.841,8713899,alternative rock,79,True,6LOiWgNOqduROYlkxFy2bQ,Adderall - Rough Demo,Slipknot
1,Black Swan,The Eraser,Thom Yorke,289826,4,2006-07-10,0,4VbV8Zyjuu1qz0QteX1wVC,0,0.613,...,0.1280,0.509,101.066,970782,alternative rock,55,True,7gTMsKyhm6vuUWoxRaFpsJ,Never as Tired as When I'm Waking Up,LCD Soundsystem
2,Two Thousand and Seventeen,New Energy,Four Tet,252255,2,2017-09-29,58,2ZIaH69kaz55RM4Pjx6KXl,184,0.551,...,0.0939,0.498,75.495,678038,electronica,63,True,4R57uZUgbb81z9Gv97K5cY,Great Day (Four Tet Remix),Madvillain
3,High And Dry,The Bends,Radiohead,257480,3,1995-03-28,0,5jafMI8FLibnjkYTZ33m0c,0,0.418,...,0.0896,0.352,87.773,8713899,alternative rock,79,False,3SyzUB8LeQxplnbWTVYTVj,11-Nov,Team Sleep
4,Karma Police,OK Computer,Radiohead,264066,6,1997-05-28,0,3SVAN3BRByDmHOhKyIDxfC,0,0.360,...,0.1720,0.317,74.807,8713899,alternative rock,79,True,4le6DvrwMv2rpyN1SPeL0g,Sappy - Early Demo,Kurt Cobain
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Cape Canaveral,Conor Oberst,Conor Oberst,244173,1,2008-01-01,25,5RlSc9SkmmV1Ma4Hd7TQYZ,78,0.631,...,0.1080,0.631,126.166,156581,chamber pop,46,False,6CbFgDtwqPPGyHMS5ItdV5,Tickle Me Pink,Johnny Flynn
96,Such Great Heights,Around The Well,Iron & Wine,251280,11,2009-05-19,0,7vcuTZAFyu0Z5dgMRLR0h0,0,0.610,...,0.4030,0.552,94.088,1071538,acoustic pop,61,True,3CAX47TnPqTujLIQTw8nwI,Old Pine,Ben Howard
97,Devil Or Angel,Places,Lou Doillon,244280,2,2012-01-01,39,2B0vzJSvp2Y0Bx8HUN4jyT,14,0.610,...,0.1100,0.284,95.891,84023,french indie pop,44,True,0OLLN9RztJcqymCeeB7URB,Come Back to Me,Hollysiz
98,Into the Mystic - 2013 Remaster,Moondance,Van Morrison,205613,5,1970-02,75,3lh3iiiJeiBXHSZw6u0kh6,171,0.608,...,0.1150,0.797,86.204,2970977,classic rock,71,True,1N4MKISvC1ddfRCRQDXDd2,Madame George - 1999 Remaster,Van Morrison


## Bonus exercise

You can complete the exercises below to obtain extra points. These points will be added to your score directly. The maximum score in this assignment is a 10.

<div class="alert alert-danger"><b>Bonus 1 </b>Write the code to find the most popular song in your playlist. If several songs have the same popularity, choose the one for which the artist has the most followers. Save the name of the song to a new variable called <i>most_popular</i>. This variable should <b>only</b> contain the name of the most popular song in string form.</div>

In [None]:
most_popular = df.sort_values(by=['Popularity', 'Artist Followers'], ascending=[False, False]).iloc[0]['Title']
display(most_popular)

'Yellow'

You might have noticed that there exist slight incosistencies in the way in which the track title are included in the playlist, versus the top 10 artist tracks. For example, the fourth song in the provided playlist appears as <i>High And Dry</i> within the title of the playlist songs but as <i>High and Dry</i> (with a small "a" for the word "and") in the Radiohead's top tracks. This results in the fact that the song of the playlist will not be included as one of the artist's top tracks in the is_top list.

<div class="alert alert-danger"><b>Bonus 2 </b>Write the code to create a new list called <i>is_top_2</i>. This time make sure that you look for partial matches of title names too.</div>

In [None]:
is_top_2 = []

for title in df['Title']:
    is_top_2.append(any(title.lower() in song.lower() for song in all_songs))

df['Included in Top 10_Ver2']=is_top_2

In [None]:
df

Unnamed: 0,Title,Album Name,Artist Name,Duration,Track Number,Release Date,Popularity,Track ID,Available Markets,Danceability,...,Valence,Tempo,Artist Followers,Artist_genre,Artist_popularity,Included in Top 10,Recommendation Song Id,Recommendation Song,Recommnedation artist,Included in Top 10_Ver2
0,Creep,Pablo Honey,Radiohead,238640,2,1993-02-22,1,6b2oQwSGFkzsMtQruIWm2p,0,0.515,...,0.104,91.841,8713899,alternative rock,79,True,6LOiWgNOqduROYlkxFy2bQ,Adderall - Rough Demo,Slipknot,True
1,Black Swan,The Eraser,Thom Yorke,289826,4,2006-07-10,0,4VbV8Zyjuu1qz0QteX1wVC,0,0.613,...,0.509,101.066,970782,alternative rock,55,True,7gTMsKyhm6vuUWoxRaFpsJ,Never as Tired as When I'm Waking Up,LCD Soundsystem,True
2,Two Thousand and Seventeen,New Energy,Four Tet,252255,2,2017-09-29,58,2ZIaH69kaz55RM4Pjx6KXl,184,0.551,...,0.498,75.495,678038,electronica,63,True,4R57uZUgbb81z9Gv97K5cY,Great Day (Four Tet Remix),Madvillain,True
3,High And Dry,The Bends,Radiohead,257480,3,1995-03-28,0,5jafMI8FLibnjkYTZ33m0c,0,0.418,...,0.352,87.773,8713899,alternative rock,79,False,3SyzUB8LeQxplnbWTVYTVj,11-Nov,Team Sleep,True
4,Karma Police,OK Computer,Radiohead,264066,6,1997-05-28,0,3SVAN3BRByDmHOhKyIDxfC,0,0.360,...,0.317,74.807,8713899,alternative rock,79,True,4le6DvrwMv2rpyN1SPeL0g,Sappy - Early Demo,Kurt Cobain,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,Cape Canaveral,Conor Oberst,Conor Oberst,244173,1,2008-01-01,25,5RlSc9SkmmV1Ma4Hd7TQYZ,78,0.631,...,0.631,126.166,156581,chamber pop,46,False,6CbFgDtwqPPGyHMS5ItdV5,Tickle Me Pink,Johnny Flynn,False
96,Such Great Heights,Around The Well,Iron & Wine,251280,11,2009-05-19,0,7vcuTZAFyu0Z5dgMRLR0h0,0,0.610,...,0.552,94.088,1071538,acoustic pop,61,True,3CAX47TnPqTujLIQTw8nwI,Old Pine,Ben Howard,True
97,Devil Or Angel,Places,Lou Doillon,244280,2,2012-01-01,39,2B0vzJSvp2Y0Bx8HUN4jyT,14,0.610,...,0.284,95.891,84023,french indie pop,44,True,0OLLN9RztJcqymCeeB7URB,Come Back to Me,Hollysiz,True
98,Into the Mystic - 2013 Remaster,Moondance,Van Morrison,205613,5,1970-02,75,3lh3iiiJeiBXHSZw6u0kh6,171,0.608,...,0.797,86.204,2970977,classic rock,71,True,1N4MKISvC1ddfRCRQDXDd2,Madame George - 1999 Remaster,Van Morrison,True
