# Practice: Spotify's API

This notebook contains a set of exercises that will guide you through the different steps of this practice exercise. Devise solutions that are code-based, i.e. not hard-coded or manually computed. 

<div class="alert alert-success">The aim of this exercise is to create and save a dataset containing information about every song in a given playlist by requesting data from Spotify's API.</div>

## Getting client credentials

Spotify's API uses an Authentication scheme called OAuth. Hence, before starting to make requests, you need to get your client credentials to the Spotify API. 

To do so, you need to have a Spotify account (free or paid). If you don't have one yet, please create a free account before moving on. Once you do, head over to Spotify for Developers, open your [Dashboard](https://developer.spotify.com/dashboard/) and log in with your account. 

Click on “CREATE APP”. Fill in the name and description for your project. Under "Redirect URI", enter "http://localhost:8888/callback" (don't worry about this, as it won't be used). Finally, it the tick boxes, select "Web API". 

Once your App has been created, you should see the app name in the Dashboard. Click on it, then on "Settings", and then on "View Client Secret".  The numbers for “Client ID” and “Client Secret” on correspond to your client credentials.

<div class="alert alert-info">Create two new variables, <b>client_id</b> and <b>client_secret</b>, that store your ID and Key, respectively</div>

In [1]:
client_id = "5236f001113d42aa982409f077a7d5a7"
client_secret = "648c84f93dfc4588b53b6197d410cd40"


These are the keys that you will use throughout the remaining of the assignment.



Great! We are good to go. Next step is getting an access token.

## Getting an access token / key

In order to access the various endpoints of the Spotify API, we need to pass an access token. 

To get one, we need to pass a ```POST``` request with our client credentials. This request will create a token resource in the server and respond back with it. We can build this ```POST``` request using ```requests``` library. In other words, the first API call in the spotify API is the call to get the access token. 

<div class="alert alert-info">Run the following cell to built your POST request</div>

In [2]:
pip install requests

Note: you may need to restart the kernel to use updated packages.


In [3]:
import requests

# URL for token resource
auth_url = 'https://accounts.spotify.com/api/token'

# request body
params = {'grant_type': 'client_credentials',
          'client_id': client_id,
          'client_secret': client_secret}

# POST the request
response =  requests.post(auth_url, data=params)
response.json()


{'access_token': 'BQBWetEdmBkAFFEqkiitkFoLgiV5GnaK4skaYIngTbYWKMLGUWxvT0WDtqoctGpCYCS1-iViSKM-k87T1dDihmSFnFpmEkcRmz-u1qd7n-Q5WA2RUgQ',
 'token_type': 'Bearer',
 'expires_in': 3600}

Take a look at the response you obtained. The access_token you just retrieved will expire after one hour.

<div class="alert alert-info">Retrieve your token from <b>auth_response</b> and save it in a new variable called <b>access_token</b>. </div>

<div class="alert alert-warning">Make sure you define the <i>access_token</i> variable such that it will be updated each time your code is run from scratch, i.e. make sure it hasn't expired by the time your code is graded.</div>

In [4]:
auth_response = response.json()
access_token = auth_response['access_token']

print("Access Token:", access_token)

Access Token: BQBWetEdmBkAFFEqkiitkFoLgiV5GnaK4skaYIngTbYWKMLGUWxvT0WDtqoctGpCYCS1-iViSKM-k87T1dDihmSFnFpmEkcRmz-u1qd7n-Q5WA2RUgQ


As above, the access token will be used throughout the remaining of the assignment.



This token is your golden ticket to access Spotify's API. A copy of this string is now stored in the server, so that everytime you make a request to the API the server will check that the token you provide and the one it has in store match.

<img src="https://www.dropbox.com/s/hgb02k4h1mtdv22/header.png?raw=1" width="500">

Similar to the Cohere API, Spotify's API expects you to include your access token in the requests header. We won't use headers in this course and this is the only exception: when some APIs require to specify the authentication in the header. Hence, the header has already been formatted for you. 

<div class="alert alert-info">Run the following cell to save the header in a new variable so that you can use it later on.</div>

In [5]:
headers = {'Authorization': 'Bearer {token}'.format(token=access_token)}

## Poking around

Spotify's API provides numerous endpoints to access things like album listings, artist information, playlists, even Spotify-generated audio analysis of individual tracks, which include their time signature or measurements such as their “danceability” or "loudness". You can take a look at all the information available by reading the [Docs](https://developer.spotify.com/documentation/web-api/reference/). In this assignment you will use several of these endpoints.

In order to get a feel of how the API works, we will begin by making a ```GET``` request to the ```audio-features``` endpoint to extract data for a specific track. In particular, let's retrieve all the information for Radiohead's *Creep* song. 

The first thing you need is to identify the appropriate URL or path to direct your request to. The urls for all Spotify API endpoints follow the same structure. They all use the base URL for the API and are then defined as a concatenation of ```base_url + endpoint```. Sometimes, you will also need to provide some additional information as part of the URL. In the case of ```audio-features```, however, it is enough with just the ```base_url``` and the ```endpoint``` name.

The ```base_url``` is defined below:

In [6]:
base_url = 'https://api.spotify.com/v1'

<div class="alert alert-info">Define the url for the audio-features endpoint by following the instructions above. Store it in a variable called <b>audio_features_endpoint</b>.</div>

In [7]:


audio_features_endpoint = f"{base_url}/audio-features"

print(audio_features_endpoint)


https://api.spotify.com/v1/audio-features


Next thing we need is to fill in the request body. If you check the documentation you'll see that the ```audio-features``` endpoint takes the following query parameters.

<img src="https://www.dropbox.com/s/s4zs6wlue0u16cu/body.png?raw=1" width="500">

Hence, the final thing you need to extract data about Radiohead's Creep song is to locate its ```id```. This is its unique identifier. Spotify has unique ids for tracks, for artists, for albums, for playlists, etc.

![Creep](https://www.dropbox.com/s/kufj6ww2yn069gb/creep.png?raw=1)

You can get the ```id``` for any song by going to Spotify, looking for the song, clicking the “…” by the song name, then “Share” and then “Copy Spotify URI”. 

<div class="alert alert-warning">Note that this procedure also works for retrieving ids for artists, albums or any other resource type. Alternatively, you can also retrieve the id for a song or a playlist by clicking "..." by the name, then "Share" and the "Copy Link". This will return a full link to the resource. The id corresponds to the string right after the last slash sign / and before the ? sign</div>

This URI should be a string that includes something like **spotify:track:**, followed by an alphanumeric sequence. This sequence is the ID you are looking for.

<div class="alert alert-info">Create a new variable called <b>track_id</b> that stores the ID for Radiohead's song Creep.</div>

In [8]:
track_id = '70LcF31zb1H0PyJoS1Sx1r'

print(f"Track ID: {track_id}")

Track ID: 70LcF31zb1H0PyJoS1Sx1r


Now that you have the id, let's format the URL of the request and send the actual GET request to retrieve the data:

In [9]:
# YOUR CODE HERE
import requests

base_url = "https://api.spotify.com/v1/audio-features"
track_id = "70LcF31zb1H0PyJoS1Sx1r"
request_url = f"{base_url}/{track_id}"
access_token = auth_response['access_token']
headers = {
    "Authorization": f"Bearer {access_token}"
}
response = requests.get(request_url, headers=headers)
print(response.json())

{'danceability': 0.515, 'energy': 0.43, 'key': 7, 'loudness': -9.935, 'mode': 1, 'speechiness': 0.0372, 'acousticness': 0.0097, 'instrumentalness': 0.000133, 'liveness': 0.129, 'valence': 0.104, 'tempo': 91.844, 'type': 'audio_features', 'id': '70LcF31zb1H0PyJoS1Sx1r', 'uri': 'spotify:track:70LcF31zb1H0PyJoS1Sx1r', 'track_href': 'https://api.spotify.com/v1/tracks/70LcF31zb1H0PyJoS1Sx1r', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/70LcF31zb1H0PyJoS1Sx1r', 'duration_ms': 238640, 'time_signature': 4}


Now that everything is ready, you can send the actual GET request to retrieve the data.

<div class="alert alert-info">Remember to pass the <i>url</i>, the <i>headers</i> and the <i>params</i> dictionary as arguments to the <i>get</i> function. Convert the response to <i>json</i> format and store it in a new variable called <b>creep</b>.</div>

In [10]:
import requests

base_url = 'https://api.spotify.com/v1/audio-features'
track_id = '70LcF31zb1H0PyJoS1Sx1r'
request_url = f"{base_url}/{track_id}"
access_token = auth_response["access_token"]

headers = {
    'Authorization': f'Bearer {access_token}'
}

response = requests.get(request_url, headers=headers)
creep = response.json()

print(creep)

{'danceability': 0.515, 'energy': 0.43, 'key': 7, 'loudness': -9.935, 'mode': 1, 'speechiness': 0.0372, 'acousticness': 0.0097, 'instrumentalness': 0.000133, 'liveness': 0.129, 'valence': 0.104, 'tempo': 91.844, 'type': 'audio_features', 'id': '70LcF31zb1H0PyJoS1Sx1r', 'uri': 'spotify:track:70LcF31zb1H0PyJoS1Sx1r', 'track_href': 'https://api.spotify.com/v1/tracks/70LcF31zb1H0PyJoS1Sx1r', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/70LcF31zb1H0PyJoS1Sx1r', 'duration_ms': 238640, 'time_signature': 4}



<div class="alert alert-warning">Often Spotify has several instances of a track in its catalogue, each available in a different set of markets. This commonly happens when the track the album is on has been released multiple times under different licenses in different markets. These tracks are linked together so that when a user tries to play a track that isn’t available in their own market, the Spotify mobile, desktop, and web players try to play another instance of the track that is available in the user’s market. Even if this issue won't prevent you from being able to play the  song in your device, it might result in different ids for the different markets. Note that the test below is written taking the values for the Spanish market into account.</div>

<div class="alert alert-warning">Note that you can specify the market for this and for future queries through the <i>market</i> parameter.</div>

Congrats! You just made your first successful request to Spotify's API! Feel free to take a look at the information included in the response. Pay special attention to the way in which information is presented. Once you are done, let's move on to some actual work!

## Getting data from a playlist

In the following exercise you will build a dataset containing data about different songs. You can either use a playlist of your own, or use the one mentioned in the documentation.


<div class="alert alert-info">Create a variable called <b>playlist_id</b> that stores the id of your playlist of choice.</div>

<div class="alert alert-warning">Your <i>playlist_id</i> should contain only alphanumeric characters. This means that characters such as ? $ % & / ! should not be included.</div>

In [11]:
playlist_id = "37i9dQZF1DZ06evO0hrh28"


In the following you are going to extract information about the different tracks included in this playlist. Hence, let's make sure that the variabl ei spropelry created and that it refers to an actual id.

The next step will be making a request to get full details of the tracks included in your chosen playlist. Remember that you can take a look at all the information available at the different endpoints in Spotify's API by reading the [Docs](https://developer.spotify.com/documentation/web-api/reference/). Locate the right endpoint for your query and read the Docs to find out how to build your request. 

<div class="alert alert-info"><b>Exercise 1 </b>Write the code to retrieve all the items from your chosen playlist from the <i>tracks</i> endpoint. When making your request don't use any of the optional arguments. Direct your request to the right endpoint instead. Store the <i>raw</i> response in a new variable called <i>playlist_response</i>.
</div>

<div class="alert alert-warning">To complete this exercise you must use the requests library</div>

In [12]:
import requests

playlist_id = "37i9dQZF1DZ06evO0hrh28"
base_url = "https://api.spotify.com/v1/playlists"
request_url = f"{base_url}/{playlist_id}/tracks"
access_token = auth_response['access_token']
headers = {
    "Authorization": f"Bearer {access_token}"
}
playlist_response = requests.get(request_url, headers=headers)
print(playlist_response.json())


{'href': 'https://api.spotify.com/v1/playlists/37i9dQZF1DZ06evO0hrh28/tracks?offset=0&limit=100', 'items': [{'added_at': '1970-01-01T00:00:00Z', 'added_by': {'external_urls': {'spotify': 'https://open.spotify.com/user/'}, 'href': 'https://api.spotify.com/v1/users/', 'id': '', 'type': 'user', 'uri': 'spotify:user:'}, 'is_local': False, 'primary_color': None, 'track': {'preview_url': 'https://p.scdn.co/mp3-preview/6b173d09175867ad5af627076c9ec1986365c384?cid=5236f001113d42aa982409f077a7d5a7', 'available_markets': ['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA', 'CL', 'CO', 'CR', 'CY', 'CZ', 'DK', 'DO', 'DE', 'EC', 'EE', 'SV', 'FI', 'FR', 'GR', 'GT', 'HN', 'HK', 'HU', 'IS', 'IE', 'IT', 'LV', 'LT', 'LU', 'MY', 'MT', 'MX', 'NL', 'NZ', 'NI', 'NO', 'PA', 'PY', 'PE', 'PH', 'PL', 'PT', 'SG', 'SK', 'ES', 'SE', 'CH', 'TW', 'TR', 'UY', 'US', 'GB', 'AD', 'LI', 'MC', 'ID', 'JP', 'TH', 'VN', 'RO', 'IL', 'ZA', 'SA', 'AE', 'BH', 'QA', 'OM', 'KW', 'EG', 'MA', 'DZ', 'TN', 'LB', 'JO', 'PS', 'IN', 'BY', 'K

<div class="alert alert-info"><b>Exercise 2 </b>Convert the response to JSON format and store the result in a new variable called <i>playlist</i>.</div>

In [13]:
playlist = "playlist_response.json()"
print(playlist)

playlist_response.json()


The following cells contain the tests that will grade your code. **Don't modify them**. Simply leave them as they are.

Take your time to familiarize yourself with the data and how they are presented. Note that, by default, Spotify's API only returns information about a maximum of 100 tracks in a playlist. If your playlist of choice has more that 100 tracks, you'll retrieve the data only for the first 100 of them.

<div class="alert alert-danger">Throghout the following exercises you may come across data that are missing. If so, encode these data using a <b>None</b> (Nonetype), unless otherwise stated. </div>

## Retrieving basic track information

In what follows, you are going to retrieve specific data for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 3 </b>Write the code to extract the title (in string form), the name of the album (in string form), the name of the artist (in string form), the duration (in integer form), the track number (in integer form), the popularity (in integer form), and the id (in string form) for all the tracks included in your chosen playlist. Store these data in separate lists called <i>title</i>, <i>album</i>, <i>artist</i>, <i>duration</i>, <i>track_number</i>, <i>track_popularity</i>, and <i>track_id</i>, respectively. In those cases where more than one value is available for these items, retain only the first. In all cases, entries should appear in the same order as they are presented in the playlist.<br></div>

<div class="alert alert-warning">Make sure you correctly set all variable names. They have to be written <b>exactly</b> as given in the instructions.</div>

In [14]:
import requests

playlist_id = "37i9dQZF1DZ06evO0hrh28"
base_url = "https://api.spotify.com/v1"
access_token = auth_response['access_token']

headers = {
    "Authorization": f"Bearer {access_token}"
}

playlist_tracks_url = f"{base_url}/playlists/{playlist_id}/tracks"
playlist_response = requests.get(playlist_tracks_url, headers=headers)
playlist = playlist_response.json()

title = []
album = []
artist = []
duration = []
track_number = []
track_popularity = []
track_id = []

for item in playlist['items']:
    track = item['track']
    
    title.append(track['name'])
    album.append(track['album']['name'])
    artist.append(track['artists'][0]['name'])  
    duration.append(track['duration_ms'])
    track_number.append(track['track_number'])
    track_popularity.append(track['popularity'])
    track_id.append(track['id'])

print(f"Titles: {title}")
print(f"Albums: {album}")
print(f"Artists: {artist}")
print(f"Durations (ms): {duration}")
print(f"Track Numbers: {track_number}")
print(f"Track Popularity: {track_popularity}")
print(f"Track IDs: {track_id}")

Titles: ['intoxicao', 'Los del Espacio', 'noviogangsta <3', 'Perdonarte, ¿Para Qué?', 'Alegría', 'CARITA TRISTE', 'La_Playlist.mpeg', 'La_Original.mp3', 'Una Foto Remix (feat. Emilia)', 'No_se_ve.mp3', 'En La Intimidad | CROSSOVER #1', 'Exclusive.mp3', 'como si no importara', 'TU Y YO', 'Uno los Dos', 'Jagger.mp3', 'Quieres', 'cuatro veinte', 'Salgo a Bailar', 'GTA.mp3', 'La Chain', 'cielo en la mente', 'Underground', 'IConic.mp3', 'rápido lento', 'mi otra mitad', 'BB', 'Facts.mp3', 'De Enero a Diciembre', 'la balada', 'Esto Recién Empieza', 'Perreito Salvaje', 'El Plan', 'latin girl', 'Tu Recuerdo', 'NAGASAKI', 'ULTRAVIOLETA', 'HISTERIQUEO', 'No Soy Yo', 'Recalienta', 'Boomshakalaka', 'Bendición', 'Q-Lito', 'Billion', '828', 'BB - Emilia en Vivo', 'Pal Perreo', 'No Más']
Albums: ['intoxicao', 'Los del Espacio', 'noviogangsta <3', 'Perdonarte, ¿Para Qué?', 'Alegría', 'CARITA TRISTE', 'La_Playlist.mpeg', '.mp3', 'Una Foto Remix (feat. Emilia)', 'No_se_ve.mp3', 'En La Intimidad | CROSSOV

<div class="alert alert-info"><b>Exercise 4 </b>Write the code to extract the number of available markets (in integer form) for all the tracks included in your chosen playlist. Store these data in a separate list called <i>n_available_markets</i>. As above, entries in this list should appear in the same order as they are presented in the playlist.</div>

In [15]:

if playlist_response.status_code == 200:
    playlist = playlist_response.json()

    n_available_markets = []

    for item in playlist['items']:
        track = item['track']
        
        n_markets = len(track['available_markets'])
        n_available_markets.append(n_markets)

    print(n_available_markets)
else:
    print(f"Failed to retrieve playlist tracks. Status code: {playlist_response.status_code}")
    print(playlist_response.text)


[185, 184, 185, 183, 184, 185, 185, 185, 184, 185, 185, 185, 185, 185, 185, 185, 183, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 185, 183, 185, 185, 185, 23, 185, 183, 185, 185, 185, 185, 185]


Each track in the playlist is associated to an album. Let's retrieve the corresponding release date.

<div class="alert alert-info"><b>Exercise 5 </b>Write the code to extract the release year (in int form) and month (in string form) for all the tracks included in your chosen playlist. Store these data in separate lists called <i>release_year</i> and <i>release_month</i>, respectively. Entries should appear in these lists in the same order as they are presented in the playlist.</div>

<div class="alert alert-warning">Make sure you correctly set the variable names. They have to be written <b>exactly</b> as given in the instructions.</div>

<div class="alert alert-warning">Months should be stored with their full names and capitalized: use <i>September</i> for 09, <i>January</i> for 01, etc.</div>

In [16]:

from datetime import datetime


release_year = []
release_month = []

for item in playlist['items']:
    album_release_date = item['track']['album']['release_date']
    
    year, month, _ = album_release_date.split('-')
    
    release_year.append(int(year))
    
    month_name = datetime.strptime(month, '%m').strftime('%B')
    release_month.append(month_name)

print(f"Release Years: {release_year}")
print(f"Release Months: {release_month}")

Release Years: [2022, 2023, 2024, 2024, 2024, 2024, 2024, 2023, 2024, 2023, 2023, 2023, 2022, 2024, 2023, 2023, 2022, 2022, 2023, 2023, 2022, 2022, 2022, 2023, 2021, 2022, 2021, 2023, 2021, 2022, 2022, 2021, 2022, 2022, 2023, 2023, 2023, 2020, 2019, 2019, 2019, 2020, 2022, 2019, 2021, 2022, 2022, 2020]
Release Months: ['May', 'June', 'October', 'May', 'August', 'August', 'June', 'November', 'January', 'May', 'February', 'November', 'May', 'February', 'April', 'November', 'July', 'May', 'July', 'November', 'August', 'May', 'November', 'November', 'September', 'May', 'November', 'November', 'December', 'May', 'February', 'May', 'November', 'May', 'March', 'June', 'April', 'March', 'August', 'March', 'November', 'October', 'July', 'December', 'November', 'October', 'March', 'February']


## Retrieving additional track information

In what follows, you are going to retrieve additional data for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 6 </b>Write the code to extract data about the danceability, energy, loudness, mode, speechiness, acousticness, instrumentalness, liveness, valence and tempo for all the tracks included in your chosen playlist. Store these data in separate lists called <i>danceability</i> (in float form), <i>energy</i>, <i>loudness</i> (in float form), <i>mode</i> (in int form), <i>speechiness</i> (in float form), <i>acousticness</i> (in float form), <i>instrumentalness</i> (in int and float form), <i>liveness</i> (in float form), <i>valence</i> (in float form) and <i>tempo</i> (in float form), respectively. In those cases where more than one value is available for these items, retain only the first. In all cases, entries should appear in the same order as they are presented in the playlist.</div>

In [17]:

danceability = []
energy = []
loudness = []
mode = []
speechiness = []
acousticness = []
instrumentalness = []
liveness = []
valence = []
tempo = []

for item in playlist['items']:
    track_id = item['track']['id']
    
    audio_features_url = f"{base_url}/audio-features/{track_id}"
    audio_features_response = requests.get(audio_features_url, headers=headers)
    audio_features = audio_features_response.json()

    danceability.append(audio_features['danceability'])
    energy.append(audio_features['energy'])
    loudness.append(audio_features['loudness'])
    mode.append(audio_features['mode'])
    speechiness.append(audio_features['speechiness'])
    acousticness.append(audio_features['acousticness'])
    instrumentalness.append(audio_features['instrumentalness'])
    liveness.append(audio_features['liveness'])
    valence.append(audio_features['valence'])
    tempo.append(audio_features['tempo'])

print(f"Danceability: {danceability}")
print(f"Energy: {energy}")
print(f"Loudness: {loudness}")
print(f"Mode: {mode}")
print(f"Speechiness: {speechiness}")
print(f"Acousticness: {acousticness}")
print(f"Instrumentalness: {instrumentalness}")
print(f"Liveness: {liveness}")
print(f"Valence: {valence}")
print(f"Tempo: {tempo}")

Danceability: [0.929, 0.813, 0.816, 0.672, 0.582, 0.681, 0.753, 0.833, 0.76, 0.765, 0.777, 0.858, 0.843, 0.793, 0.764, 0.543, 0.796, 0.761, 0.73, 0.882, 0.841, 0.801, 0.803, 0.721, 0.793, 0.761, 0.769, 0.871, 0.85, 0.687, 0.866, 0.764, 0.659, 0.9, 0.713, 0.866, 0.852, 0.781, 0.751, 0.754, 0.727, 0.785, 0.639, 0.788, 0.763, 0.618, 0.903, 0.785]
Energy: [0.642, 0.682, 0.872, 0.702, 0.632, 0.793, 0.836, 0.809, 0.703, 0.66, 0.599, 0.677, 0.534, 0.559, 0.408, 0.6, 0.646, 0.696, 0.656, 0.783, 0.616, 0.718, 0.818, 0.821, 0.352, 0.482, 0.764, 0.739, 0.442, 0.352, 0.635, 0.596, 0.648, 0.587, 0.873, 0.612, 0.711, 0.79, 0.82, 0.77, 0.824, 0.594, 0.499, 0.64, 0.693, 0.592, 0.557, 0.757]
Loudness: [-3.179, -3.598, -4.27, -4.662, -3.457, -3.47, -2.584, -2.751, -3.667, -5.059, -4.072, -4.294, -6.658, -4.192, -8.212, -7.185, -3.925, -3.817, -2.628, -4.023, -4.451, -5.213, -3.232, -3.695, -7.675, -8.382, -4.709, -4.891, -5.779, -5.548, -4.042, -6.059, -2.519, -6.117, -3.961, -6.251, -4.017, -3.397, -4.

## Retrieving artist information

In what follows, you are going to retrieve data about the artists for each of the tracks contained in your playlist.

<div class="alert alert-info"><b>Exercise 7 </b>Write the code to extract data about the total number of followers (in int form), the first listed genre (in string form) and the popularity (in int form) for the artists of all the tracks included in your chosen playlist. Store these data in separate lists called <i>artist_followers</i>, <i>genre</i> and <i>artist_popularity</i>. In cases where no genre is given, fill in the corresponding entry using a None.</div>

<div class="alert alert-warning">Note that the artist information has to to be extracted from the tracks themselves.</div>

In [18]:

artist_followers = []
genre = []
artist_popularity = []

for item in playlist['items']:
    track = item['track']
    artist_id = track['artists'][0]['id']  
    
    artist_url = f"{base_url}/artists/{artist_id}"
    artist_response = requests.get(artist_url, headers=headers)
    artist_data = artist_response.json()
    
    followers_count = artist_data['followers']['total']
    artist_followers.append(followers_count)
    
    if artist_data['genres']:
        genre.append(artist_data['genres'][0])  
    else:
        genre.append(None)
    
    popularity = artist_data['popularity']
    artist_popularity.append(popularity)

print(f"Artist Followers: {artist_followers}")
print(f"Genres: {genre}")
print(f"Artist Popularity: {artist_popularity}")

Artist Followers: [3641720, 5765219, 3641720, 7776529, 5365191, 948907, 3641720, 3641720, 517743, 3641720, 633460, 3641720, 3641720, 5748679, 1890155, 3641720, 3092597, 3641720, 1594614, 3641720, 3641720, 3641720, 3641720, 3641720, 3641720, 3641720, 751750, 3641720, 3641720, 3641720, 11304451, 3641720, 1806162, 3641720, 9244048, 576684, 1352027, 751750, 3641720, 3641720, 3517791, 3641720, 58417, 3641720, 44331, 3641720, 1594614, 3641720]
Genres: ['pop argentino', 'argentine hip hop', 'pop argentino', 'gruperas inmortales', 'trap argentino', 'spanish pop', 'pop argentino', 'pop argentino', 'rap uruguayo', 'pop argentino', 'trap argentino', 'pop argentino', 'pop argentino', 'argentine hip hop', 'electronica argentina', 'pop argentino', 'latin talent show', 'pop argentino', 'trap argentino', 'pop argentino', 'pop argentino', 'pop argentino', 'pop argentino', 'pop argentino', 'pop argentino', 'pop argentino', 'cumbia pop', 'pop argentino', 'pop argentino', 'pop argentino', 'argentine hip h

In addition to the above, there's also additional information we can retrieve about each artist. For this purpose, let's first retrieve the list of distinct artists for the different tracks in your playlist.

<div class="alert alert-info"><b>Exercise 8 </b>Write the code to identify the list of unique artist ids that correspond to the different tracks in your chosen playlist. Store these data in a list called <i>unique_artist_ids</i>.</div>

<div class="alert alert-warning">Retrieve the artist ids that correspod directly to the tracks.</div>

In [19]:
unique_artist_ids = set()

for item in playlist['items']:
    track = item['track']
    for artist in track['artists']:
        unique_artist_ids.add(artist['id'])  

unique_artist_ids = list(unique_artist_ids)

print(f"Unique Artist IDs: {unique_artist_ids}")

Unique Artist IDs: ['3E6xrwgnVfYCrCs0ePERDz', '1TtXnWcUs0FCkaZDPGYHdf', '3CDoRporvSjdzTrm99a3gi', '0BqURncJM5B1BBu7UM51eq', '2IKdK6PbitvCiXt1t2bPU6', '0AqlFI0tz2DsEoJlKSIiT9', '1DxLCyH42yaHKGK3cl5bvG', '1vqR17Iv8VFdzure1TAXEq', '4YYxffPVDFe9XoqqbRW6Bq', '6k8mwkKJKKjBILo7ypBspl', '6Itjwvv5YmsC8ZcI5N4Jux', '73jBynjsVtofjRpdpRAJGk', '28gNT5KBp7IjEOQoevXf9N', '7eLcDZDYHXZCebtQmVFL25', '2NfSBtmWe7oPw1EmetJVso', '1Ts9of7VPZElwPQnqnDSfW', '3Apb2lGmGJaBmr0TTBJvIZ', '0ZCO8oVkMj897cKgFH7fRW', '4m6ubhNsdwF4psNf3R8kwR', '1bAftSH8umNcGZ0uyV7LMg', '3wtMPMvPtiFylbnNXF6CAj', '5Y3MV9DZ0d87NnVm56qSY1', '0dUyjgCyjfj5eMx6bX2TWf', '2eEmsgWmUFMbtU7agJpnjY', '6GRwwWAtmusrgAL5JF9Dfr', '5UN0rzL594mWY2RbOtZqIN', '2UZIAOlrnyZmyzt1nuXr9y', '3bvfu2KAve4lPHrhEFDZna', '2Yia9CwtgcrsN5kVOQ0qRA', '7vXDAI8JwjW531ouMGbfcp', '768O5GliF0bqscyghggrbE', '2DspEsT7UXGKd2VaaedgG4', '18qC8mrcJ9ZjChRDPvpadi', '07YUOmWljBTXwIseAUd9TW', '2OhUNb01gLwygOizYvTm0e', '5Rj6rNR8zIlUUDCs1OyPmW', '7FNnA9vBm6EKceENgCGRMb']


We are now interested in retrieving catalog information about each artist’s top tracks. This information is provided by Spotify's API on a country basis. Here, we will retrieve the information corresponding to Spain, whose *ISO 3166-1 alpha-2* code is **ES**. The information we are looking for is stored under the ```top-tracks``` endpoint for ```artists```. Requests to this location retrieve the 10 most famous tracks for a given artist id.

<div class="alert alert-info"><b>Exercise 9 </b>Write the code to retrieve the 10 top tracks for each of the unique artists in your chosen playlist. Store these data in a dictionary called <i>top_tracks</i>. The keys of this dictionary should correspond to the unique artist ids stored in the list <i>unique_artist_id</i>. The values of this dictionary should include the names of the 10 most popular songs in a list.</div>

<div class="alert alert-warning">In cases where the number of tracks included in the top-tracks endpoint is below 10, simply retrieve the provided list.</div>


In [20]:

unique_artist_ids = []

for item in playlist['items']:
    track = item['track']
    artist_id = track['artists'][0]['id']
    if artist_id not in unique_artist_ids:
        unique_artist_ids.append(artist_id)

top_tracks = {}

for artist_id in unique_artist_ids:
    top_tracks_url = f"{base_url}/artists/{artist_id}/top-tracks?market=US"
    top_tracks_response = requests.get(top_tracks_url, headers=headers)
    top_tracks_data = top_tracks_response.json()

    track_names = [track['name'] for track in top_tracks_data['tracks']]

    top_tracks[artist_id] = track_names[:10]  

print(f"Top Tracks: {top_tracks}")

Top Tracks: {'0AqlFI0tz2DsEoJlKSIiT9': ['Alegría', 'Perdonarte ¿Para Qué?', 'CARITA TRISTE', 'La_Original.mp3', 'Una Foto Remix (feat. Emilia)', 'Los del Espacio', 'La_Playlist.mpeg', 'No_se_ve.mp3', 'Exclusive.mp3', 'como si no importara'], '1vqR17Iv8VFdzure1TAXEq': ['Los del Espacio', 'Entre Nosotros', 'Entre Nosotros (Remix) [con Nicki Nicole]', 'Somos 3', 'Carta de Despedida', 'Mala Mía', 'AEROBICO REMIX', 'La Trampa es Ley', 'LuXxX', 'BAD BOY (feat. Juhn, Jairo Vera, Sayian Jimmy, Nysix Music, CamiMusic & Montana the Producer) - Remix'], '0ZCO8oVkMj897cKgFH7fRW': ['Perdonarte ¿Para Qué?', 'El Amor De Mi Vida', 'Nunca Es Suficiente', 'El Listón De Tu Pelo', 'Cómo Te Voy A Olvidar', 'Otra Noche', 'Amor A Primera Vista', '17 Años', 'Tú Y Tú', 'Entrega De Amor'], '5Y3MV9DZ0d87NnVm56qSY1': ['Piel', 'Alegría', 'Una Foto Remix (feat. Emilia)', 'BESAME (feat. Tiago PZK, Khea & Neo Pistea) - Remix', 'Los del Espacio', 'De Vuelta', 'Entre Nosotros', 'Entre Nosotros (Remix) [con Nicki Nicole

We can now use this information to identify those songs in your chosen playlist that correspond to each artist's top tracks.

<div class="alert alert-info"><b>Exercise 10 </b>Write the code to check whether the different tracks in your chosen playlist are included among the corresponding artist's top tracks. Store the results in a list called <i>is_top</i>. This list should include the boolean value <i>True</i> whenever the considered track is among the top tracks for that artist and <i>False</i> otherwise.</div>

<div class="alert alert-warning">When comparing track names, consider only <b>exact</b> matches.</div>

In [21]:
is_top = []

for item in playlist['items']:
    track = item['track']
    track_name = track['name']
    artist_id = track['artists'][0]['id']
    
    top_tracks_url = f"{base_url}/artists/{artist_id}/top-tracks?market=US"
    top_tracks_response = requests.get(top_tracks_url, headers=headers)
    top_tracks = top_tracks_response.json()

    top_track_names = [top_track['name'] for top_track in top_tracks['tracks']]
    is_top.append(track_name in top_track_names)

print(f"Is Top Track: {is_top}")

Is Top Track: [False, True, False, False, True, True, True, True, True, True, True, True, True, False, True, False, False, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, True, False, False, False]


A very interesting feature in Spotify is the song recommender.

When creating a playlist from scratch, Spotify offers help by suggesting additional tracks that may be related to the ones already added. Spotify's API also provides a specific endpoint for `recommendations`. We will use this endpoint to retrieve one recommendation for each song in our original playlist.

<div class="alert alert-info"><b>Exercise 11 </b>Write the code to retrieve the id, title and artist from the list of recommendations for each track in your playlist. Store these data in new lists called <i>recommendatin_id</i>, <i>recommendation_title</i> and <i>recommendation_artist</i>.</div>

<div class="alert alert-warning">Retrieve recommendations based solely on each track.</div>

<div class="alert alert-warning">Retrieve information only about the first track in the recommendations list.</div>

<div class="alert alert-warning">When more than one artist is associated to the recommended track, retrieve information only about the first.</div>

In [22]:

recommendation_id = []
recommendation_title = []
recommendation_artist = []

for item in playlist['items']:
    track = item['track']
    track_id = track['id']
    
    recommendations_url = f"{base_url}/recommendations?seed_tracks={track_id}"
    recommendations_response = requests.get(recommendations_url, headers=headers)
    recommendations = recommendations_response.json()

    if recommendations['tracks']:
        recommended_track = recommendations['tracks'][0]
        recommendation_id.append(recommended_track['id'])
        recommendation_title.append(recommended_track['name'])
        recommendation_artist.append(recommended_track['artists'][0]['name'])
    else:
        recommendation_id.append(None)
        recommendation_title.append(None)
        recommendation_artist.append(None)

print("Recommendation IDs:", recommendation_id)
print("Recommendation Titles:", recommendation_title)
print("Recommendation Artists:", recommendation_artist)

Recommendation IDs: ['0H9WU0OIXPpbOVgzzOanXb', '4nrPB8O7Y7wsOCJdgXkthe', '0mjFlWIfnh1Ahnn05xeRu5', '0xsA8UB6GyQ1xCl3uJ7zeG', '0kTMK4gNFfLXaTb62w1UaJ', '4OrAAeefsDOdUSfE87C6WR', '2qEoAz0i6yEz5dPggABcLH', '1egG8nOtq5l9ZiqVRT7KM6', '3eOkhJYDaTELGhWHBNbf4A', '5DSMm0BZXlSvqArsTX00Ge', '5A1vGSsqw0UQCBxHOyPOjG', '61wAL7pFMMfvYv9SXIiYAB', '4VueCOBroWZxBtDmNbon2r', '3YTe42RPu0iJVr1ZYJHHyC', '0Xn0NaJ2eqT7yWChjfy2E6', '0Sfn2TYbpQtCGMBf6C0Y6T', '4hceSKjrkDTO0nMKFcb3sj', '6WkfdgfHTdpmVHcB3Jn4ks', '4xVIjXgpkGAEejcU0owkSj', '7zJu3gmfC46vRoCQ8MZSIY', '5hqAc7ZD21UREu9mNKuLhr', '1p22yVi9e8DT6BEUvOZ0TL', '6KEb17S00Inf0v1qYDgUAj', '4JQgopha31U9YQn2Hblgah', '4sykJxUsqdXZRqRv1J9fEQ', '4bADyrVual0RgJodXqRQRu', '2l0hr2jYYXdkbF393AThLm', '71q4nNysHdBhXQJI9qnhNd', '3e5gPPx7MTGoYffXH70Tqf', '7w7BrPbOjF5OxChs2dxFve', '2P25SKGU0f3OdZOcxFmZ0x', '21FrtQ99WXCeUR8zAg5DV5', '24iXuKsb2nPPZR6uj50hRu', '4yb5Z9dwnKhO9gC0AZQYDe', '2sGmT34Yii7lrcrtxC4Pkt', '5bGyhsYRNyCwsSYYSCzhxN', '4rBQxFVwTnf99p818azSon', '31dmZHNvM3jTkjIk

## Bonus exercise

You can complete the exercises below to obtain extra points. These points will be added to your score directly. The maximum score in this assignment is a 10.

<div class="alert alert-danger"><b>Bonus 1 </b>Write the code to find the most popular song in your playlist. If several songs have the same popularity, choose the one for which the artist has the most followers. Save the name of the song to a new variable called <i>most_popular</i>. This variable should <b>only</b> contain the name of the most popular song in string form.</div>

In [23]:

most_popular = None
highest_popularity = -1
most_followers = -1

for item in playlist['items']:
    track = item['track']
    track_name = track['name']
    track_popularity = track['popularity']
    artist_id = track['artists'][0]['id']

    artist_url = f"{base_url}/artists/{artist_id}"
    artist_response = requests.get(artist_url, headers=headers)
    artist = artist_response.json()
    artist_followers = artist['followers']['total']

    if track_popularity > highest_popularity or (track_popularity == highest_popularity and artist_followers > most_followers):
        highest_popularity = track_popularity
        most_followers = artist_followers
        most_popular = track_name

print(f" The most popular song is: {most_popular}")

 The most popular song is: Perdonarte, ¿Para Qué?


You might have noticed that there exist slight incosistencies in the way in which the track title are included in the playlist, versus the top 10 artist tracks. For example, the fourht song in the provided playlist appears as <i>High And Dry</i> within the title of the playlist songs but as <i>High and Dry</i> (with a small "a" for the word "and") in the Radiohead's top tracks. This results in the fact that the song of the playlist will not be included as one of the artist's top tracks in the is_top list.

<div class="alert alert-danger"><b>Bonus 2 </b>Write the code to create a new list called <i>is_top_2</i>. This time make sure that you look for partial matches of title names too.</div>

In [49]:
import re

is_top_2 = []

def clean_track_name(name): 
    cleaned_name = re.sub(r'\(.*?\)', '', name).strip().lower()
    return cleaned_name

for item in track:
    track_name = track['name']
    artist_id = track['artists'][0]['id']
    top_tracks_for_artist = top_tracks.get(artist_id, [])
    
    if track_name in top_tracks_for_artist:
        is_top_2.append(True)
    else:
        cleaned_track_name = clean_track_name(track_name)
        cleaned_top_tracks = [clean_track_name(tt) for tt in top_tracks_for_artist]
        
        is_top_partial = cleaned_track_name in cleaned_top_tracks
        is_top_2.append(is_top_partial)

In [51]:
print(is_top_2)

[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False]


In [53]:
import re
from difflib import SequenceMatcher

is_top_2 = []

def clean_track_name(name):
    cleaned_name = re.sub(r'\(.*?\)', '', name).strip().lower()
    return cleaned_name

def is_partial_match(name1, name2, threshold=0.7):
    """Returns True if the similarity ratio between name1 and name2 is above the threshold."""
    return SequenceMatcher(None, name1, name2).ratio() > threshold

for item in track:
    print(type(item), item)  # Check the type and content of 'item'

    # Assuming 'item' is a dictionary, try to access the values safely
    if isinstance(item, dict):
        track_name = item['track']['name']
        artist_id = item['track']['artists'][0]['id']
        top_tracks_for_artist = top_tracks.get(artist_id, [])
        
        cleaned_track_name = clean_track_name(track_name)
        cleaned_top_tracks = [clean_track_name(tt) for tt in top_tracks_for_artist]
        
        is_exact_match = cleaned_track_name in cleaned_top_tracks
        is_partial_match_found = any(is_partial_match(cleaned_track_name, tt) for tt in cleaned_top_tracks)
        
        is_top_2.append(is_exact_match or is_partial_match_found)
    else:
        print("Unexpected item format:", item)



<class 'str'> preview_url
Unexpected item format: preview_url
<class 'str'> available_markets
Unexpected item format: available_markets
<class 'str'> explicit
Unexpected item format: explicit
<class 'str'> type
Unexpected item format: type
<class 'str'> episode
Unexpected item format: episode
<class 'str'> track
Unexpected item format: track
<class 'str'> album
Unexpected item format: album
<class 'str'> artists
Unexpected item format: artists
<class 'str'> disc_number
Unexpected item format: disc_number
<class 'str'> track_number
Unexpected item format: track_number
<class 'str'> duration_ms
Unexpected item format: duration_ms
<class 'str'> external_ids
Unexpected item format: external_ids
<class 'str'> external_urls
Unexpected item format: external_urls
<class 'str'> href
Unexpected item format: href
<class 'str'> id
Unexpected item format: id
<class 'str'> name
Unexpected item format: name
<class 'str'> popularity
Unexpected item format: popularity
<class 'str'> uri
Unexpected item 