## Spotify API project

This project aims at using the Spotify API to retrieve different information about one of my playlist, such as music features, top artists and their top tracks, and genres.


This project was done during my MSc of Business Analytics.
For this project we refer to Spotify's API documentation https://developer.spotify.com/documentation/web-api/

In [None]:
client_id = #insert your key
client_secret = #insert your key

We start by making a call request to the API

In [None]:
import requests

# URL for token resource
auth_url = 'https://accounts.spotify.com/api/token'

# request body
params = {'grant_type': 'client_credentials',
          'client_id': client_id,
          'client_secret': client_secret}

# POST the request
auth_response = requests.post(auth_url, params).json()

We then retrieve the token access

In [None]:
access_token = auth_response['access_token']
print(access_token)

Run the following header to save it as a variable

In [None]:
headers = {'Authorization': 'Bearer {token}'.format(token=access_token)}

We first want to retrieve the audio features endpoint

In [1]:
#define the base url
base_url = 'https://api.spotify.com/v1/'
#use the audio_feature endpoint
audio_features_endpoint = base_url+'audio-features'

# Retrieving one song features

As a first exercise, we will retrieve the features of the song Creep by Radiohead

In [None]:
#extract song's string
string =  'spotify:track:6b2oQwSGFkzsMtQruIWm2p'
track_id = string[14:]

We then make a request to the API

In [None]:
params = { 'ids' : track_id}
          r = requests.get(audio_features_endpoint, params = params, headers=headers)
creep = r.json()
print(creep)

# Retrieving information from a playlist

As a second exercise, we will retrieve data from your chosen playlist. Simply extract the link of the playlist of your choice to start with.

In [None]:
#retrieved link from : https://open.spotify.com/playlist/0aUiX1IM1m1W3LZf447qpi?si=a619ba7f28dc483b
string ='spotify:playlist:0aUiX1IM1m1W3LZf447qpi'
playlist_id = string[17:]
print(playlist_id)


In [None]:
#create appropriate URL first and then add playlist id and same header params
playlist_response = requests.get(base_url+'playlists/'+playlist_id +'/tracks',headers=headers)
print(playlist_response)

In [None]:
playlist = playlist_response.json()
playlist

Now that we have our playlist request done, we will create lists to store the different items : title, album, artist, duration, track number, realase date, track populrity, track id, number of available markets

In [None]:
#create new list for items
playlist_items = playlist['items']
playlist_items

#create empty list for each items
title = []
album = []
artist = []
duration = []
track_number = []
release_date = []
track_popularity = []
track_id = []
n_available_markets = []

#loop through the items and append them

for i in playlist_items:
    title.append(i['track']['name'])
    album.append(i['track']['album']['name'])
    artist.append(i['track']['artists'][0]['name'])
    duration.append(i['track']['duration_ms'])
    track_number.append(i['track']['track_number'])
    release_date.append(i['track']['album']['release_date'])
    track_popularity.append(i['track']['popularity'])
    track_id.append(i['track']['id'])
    
#incoporate an if function in case the markets are not available to have "none" appear
    if len(i['track']['album']['available_markets']) == 0:
        n_available_markets.append(None)
    else:
        n_available_markets.append(len(i['track']['album']['available_markets']))

We will now retrieve further information about our playlist songs, more specifically the *audio features*

In [None]:
#YOUR CODE HERE

#make the request to the API
params = {'ids' : ','.join(track_id)}
result = requests.get(audio_features_endpoint, headers = headers , params = params).json()


#create main list
playlist_audio_features = result['audio_features']

#create empty list for each items
danceability = []
energy = []
loudness = []
mode = []
speechiness = []
acousticness = []
instrumentalness = []
liveness = []
valence = []
tempo = []



#loop through the items and append them

for i in playlist_audio_features:
    danceability.append(i['danceability'])
    energy.append(i['energy'])
    loudness.append(i['loudness'])
    mode.append(i['mode'])
    speechiness.append(i['speechiness'])
    acousticness.append(i['acousticness'])
    instrumentalness.append(i['instrumentalness'])
    liveness.append(i['liveness'])
    valence.append(i['valence'])
    tempo.append(i['tempo'])


# Retrieving artist information

With the same playlist that we have worked with, we will retrieve its *artists information*. We will retrieve the followers, genre and the artist popularity.

In [None]:
# YOUR CODE HERE
artist_info = []

    #code to get the artist ids
for i in playlist_items:
    artist_info.append(i['track']['artists'][0]['id'])

    #code to get the first 50 ids from the ones above
artist_url = 'https://api.spotify.com/v1/artists'
params = {'ids' : ','.join(artist_info[:50])}
r = requests.get(artist_url, headers = headers , params = params).json()
artist_response = r['artists']

artist_followers = []
genre =[]
artist_popularity=[]

for i in artist_response:
    artist_followers.append(i['followers']['total'])
    artist_popularity.append(i['popularity'])
    if len(i['genres']) == 0:
        genre.append(None)
    else:
        genre.append(i['genres'][0])

print(artist_followers)
print(genre)
print(artist_popularity)

As a second step, we will now retrieve the *unique track ids* of the artists available in the playlist

In [None]:
unique_artist_id = []
for id in artist_info:
    if id in unique_artist_id:
        pass
    if id not in unique_artist_id:
        unique_artist_id.append(id)
unique_artist_id

We are now interested in retrieving catalog information about each artist’s top tracks. This information is provided by Spotify's API on a country basis. Here, we will retrieve the information corresponding to Spain, whose ISO 3166-1 alpha-2 code is ES. The information we are looking for is store dun der to top-tracks endpoint for artists. Requests to this location retrieve the 10 most famous tracks for a given artist id.

In [None]:
# YOUR CODE HERE
top_tracks = dict.fromkeys(artist_info)
params = {'market':'ES'}

counter_1 =0

for i in range(len(unique_artist_id)):
    artist = unique_artist_id[counter_1]
    url = 'https://api.spotify.com/v1/artists/' + artist + '/top-tracks'
    tracks_artist = (requests.get(url, params = params, headers = headers).json())['tracks']
    top_tracks_artist = []
    counter_2 = 0
    
    for i in tracks_artist:
        top_tracks_artist.append(tracks_artist[counter_2]['name'])
        counter_2 += 1
        
    top_tracks[artist] = top_tracks_artist   
    counter_1 += 1
    
top_tracks

We will now write a code to idenfity if the different tracks in our chosen playlist are included among the corresponding artist's top tracks. We will store the restuls in a list called *is_top*. It will contain an entry "yes" whenever it is part of the top songs, or a "None" if it is not the case. The code will look for exact matches only.

In [None]:
params = {'market':'ES'}
counter_1 = 0
is_top = []

for i in range(len(track_id)):
    artist_loop = artist_info[counter_1]
    url = 'https://api.spotify.com/v1/artists/' + artist_loop + '/top-tracks'
    tracks_artist = (requests.get(url, params = params, headers = headers).json())['tracks']
    tracks_artist_ids = []
    counter_2 = 0
    for i in tracks_artist:
        tracks_artist_ids.append(tracks_artist[counter_2]['id'])
        counter_2 += 1
    if str(track_id[counter_1]) in tracks_artist_ids:
        is_top.append('Yes')
    else:
        is_top.append(None)
    counter_1 += 1

yes_counter = 0
none_counter = 0

for i in is_top:
    if i == 'Yes':
        yes_counter += 1
    if i == None:
        none_counter += 1

is_top


Finally, we store this data in a dataframe and download it as a CSV file

In [None]:
# YOUR CODE HERE
import pandas as pd

df=  pd.DataFrame(data={
    'title':title,
    'album':album,
    'artist':artist,
    'duration':duration,
    'track_number':track_number,
    'release_date':release_date,
    'track_id':track_id,
    'track_popularity': track_popularity,
    'n_available_markets':n_available_markets,
    'danceability':danceability,
    'energy':loudness,
    'mode':mode,
    'speechiness':speechiness,
    'acousticness':acousticness,
    'instrumentalness':instrumentalness,
    'liveness':liveness,
    'valence':valence,
    'tempo':tempo,
    'followers':artist_followers,
    'genre':genre,
    'popularity':artist_popularity,
    'top 10': is_top
})


df.to_csv(r'#PATH TO YOUR SAVING LOCATION')
df