## Spotify Script

We want to get the audio features of the songs that did well in past Eurovisions.

We will assume that as input, we have 2 lists of strings: 1 with the song names and 1 with the artist names, where the element at the ith index of the song list is the song, and the element at the ith index of the artist list is the artist who sang that song.

In [1]:
import json
import urllib.request
from urllib.request import Request
from pandas.io.json import json_normalize

We now need to get a token to access the Spotify API. For now, I'm using a temporary token (so this string will need to be replaced very often) generated from the "Get Token" button here: https://developer.spotify.com/console/get-audio-features-track/?id=06AKEBrKUckW0KREUWRnvT

In [2]:
current_token = 'BQAtamt872wkWRW0SWSWTUfKS5orAzeXKBRd4hyT1aYHGCzEHhFmpWuSe-P_02-H0bjlwhFAOgyEiqZbewpmLJ6Kl7T1w7kfyDHdumtckl9eTvv-4YM0EQzv2YTyMaN4PMfiVuQaFeqhE6JtvJQ'

In [3]:
songs = ['Euphoria', 'Rise Like a Phoenix']
artists = ['Loreen','Conchita Wurst']

We also want to keep track of null-values (songs that aren't in the Spotify API):

In [4]:
countNulls = 0

Data is our result list that will keep all the JSONs we get:

In [6]:
data = []
for i in range(len(songs)):
    
    # formatting spaces
    song = songs[i].replace(" ","%20")
    artist = artists[i].replace(" ","%20")

    # going from artist / song name to song URI (https://developer.spotify.com/documentation/web-api/reference/search/search/)
    request = Request('https://api.spotify.com/v1/search?q=track:' + song + '%20artist:' + artist + '&type=track&limit=1')
    request.add_header('Accept', 'application/json')
    request.add_header('Content-Type', 'application/json')
    request.add_header('Authorization', 'Bearer ' + current_token)
    res = urllib.request.urlopen(request)
    resObject = json.load(res)

    if (len(resObject["tracks"]["items"]) == 0):
        countNulls += 1
    else:
        songURI = resObject["tracks"]["items"][0]["id"]
        name = songs[i]

        # going from song URI -> audio features (https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/)
        audioRequest = Request('https://api.spotify.com/v1/audio-features/' + songURI)
        audioRequest.add_header('Accept', 'application/json')
        audioRequest.add_header('Content-Type', 'application/json')
        audioRequest.add_header('Authorization', 'Bearer ' + current_token)
        audioRes = urllib.request.urlopen(audioRequest)
        jsonObject = json.load(audioRes)

        # adding the song name into JSON
        jsonObject["name"] = name

        data.append(jsonObject)

# converting the list of JSON objects -> a dataframe
df = json_normalize(data)
print(df)

   danceability  energy  key  loudness  mode  speechiness  acousticness  \
0         0.562   0.783   11    -4.727     0       0.0428         0.135   
1         0.425   0.503    2    -6.324     0       0.0256         0.142   

   instrumentalness  liveness  valence    tempo            type  \
0          0.000014     0.282    0.391  132.060  audio_features   
1          0.000000     0.105    0.230   81.653  audio_features   

                       id                                   uri  \
0  4Qh4H4KxrlvTjPA6sAJC07  spotify:track:4Qh4H4KxrlvTjPA6sAJC07   
1  1ijX03QOR6a1wI322HifSV  spotify:track:1ijX03QOR6a1wI322HifSV   

                                          track_href  \
0  https://api.spotify.com/v1/tracks/4Qh4H4KxrlvT...   
1  https://api.spotify.com/v1/tracks/1ijX03QOR6a1...   

                                        analysis_url  duration_ms  \
0  https://api.spotify.com/v1/audio-analysis/4Qh4...       181787   
1  https://api.spotify.com/v1/audio-analysis/1ijX...       1830

We then export our results from the dataframe into CSV:

In [7]:
df.to_csv(r'spotify_audio_features.csv')

### # Nulls

We now see how many null values there were:

In [8]:
print("Number of Nulls: ", countNulls)
print("Number of Songs: ", len(df))

Number of Nulls:  0
Number of Songs:  2


### Miscellaneous Stuff

To see how a res object (where res = urllib.request.urlopen(request)) looks in a JSON format:

    print(res.read().decode())

Here's a dataset where they label what each Spotify track feature means: https://www.kaggle.com/nadintamer/top-spotify-tracks-of-2018