# Retrieving Spotify Data Using Spotify API

The goal of this notebook is to use the <a href='https://developer.spotify.com/documentation/web-api/'>spotify API</a> to retrieve my personal liked tracks. After retrieving the track ids, the spotify track features endpoint was called to retrieve the track features which classify tracks based on danceability, energy, key, loudness, mode, speechiness, acousticness, instrumentalness, liveness, valence, and tempo.

In [None]:
import os 
import requests
import joblib
import datetime as datetime
import time

Create a SpotifyAPI class which defines contains methods which will provide authentication parameters to the spotify API to retrieve an authentication token

In [None]:
class SpotifyAPI:
    
    authorize_url = 'https://accounts.spotify.com/authorize'
    tracks_request_url = 'https://api.spotify.com/v1/me/tracks'
    
    def __init__(self, client_id, client_secret):
        self.client_id = client_id
        self.client_secret = client_secret
    
    def get_authorization_code(self):
        
        client_id = self.client_id
        client_secret = self.client_secret
        authorize_url = self.authorize_url
        
        authorize_params = {
            'client_id':client_id,
            'response_type':'token',
            'redirect_uri':'https://developer.spotify.com/documentation/web-api/reference-beta/#endpoint-get-users-saved-tracks',
            'scope':'user-library-read user-read-recently-played'
        }
        
        r = requests.get(authorize_url, params=authorize_params)
        
        print(r.url)
    
    def get_track_data(self):
        
        access_token = self.get_access_token()
        
        tracks_request_url = self.tracks_request_url
        
        tracks_request_header = {
            'Authorization': f'Bearer {access_token}',
            'scope':'user-library-read user-read-recently-played'
        }
        
        r = requests.get(tracks_request_url, headers=tracks_request_header)
        
        return r.status_code

To get credentials you must create a spotify web app on your <a href='https://developer.spotify.com/dashboard/login'>dashboard</a>. Once you create a web app the details for the client id and secret key are under the web app details.

In [None]:
spotify_client_id = os.environ['SPOTIFY_API_CLIENT_ID']
spotify_secret_key = os.environ['SPOTIFY_API_SECRET_KEY']

spotify_client = SpotifyAPI(spotify_client_id, spotify_secret_key)

The authentication token is taken after clicking the url produced below. After clicking the produced link, the token is contained in the url of the redirect page.

In [5]:
auth_code_url = spotify_client.get_authorization_code()

https://accounts.spotify.com/login?continue=https%3A%2F%2Faccounts.spotify.com%2Fauthorize%3Fscope%3Duser-library-read%2Buser-read-recently-played%26response_type%3Dtoken%26redirect_uri%3Dhttps%253A%252F%252Fdeveloper.spotify.com%252Fdocumentation%252Fweb-api%252Freference-beta%252F%2523endpoint-get-users-saved-tracks%26client_id%3D853fe527a1704e18845533fb15278ead


Again the token is taken from the above url. The for loop below gets the spotify track ids of all the liked songs in my spotify account, and the list was dumped as a joblib object to accessed in other notebooks.

In [6]:
#Token is taken from auth code url

track_ids = []

for i in range(20):
    
    offset = i*50
    
    tracks_url = f'https://api.spotify.com/v1/me/tracks?offset={offset}&limit=50'

    header = {
        'Authorization': f'Bearer {token}'
    }

    tracks_response = requests.get(tracks_url, headers=header)
    
    tracks_json = tracks_response.json()
    
    for track in tracks_json['items']:
        track_id = track['track']['id']
        track_ids.append(track_id)
        
joblib.dump(track_ids,'../Joblib_Objects/liked_track_ids')

['Joblib_Objects/liked_track_ids']

Once all the liked track ids are saved in a list, the track features are able to be retrieved. The audio/track features spotify endpoint does not include the genre of the song. To get the genre, I collected the artist id, and with this id I was able to collect the artist features and their respective genre. This is what will define the genres of the tracks. All the data was saved into a pandas dataframe and exported as a CSV.

In [39]:
import pandas as pd

track_features_df = pd.DataFrame(columns=['track_id','genre','artist','danceability','energy','key','loudness','mode','speechiness','acousticness',
                                         'instrumentalness','liveness','valence','tempo'])

for track in track_ids:
    
    try:
        audio_features_response = requests.get(f'https://api.spotify.com/v1/audio-features/{track}', headers=header)
        audio_features_response_json = audio_features_response.json()

        track_response = requests.get(f'https://api.spotify.com/v1/tracks/{track}',headers=header)
        track_response_json = track_response.json()

        album_id = track_response_json['album']['id']
        artist_id = track_response_json['artists'][0]['id']

        artist_response = requests.get(f'https://api.spotify.com/v1/artists/{artist_id}',headers=header)
        artist_response_json = artist_response.json()
        genres = artist_response_json['genres']
        artist = artist_response_json['name']

        track_features = {
            'track_id':track,
            'genre':genres,
            'artist':artist,
            'danceability':audio_features_response_json['danceability'],
            'energy':audio_features_response_json['energy'],
            'key':audio_features_response_json['key'],
            'loudness':audio_features_response_json['loudness'],
            'mode':audio_features_response_json['mode'],
            'speechiness':audio_features_response_json['speechiness'],
            'acousticness':audio_features_response_json['acousticness'],
            'instrumentalness':audio_features_response_json['instrumentalness'],
            'liveness':audio_features_response_json['liveness'],
            'valence':audio_features_response_json['valence'],
            'tempo':audio_features_response_json['tempo']
        }

        track_features_df = track_features_df.append(track_features,ignore_index=True)
    
    except:
        print(track)
        
    

track_features_df.to_csv('../CSV_Data/liked_songs.csv')

0P6AWOA4LG1XOctzaVu5tt
2RxbUs1UfebXymMwrAKZVB
