# Recommender with Spotify API Integration

## 1: Create Spotify Authorization Token

Due to Spotify API restrictions, you need to add SpotifyForDevelopers to your Spotify account to be able to run the code in this notebook. Since you can't "log in" per se from the notebook itself, creating this token allows you to validate your credentials and retrieve information from your Spotify account. 

### Steps

1. Go to https://developer.spotify.com/dashboard/ and log in (or sign up if you don't already have an account).  
2. Go to https://developer.spotify.com/console/post-playlists/ and press "Get Token".
3. Select the following checkboxes:  
    a. playlist-modify-public  
    b. playlist-modify-private  
    c. user-read-recently-played  
4. Press "Request Token" and "Agree". Now you should see a field value under **OAuth Token**. Copy this, as we'll need it later.  
5. Next, open Spotify. Open the drop-down next to your name in the top-right corner, and select **Account**.  
6. Copy the **Username** as we'll need it later.  
7. Open a terminal on your system. 
8. Enter the 2 following commands one by one in your terminal, replacing the <> with the variables you copied earlier. This basically allows you to validate your Spotify credentials without hard-coding it in this notebook's code cells. If you're on a Mac, replace the _SET_ keyword with _export_.  
**setx SPOTIFY_AUTHORIZATION_TOKEN \<yourOAuthToken\>  
setx SPOTIFY_USER_ID \<yourUsername\>**  

<img src="./terminal.JPG">

Restart your terminal and Jupyter-lab for the changes to take effect. In some cases, restarting the system is required.

**Note: Spotify Access Tokens expire every hour, so you may need to go through the previous steps again if any of the code cells fail.**

In [1]:
#!pip install spotipy
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import json
import os
import requests

# instantiate global variables
auth_token = os.environ.get("SPOTIFY_AUTHORIZATION_TOKEN") 
user_id = os.getenv("SPOTIFY_USER_ID")

# if the below lines returns None, please restart your system
#print(auth_token)
#print(user_id)

if auth_token is None:
    print("Authorization Token is None. Please restart the application after setting up the token.")
else:
    print("Found an Authorization Token")

Found an Authorization Token


In [2]:
def getAPIrequest(auth_token, url):
    """
    Function to place GET requests to the Spotify API.
    """
    response = requests.get(
            url,
            headers={
                "Content-Type": "application/json",
                "Authorization": f"Bearer {auth_token}"
            }
        )
    return response

def postAPIrequest(auth_token, url, data):
    """
    Function to place POST requests to the Spotify API.
    """
    response = requests.post(
           url,
           data=data,
           headers={
               "Content-Type": "application/json",
               "Authorization": f"Bearer {auth_token}"
           }
    )
    return response

## 2: Retrieve User's Recently Played Songs
Now, we move on to retrieving the last _numOfTracks_ songs played by the user. We do this by sending a GET request to the Spotify API.

In [3]:
def getLastPlayedSongs(numOfTracks):
    """
    Function to retrieve the last numOfTracks songs played by the user.
    """
    url = f"https://api.spotify.com/v1/me/player/recently-played?limit={numOfTracks}"
    response = getAPIrequest(auth_token, url)
    response_json = response.json()
    songs = []
    #print(json.dumps(response_json, indent=4))
    try:
        for song in response_json["items"]:
            songs.append(song)
    except KeyError:
        print("Your Spotify Access Token expired.")
        print("Please obtain a new one and try again.")
    return songs

In [4]:
num = int(input("How many tracks would you like to visualize? "))
lastPlayed = getLastPlayedSongs(num)
print(f"\nHere are the last {num} tracks you listened to on Spotify:")
for index, track in enumerate(lastPlayed):
    print(f"\n {index+1}: {track['track']['name']}, {track['track']['artists'][0]['name']} ({track['track']['album']['release_date'][:4]})")

How many tracks would you like to visualize?  25



Here are the last 25 tracks you listened to on Spotify:

 1: Walk of Life, Dire Straits (1985)

 2: Sultans of Swing, Dire Straits (1978)

 3: Lady Writer, Dire Straits (1979)

 4: FACE, BROCKHAMPTON (2017)

 5: How to Save a Life - Live in NYC - 2009, The Fray (2009)

 6: Hold My Liquor, Kanye West (2013)

 7: 12:45 - Stripped, Etham (2018)

 8: Bound 2, Kanye West (2013)

 9: Lady Writer, Dire Straits (1979)

 10: Million Reasons, Lady Gaga (2016)

 11: Dancing, Mellow Fellow (2017)

 12: Taro, alt-J (2012)

 13: Come and See Me (feat. Drake), PARTYNEXTDOOR (2016)

 14: Nice For What, Drake (2018)

 15: The Motto, Drake (2011)

 16: HIGHEST IN THE ROOM, Travis Scott (2019)

 17: 90210 (feat. Kacy Hill), Travis Scott (2015)

 18: DNA., Kendrick Lamar (2017)

 19: LOVE. FEAT. ZACARI., Kendrick Lamar (2017)

 20: All The Stars (with SZA), Kendrick Lamar (2018)

 21: 911 / Mr. Lonely (feat. Frank Ocean & Steve Lacy), Tyler, The Creator (2017)

 22: Where'd You Go (feat. Holly Brook & Jo

## 3: Get User's Preferences

Now, we ask the user to specify the songs that they would like as a basis for their recommendations. Enter the list as a space-separated series of indices. For instance, if you want the first, third and fifth songs, enter: 1 3 5

In [5]:
ref_tracks = input("\nEnter a list of up to 5 tracks to be used as seed tracks: ") # enter space separated number of the track
ref_tracks = ref_tracks.split()
seed_tracks = [lastPlayed[int(i)-1] for i in ref_tracks]
# print(seed_tracks)


Enter a list of up to 5 tracks to be used as seed tracks:  4 14 16


## 4: Data Preprocessing
Using the user's choices, we convert it to model-friendly input so that the format matches the dataframe we are using.

In [6]:
def get_song_info(song_list):
    """
    Function to get the name and release year of seed tracks. 
    """
    seeds = []
    for item in range(len(song_list)):
        song = {'name': song_list[item]['track']['name'], 'artists': str([song_list[item]['track']['artists'][0]['name']]) }
        seeds.append(song)
    return seeds

get_song_info(seed_tracks)

[{'name': 'FACE', 'artists': "['BROCKHAMPTON']"},
 {'name': 'Nice For What', 'artists': "['Drake']"},
 {'name': 'HIGHEST IN THE ROOM', 'artists': "['Travis Scott']"}]

## 5: Model Building and Training
We move on to building and training our model. The code to do so exactly matches the code in Recommender.ipynb. 

In [7]:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import plotly.express as px
%matplotlib inline

In [8]:
song_data = pd.read_csv('./data/data.csv')

In [9]:
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
song_cluster_pipeline = Pipeline([('scaler', StandardScaler()), 
                                  ('kmeans', KMeans(n_clusters=20, 
                                   verbose=2, n_jobs=4))],verbose=True)
X = song_data.select_dtypes(np.number)
number_cols = list(X.columns)
song_cluster_pipeline.fit(X)
song_cluster_labels = song_cluster_pipeline.predict(X)
song_data['cluster_label'] = song_cluster_labels

[Pipeline] ............ (step 1 of 2) Processing scaler, total=   0.1s




Initialization complete
Iteration 0, inertia 1612513.6994143808
Iteration 1, inertia 1244456.3557672438
Iteration 2, inertia 1206077.1940710181
Iteration 3, inertia 1190908.1017862672
Iteration 4, inertia 1178986.1559294919
Iteration 5, inertia 1168661.2889503606
Iteration 6, inertia 1160049.5823167677
Iteration 7, inertia 1153387.0876455668
Iteration 8, inertia 1146544.6464971923
Iteration 9, inertia 1139896.397420521
Iteration 10, inertia 1136489.5803730509
Iteration 11, inertia 1134317.648424751
Iteration 12, inertia 1132249.0749925573
Iteration 13, inertia 1130243.4982501687
Iteration 14, inertia 1128398.4007926527
Iteration 15, inertia 1126287.7256307367
Iteration 16, inertia 1123320.8835640284
Iteration 17, inertia 1120487.5162967455
Iteration 18, inertia 1119386.0974660388
Iteration 19, inertia 1118866.067231291
Iteration 20, inertia 1118559.5207787186
Iteration 21, inertia 1118363.2203863256
Iteration 22, inertia 1118228.2525210606
Iteration 23, inertia 1118118.911889206
Iterat

## 6: Making Recommendations
We now proceed to make recommendations based on the user's preferences. The algorithm followed here is the same as Recommender.ipynb. The only changes are in the input and output columns, so as to make this data suitable for interaction with the Spotify API endpoint. 

In [10]:
from collections import defaultdict
from scipy.spatial.distance import cdist
import difflib
    
def get_song_data(song, song_data):
    
    """
    Gets the song data for a specific song. The song argument takes the form of a dictionary with 
    key-value pairs for the name and release year of the song.
    """
    try:
        song_info = song_data[(song_data['name'] == song['name']) 
                            & (song_data['artists'] == song['artists'])].iloc[0]
        return song_info
    except IndexError:
        return None

def get_mean_vector(song_list, song_data):
    """
    Gets the mean vector for a list of songs.
    """
    song_vectors = []
    for song in song_list:
        song_info = get_song_data(song, song_data)
        if song_info is None:
            print('Warning: {} does not exist in database'.format(song['name']))
            continue
        song_vector = song_info[number_cols].values
        song_vectors.append(song_vector)  
    song_matrix = np.array(list(song_vectors))
    return np.mean(song_matrix, axis=0)

def flatten_dict_list(dict_list):
    """
    Utility function for flattening a list of dictionaries.
    """
    flattened_dict = defaultdict()
    for key in dict_list[0].keys():
        flattened_dict[key] = []
    for dictionary in dict_list:
        for key, value in dictionary.items():
            flattened_dict[key].append(value)
    return flattened_dict

In [11]:
def recommend_songs(song_list, song_data, n_songs=12):
    """
    Recommends songs based on a list of previous songs that a user has listened to.
    """
    metadata_cols = ['name', 'year', 'artists', 'id']
    song_dict = flatten_dict_list(song_list)
    
    song_center = get_mean_vector(song_list, song_data)
    scaler = song_cluster_pipeline.steps[0][1]
    scaled_data = scaler.transform(song_data[number_cols])
    scaled_song_center = scaler.transform(song_center.reshape(1, -1))
    distances = cdist(scaled_song_center, scaled_data, 'cosine')
    index = list(np.argsort(distances)[:, :n_songs][0])
    
    rec_songs = song_data.iloc[index]
    rec_songs = rec_songs[~rec_songs['name'].isin(song_dict['name'])]
    return rec_songs[metadata_cols].to_dict(orient='records')[1:]

In [12]:
recommended = recommend_songs(get_song_info(seed_tracks), song_data)
print(recommended)

[{'name': 'Hear Me Calling', 'year': 2019, 'artists': "['Juice WRLD']", 'id': '13ZyrkCDmRz5xY3seuAWYk'}, {'name': 'Better Off Dead', 'year': 2020, 'artists': "['jxdn']", 'id': '4ih3Y0t86lfK8m8pTgEx4I'}, {'name': 'Bichiyal', 'year': 2020, 'artists': "['Bad Bunny', 'Yaviah']", 'id': '4j4w4CXm6BSr0s25wAWrrX'}, {'name': 'Coco (feat. DaBaby)', 'year': 2020, 'artists': "['24kGoldn', 'DaBaby']", 'id': '2V6pjeOCuPBVlatcWTXUtP'}, {'name': 'hooligan', 'year': 2020, 'artists': "['Baby Keem']", 'id': '02iYJG3KLBJODa5JkQ4O6y'}, {'name': 'No Church In The Wild', 'year': 2011, 'artists': "['JAY-Z', 'Kanye West', 'Frank Ocean', 'The-Dream']", 'id': '3Osd3Yf8K73aj4ySn6LrvK'}, {'name': 'Phases', 'year': 2019, 'artists': "['PRETTYMUCH']", 'id': '3je88Q4OvTqIx7BFRFYvRA'}, {'name': 'Enjoy Yourself (feat. Karol G)', 'year': 2020, 'artists': "['Pop Smoke', 'KAROL G']", 'id': '3NWrHCwvyII4fTx05PN3IO'}, {'name': 'Nobody', 'year': 2019, 'artists': "['Martin Jensen', 'James Arthur']", 'id': '2qfEcCkEo5NscA9GL7ER

## 7: Creating a Playlist
We now create a playlist full of the recommendations we computed.  

First, we create a playlist with a user-specified title by making a POST request to the API.

Then, we match the 'id' of the songs in the _recommended_ list of songs with their respective Spotify URIs through a series of GET requests.  

These URIs are all appended to a JSON list, which is then used in a POST request to the API, to add these songs to the playlist we just created.

In [13]:
playlist_name = input("Enter a playlist name:")
playlist_description = "We hope you enjoy the music we curated for you!"

Enter a playlist name: test


In [14]:
def createPlaylist(name=playlist_name, description=playlist_description, user_id=user_id):
    """
    This function creates a playlist for the user with a specified name and description.
    """
    data = json.dumps({
            "name": name,
            "description": description,
            "public": True
        })
    url = f"https://api.spotify.com/v1/users/{user_id}/playlists"
    response = postAPIrequest(auth_token, url, data)
    response_json = response.json()
    playlist_id = response_json["id"]
    return playlist_id

In [15]:
def searchForTrack(track):
    """
    This function matches a song's id with its URI in Spotify.
    URIs are required to add songs to a playlist.
    """
    url = f"https://api.spotify.com/v1/tracks/{track['id']}"
    response = getAPIrequest(auth_token, url)
    response_json = response.json()
    track_uri = response_json["uri"]
    return track_uri
        

In [16]:
def addSongsToPlaylist(playlist_id, tracks):
    """
    This function finds the URIs of all recommended songs and then adds them to the playlist.
    """
    track_uris = [searchForTrack(track) for track in tracks]
    #print(track_uris)
    data = json.dumps(track_uris)
    url = f"https://api.spotify.com/v1/playlists/{playlist_id}/tracks"
    response = postAPIrequest(auth_token, url, data)
    response_json = response.json()
    return response_json

In [17]:
addSongsToPlaylist(createPlaylist(), recommended)

{'snapshot_id': 'Myw5ODJkYzBkYjkwZTFmMmQzMmNiNGIwMDY1ODY4ODI3YmYwMGI0NjBi'}

**If the previous cell returns a snapshot_id, it means the playlist was created succesfully! Please check your Spotify account.**
