# Recommenders

Recommenders explored in this notebook. 

[Track Based](#track_based)

[Artist Based](#artist_based)

[One-Click](#one_click)


# Log-In
Client ID and Client Secret are application identification codes provided by the [Spotify Developers Program](https://developer.spotify.com/). These codes are removed from my code before pushing to GitHub.

In [1]:
## REMOVE CLIENT_ID AND CLIENT_SECRET BEFORE PUSHING ##

CLIENT_ID='472796d4b7904eb8ab972808d46bd0b0'
CLIENT_SECRET='cc89036aabd04a61a9d20970a0510186'

In [2]:
# This code is all of the necessary imports and a quick way to access my personal data
# using Spotipy. 
import spotipy
from spotipy import util
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
import pandas as pd
import matplotlib.pyplot as plt
import urllib

USERNAME = 'elw86ve5g5t944wwlef6qyzu3' # Alex Fioto's user id
SCOPE = 'playlist-modify-public user-top-read'
LOCAL_REDIRECT_URI = 'http://127.0.0.1:8080'
REDIRECT_URI = 'https://alexaurusrecs.herokuapp.com/'

# Requesting access token
token = util.prompt_for_user_token(username=USERNAME,
                                   scope=SCOPE,
                                   client_id=CLIENT_ID,
                                   client_secret=CLIENT_SECRET,
                                   redirect_uri=LOCAL_REDIRECT_URI) 
# Instantiating OAuth object
spotify = spotipy.Spotify(auth=token)

<a id='track_based'></a>
# Track Based Recommender

**Overview**: The track based recommender will retreive base recommendations from Spotify based on what the user inputs as parameters. I will request the audio features for each one of the recommended tracks. Learn more about the audio features [HERE](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/). 

From there, I will pull the user's top listened-to tracks. I will pull the audio features from them as well.

Lastly, I will use scikit-learn's `cosine_similarity`and `pairwise_distances`to compare the audio features between the two sets of tracks and return a list of tracks that most similar.  

In [12]:
# Imports
from sklearn.metrics.pairwise import pairwise_distances, cosine_distances, cosine_similarity
from sklearn.preprocessing import StandardScaler
import pandas as pd
pd.set_option("display.precision", 14)

Below is a spotify [URI](https://community.spotify.com/t5/Spotify-Answers/What-s-a-Spotify-URI/ta-p/919201). Essentially, it's a code used by Spotify to identify artists, playlists, tracks and more. 

I will use the track Blockbuster Pt. 1 by Run the Jewels as my test track. Listen to it [here](https://open.spotify.com/album/4Loc7NtCAo9mypHO6kbviD?highlight=spotify:track:5jQYkYhoOlBW4vJ2l4TCxl)

**The scenario is that I LOVE this song. I want to hear more just like it. Let's find some good recommendations!**

In [13]:
# Test track from Run the Jewels.
blockbuster = 'spotify:track:5jQYkYhoOlBW4vJ2l4TCxl' 

In [14]:
# Returns a dictionary of recommended tracks. These are the base recommendations.
recs = spotify.recommendations(seed_tracks = [blockbuster], limit=20)
type(recs)

dict

In [15]:
# Track names and URIs of base recommendations
rec_track_names = [track['name'] for track in recs['tracks']]
uris = [track['uri'] for track in recs['tracks']]
rec_track_names

['Millions',
 'The Space Program',
 'Trouble in Paradise',
 'Blue Suede',
 'Connect Four',
 'Reagan',
 'Put Jewels on It',
 'Father Sister Berzerker',
 'Easy Rider',
 'Kill Jill (feat. Killer Mike & Jeezy)',
 'Acid Raindrops',
 'Rings',
 'Lie, Cheat, Steal',
 'Alphabet Aerobics',
 'Untouchable',
 'Wolf Like Me',
 'Drowning',
 'War Ready',
 'Above The Clouds',
 'Bar Breaker']

At this point in the recommender we come to a crossroad. The user can designate that we not use their listening history. If this is the case, we return the URIs of the base recommendations. But where's the fun in that!!ðŸŽºðŸŽ¶ðŸŽµ

Below, you can see a dataframe that includes the audio features. Danceability, energy, etc. These features are what we are going to be measure to the user's top tracks

In [17]:
# Creating audio features dataframe of recommended tracks
rec_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
rec_df['track_name'] = rec_track_names
rec_df['uri'] = uris
rec_df.head(3)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.667,0.747,4,-4.918,0,0.277,0.0484,0.0,0.108,0.281,136.931,audio_features,7clUOYJb0WsAgFQEgDxSIT,spotify:track:7clUOYJb0WsAgFQEgDxSIT,https://api.spotify.com/v1/tracks/7clUOYJb0WsA...,https://api.spotify.com/v1/audio-analysis/7clU...,309947,4,Millions
1,0.612,0.705,1,-5.898,1,0.352,0.427,0.0,0.658,0.533,101.876,audio_features,203xmWRHAyqwW6AkGkhhVM,spotify:track:203xmWRHAyqwW6AkGkhhVM,https://api.spotify.com/v1/tracks/203xmWRHAyqw...,https://api.spotify.com/v1/audio-analysis/203x...,341040,4,The Space Program
2,0.814,0.851,7,-6.725,1,0.212,0.155,0.00251,0.232,0.245,110.013,audio_features,4AHo9T0MVpNTPybmv0ITfJ,spotify:track:4AHo9T0MVpNTPybmv0ITfJ,https://api.spotify.com/v1/tracks/4AHo9T0MVpNT...,https://api.spotify.com/v1/audio-analysis/4AHo...,126945,4,Trouble in Paradise


In [18]:
# Fetching a dictionary of user's top tracks
user_tracks = spotify.current_user_top_tracks(limit=15, time_range='medium_term')
type(user_tracks)

dict

In [19]:
# Saving lists of track names and URIs of user top tracks
user_track_names = [track['name'] for track in user_tracks['items']]
uris = [track['uri'] for track in user_tracks['items']]
user_track_names # How embarassing ðŸ˜³

['Blockbuster Night, Pt. 1',
 'When I Grow Up',
 'The Dark',
 'No Excuses',
 'Nate',
 'Returns',
 'PAID MY DUES',
 'The Search',
 'Leave Me Alone',
 'Blood // Water',
 'The Visitor',
 'Let Me Go',
 'Venom - Music From The Motion Picture',
 'Only',
 'Talk to Me']

In [20]:
# Creating audio features dataframe of user top_tracks
user_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
user_df['track_name'] = user_track_names
user_df['uri'] = uris
user_df.head(1)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.701,0.525,1,-7.938,0,0.382,0.0353,2.55e-06,0.0615,0.385,75.693,audio_features,5jQYkYhoOlBW4vJ2l4TCxl,spotify:track:5jQYkYhoOlBW4vJ2l4TCxl,https://api.spotify.com/v1/tracks/5jQYkYhoOlBW...,https://api.spotify.com/v1/audio-analysis/5jQY...,152253,4,"Blockbuster Night, Pt. 1"


We are slicing our data frames to only include numeric columns to be compared.

In [23]:
# Setting new dataframes of only numeric features to be compared
rec_df_numeric = rec_df.loc[:, rec_df.columns[:11]]
user_df_numeric = user_df.loc[:, user_df.columns[:11]]
rec_df_numeric.head(1)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
0,0.667,0.747,4,-4.918,0,0.277,0.0484,0.0,0.108,0.281,136.931


Here is another crossroad for the user. At this point a user may determine specifically what they like about the song. If they only like the "beat", we may only include danceability, energy and tempo. If they like the message behind the song, we may only include features such as valence, mode and speechiness. 

Unfortunately, this user choice is not implemented in the app yet due to my lack of front-end web development skills. Coming soon! 

In [25]:
# Comparing the two dataframes using pairwise_distances and setting metric to cosine
comps = pairwise_distances(rec_df_numeric, user_df_numeric, metric='cosine')

# Creating dataframe of these distances and setting the columns to the user_tracks and index to recommendation tracks
comps_df = pd.DataFrame(comps, columns=user_track_names, index=rec_track_names)

# Pairwise calculates cosine distance, but we want similiarity.
comps_df = 1 - comps_df

# Dropping any duplicate tracks
comps_df.drop_duplicates(inplace=True)

comps_df.head(3)

Unnamed: 0,"Blockbuster Night, Pt. 1",When I Grow Up,The Dark,No Excuses,Nate,Returns,PAID MY DUES,The Search,Leave Me Alone,Blood // Water,The Visitor,Let Me Go,Venom - Music From The Motion Picture,Only,Talk to Me
Millions,0.99750199236698,0.99988936622154,0.99928910934323,0.99969668028341,0.99490343489182,0.99898916838312,0.99983087569199,0.99986462571067,0.99963036013202,0.9999483253942,0.99777219905774,0.99954564228239,0.99953972564343,0.99935245754999,0.99947921314963
The Space Program,0.99882923579216,0.99971043624963,0.99948685227837,0.99956354001272,0.99281667873934,0.99834751832921,0.99953952809482,0.99979926361571,0.99935042684243,0.99927006893044,0.99898588436804,0.99906472091026,0.99971712573011,0.99813693400984,0.99895211104388
Trouble in Paradise,0.99774200287826,0.99866710443145,0.99942652038497,0.99786476290929,0.99770140663709,0.99991324570446,0.99823742792039,0.99867898557355,0.9976224357366,0.99934551862418,0.99815421778617,0.9998757155409,0.99769662672427,0.99961334431696,0.99992206406963


Now we have a cosine similarity score for each pair of base recommendation and user top track. This model assumes that you like the songs you listen to the most. 

Next, for each user track, we will find the highest similarity score and add that to our list of recommendations.

In [26]:
song_list = []

for user_track in comps_df.columns:
    max_cosine = comps_df[user_track].max()
    song_name = comps_df[comps_df[user_track] == max_cosine].index[0]
    song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
    #print(f'{user_track}------ closest to ---------> {song_name}')
    song_list.append((song_name, song_uri))

In [27]:
# This adds unique URIs. Drops songs if they are duplicates.

uri_list= []

[uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]

[None, None, None, None, None, None, None]

In [30]:
# List of songs that have the highest scores to my top tracks and their asscociated URIs
song_list

[('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Bar Breaker', 'spotify:track:6LyzqxbVyIUoDtr4dYshIl'),
 ('Connect Four', 'spotify:track:7uer04FLTB7mt2ImHP6XjE'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Put Jewels on It', 'spotify:track:01zffoSenvp9JnSbl0UgMa'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Father Sister Berzerker', 'spotify:track:3ROmYjCVAJeoqi5ozYFi6J'),
 ('Connect Four', 'spotify:track:7uer04FLTB7mt2ImHP6XjE')]

At this point, you can copy and paste the URIs into a blank Spotify playlist. My app will create the playlist for the user so that step is eliminated. Please see "Create Playlist" function in app_utilities.ipynb for more information.

Below is code to perform the same calculations as above, but this time it includes a standard scaler. I hypothesized that the scaled data would give better recommendations, but I have found that not to be the case. I may explore this more in the future.

In [33]:
# Scaled data
ss = StandardScaler()
rec_df_scaled = pd.DataFrame(ss.fit_transform(rec_df_numeric.values), columns=rec_df_numeric.columns, index=rec_df_numeric.index)
user_df_scaled = pd.DataFrame(ss.transform(user_df_numeric.values), columns=user_df_numeric.columns, index=user_df_numeric.index)
comps_scaled = pairwise_distances(rec_df_scaled, user_df_scaled, metric='cosine')
comps_df_scaled = pd.DataFrame(comps_scaled, columns=user_track_names, index=rec_track_names)
comps_df_scaled = 1 - comps_df_scaled
comps_df_scaled.drop_duplicates(inplace=True)
song_list_scaled = []

for user_track in comps_df_scaled.columns:
    max_cosine = comps_df_scaled[user_track].max()
    song_name = comps_df_scaled[comps_df_scaled[user_track] == max_cosine].index[0]
    song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
    #print(f'{user_track}------ closest to ---------> {song_name}')
    song_list_scaled.append((song_name, song_uri))
    
uri_list_scaled = []
[uri_list_scaled.append(track[1]) for track in song_list_scaled if track[1] not in uri_list_scaled]
song_list_scaled

[('Drowning', 'spotify:track:70bx1d7tQbVm768tZKbRhn'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('War Ready', 'spotify:track:2HwqshjCe5Tuvq2s6Fx3K5'),
 ('Drowning', 'spotify:track:70bx1d7tQbVm768tZKbRhn'),
 ('Untouchable', 'spotify:track:4AHZRMJCpscmxygCNtC2Qq'),
 ('Bar Breaker', 'spotify:track:6LyzqxbVyIUoDtr4dYshIl'),
 ('Millions', 'spotify:track:7clUOYJb0WsAgFQEgDxSIT'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Above The Clouds', 'spotify:track:3ZBSXNYdTZVaBUQI3E2rF6'),
 ('Millions', 'spotify:track:7clUOYJb0WsAgFQEgDxSIT'),
 ('War Ready', 'spotify:track:2HwqshjCe5Tuvq2s6Fx3K5'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG')]

Below is a function that includes all of the above code with additional parameters for the base recommendations.

In [4]:
def recommend_songs(spotify, artists=None, genres=None, tracks=None, limit=100, n_tracks=10, listener_based=True, time_range='medium_term', drop_cols=[]):
    '''
    This function will recommend songs based on a seed artist, seed genres and/or seed tracks.
    The searched songs will be compared to user top tracks and return only those songs with the 
    highest cosine similarity.
   
    Parameters:
    
    spotify: OAuth user instance
    
    artists: list of Spotify artist URIs
    
    genres: list of seed genres
    
    tracks: list of Spotify track URIs
    
    limit: number of basic Spotify recommendations for comparison
    
    n_tracks: number of tracks to pull from user listening history
    
    listener_based: boolean, if True, function will pull user data and compare to base recommendations. If False, function will return n_tracks of base recommendations
    
    time_range: the listening history time range to pull from user listening data
    
    drop_cols: columns to drop from numeric featue comparison. 
    Choose from:    danceability
                    energy
                    key
                    loudness
                    mode
                    speechiness
                    acousticness
                    instrumentalness
                    liveness
                    valence
                    tempo
    
    '''
    
    # Fetching Spotify recommendations 
    recs = spotify.recommendations(seed_artists=artists, seed_genres=genres, seed_tracks=tracks, limit=limit)
    
    # Saving lists of track names and URIs of recommended tracks
    rec_track_names = [track['name'] for track in recs['tracks']]
    uris = [track['uri'] for track in recs['tracks']]
    
    # Creating audio features dataframe of recommended tracks
    rec_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
    rec_df['track_name'] = rec_track_names
    rec_df['uri'] = uris
    
    # If not listener based, returning list of URIs for Spotify recommended songs
    if not listener_based:
        return list(rec_df[:n_tracks]['uri'])
    
    else: # If listener based
        
        # Fetching user top tracks
        user_tracks = spotify.current_user_top_tracks(limit=n_tracks, time_range=time_range)
        
        # Saving lists of track names and URIs of user top tracks
        user_track_names = [track['name'] for track in user_tracks['items']]
        uris = [track['uri'] for track in user_tracks['items']]
        
        # Creating audio features dataframe of user top_tracks
        user_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
        user_df['track_name'] = user_track_names
        user_df['uri'] = uris
        
        # Setting new dataframes of only numeric features to be compared
        rec_df_numeric = rec_df.loc[:, rec_df.columns[:11]]
        user_df_numeric = user_df.loc[:, user_df.columns[:11]]
        
        # If drop_cols, dropping appropriate columns from each numeric dataframe
        if drop_cols:
            rec_df_numeric.drop(drop_cols, inplace=True, axis=1)
            user_df_numeric.drop(drop_cols, inplace=True, axis=1)
              
        
        comps = pairwise_distances(rec_df_numeric, user_df_numeric, metric='cosine')
        comps_df = pd.DataFrame(comps, columns=user_track_names, index=rec_track_names)
        comps_df = 1 - comps_df
        comps_df.drop_duplicates(inplace=True)
        
        song_list = []
    
        for user_track in comps_df.columns:
            max_cosine = comps_df[user_track].max()
            song_name = comps_df[comps_df[user_track] == max_cosine].index[0]
            song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
            #print(f'{user_track}------ closest to ---------> {song_name}')
            song_list.append((song_name, song_uri))

        uri_list = []
        [uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]
    
    
        return uri_list

<a id="artist_based"></a>
# Artist Based Recommender

In [55]:
# Test Artists URIs
logslaught = 'spotify:artist:1I471vwcRhqQl6QonGZlen'
grandson = 'spotify:artist:4ZgQDCtRqZlhLswVS6MHN4'
run_the_jewels = 'spotify:artist:4RnBFZRiMLRyZy0AzzTg2C'
hatebreed = 'spotify:artist:17Mb968quDHpjCkIyq30QV'
jxdn = 'spotify:artist:6Y64EaNqpqcZYTgs4c76gF'

In [55]:
albums = sp.artist_albums(artist_id = run_the_jewels, limit=50)
len(albums['items'])

In [57]:
def get_artist_tracks(sp, artists, n_albums=50):
    '''
    Function takes in the Spotify URI of one or more artists and returns a Pandas dataframe with Spotify's proprietary audio features.
    
    Parameters:
    sp: Spotipy user authentication object. 
    artists: list of one or more artist URIs
    n_albums: number of albums to fetch per artist. 1-50 acceptable range
    
    
    '''
    # Empty list to hold the track uris, album uris and dataframes to concatenate
    uris = []
    album_uris = []
    df_list = []
    
    # Checking if user input artists as a list. 
    if type(artists) != list:
        # Rectifying if user did not input list
        artists = [artists]
    
    # Iterate through artists
    for artist in artists:
        # Fetches the albums JSONs
        albums = sp.artist_albums(artist, limit=n_albums)
        # Iterating through the albums and appending the artist URIs
        for album in albums['items']:
            album_uris.append(album['uri'])
    
    # Iterate through the album URIs
    for uri in album_uris:
        # Fetching album JSON
        album = sp.album(uri)
        # Fetching album name
        album_name = album['name']
        # Fetching all artists on album
        album_artist = ', '.join([artist['name'] for artist in album['artists']])
        
        # Fetching all of the track URIs on the album
        tracks = sp.album_tracks(uri)
        # Listing track URIs and names
        track_uris = [track['uri'] for track in tracks['items']]
        track_names = [track['name'] for track in tracks['items']]
        
        # Fetching audio features from the track URIs and assigning track name, album name and album artist
        audio_features_df = pd.DataFrame.from_dict(sp.audio_features(track_uris))
        audio_features_df['track_name'] = track_names
        audio_features_df['album_name'] = album_name
        audio_features_df['artist_name'] = album_artist
        
        # Appending dataframe to list to concatenate
        df_list.append(audio_features_df)
    
    # Concatenating dataframes, resetting index and dropping duplicates
    df = pd.concat(df_list)
    df.reset_index(inplace=True, drop=True)
    df.drop_duplicates()
    return df

In [58]:
art = get_artist_tracks(sp, 'spotify:artist:6Y64EaNqpqcZYTgs4c76gF')

In [59]:
art

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name,album_name,artist_name
0,0.684,0.672,10,-5.762,0,0.0341,0.15100,0.000000,0.1160,0.381,...,audio_features,4ih3Y0t86lfK8m8pTgEx4I,spotify:track:4ih3Y0t86lfK8m8pTgEx4I,https://api.spotify.com/v1/tracks/4ih3Y0t86lfK...,https://api.spotify.com/v1/audio-analysis/4ih3...,155674,4,Better Off Dead,Better Off Dead,jxdn
1,0.501,0.633,5,-6.560,1,0.0466,0.00358,0.000000,0.3510,0.463,...,audio_features,4C7dLHxwVRA0cYNzqbA7aP,spotify:track:4C7dLHxwVRA0cYNzqbA7aP,https://api.spotify.com/v1/tracks/4C7dLHxwVRA0...,https://api.spotify.com/v1/audio-analysis/4C7d...,135573,4,Tonight (feat. iann dior),Better Off Dead,jxdn
2,0.552,0.333,0,-7.836,1,0.0270,0.08680,0.000000,0.5570,0.274,...,audio_features,1JMaSFk9pS772cWad0gp4F,spotify:track:1JMaSFk9pS772cWad0gp4F,https://api.spotify.com/v1/tracks/1JMaSFk9pS77...,https://api.spotify.com/v1/audio-analysis/1JMa...,119371,4,Pray,Better Off Dead,jxdn
3,0.598,0.662,4,-5.797,1,0.0480,0.00301,0.000167,0.0812,0.531,...,audio_features,0nAsW4plPTFBHoQkU1z6CA,spotify:track:0nAsW4plPTFBHoQkU1z6CA,https://api.spotify.com/v1/tracks/0nAsW4plPTFB...,https://api.spotify.com/v1/audio-analysis/0nAs...,141207,4,So What!,Better Off Dead,jxdn
4,0.554,0.800,8,-3.936,1,0.0395,0.11900,0.000000,0.6290,0.463,...,audio_features,3himokjS8OCS9Qfgzn1JAH,spotify:track:3himokjS8OCS9Qfgzn1JAH,https://api.spotify.com/v1/tracks/3himokjS8OCS...,https://api.spotify.com/v1/audio-analysis/3him...,160120,4,Angels & Demons,Better Off Dead,jxdn
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
344,0.741,0.821,9,-5.186,1,0.0507,0.18200,0.000000,0.2630,0.161,...,audio_features,48pbmUwTtlSDpi3sRd3EzR,spotify:track:48pbmUwTtlSDpi3sRd3EzR,https://api.spotify.com/v1/tracks/48pbmUwTtlSD...,https://api.spotify.com/v1/audio-analysis/48pb...,182693,4,Replay (feat. Flo Rida),UK Hits,Various Artists
345,0.598,0.662,4,-5.797,1,0.0480,0.00301,0.000167,0.0812,0.531,...,audio_features,5LX9kAJoI2KarxvMZHFQtl,spotify:track:5LX9kAJoI2KarxvMZHFQtl,https://api.spotify.com/v1/tracks/5LX9kAJoI2Ka...,https://api.spotify.com/v1/audio-analysis/5LX9...,141207,4,So What!,UK Hits,Various Artists
346,0.920,0.610,9,-6.615,1,0.2910,0.00558,0.000000,0.0938,0.424,...,audio_features,5dqqWVQmrKDeNDd1a0yjxh,spotify:track:5dqqWVQmrKDeNDd1a0yjxh,https://api.spotify.com/v1/tracks/5dqqWVQmrKDe...,https://api.spotify.com/v1/audio-analysis/5dqq...,135493,4,Cool Off,UK Hits,Various Artists
347,0.692,0.653,5,-5.715,1,0.0392,0.49000,0.006560,0.4550,0.901,...,audio_features,4CkweyEaChnaHSNLaUFmSO,spotify:track:4CkweyEaChnaHSNLaUFmSO,https://api.spotify.com/v1/tracks/4CkweyEaChna...,https://api.spotify.com/v1/audio-analysis/4Ckw...,192267,4,Right Back Where We Started From,UK Hits,Various Artists


In [60]:
def get_user_tracks(sp, limit=10, time_range='medium_term'):
    '''
    Fetches a Spotify user's top tracks and returns a dataframe of 
    Spotify audio features
    
    Parameters:
    sp: Spotipy user authentication object  
    limit: number of user tracks to fetch
    time_range: choose from 'short_term', 'medium_term' and 'long_term'
    '''
    
    # Fetching user's top tracks
    user_tracks = sp.current_user_top_tracks(limit=limit, time_range=time_range)
    # Fetching audio features from user's top tracks
    audio_features =  sp.audio_features([item['uri'] for item in user_tracks['items']])
    # Listing names of tracks
    names = [item['name'] for item in user_tracks['items']]
    # Creating dataframe of audio features
    df = pd.DataFrame.from_dict(audio_features)
    # Setting track names
    df['track_name'] = names
    return df

In [61]:
def user_tracks_dataframe(spotify, limit=10):
    '''
    This function takes in a spotify user authenication object and a limit of songs.
    Return a pandas data frame with users top tracks (names and artists) 
    within short, medium and long-term time-ranges

    '''
    terms = ['short_term', 'medium_term', 'long_term']
    df_list = []
    for term in terms:
        tracks = spotify.current_user_top_tracks(limit=limit, time_range=term)
        track_names = [track['name'] for track in tracks['items']]
        artist_list = []
        for track in tracks['items']:
            artists = ', '.join([artist['name'] for artist in track['artists']])
            artist_list.append(artists)
        df = pd.DataFrame(list(zip(track_names, artist_list)), 
                          columns = ['track_name', 'artist_name'])
        df['time_range'] = term
        df_list.append(df)
    df = pd.concat(df_list)
    df.reset_index(inplace=True, drop=True)
    return df
        

In [62]:
def get_playlist_embeddings(spotify):
    '''
    Takes in user authentication object and returns HTML embedding codes for user playlists
    '''
    playlists = spotify.current_user_playlists()
    playlist_ids = [playlist['id'] for playlist in playlists['items']]
    embed_code = '<iframe src="https://open.spotify.com/embed/playlist/####" width="300" height="380" frameborder="0" allowtransparency="true" allow="encrypted-media"></iframe>'
    codes = ' '.join([embed_code.replace('####', playlist_id) for playlist_id in playlist_ids])
    return codes

In [64]:
from sklearn.metrics.pairwise import pairwise_distances, cosine_distances, cosine_similarity
pd.set_option("display.precision", 14)

In [69]:
def compare_songs(artist_df, user_df):
    '''
    Takes in a dataframe of artist tracks and audio feautes, and a dataframe of user top tracks. 
    Returns tracks in artist_df that are most alligned with user tracks.
    '''
    
    band_track_names = artist_df['track_name']
    user_track_names = user_df['track_name']
    
    df_numeric = artist_df.loc[:, artist_df.columns[:11]]
    user_df_numeric = user_df.loc[:, user_df.columns[:11]]
    
    recs = pairwise_distances(df_numeric, user_df_numeric, metric='cosine')
    rec_df = pd.DataFrame(recs, columns=user_track_names, index=band_track_names)    
    rec_df = 1 - rec_df
    rec_df.drop_duplicates(inplace=True)
    
    song_list = []
    
    for user_track in rec_df.columns:
        max_cosine = rec_df[user_track].max()
        song_name = rec_df[rec_df[user_track] == max_cosine].index[0]
        song_uri = list(artist_df.loc[artist_df['track_name'] == song_name, 'uri'])[0]
        #print(f'{user_track}------ closest to ---------> {song_name}')
        song_list.append((song_name, song_uri))
        
    uri_list = []
    [uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]
    
    return uri_list

# Test out Functions!
get_artist_tracks
get_user_tracks
compare_songs

In [75]:
glass_animals = 'spotify:artist:4yvcSjfu4PC0CYQyLy4wSq'
ewf = 'spotify:artist:4QQgXkCYTt3BlENzhyNETg'

In [80]:
artist_tracks = get_artist_tracks(sp, artists = ewf )
user_tracks = get_user_tracks(sp, limit=12, time_range='long_term')

In [81]:
artist_tracks.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name,album_name,artist_name
0,0.399,0.336,6,-15.191,0,0.116,0.369,8.95e-05,0.136,0.0579,...,audio_features,52IRsLlMfFeLb0lIyRMV4Y,spotify:track:52IRsLlMfFeLb0lIyRMV4Y,https://api.spotify.com/v1/tracks/52IRsLlMfFeL...,https://api.spotify.com/v1/audio-analysis/52IR...,93413,4,Intro : Grandmix - Earth Wind & Fire And Friends,Grandmix Earth Wind & Fire (mixed by Ben Liebr...,"Earth, Wind & Fire, Friends"
1,0.644,0.813,4,-8.584,0,0.0558,0.242,0.00117,0.63,0.774,...,audio_features,3C4G2ZtQBwqLHyS3yfAmYM,spotify:track:3C4G2ZtQBwqLHyS3yfAmYM,https://api.spotify.com/v1/tracks/3C4G2ZtQBwqL...,https://api.spotify.com/v1/audio-analysis/3C4G...,104693,4,Fantasy,Grandmix Earth Wind & Fire (mixed by Ben Liebr...,"Earth, Wind & Fire, Friends"
2,0.493,0.902,4,-7.602,0,0.0817,0.282,5.68e-06,0.115,0.571,...,audio_features,2GuvbIb0MqNbEVlm9WjUVr,spotify:track:2GuvbIb0MqNbEVlm9WjUVr,https://api.spotify.com/v1/tracks/2GuvbIb0MqNb...,https://api.spotify.com/v1/audio-analysis/2Guv...,96427,4,Sun Goddess,Grandmix Earth Wind & Fire (mixed by Ben Liebr...,"Earth, Wind & Fire, Friends"
3,0.795,0.766,1,-8.047,0,0.0522,0.259,0.000178,0.141,0.927,...,audio_features,2y5BrHSV5E0c2cEIBWyD83,spotify:track:2y5BrHSV5E0c2cEIBWyD83,https://api.spotify.com/v1/tracks/2y5BrHSV5E0c...,https://api.spotify.com/v1/audio-analysis/2y5B...,104053,4,On Your Face,Grandmix Earth Wind & Fire (mixed by Ben Liebr...,"Earth, Wind & Fire, Friends"
4,0.78,0.579,2,-6.96,0,0.0228,0.0404,3.01e-05,0.316,0.872,...,audio_features,2qOrEjVUu1Ww8LPvXd1OXA,spotify:track:2qOrEjVUu1Ww8LPvXd1OXA,https://api.spotify.com/v1/tracks/2qOrEjVUu1Ww...,https://api.spotify.com/v1/audio-analysis/2qOr...,97747,4,Wonderland,Grandmix Earth Wind & Fire (mixed by Ben Liebr...,"Earth, Wind & Fire, Friends"


In [89]:
song_list = compare_songs(artist_tracks, user_tracks)

In [454]:
tracks = get_artist_tracks(sp, artists='spotify:artist:3bcLBxvaI7GsBzGp3WHnwQ')
user_tracks = get_user_tracks(sp)

In [453]:
tracks

499

In [456]:
len(user_tracks)

10

In [459]:
compare_songs(tracks, user_tracks)

['spotify:track:2IF8sqTWsKptMllsYglkeK',
 'spotify:track:0POBjbE1ovNWInRDVQbREC',
 'spotify:track:7KsGeiA9C4x63MCJcdpL12',
 'spotify:track:1jIluyycudIsv7OrURzBBU',
 'spotify:track:0BhFrPI9q21LadxoIgT2zf',
 'spotify:track:76g5npBAaPI8RDQETVehR7',
 'spotify:track:0OkNhfAJPFb54zavGW0RAN',
 'spotify:track:4e2rhIyiyAeWuU23k0kMFC',
 'spotify:track:7DFTzN8Tg1U6zIYn8rgL9y']

<a id="one_click"></a>
# One Click Recommender

In [90]:
#CLIENT_ID=*SEE NOTES*
#CLIENT_SECRET=*SEE NOTES*


In [93]:
import spotipy
from spotipy import util
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
import pandas as pd
import matplotlib.pyplot as plt
import urllib

## REMOVE CLIENT_ID AND CLIENT_SECRET BEFORE PUSHING ##

#CLIENT_ID
#CLIENT_SECRET
USERNAME = 'elw86ve5g5t944wwlef6qyzu3' # Alex Fioto's user id
SCOPE = 'playlist-modify-public user-top-read'
LOCAL_REDIRECT_URI = 'http://127.0.0.1:8080'
REDIRECT_URI = 'https://fioto-spotify-flask.herokuapp.com/'
scope='user-read-currently-playing playlist-modify-public user-top-read playlist-read-private user-read-playback-state'

# Requesting access token
token = util.prompt_for_user_token(username=USERNAME,
                                   scope=scope,
                                   client_id=CLIENT_ID,
                                   client_secret=CLIENT_SECRET,
                                   redirect_uri=LOCAL_REDIRECT_URI) 
# Instantiating OAuth object
spotify = spotipy.Spotify(auth=token)

In [82]:
def one_click_rec(spotify):
    try:
        uris = [uri['uri'] for uri in spotify.current_user_top_tracks(limit=55352345435234523453)['items']]
    except:
        playlist_id = spotify.featured_playlists(limit=1)['playlists']['items'][0]['id']
        playlist = spotify.playlist(playlist_id, additional_types=('track', ))
        uris = [uri['track']['uri'] for uri in playlist['tracks']['items'][:5]]
    recs = spotify.recommendations(seed_artists=None, seed_genres=None, seed_tracks=uris, limit=20)
    rec_uris = [uri['uri'] for uri in recs['tracks']]
    return rec_uris, uris

In [45]:
def make_playlist(spotify, playlist_uris, playlist_name):
    user = spotify.current_user()['id']
    playlist = spotify.user_playlist_create(user=user, name=playlist_name, public=True)
    playlist_id = playlist['id']
    spotify.user_playlist_add_tracks(user=user,
                                     playlist_id=playlist_id,
                                     tracks=playlist_uris)

In [46]:
# make_playlist(spotify, one_click_rec(spotify), 'Alex\'s One Click Rec')

In [75]:
playlist_id = spotify.featured_playlists(limit=1)['playlists']['items'][0]['id']
playlist = spotify.playlist(playlist_id, additional_types=('track', ))
uris = [uri['track']['uri'] for uri in playlist['tracks']['items'][:5]]
uris

['spotify:track:4R59wt5nnhYo88PIu3cUIt',
 'spotify:track:1MKT2Vf6qowCN48KlH0aCA',
 'spotify:track:1fYouLdK3PkOMPnx4CPxTY',
 'spotify:track:0NeJjNlprGfZpeX2LQuN6c',
 'spotify:track:2f6pqUyFcs3NUSoz49H9nw']

In [83]:
one_click_rec(spotify)

HTTP Error for GET to https://api.spotify.com/v1/me/top/tracks returned 400 due to invalid request


(['spotify:track:7hyFz1ms1XEHbE2KqUbUQ8',
  'spotify:track:3Rl26h1HiMCV0HFHHVb2IM',
  'spotify:track:5RRNZFyOi17nTh2bPEKPtp',
  'spotify:track:7tOYSMYowhxJ0uK3WMoL5n',
  'spotify:track:2uTrOSgkY7bPDh7VLeaicM',
  'spotify:track:4mGgFYhPcF6VuMn4jaHCbx',
  'spotify:track:7LmMQyOx62rbprC2eyPtHO',
  'spotify:track:5G4W4UzaJIpYl0ar95Cs17',
  'spotify:track:403vzOZN0tETDpvFipkNIL',
  'spotify:track:3Ulj4AXe7s1gUZ7iByb9Jy',
  'spotify:track:27ytYULTu6QSZBhGaOKq9i',
  'spotify:track:1hI9ZhG2wlCbQKJnw3krPU',
  'spotify:track:09HN59mQtAlKYzM2i5sGbO',
  'spotify:track:711WfDztCZpnmJg7Uvwod3',
  'spotify:track:52FKX00U3PnzrBQmbMTB8b',
  'spotify:track:25fcj6d2W1l8DQL11Czdzb',
  'spotify:track:6ZMYbLF33jIECoG2MClauD',
  'spotify:track:1TE21NTIHAUUOVd1GVXNOw',
  'spotify:track:3kQDykkw9HVowlm3HxTcuR',
  'spotify:track:77OBKDqQD0tvocHP5AXDDV'],
 ['spotify:track:4R59wt5nnhYo88PIu3cUIt',
  'spotify:track:1MKT2Vf6qowCN48KlH0aCA',
  'spotify:track:1fYouLdK3PkOMPnx4CPxTY',
  'spotify:track:0NeJjNlprGfZpeX2

In [97]:
spotify.current_playback()

{'device': {'id': '61132c7ae806abd04502be3994fcc2408fb6fd88',
  'is_active': True,
  'is_private_session': False,
  'is_restricted': False,
  'name': 'Alexâ€™s MacBook Pro',
  'type': 'Computer',
  'volume_percent': 100},
 'shuffle_state': False,
 'repeat_state': 'off',
 'timestamp': 1609623733109,
 'context': {'external_urls': {'spotify': 'https://open.spotify.com/playlist/4qOSWuAVcnM4lgsLAzg2Pm'},
  'href': 'https://api.spotify.com/v1/playlists/4qOSWuAVcnM4lgsLAzg2Pm',
  'type': 'playlist',
  'uri': 'spotify:playlist:4qOSWuAVcnM4lgsLAzg2Pm'},
 'progress_ms': 8574,
 'item': {'album': {'album_type': 'album',
   'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/3Uobr6LgQpBbk6k4QGAb3V'},
     'href': 'https://api.spotify.com/v1/artists/3Uobr6LgQpBbk6k4QGAb3V',
     'id': '3Uobr6LgQpBbk6k4QGAb3V',
     'name': 'I Prevail',
     'type': 'artist',
     'uri': 'spotify:artist:3Uobr6LgQpBbk6k4QGAb3V'}],
   'available_markets': ['AD',
    'AE',
    'AL',
    'AR',
    