<a id="home"></a>
# Alexaurus Recs Recommenders

### Recommenders explored in this notebook. 

[Track Based](#track_based)

[Artist Based](#artist_based)

[One-Click](#one_click)

[Playlist Based - work in progress](#playlist_based)

# Log-In
Client ID and Client Secret are application identification codes provided by the [Spotify Developers Program](https://developer.spotify.com/). These codes are removed from my code before pushing to GitHub.

In [53]:
## REMOVE CLIENT_ID AND CLIENT_SECRET BEFORE PUSHING ##


In [48]:
# This code is all of the necessary imports and a quick way to access my personal data
# using Spotipy. 
import spotipy
from spotipy import util
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
import pandas as pd
import matplotlib.pyplot as plt
import urllib

USERNAME = 'elw86ve5g5t944wwlef6qyzu3' # Alex Fioto's user id
SCOPE = 'playlist-modify-public user-top-read'
LOCAL_REDIRECT_URI = 'http://127.0.0.1:8080'
REDIRECT_URI = 'https://alexaurusrecs.herokuapp.com/'

# Requesting access token
token = util.prompt_for_user_token(username=USERNAME,
                                   scope=SCOPE,
                                   client_id=CLIENT_ID,
                                   client_secret=CLIENT_SECRET,
                                   redirect_uri=LOCAL_REDIRECT_URI) 
# Instantiating OAuth object
spotify = spotipy.Spotify(auth=token)

<a id='track_based'></a>
# Track Based Recommender
[Back to Top](#home)

### Overview: 

The track based recommender will retreive base recommendations from Spotify based on what the user inputs as parameters. I will request the audio features for each one of the recommended tracks. Learn more about the audio features [HERE](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/). 

From there, I will pull the user's top listened-to tracks. I will pull the audio features from them as well.

Lastly, I will use scikit-learn's `cosine_similarity`and `pairwise_distances`to compare the audio features between the two sets of tracks and return a list of tracks that most similar.  

### Let's get started

In [12]:
# Imports
from sklearn.metrics.pairwise import pairwise_distances, cosine_distances, cosine_similarity
from sklearn.preprocessing import StandardScaler
import pandas as pd
pd.set_option("display.precision", 14)

Below is a spotify [URI](https://community.spotify.com/t5/Spotify-Answers/What-s-a-Spotify-URI/ta-p/919201). Essentially, it's a code used by Spotify to identify artists, playlists, tracks and more. 

I will use the track Blockbuster Pt. 1 by Run the Jewels as my test track. Listen to it [here](https://open.spotify.com/album/4Loc7NtCAo9mypHO6kbviD?highlight=spotify:track:5jQYkYhoOlBW4vJ2l4TCxl)

**The scenario is that I LOVE this song. I want to hear more just like it. Let's find some good recommendations!**

In [13]:
# Test track from Run the Jewels.
blockbuster = 'spotify:track:5jQYkYhoOlBW4vJ2l4TCxl' 

In [14]:
# Returns a dictionary of recommended tracks. These are the base recommendations.
recs = spotify.recommendations(seed_tracks = [blockbuster], limit=20)
type(recs)

dict

In [15]:
# Track names and URIs of base recommendations
rec_track_names = [track['name'] for track in recs['tracks']]
uris = [track['uri'] for track in recs['tracks']]
rec_track_names

['Millions',
 'The Space Program',
 'Trouble in Paradise',
 'Blue Suede',
 'Connect Four',
 'Reagan',
 'Put Jewels on It',
 'Father Sister Berzerker',
 'Easy Rider',
 'Kill Jill (feat. Killer Mike & Jeezy)',
 'Acid Raindrops',
 'Rings',
 'Lie, Cheat, Steal',
 'Alphabet Aerobics',
 'Untouchable',
 'Wolf Like Me',
 'Drowning',
 'War Ready',
 'Above The Clouds',
 'Bar Breaker']

At this point in the recommender there is a user **decison-point**. The user can designate that we not use their listening history. If this is the case, we return the URIs of the base recommendations. But where's the fun in that!!🎺🎶🎵

Below, you can see a dataframe that includes the audio features. Danceability, energy, etc. These features are what we are going to be measure to the user's top tracks

In [17]:
# Creating audio features dataframe of recommended tracks
rec_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
rec_df['track_name'] = rec_track_names
rec_df['uri'] = uris
rec_df.head(3)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.667,0.747,4,-4.918,0,0.277,0.0484,0.0,0.108,0.281,136.931,audio_features,7clUOYJb0WsAgFQEgDxSIT,spotify:track:7clUOYJb0WsAgFQEgDxSIT,https://api.spotify.com/v1/tracks/7clUOYJb0WsA...,https://api.spotify.com/v1/audio-analysis/7clU...,309947,4,Millions
1,0.612,0.705,1,-5.898,1,0.352,0.427,0.0,0.658,0.533,101.876,audio_features,203xmWRHAyqwW6AkGkhhVM,spotify:track:203xmWRHAyqwW6AkGkhhVM,https://api.spotify.com/v1/tracks/203xmWRHAyqw...,https://api.spotify.com/v1/audio-analysis/203x...,341040,4,The Space Program
2,0.814,0.851,7,-6.725,1,0.212,0.155,0.00251,0.232,0.245,110.013,audio_features,4AHo9T0MVpNTPybmv0ITfJ,spotify:track:4AHo9T0MVpNTPybmv0ITfJ,https://api.spotify.com/v1/tracks/4AHo9T0MVpNT...,https://api.spotify.com/v1/audio-analysis/4AHo...,126945,4,Trouble in Paradise


In [18]:
# Fetching a dictionary of user's top tracks
user_tracks = spotify.current_user_top_tracks(limit=15, time_range='medium_term')
type(user_tracks)

dict

In [19]:
# Saving lists of track names and URIs of user top tracks
user_track_names = [track['name'] for track in user_tracks['items']]
uris = [track['uri'] for track in user_tracks['items']]
user_track_names # How embarassing 😳

['Blockbuster Night, Pt. 1',
 'When I Grow Up',
 'The Dark',
 'No Excuses',
 'Nate',
 'Returns',
 'PAID MY DUES',
 'The Search',
 'Leave Me Alone',
 'Blood // Water',
 'The Visitor',
 'Let Me Go',
 'Venom - Music From The Motion Picture',
 'Only',
 'Talk to Me']

In [20]:
# Creating audio features dataframe of user top_tracks
user_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
user_df['track_name'] = user_track_names
user_df['uri'] = uris
user_df.head(1)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.701,0.525,1,-7.938,0,0.382,0.0353,2.55e-06,0.0615,0.385,75.693,audio_features,5jQYkYhoOlBW4vJ2l4TCxl,spotify:track:5jQYkYhoOlBW4vJ2l4TCxl,https://api.spotify.com/v1/tracks/5jQYkYhoOlBW...,https://api.spotify.com/v1/audio-analysis/5jQY...,152253,4,"Blockbuster Night, Pt. 1"


<a id='artist_reference'></a>
We are slicing our data frames to only include numeric columns to be compared.

In [23]:
# Setting new dataframes of only numeric features to be compared
rec_df_numeric = rec_df.loc[:, rec_df.columns[:11]]
user_df_numeric = user_df.loc[:, user_df.columns[:11]]
rec_df_numeric.head(1)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
0,0.667,0.747,4,-4.918,0,0.277,0.0484,0.0,0.108,0.281,136.931


Here is another **decision-point** for the user. At this point a user may determine specifically what they like about the song. If they only like the "beat", we may only include danceability, energy and tempo. If they like the message behind the song, we may only include features such as valence, mode and speechiness. 

Unfortunately, this user choice is not implemented in the app yet due to my lack of front-end web development skills. Coming soon! 

### Now we compare the two dataframes

In [25]:
# Comparing the two dataframes using pairwise_distances and setting metric to cosine
comps = pairwise_distances(rec_df_numeric, user_df_numeric, metric='cosine')

# Creating dataframe of these distances and setting the columns to the user_tracks and index to recommendation tracks
comps_df = pd.DataFrame(comps, columns=user_track_names, index=rec_track_names)

# Pairwise calculates cosine distance, but we want similiarity.
comps_df = 1 - comps_df

# Dropping any duplicate tracks
comps_df.drop_duplicates(inplace=True)

comps_df.head(3)

Unnamed: 0,"Blockbuster Night, Pt. 1",When I Grow Up,The Dark,No Excuses,Nate,Returns,PAID MY DUES,The Search,Leave Me Alone,Blood // Water,The Visitor,Let Me Go,Venom - Music From The Motion Picture,Only,Talk to Me
Millions,0.99750199236698,0.99988936622154,0.99928910934323,0.99969668028341,0.99490343489182,0.99898916838312,0.99983087569199,0.99986462571067,0.99963036013202,0.9999483253942,0.99777219905774,0.99954564228239,0.99953972564343,0.99935245754999,0.99947921314963
The Space Program,0.99882923579216,0.99971043624963,0.99948685227837,0.99956354001272,0.99281667873934,0.99834751832921,0.99953952809482,0.99979926361571,0.99935042684243,0.99927006893044,0.99898588436804,0.99906472091026,0.99971712573011,0.99813693400984,0.99895211104388
Trouble in Paradise,0.99774200287826,0.99866710443145,0.99942652038497,0.99786476290929,0.99770140663709,0.99991324570446,0.99823742792039,0.99867898557355,0.9976224357366,0.99934551862418,0.99815421778617,0.9998757155409,0.99769662672427,0.99961334431696,0.99992206406963


Now we have a cosine similarity score for each pair of base recommendation and user top track. This model assumes that you like the songs you listen to the most. 

Next, for each user track, we will find the highest similarity score and add that to our list of recommendations.

In [26]:
song_list = []

for user_track in comps_df.columns:
    max_cosine = comps_df[user_track].max()
    song_name = comps_df[comps_df[user_track] == max_cosine].index[0]
    song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
    #print(f'{user_track}------ closest to ---------> {song_name}')
    song_list.append((song_name, song_uri))

In [27]:
# This adds unique URIs. Drops songs if they are duplicates.

uri_list= []

[uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]

[None, None, None, None, None, None, None]

In [30]:
# List of songs that have the highest scores to my top tracks and their asscociated URIs
song_list

[('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Bar Breaker', 'spotify:track:6LyzqxbVyIUoDtr4dYshIl'),
 ('Connect Four', 'spotify:track:7uer04FLTB7mt2ImHP6XjE'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Put Jewels on It', 'spotify:track:01zffoSenvp9JnSbl0UgMa'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Father Sister Berzerker', 'spotify:track:3ROmYjCVAJeoqi5ozYFi6J'),
 ('Connect Four', 'spotify:track:7uer04FLTB7mt2ImHP6XjE')]

At this point, you can copy and paste the URIs into a blank Spotify playlist. My app will create the playlist for the user so that step is eliminated. Please see "Create Playlist" function in app_utilities.ipynb for more information.

Below is code to perform the same calculations as above, but this time it includes a standard scaler. I hypothesized that the scaled data would give better recommendations, but I have found that not to be the case. I may explore this more in the future.

In [33]:
# Scaled data
ss = StandardScaler()
rec_df_scaled = pd.DataFrame(ss.fit_transform(rec_df_numeric.values), columns=rec_df_numeric.columns, index=rec_df_numeric.index)
user_df_scaled = pd.DataFrame(ss.transform(user_df_numeric.values), columns=user_df_numeric.columns, index=user_df_numeric.index)
comps_scaled = pairwise_distances(rec_df_scaled, user_df_scaled, metric='cosine')
comps_df_scaled = pd.DataFrame(comps_scaled, columns=user_track_names, index=rec_track_names)
comps_df_scaled = 1 - comps_df_scaled
comps_df_scaled.drop_duplicates(inplace=True)
song_list_scaled = []

for user_track in comps_df_scaled.columns:
    max_cosine = comps_df_scaled[user_track].max()
    song_name = comps_df_scaled[comps_df_scaled[user_track] == max_cosine].index[0]
    song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
    #print(f'{user_track}------ closest to ---------> {song_name}')
    song_list_scaled.append((song_name, song_uri))
    
uri_list_scaled = []
[uri_list_scaled.append(track[1]) for track in song_list_scaled if track[1] not in uri_list_scaled]
song_list_scaled

[('Drowning', 'spotify:track:70bx1d7tQbVm768tZKbRhn'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('War Ready', 'spotify:track:2HwqshjCe5Tuvq2s6Fx3K5'),
 ('Drowning', 'spotify:track:70bx1d7tQbVm768tZKbRhn'),
 ('Untouchable', 'spotify:track:4AHZRMJCpscmxygCNtC2Qq'),
 ('Bar Breaker', 'spotify:track:6LyzqxbVyIUoDtr4dYshIl'),
 ('Millions', 'spotify:track:7clUOYJb0WsAgFQEgDxSIT'),
 ('Acid Raindrops', 'spotify:track:4MbV8zrWudQflnbiIzp29t'),
 ('Above The Clouds', 'spotify:track:3ZBSXNYdTZVaBUQI3E2rF6'),
 ('Millions', 'spotify:track:7clUOYJb0WsAgFQEgDxSIT'),
 ('War Ready', 'spotify:track:2HwqshjCe5Tuvq2s6Fx3K5'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG'),
 ('Kill Jill (feat. Killer Mike & Jeezy)',
  'spotify:track:6npcJbgDhtWeNT5iOGsjZI'),
 ('Rings', 'spotify:track:3SVqCyJ7ahjO2kAa4XhGUG')]

Below is a function that includes all of the above code with additional parameters for the base recommendations.

In [4]:
def recommend_songs(spotify, artists=None, genres=None, tracks=None, limit=100, n_tracks=10, listener_based=True, time_range='medium_term', drop_cols=[]):
    '''
    This function will recommend songs based on a seed artist, seed genres and/or seed tracks.
    The searched songs will be compared to user top tracks and return only those songs with the 
    highest cosine similarity.
   
    Parameters:
    
    spotify: OAuth user instance
    
    artists: list of Spotify artist URIs
    
    genres: list of seed genres
    
    tracks: list of Spotify track URIs
    
    limit: number of basic Spotify recommendations for comparison
    
    n_tracks: number of tracks to pull from user listening history
    
    listener_based: boolean, if True, function will pull user data and compare to base recommendations. If False, function will return n_tracks of base recommendations
    
    time_range: the listening history time range to pull from user listening data
    
    drop_cols: columns to drop from numeric featue comparison. 
    Choose from:    danceability
                    energy
                    key
                    loudness
                    mode
                    speechiness
                    acousticness
                    instrumentalness
                    liveness
                    valence
                    tempo
    
    '''
    
    # Fetching Spotify recommendations 
    recs = spotify.recommendations(seed_artists=artists, seed_genres=genres, seed_tracks=tracks, limit=limit)
    
    # Saving lists of track names and URIs of recommended tracks
    rec_track_names = [track['name'] for track in recs['tracks']]
    uris = [track['uri'] for track in recs['tracks']]
    
    # Creating audio features dataframe of recommended tracks
    rec_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
    rec_df['track_name'] = rec_track_names
    rec_df['uri'] = uris
    
    # If not listener based, returning list of URIs for Spotify recommended songs
    if not listener_based:
        return list(rec_df[:n_tracks]['uri'])
    
    else: # If listener based
        
        # Fetching user top tracks
        user_tracks = spotify.current_user_top_tracks(limit=n_tracks, time_range=time_range)
        
        # Saving lists of track names and URIs of user top tracks
        user_track_names = [track['name'] for track in user_tracks['items']]
        uris = [track['uri'] for track in user_tracks['items']]
        
        # Creating audio features dataframe of user top_tracks
        user_df = pd.DataFrame.from_dict(spotify.audio_features(uris))
        user_df['track_name'] = user_track_names
        user_df['uri'] = uris
        
        # Setting new dataframes of only numeric features to be compared
        rec_df_numeric = rec_df.loc[:, rec_df.columns[:11]]
        user_df_numeric = user_df.loc[:, user_df.columns[:11]]
        
        # If drop_cols, dropping appropriate columns from each numeric dataframe
        if drop_cols:
            rec_df_numeric.drop(drop_cols, inplace=True, axis=1)
            user_df_numeric.drop(drop_cols, inplace=True, axis=1)
              
        
        comps = pairwise_distances(rec_df_numeric, user_df_numeric, metric='cosine')
        comps_df = pd.DataFrame(comps, columns=user_track_names, index=rec_track_names)
        comps_df = 1 - comps_df
        comps_df.drop_duplicates(inplace=True)
        
        song_list = []
    
        for user_track in comps_df.columns:
            max_cosine = comps_df[user_track].max()
            song_name = comps_df[comps_df[user_track] == max_cosine].index[0]
            song_uri = list(rec_df.loc[rec_df['track_name'] == song_name, 'uri'])[0]
            #print(f'{user_track}------ closest to ---------> {song_name}')
            song_list.append((song_name, song_uri))

        uri_list = []
        [uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]
    
    
        return uri_list

<a id="artist_based"></a>
# Artist Based Recommender
[Back to Top](#home)

**Overview**: 

The goal with this recommender is to return songs from a target artist that closely match your listening taste. I've broken this recommender into 3 functions:

- `get_artist_tracks`: pulls all of the songs from an artist and returns the audio features
- `get_user_tracks`: pulls a designated amount of tracks from a user's listening history and returns 
- `compare_songs`: pairwise comparison of the two previous functions and returns as list of URIs 

In [16]:
# Test Artists URIs
logslaught = 'spotify:artist:1I471vwcRhqQl6QonGZlen'
grandson = 'spotify:artist:4ZgQDCtRqZlhLswVS6MHN4'
run_the_jewels = 'spotify:artist:4RnBFZRiMLRyZy0AzzTg2C'
hatebreed = 'spotify:artist:17Mb968quDHpjCkIyq30QV'
jxdn = 'spotify:artist:6Y64EaNqpqcZYTgs4c76gF'

First we need to ping the Spotify API for the maximum allowed number of albums for a given artist. In this case, our favorite, [Run the Jewels](https://open.spotify.com/artist/4RnBFZRiMLRyZy0AzzTg2C)

In [17]:
albums = spotify.artist_albums(artist_id=run_the_jewels, limit=50)

In [18]:
# List of all album URIs
album_uris = []
for album in albums['items']:
        album_uris.append(album['uri'])

        # Display the first four URIs
album_uris[:4]

['spotify:album:6cx4GVNs03Pu4ZczRnWiLd',
 'spotify:album:74LenWKjMdGa2qJiI1sxwT',
 'spotify:album:0kx0GhFSzqxYDeUiyyWl8i',
 'spotify:album:0DLkBrivkPhPGeSowPUyde']

The code snippen below will iterates through all of the albums, pulls the artists from each track (some are collaborative and there are more than one), pulls all of the tracks from each of the albums and grabs the audio features and renders them in a pandas dataframe. From there, we append each album dataframe to a list called `df_list`. 

In [19]:
df_list = []
# Iterate through the album URIs
for uri in album_uris:
    # Fetching album JSON
    album = spotify.album(uri)
    # Fetching album name
    album_name = album['name']
    # Fetching all artists on album
    album_artist = ', '.join([artist['name'] for artist in album['artists']])

    # Fetching all of the track URIs on the album
    tracks = spotify.album_tracks(uri)
    # Listing track URIs and names
    track_uris = [track['uri'] for track in tracks['items']]
    track_names = [track['name'] for track in tracks['items']]

    # Fetching audio features from the track URIs and assigning track name, album name and album artist
    audio_features_df = pd.DataFrame.from_dict(spotify.audio_features(track_uris))
    audio_features_df['track_name'] = track_names
    audio_features_df['album_name'] = album_name
    audio_features_df['artist_name'] = album_artist

    # Appending dataframe to list to concatenate
    df_list.append(audio_features_df)

Below we concatenate all of the album dataframes in `df_list` and now we have a dataframe named `df_artist` that includes all of the audio features for each one of their tracks

In [20]:
# Concatenating dataframes, resetting index and dropping duplicates
df_artist = pd.concat(df_list)
df_artist.reset_index(inplace=True, drop=True)
df_artist.drop_duplicates()
df_artist.head(3)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name,album_name,artist_name
0,0.595,0.878,10,-5.63,0,0.423,0.172,0.0,0.303,0.746,...,audio_features,1EjzcBTVLV7ATtdsQwyV31,spotify:track:1EjzcBTVLV7ATtdsQwyV31,https://api.spotify.com/v1/tracks/1EjzcBTVLV7A...,https://api.spotify.com/v1/audio-analysis/1Ejz...,146573,4,yankee and the brave (ep. 4),RTJ4,Run The Jewels
1,0.686,0.824,11,-5.413,1,0.34,0.157,0.0,0.242,0.405,...,audio_features,5taqLrLouA4vCjM7ZQpEtW,spotify:track:5taqLrLouA4vCjM7ZQpEtW,https://api.spotify.com/v1/tracks/5taqLrLouA4v...,https://api.spotify.com/v1/audio-analysis/5taq...,183827,4,ooh la la (feat. Greg Nice & DJ Premier),RTJ4,Run The Jewels
2,0.584,0.722,9,-7.733,1,0.28,0.0183,1.4e-05,0.143,0.685,...,audio_features,2uxudaBcJamtfgvUjSDdkZ,spotify:track:2uxudaBcJamtfgvUjSDdkZ,https://api.spotify.com/v1/tracks/2uxudaBcJamt...,https://api.spotify.com/v1/audio-analysis/2uxu...,201133,4,out of sight (feat. 2 Chainz),RTJ4,Run The Jewels


### This is all of the code above compiled in a function

In [22]:
def get_artist_tracks(sp, artists, n_albums=50):
    '''
    Function takes in the Spotify URI of one or more artists and returns a Pandas dataframe with Spotify's proprietary audio features.
    
    Parameters:
    sp: Spotipy user authentication object. 
    artists: list of one or more artist URIs
    n_albums: number of albums to fetch per artist. 1-50 acceptable range
    
    
    '''
    # Empty list to hold the track uris, album uris and dataframes to concatenate
    uris = []
    album_uris = []
    df_list = []
    
    # Checking if user input artists as a list. 
    if type(artists) != list:
        # Rectifying if user did not input list
        artists = [artists]
    
    # Iterate through artists
    for artist in artists:
        # Fetches the albums JSONs
        albums = sp.artist_albums(artist, limit=n_albums)
        # Iterating through the albums and appending the artist URIs
        for album in albums['items']:
            album_uris.append(album['uri'])
    
    # Iterate through the album URIs
    for uri in album_uris:
        # Fetching album JSON
        album = sp.album(uri)
        # Fetching album name
        album_name = album['name']
        # Fetching all artists on album
        album_artist = ', '.join([artist['name'] for artist in album['artists']])
        
        # Fetching all of the track URIs on the album
        tracks = sp.album_tracks(uri)
        # Listing track URIs and names
        track_uris = [track['uri'] for track in tracks['items']]
        track_names = [track['name'] for track in tracks['items']]
        
        # Fetching audio features from the track URIs and assigning track name, album name and album artist
        audio_features_df = pd.DataFrame.from_dict(sp.audio_features(track_uris))
        audio_features_df['track_name'] = track_names
        audio_features_df['album_name'] = album_name
        audio_features_df['artist_name'] = album_artist
        
        # Appending dataframe to list to concatenate
        df_list.append(audio_features_df)
    
    # Concatenating dataframes, resetting index and dropping duplicates
    df = pd.concat(df_list)
    df.reset_index(inplace=True, drop=True)
    df.drop_duplicates()
    return df

### Let's test `get_artist_tracks`!

In [23]:
artist_df = get_artist_tracks(spotify, run_the_jewels)
artist_df.head(3)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name,album_name,artist_name
0,0.595,0.878,10,-5.63,0,0.423,0.172,0.0,0.303,0.746,...,audio_features,1EjzcBTVLV7ATtdsQwyV31,spotify:track:1EjzcBTVLV7ATtdsQwyV31,https://api.spotify.com/v1/tracks/1EjzcBTVLV7A...,https://api.spotify.com/v1/audio-analysis/1Ejz...,146573,4,yankee and the brave (ep. 4),RTJ4,Run The Jewels
1,0.686,0.824,11,-5.413,1,0.34,0.157,0.0,0.242,0.405,...,audio_features,5taqLrLouA4vCjM7ZQpEtW,spotify:track:5taqLrLouA4vCjM7ZQpEtW,https://api.spotify.com/v1/tracks/5taqLrLouA4v...,https://api.spotify.com/v1/audio-analysis/5taq...,183827,4,ooh la la (feat. Greg Nice & DJ Premier),RTJ4,Run The Jewels
2,0.584,0.722,9,-7.733,1,0.28,0.0183,1.4e-05,0.143,0.685,...,audio_features,2uxudaBcJamtfgvUjSDdkZ,spotify:track:2uxudaBcJamtfgvUjSDdkZ,https://api.spotify.com/v1/tracks/2uxudaBcJamt...,https://api.spotify.com/v1/audio-analysis/2uxu...,201133,4,out of sight (feat. 2 Chainz),RTJ4,Run The Jewels


### Now lets get a user's tracks!
We are going to go through the same process, but with a few less steps. Looking forward, we are going to parameterize `limit` in the `get_user_tracks` function so the user has a choice of how many songs to pull and compare to the artist's tracks, but for now we will leave it at 20 tracks. We will do the same for `time_range`, for now we will leave it at `'medium_term'`.

In [25]:
# Fetching user's top tracks
user_tracks = spotify.current_user_top_tracks(limit=20, time_range='medium_term')

In [28]:
# Fetching audio features from user's top tracks
audio_features =  spotify.audio_features([item['uri'] for item in user_tracks['items']])

Now we have fetched the user's top tracks and the audio features for them, we need to create a dataframe along with the names of the tracks to compare them

In [29]:
# Listing names of tracks
names = [item['name'] for item in user_tracks['items']]

# Creating dataframe of audio features
df_user = pd.DataFrame.from_dict(audio_features)

# Setting track names
df_user['track_name'] = names

df_user.head(2)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.701,0.525,1,-7.938,0,0.382,0.0353,3e-06,0.0615,0.385,75.693,audio_features,5jQYkYhoOlBW4vJ2l4TCxl,spotify:track:5jQYkYhoOlBW4vJ2l4TCxl,https://api.spotify.com/v1/tracks/5jQYkYhoOlBW...,https://api.spotify.com/v1/audio-analysis/5jQY...,152253,4,"Blockbuster Night, Pt. 1"
1,0.817,0.814,2,-3.985,1,0.185,0.32,0.0,0.152,0.247,109.994,audio_features,5rLyYxZNzca00ENADO9m54,spotify:track:5rLyYxZNzca00ENADO9m54,https://api.spotify.com/v1/tracks/5rLyYxZNzca0...,https://api.spotify.com/v1/audio-analysis/5rLy...,196800,4,When I Grow Up


### This is all of the above code in a function

In [33]:
def get_user_tracks(sp, limit=10, time_range='medium_term'):
    '''
    Fetches a Spotify user's top tracks and returns a dataframe of 
    Spotify audio features
    
    Parameters:
    sp: Spotipy user authentication object  
    limit: number of user tracks to fetch
    time_range: choose from 'short_term', 'medium_term' and 'long_term'
    '''
    
    # Fetching user's top tracks
    user_tracks = sp.current_user_top_tracks(limit=limit, time_range=time_range)
    # Fetching audio features from user's top tracks
    audio_features =  sp.audio_features([item['uri'] for item in user_tracks['items']])
    # Listing names of tracks
    names = [item['name'] for item in user_tracks['items']]
    # Creating dataframe of audio features
    df = pd.DataFrame.from_dict(audio_features)
    # Setting track names
    df['track_name'] = names
    return df

### Let's test `get_user_tracks`!

In [35]:
df_user = get_user_tracks(spotify, limit=20, time_range='long_term')
df_user.head(2)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,track_name
0,0.701,0.525,1,-7.938,0,0.382,0.0353,3e-06,0.0615,0.385,75.693,audio_features,5jQYkYhoOlBW4vJ2l4TCxl,spotify:track:5jQYkYhoOlBW4vJ2l4TCxl,https://api.spotify.com/v1/tracks/5jQYkYhoOlBW...,https://api.spotify.com/v1/audio-analysis/5jQY...,152253,4,"Blockbuster Night, Pt. 1"
1,0.817,0.814,2,-3.985,1,0.185,0.32,0.0,0.152,0.247,109.994,audio_features,5rLyYxZNzca00ENADO9m54,spotify:track:5rLyYxZNzca00ENADO9m54,https://api.spotify.com/v1/tracks/5rLyYxZNzca0...,https://api.spotify.com/v1/audio-analysis/5rLy...,196800,4,When I Grow Up


## We've got an artist's tracks and a user's tracks. Lets compare!

This comparison is the same as it was in the earlier function. [Refer here for a detailed explanation of what is happening](#artist_reference)

In [37]:
from sklearn.metrics.pairwise import pairwise_distances, cosine_distances, cosine_similarity
pd.set_option("display.precision", 14)

artist_track_names = df_artist['track_name']
user_track_names = df_user['track_name']

df_numeric = df_artist.loc[:, df_artist.columns[:11]]
user_df_numeric = df_user.loc[:, df_user.columns[:11]]

recs = pairwise_distances(df_numeric, user_df_numeric, metric='cosine')
rec_df = pd.DataFrame(recs, columns=user_track_names, index=artist_track_names)    
rec_df = 1 - rec_df
rec_df.drop_duplicates(inplace=True)

song_list = []

for user_track in rec_df.columns:
    max_cosine = rec_df[user_track].max()
    song_name = rec_df[rec_df[user_track] == max_cosine].index[0]
    song_uri = list(df_artist.loc[df_artist['track_name'] == song_name, 'uri'])[0]
    #print(f'{user_track}------ closest to ---------> {song_name}')
    song_list.append((song_name, song_uri))

uri_list = []
[uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]

song_list

[('Blockbuster Night, Pt. 1', 'spotify:track:6zqYITGsQw58qXz31AtQah'),
 ('Pew Pew Pew - Live From SXSW / 2015',
  'spotify:track:5D8iavYbqpalfnXmUXtFf8'),
 ('All My Life', 'spotify:track:4oc6T2mb67AAVxI7b4E12u'),
 ('Get It', 'spotify:track:63i0xlsKuhhHqt9fmpRrMJ'),
 ('yankee and the brave (ep. 4)', 'spotify:track:1EjzcBTVLV7ATtdsQwyV31'),
 ('Hey Kids', 'spotify:track:2ZPHQZkhWNNjKQ32d8aimv'),
 ('Get It', 'spotify:track:63i0xlsKuhhHqt9fmpRrMJ'),
 ('Pew Pew Pew - Bonus Track', 'spotify:track:1UwINkY4CL6d0WXgmgdGtv'),
 ('Get It', 'spotify:track:63i0xlsKuhhHqt9fmpRrMJ'),
 ('Twin Hype Back', 'spotify:track:6oGoJ9RJynv7gLaJB7sffR'),
 ('Run the Jewels', 'spotify:track:3mdo1znA61tI2RVALup1ft'),
 ('Down - Z Dot UK Remix', 'spotify:track:0C5sm79K5QIvJkOgreMpjp'),
 ('walking in the snow', 'spotify:track:2pVvB487ZwqdzTxEvEEors'),
 ('ooh la la (feat. Greg Nice & DJ Premier)',
  'spotify:track:5taqLrLouA4vCjM7ZQpEtW'),
 ('Talk to Me', 'spotify:track:0QCymIumn4YPTlKOuvDsS1'),
 ('Down (feat. Joi)', 's

### This is all of the code above compiled in a function

In [40]:
def compare_songs(artist_df, user_df):
    '''
    Takes in a dataframe of artist tracks and audio feautes, and a dataframe of user top tracks. 
    Returns tracks in artist_df that are most alligned with user tracks.
    '''
    
    band_track_names = artist_df['track_name']
    user_track_names = user_df['track_name']
    
    df_numeric = artist_df.loc[:, artist_df.columns[:11]]
    user_df_numeric = user_df.loc[:, user_df.columns[:11]]
    
    recs = pairwise_distances(df_numeric, user_df_numeric, metric='cosine')
    rec_df = pd.DataFrame(recs, columns=user_track_names, index=band_track_names)    
    rec_df = 1 - rec_df
    rec_df.drop_duplicates(inplace=True)
    
    song_list = []
    
    for user_track in rec_df.columns:
        max_cosine = rec_df[user_track].max()
        song_name = rec_df[rec_df[user_track] == max_cosine].index[0]
        song_uri = list(artist_df.loc[artist_df['track_name'] == song_name, 'uri'])[0]
        #print(f'{user_track}------ closest to ---------> {song_name}')
        song_list.append((song_name, song_uri))
        
    uri_list = []
    [uri_list.append(track[1]) for track in song_list if track[1] not in uri_list]
    
    return uri_list

### Lets test out all three functions!
- `get_artist_tracks`
- `get_user_tracks`
- `compare_songs`

In [41]:
glass_animals = 'spotify:artist:4yvcSjfu4PC0CYQyLy4wSq'

In [42]:
artist_tracks = get_artist_tracks(spotify, artists = glass_animals )
user_tracks = get_user_tracks(spotify, limit=12, time_range='long_term')
song_list = compare_songs(artist_tracks, user_tracks)
song_list

['spotify:track:3mRBPZbnwsmddN0JH5BNVl',
 'spotify:track:29hdc5G2JCbWMfuESdrlkK',
 'spotify:track:3ZMXLTcves58BnHyLytiTp',
 'spotify:track:2zZ9AeS6VQzXQsfzs1HW36',
 'spotify:track:1Z1ukvH16jnsmjVidIVueE',
 'spotify:track:6XADhsXM5S9Ck89bbMMTGw',
 'spotify:track:5MwthEwpoFUVayAlhjoBs2',
 'spotify:track:2Svizy2uKSW0YyuffAX78U',
 'spotify:track:7t3MxxnVUOEkmdvEWxXSaH',
 'spotify:track:05rgpw5CeSege0heFDGSqA',
 'spotify:track:65PsOtxOpgaGvjztMGzoj2',
 'spotify:track:67QuYzEv1pkYrP6QyugCDZ']

**AWESOME!** We now have some cool songs to listen to by [Glass Animals](https://open.spotify.com/artist/4yvcSjfu4PC0CYQyLy4wSq)

<a id="one_click"></a>
# One Click Recommender
[Back to Top](#home)

I received some feedback that it would be nice to be able to create a playlist with one click. There is no data science here, but here is a fun little function that will do this for you!

In [51]:
def one_click_rec(spotify, limit=20):
    '''
    Create a basic recommended Spotify playlist based on your listening history.
    If listening history is not deep enough, base playlist off of current Spotify Editor's recommended playlist.
    Returns a list of spotify URI codes of recommended songs
    
    Parameters:
        
        spotify: OAuth user authentication object
        
        limit: number of tracks to pull from listening history
        
    Returns:
        
        List of Spotify URI codes
        
    '''
    try:
        # Attempt to pull user data
        uris = [uri['uri'] for uri in spotify.current_user_top_tracks(limit=5)['items']]
    except:
        # If not enough user data, grab URIs from Spotify featured playlist
        playlist_id = spotify.featured_playlists(limit=1)['playlists']['items'][0]['id']
        playlist = spotify.playlist(playlist_id, additional_types=('track', ))
        uris = [uri['track']['uri'] for uri in playlist['tracks']['items'][:5]]
    
    # Request 20 recommended songs for the URIs designated above
    recs = spotify.recommendations(seed_tracks=uris, limit=20)
    rec_uris = [uri['uri'] for uri in recs['tracks']]
    return rec_uris

In [52]:
one_click_rec(spotify)

['spotify:track:5epMdylltsD0tLc8TCkt9M',
 'spotify:track:5C5FqRfbXF9oY4HSRYyqn3',
 'spotify:track:2TfoabXa0CbEwcqpxOn9z3',
 'spotify:track:5VndRF0wlvbMQsZ25GErQM',
 'spotify:track:4CxFN5zON70B3VOPBYbd6P',
 'spotify:track:38gP5OiDg8u1FvHWY6F53Y',
 'spotify:track:0C9JzWHcKsyY8AzgCM97h0',
 'spotify:track:53BHUFdQphHiZUUG3nx9zn',
 'spotify:track:3ebfDw8WEbf0DxGh47R2lo',
 'spotify:track:0hsl5A7PsMB5i4alm7ozIu',
 'spotify:track:0ZbZYmpo4XXcw1rmaRlAVT',
 'spotify:track:2AHKsVXN6BjBO8SEVX6cBC',
 'spotify:track:4A8pfRjSc0U3xA73UtxIA0',
 'spotify:track:2HRYa6iG1M5DRefO8pK2I3',
 'spotify:track:3JyvSSU0VnlMUsQckyEVfX',
 'spotify:track:6w6SW8zyEcyxwSR7Wya45a',
 'spotify:track:1qRabaD5y56JZzQSm4qB0n',
 'spotify:track:21fThCIz6c4Brj6stR2heN',
 'spotify:track:6VDz4m1vEGHDDqQojf1mpc',
 'spotify:track:5HkUROhtt0H7ivLukKvRka']

<a id="playlist_based"></a>
# Playlist Recommender
[Back to Top](#home)
### Work in Progress

In [75]:
playlist_id = spotify.featured_playlists(limit=1)['playlists']['items'][0]['id']
playlist = spotify.playlist(playlist_id, additional_types=('track', ))
uris = [uri['track']['uri'] for uri in playlist['tracks']['items'][:5]]
uris

['spotify:track:4R59wt5nnhYo88PIu3cUIt',
 'spotify:track:1MKT2Vf6qowCN48KlH0aCA',
 'spotify:track:1fYouLdK3PkOMPnx4CPxTY',
 'spotify:track:0NeJjNlprGfZpeX2LQuN6c',
 'spotify:track:2f6pqUyFcs3NUSoz49H9nw']

In [97]:
spotify.current_playback()

{'device': {'id': '61132c7ae806abd04502be3994fcc2408fb6fd88',
  'is_active': True,
  'is_private_session': False,
  'is_restricted': False,
  'name': 'Alex’s MacBook Pro',
  'type': 'Computer',
  'volume_percent': 100},
 'shuffle_state': False,
 'repeat_state': 'off',
 'timestamp': 1609623733109,
 'context': {'external_urls': {'spotify': 'https://open.spotify.com/playlist/4qOSWuAVcnM4lgsLAzg2Pm'},
  'href': 'https://api.spotify.com/v1/playlists/4qOSWuAVcnM4lgsLAzg2Pm',
  'type': 'playlist',
  'uri': 'spotify:playlist:4qOSWuAVcnM4lgsLAzg2Pm'},
 'progress_ms': 8574,
 'item': {'album': {'album_type': 'album',
   'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/3Uobr6LgQpBbk6k4QGAb3V'},
     'href': 'https://api.spotify.com/v1/artists/3Uobr6LgQpBbk6k4QGAb3V',
     'id': '3Uobr6LgQpBbk6k4QGAb3V',
     'name': 'I Prevail',
     'type': 'artist',
     'uri': 'spotify:artist:3Uobr6LgQpBbk6k4QGAb3V'}],
   'available_markets': ['AD',
    'AE',
    'AL',
    'AR',
    'A