# Personalized Spotify Playlist Creator (Web Application)

## About the project

using **spotify's API** (application programm interface) called **spotipy**, we'll create a personalized recommendation system, whose outcome will be a personalized playlist based on the user's top 20 most listened songs. 
The python programm will be enclosed in a web application usign the **flask** module. 

## 1. Data collection

### 1.1 API Configuration

In [2]:
%pip install spotipy
import spotipy

Note: you may need to restart the kernel to use updated packages.


In [3]:
from  spotipy.oauth2 import SpotifyClientCredentials
import spotipy.util as util

1. Go to [developer.spotify.com](https://developer.spotify.com/)
2. Click the "DASHBOARD" tab
3. Log in using your email and password of your spotify account
4. Click "CREATE AN APP", name the application, fill the description
5. in "SETTINGS" of our newly created app, fill "REDIRECTED URIs" with a localhost, ex: *hhtp://localhost:8888/callback/*
6. Define which users will have authorization to use our app, in "USERS AND ACCESS"; in this case I'll only give authorization to my own user.
7. Get "CLIENT ID" and "CLIENT SECRET", keys that will give access to momdify and read our own account and use spotify's API.


In [4]:
cid="f1ddb43be5e04418b0c588aa632cf726"
secret="d8ba3b9a5e3b45c694afa60a672156f1"
username="12145924164"
redirect_uri="http://localhost:8888/callback"

In [5]:
# our priveleges for the function current_user_top_track will be given by the scope:
# "user-top-read" to get top songs from user, "playlist-modify-public" to create the playlist.
scope="playlist-modify-public,user-top-read"
token=util.prompt_for_user_token(username,scope,cid,secret,redirect_uri)
sp=spotipy.Spotify(auth=token)

### 1.2 Get top 20 tracks of user

the data used to make a recommendation can be obtained in several ways, in this case we will use spotipy's function **current_user_top_tracks**. <br>
Other starting points to make a recommendation base on the user's current likings can be made using:
- **sp.user_current_followed_artists**
- **sp.current_user_playlists**
- **sp.current_user_recently_played**
- **sp.current_user_top_artists**

in this instance, we will get the user's top 20 songs the past few weeks. (for more information about time range read spotipy's documentation). 
The function ***current_user_top_track*** will return a *dictionary* in which the field "**items**" will be the one with information of our interest about each track


In [6]:
top20 = sp.current_user_top_tracks(time_range='short_term',limit=20)
top20

{'items': [{'album': {'album_type': 'SINGLE',
    'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/2bdcBjvuI9worc472GbeU0'},
      'href': 'https://api.spotify.com/v1/artists/2bdcBjvuI9worc472GbeU0',
      'id': '2bdcBjvuI9worc472GbeU0',
      'name': 'Samuel Kim',
      'type': 'artist',
      'uri': 'spotify:artist:2bdcBjvuI9worc472GbeU0'}],
    'available_markets': ['AD',
     'AE',
     'AR',
     'AT',
     'AU',
     'BE',
     'BG',
     'BH',
     'BO',
     'BR',
     'CA',
     'CH',
     'CL',
     'CO',
     'CR',
     'CY',
     'CZ',
     'DE',
     'DK',
     'DO',
     'DZ',
     'EC',
     'EE',
     'EG',
     'ES',
     'FI',
     'FR',
     'GB',
     'GR',
     'GT',
     'HK',
     'HN',
     'HU',
     'ID',
     'IE',
     'IL',
     'IN',
     'IS',
     'IT',
     'JO',
     'JP',
     'KW',
     'LB',
     'LI',
     'LT',
     'LU',
     'LV',
     'MA',
     'MC',
     'MT',
     'MX',
     'MY',
     'NI',
     'NL',
     'NO',
  

In [7]:
for i,item in enumerate(top20["items"]):
        #from the subfield "name" of the fied "artists" we'll get the name
        print(i+1,item["name"],"//",item["artists"][0]["name"])

1 Avatar's Love // Samuel Kim
2 She's A Rainbow // The Rolling Stones
3 Saturn // Bryce Dessner
4 Bella Traicion // Belinda
5 Blanket Me - Recorded at Spotify Studios NYC // Hundred Waters
6 never been in love // Gatlin
7 cruel intentions // Lexi Jayde
8 drunk text me // Lexi Jayde
9 Helium // Sia
10 Cherry Lips (Go Baby Go!) // Garbage
11 If I Say // Mumford & Sons
12 walkin away // Lexi Jayde
13 It's Foggy Today // Evgeny Grinko
14 The Best Part // Bien
15 Baby Blue // Luke Hemmings
16 Whirlwind // Geowulf
17 Out of the Blue // Polock
18 I Love You // RIOPY
19 Bad For Me // Allocai
20 Ray of Light // Madonna


### 1.3 Creation of a dataframe with the musical features of each track

In [8]:
# we'll create a dataframe name "top20_df"
import pandas as pd
tracks=top20["items"]
track_ids=[]
track_names=[]
features=[]
for track in tracks:
    track_id=track["id"]
    track_name=track["name"]
    #using method "audio_features" we'll get musical features using the track id 
    audio_features =sp.audio_features(track_id)
    
    track_ids.append(track_id)
    track_names.append(track_name)
    features.append(audio_features[0])
top20_df=pd.DataFrame(features, index=track_names)

In [9]:
top20_df.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
Avatar's Love,0.193,0.324,2,-8.816,1,0.0305,0.73,0.924,0.0805,0.0385,78.307,audio_features,0s3djTXa7o53xNE9zrxwya,spotify:track:0s3djTXa7o53xNE9zrxwya,https://api.spotify.com/v1/tracks/0s3djTXa7o53...,https://api.spotify.com/v1/audio-analysis/0s3d...,205428,4
She's A Rainbow,0.464,0.824,10,-7.359,1,0.0374,0.467,0.018,0.33,0.539,109.298,audio_features,6KOtheMY0KN4s9TrQHr9It,spotify:track:6KOtheMY0KN4s9TrQHr9It,https://api.spotify.com/v1/tracks/6KOtheMY0KN4...,https://api.spotify.com/v1/audio-analysis/6KOt...,253720,4
Saturn,0.465,0.743,10,-9.4,0,0.0332,0.527,0.115,0.103,0.232,143.986,audio_features,3lb5jRUvwyNloY5Uhnzhwn,spotify:track:3lb5jRUvwyNloY5Uhnzhwn,https://api.spotify.com/v1/tracks/3lb5jRUvwyNl...,https://api.spotify.com/v1/audio-analysis/3lb5...,231023,4
Bella Traicion,0.532,0.935,2,-3.126,1,0.0686,0.00644,0.0,0.11,0.662,139.887,audio_features,5eIsMbKPT1IJ0b0rdvgtlZ,spotify:track:5eIsMbKPT1IJ0b0rdvgtlZ,https://api.spotify.com/v1/tracks/5eIsMbKPT1IJ...,https://api.spotify.com/v1/audio-analysis/5eIs...,225373,4
Blanket Me - Recorded at Spotify Studios NYC,0.358,0.376,4,-12.016,1,0.0427,0.837,0.00128,0.0718,0.292,122.043,audio_features,4bMHWD1vEnq76Ro208gZHd,spotify:track:4bMHWD1vEnq76Ro208gZHd,https://api.spotify.com/v1/tracks/4bMHWD1vEnq7...,https://api.spotify.com/v1/audio-analysis/4bMH...,303196,3


removing features of our dataframe, will get a more organized dataset, where the outcome will be our 20 top tracks with our 12 essential features to make a prediction

In [10]:
top20_df=top20_df[["id","danceability","energy", "key", "loudness","mode","speechiness","acousticness","instrumentalness","liveness","valence","tempo"]]
top20_df.shape

(20, 12)

### 1.4 Get candidate tracks

1. we'll extract the corresponding artists (without repetitions) that are of interest for the user
2. Using our API we'll get similar artists (sp.artist_related_artists) with new album releases (sp.new_release)
3. From this expanded listing of artists from each artist we'll get one new album and from each album we'll extract 3 tracks
4. using the API we'll get the musical features of our new tracks which we'll call our candidates tracks. 


#### 1.4.1 Get artists from the user's top20_df

In [11]:
ids_artists=[]
print("Artists on my top 20")
print('=====================')
for item in top20["items"]:
    artist_id=item["artists"][0]["id"]
    artist_name=item["artists"][0]["name"]
    print(f'{artist_id}:{artist_name}')
    ids_artists.append(artist_id)
    # a list is a series of objects with out repetition, that's why using set will
    #remove the duplicate artists from our list
ids_artists=list(set(ids_artists))
print(f'Number of artists (without repetition):{len(ids_artists)}')

Artists on my top 20
2bdcBjvuI9worc472GbeU0:Samuel Kim
22bE4uQ6baNwSHPVcDxLCe:The Rolling Stones
5HHKeO04SOcxUxjruFXf5l:Bryce Dessner
5LeiVcEnsZcwc133TUhJNW:Belinda
108ugtkRFQzP9nGgNiyERO:Hundred Waters
1KGcdM5KxCVydaHe29QAj9:Gatlin
69761NObDw2KwmmFgZmxzC:Lexi Jayde
69761NObDw2KwmmFgZmxzC:Lexi Jayde
5WUlDfRSoLAfcVSX1WnrxN:Sia
6S0GHTqz5sxK5f9HtLXn9q:Garbage
3gd8FJtBJtkRxdfbTu19U2:Mumford & Sons
69761NObDw2KwmmFgZmxzC:Lexi Jayde
69RwhKw37lY73bMGaSts7C:Evgeny Grinko
2uodbv2953Z7R1ipwucK2A:Bien
4UFShyUQuA8dguoZrqX0jQ:Luke Hemmings
01TQ6CLvPSOYGUqRQ3nWgZ:Geowulf
0dRPyk7hiV4FJxSG1VHf2L:Polock
4ytDz3A9nHcVMjEbdNSKzA:RIOPY
35hdnctklAphlKzgR8aHpq:Allocai
6tbjWDEIzxoDsBA1FuhfPW:Madonna
Number of artists (without repetition):18


#### 1.4.2 related artists to user's top 20 artists

In [12]:
print("")
print("similar artists")
print('=====================')
ids_similar_artists=[]
for artist_id in ids_artists:
    # from the dictionary given we are only interested in the artists field
    artists=sp.artist_related_artists(artist_id)["artists"]
    for item in artists:
        artist_id=item["id"]
        artist_name=item["name"]
        print(f'{artist_id}:{artist_name}')
        ids_similar_artists.append(artist_id)
ids_artists.extend(ids_similar_artists)
ids_artists=list(set(ids_artists))
print(f'Number of artists (without repetition):{len(ids_artists)}')


similar artists
5rQt28kugM2JTzkJtgNovi:Mik
2hE2ofc7IKD1iQ39HYU5N4:KOHTA YAMAMOTO
2pXwZkLpbcY2hWndxzes1B:Aleta
7ArkPxPSwZoQw5ESuxt4oG:Rok Nardin
6krMKleBJfXYPdCP5q3ypW:Kevin Kiner
3cxdsMDGKccNG6J9CbaYaZ:AtinPiano
56C07WwXqmNmRXSmkmJFsK:CJ Music
3VudC1DraV4LjIdLzpqQ80:Friedrich Habetler
0Riv2KnFcLZA3JSVryRg4y:Hiroyuki Sawano
4UIwKmZD5HU9E0RjeZaxVt:Krutikov Music
3oGVQWQy7lgTMuTnKUZZNZ:Yuki Hayashi
4uXJgaCc1GtHWtFq8CmPmQ:Hiroaki Tsutsumi
56DDzGJXY0xndL9wu9aHUD:Yugo Kanno
64mecceQewFCKwCK6JBW0o:Yasuharu Takanashi
3vJCYheZF5PWUvTIykvNG5:Secession Studios
4JPXL1WEPgSUSBsWB1GMim:A Samurai In Tokyo
04ZLnodB6WbVvYg2LECqpQ:Natalie Holt
0G8iubd5gKv5vdPnmHRaxT:Yutaka Yamada
6QeQA8W6WZNwHfDU1mOA2e:Joseph Shirley
2RzQznPCFWvnq3wBh0zzD2:Charli Adams
6Y2m4AEOS9JFrsK2goyg7T:Babygirl
3vldh5Ceynytj6Iglw4haP:Tommy Lefroy
3gBjSrNsYzzbeo0nwsL21J:ella jane
3MNLhvqJkWsO6tcjY9ps62:Maude Latour
5NnHZjAQDhWUb5ISZO7FSw:Love You Later
26DvqLYszG0oIOeelTF5kE:Trousdale
7GlBOeep6PqTfFi59PTUUN:Chappell Roan
2AmfMGi3W

#### 1.4.3 Amplify our list with artists with new albums

In [13]:
print('')
print("Artists witih new releases:")
print("===========================")
new_releases =sp.new_releases(limit=20)["albums"]
for item in new_releases["items"]:
    artist_id=item["artists"][0]["id"]
    artist_name=item["artists"][0]["name"]
    album_name=item["name"]
    release_date=item["release_date"]
    print(f'{artist_id}:{artist_name}-//{album_name},{release_date}')
    ids_artists.append(artist_id)
ids_artists=list(set(ids_artists))
print(f'Number of artists (without repetition):{len(ids_artists)}')


Artists witih new releases:
0CDUUM6KNRvgBFYIbWxJwV:Dawes-//Misadventures Of Doomscroller,2022-07-22
4MoAOfV4ROWofLG3a3hhBN:Jon Pardi-//Mr. Saturday Night,2022-07-22
3M3wTTCDwicRubwMyHyEDy:Shygirl-//Coochie (a bedtime story),2022-07-20
181bsRPaVXVlUKXrxwZfHK:Megan Thee Stallion-//Pressurelicious (feat. Future),2022-07-21
6qqNVTkY8uBg9cP3Jd7DAH:Billie Eilish-//Guitar Songs,2022-07-21
2P5sC9cVZDToPxyomzF1UH:Joey Bada$$-//2000,2022-07-22
5nvWOyAkfNgVLKESq4fOj2:Montell Fish-//JAMIE,2022-07-22
23DYJsw4uSCguIqiTIDtcN:Southside-//Save Me,2022-07-22
08PvCOlef4xdOr20jFSTPd:Flo Milli-//You Still Here, Ho ?,2022-07-20
5qa31A9HySw3T7MKWI9bGg:FLETCHER-//Becky’s So Hot,2022-07-20
7okSU80WTrn4LXlyXYbX3P:Clinton Kane-//MAYBE SOMEDAY IT'LL ALL BE OK,2022-07-22
2vnB6tuQMaQpORiRdvXF9H:Beach Bunny-//Emotional Creature,2022-07-22
21mKp7DqtSNHhCAU2ugvUw:ODESZA-//The Last Goodbye,2022-07-22
3oSJ7TBVCWMDMiYjXNiCKE:Kane Brown-//Grand,2022-07-22
5Y3MV9DZ0d87NnVm56qSY1:Tiago PZK-//Portales,2022-07-21
1z7b1Pr1rSl

#### 1.4.4 Creation of candidate tracks dataframe

From each artist of interest get one of their album

In [14]:
id_albums=[]
nartists=len(ids_artists)
for i,ids_artist in enumerate(ids_artists):
    print(f'Processing artist {i+1} of {nartists}...')
    albums=sp.artist_albums(ids_artist,limit=1)
    for album in albums['items']:
        id_albums.append(album["id"])
print("Ready")
    

Processing artist 1 of 391...
Processing artist 2 of 391...
Processing artist 3 of 391...
Processing artist 4 of 391...
Processing artist 5 of 391...
Processing artist 6 of 391...
Processing artist 7 of 391...
Processing artist 8 of 391...
Processing artist 9 of 391...
Processing artist 10 of 391...
Processing artist 11 of 391...
Processing artist 12 of 391...
Processing artist 13 of 391...
Processing artist 14 of 391...
Processing artist 15 of 391...
Processing artist 16 of 391...
Processing artist 17 of 391...
Processing artist 18 of 391...
Processing artist 19 of 391...
Processing artist 20 of 391...
Processing artist 21 of 391...
Processing artist 22 of 391...
Processing artist 23 of 391...
Processing artist 24 of 391...
Processing artist 25 of 391...
Processing artist 26 of 391...
Processing artist 27 of 391...
Processing artist 28 of 391...
Processing artist 29 of 391...
Processing artist 30 of 391...
Processing artist 31 of 391...
Processing artist 32 of 391...
Processing artist

extract 3 tracks from the list of albums

In [15]:
id_tracks=[]
nalbums=len(id_albums)
for i,id_album in enumerate(id_albums):
    print(f'Processing album {i+1} of {nalbums}...')
    # through sp.album_track we extract tracks from albums
    album_tracks=sp.album_tracks(id_album,limit=3)
    for track in album_tracks['items']:
        id_tracks.append(track["id"])
print(f'ready! total number of preselected tracks of new releases:{len(id_tracks)}')
    

Processing album 1 of 391...
Processing album 2 of 391...
Processing album 3 of 391...
Processing album 4 of 391...
Processing album 5 of 391...
Processing album 6 of 391...
Processing album 7 of 391...
Processing album 8 of 391...
Processing album 9 of 391...
Processing album 10 of 391...
Processing album 11 of 391...
Processing album 12 of 391...
Processing album 13 of 391...
Processing album 14 of 391...
Processing album 15 of 391...
Processing album 16 of 391...
Processing album 17 of 391...
Processing album 18 of 391...
Processing album 19 of 391...
Processing album 20 of 391...
Processing album 21 of 391...
Processing album 22 of 391...
Processing album 23 of 391...
Processing album 24 of 391...
Processing album 25 of 391...
Processing album 26 of 391...
Processing album 27 of 391...
Processing album 28 of 391...
Processing album 29 of 391...
Processing album 30 of 391...
Processing album 31 of 391...
Processing album 32 of 391...
Processing album 33 of 391...
Processing album 34

get musical features of the total number of preselected tracks of the new releases

In [28]:
track_names=[]
features=[]
ntracks=len(id_tracks)
for i, track_id in enumerate(id_tracks):
    print(f'processing track{i+1} of {ntracks}...')
    track_name=sp.track(track_id)["name"]
    audio_features=sp.audio_features(track_id)
    
    # not to include tracks with no musical features 
    # (as it is the case of new songs that spotify has not processed yet)
    if audio_features[0] != None:
        track_names.append(track_name)
        features.append(audio_features[0])
print('Ready')
candidate_tracks_df=pd.DataFrame(features,index=track_names)

processing track1 of 1078...
processing track2 of 1078...
processing track3 of 1078...
processing track4 of 1078...
processing track5 of 1078...
processing track6 of 1078...
processing track7 of 1078...
processing track8 of 1078...
processing track9 of 1078...
processing track10 of 1078...
processing track11 of 1078...
processing track12 of 1078...
processing track13 of 1078...
processing track14 of 1078...
processing track15 of 1078...
processing track16 of 1078...
processing track17 of 1078...
processing track18 of 1078...
processing track19 of 1078...
processing track20 of 1078...
processing track21 of 1078...
processing track22 of 1078...
processing track23 of 1078...
processing track24 of 1078...
processing track25 of 1078...
processing track26 of 1078...
processing track27 of 1078...
processing track28 of 1078...
processing track29 of 1078...
processing track30 of 1078...
processing track31 of 1078...
processing track32 of 1078...
processing track33 of 1078...
processing track34 

we'll eliminate additional features as we did previously with our top20 songs dataframe

In [29]:
candidate_tracks_df.head(10)
candidate_tracks_df=candidate_tracks_df[["id","danceability","energy", "key", "loudness","mode","speechiness","acousticness","instrumentalness","liveness","valence","tempo"]]

## 2. Filtering based on content

we will compare our **user's top 20 tracks** with our **candidate tracks**. <br>
Each song can be characterized by 12 values (12 dimensions); therefore each track can be represented by a 12 dimensional vector
Our filtering based in content will compare the vectors (tracks) from our *candidate songs dataframe* with the *user's top 20 songs data frame*.
the similarity between each vector will be  mesurable using cosine similarity which comparares the angles between each vector. If a vector is closer to our target, the angle should be smaller.

$$cos(\overline xi , \overline yi)=\frac{\overline xi * \overline yi}{\Vert \overline xi\Vert \overline yi\Vert}

### 2.2 Standarizing Values

each **colum** has a different scale, therefore we can scale each colum or standarize. the function StandardScaler will subtract the mean from each colum and divide by the standard deviation.

In [30]:
# extracting the values in a numpy array format
top20_mtx=top20_df.iloc[:,1:].values
candidates_mtx=candidate_tracks_df.iloc[:,1:].values

In [31]:
from sklearn.preprocessing import StandardScaler
scaler=StandardScaler()

t20_scaled=scaler.fit_transform(top20_mtx)
can_scaled=scaler.fit_transform(candidates_mtx)
print(t20_scaled.mean(axis=0))
print(t20_scaled.std(axis=0))


[-1.41553436e-16 -1.44328993e-16 -4.44089210e-17  3.48332474e-16
 -1.66533454e-17 -8.04911693e-16  2.22044605e-17  7.21644966e-17
 -2.66453526e-16 -1.77635684e-16 -4.44089210e-17]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


### 2.3 Normalization of vectors

now every vector's magnitude is different and we want each one of them to be the same size. We can normalize them by dividing the scaled values by the magnitude of the vector.
Normalization of each vector (row) implies that the sum of all the features of track should sum to 1.

In [32]:
import numpy as np
#magnitude of each vector
t20_norm=np.sqrt((t20_scaled*t20_scaled).sum(axis=1))
can_norm=np.sqrt((can_scaled*can_scaled).sum(axis=1))
#normalization
nt20=t20_scaled.shape[0]
ncan=can_scaled.shape[0]
t20=t20_scaled/t20_norm.reshape(nt20,1)
can=can_scaled/can_norm.reshape(ncan,1)

print(np.sqrt((t20*t20).sum(axis=1)))
print(np.sqrt((can*can).sum(axis=1)))


[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. ... 1. 1. 1.]


### 2.4 Similarity between vectors

sklearn has already a function that uses cosine similarity, which outcome will be a matrix.

In [33]:
from sklearn.metrics.pairwise import linear_kernel
# linear_kernel uses the cosine similarity method to compare candidate tracks to our top 20 tracks
cos_sim=linear_kernel(t20,can)
cos_sim.shape

(20, 1077)

### 2.5 Defining Filtering Function

In [34]:
#where pos is the position, cos_sim the fuction 
#that will make the comparison, ncands the numbe of candidates, umbral the minimal value 
# to consider the track to be similar
def get_candidate(pos,cos_sim,ncands,umbral=0.8):
    # get candidate tracks above our umbral
    idx=np.where(cos_sim[pos,:]>=umbral)[0]
    # rearange from top similarity to lower (we need to invert the list as argsort return values from lowest to highest)
    idx=idx[np.argsort(cos_sim[pos,idx])[::-1]]
    if len(idx)>=ncands:
        cands=idx[0:ncands]
    else:
        cands=idx
    return cands

In [35]:
# Ejemplo
ids_t20=[]
ids_playlist=[]
for i in range(20):
    cands=get_candidate(i,cos_sim,5)
    print(f'{i}==>candidate tracks:{cands},similarity:{cos_sim[i,cands]}')

0==>candidate tracks:[ 901  900   54 1021  158],similarity:[0.97178418 0.9459791  0.94420648 0.93603118 0.92040656]
1==>candidate tracks:[ 200 1046  623  712  617],similarity:[0.89026769 0.85046021 0.83970622 0.80838118 0.80472796]
2==>candidate tracks:[748 998],similarity:[0.89271696 0.81529661]
3==>candidate tracks:[244 529 905 990 202],similarity:[0.91101959 0.90938024 0.90737865 0.89117887 0.85469827]
4==>candidate tracks:[829 423 677 162 503],similarity:[0.87705816 0.87075724 0.86883671 0.85769354 0.83891624]
5==>candidate tracks:[ 125  581 1036],similarity:[0.8209998  0.81823729 0.80624293]
6==>candidate tracks:[478 129 116 975 295],similarity:[0.9010857  0.84540676 0.84473608 0.82357402 0.81903157]
7==>candidate tracks:[471 169 655 747 967],similarity:[0.98460031 0.93022656 0.9006087  0.89451811 0.88602795]
8==>candidate tracks:[723 375 335 673 503],similarity:[0.8685132  0.85076941 0.84617098 0.83739086 0.83329749]
9==>candidate tracks:[539],similarity:[0.83146047]
10==>candida

## 3. Playlist Creation

### 3.1 tracks' IDs that will be included in the new playlist

In [36]:
ids_t20=[]
ids_playlist=[]
for i in range(top20_df.shape[0]):
    print(top20_df.index[i])
    ids_t20.append(top20_df["id"][i])
    
    cands=get_candidate(i,cos_sim,5,umbral=0.8)
    
    if len(cands)==0:
        print('     ***No similar tracks found***')
    else:
        for j in cands:
            #extraction of candidate id 
            id_cand=candidate_tracks_df["id"][j]
            # save candidate's id and save it to the playlist
            ids_playlist.append(id_cand)
            print(f"    {candidate_tracks_df.index[j]}")


Avatar's Love
    Healing Is A Miracle
    Oh, Memory
    So Soft
    Sense
    The Lake
She's A Rainbow
    Street Fighting Man - Live
    Shu Ba Da Du Ma Ma Ma Ma - Live
    Solo Quédate En Silencio - En Vivo
    Who Knew - Live
    Lost Cause - Live 1992
Saturn
    Alibis
    Aşkın Kenarından
Bella Traicion
    You Deserve A Better Man
    Wake UP!
    What Life Is
    Where I Belong
    Intrusive
Blanket Me - Recorded at Spotify Studios NYC
    holding my own hand (as written)
    Granted
    Losing My Mind
    Jamie
    Oh My God Please
never been in love
    Contigo
    Loco
    BURNITUP! (feat. Missy Elliott)
cruel intentions
    Someone Else’s Cafe / Doomscroller Tries To Relax
    Jenny
    Don't Let It Break Your Heart
    See You When The End's Near
    My Plan
drunk text me
    drunk text me
    Me Detengo
    Shapeshifter
    Child In The Dark
    Así Caen Los Días
Helium
    Hell
    doomsday
    Hope
    i walked a mile in my room
    Oh My God Please
Cherry Lips (Go Bab

### 3.2 Removing Duplicates

remove recommended tracks which are the ones compared

In [37]:
ids_playlist_dep=[x for x in ids_playlist if x not in ids_t20]
# eliminate possible repetitions
ids_playlist_dep=list(set(ids_playlist_dep))

In [38]:
print(ids_playlist_dep)
len(ids_playlist_dep)

['0y38hbZ7yGrwOilYoqOv8c', '76vMKwFtdDDCLcM6zXybjB', '2mcT0fTjRFly3tbb7Evdiv', '2mpezzl8FlxttNlaUSDzuV', '4QxoIotDqwd6zhgYQANlTL', '2FOf3j3prkO2JyooyAYBSH', '7MiEayONNCLMKXqNXBFOnO', '5p8QLRVnBk20xuvwiDZtTr', '7xsYgGd2BRk56uO3njXUWi', '5UkdHh2Gexv89qubDJllan', '1mVva2Aa3gD9Wwlqg2G1y3', '3gYvFUEdWIlvxVxKfimAJr', '63PbZwickD5dgkGxMWhvzi', '7mtfb9JlOQ0YMryiG7VwP7', '7rA1vnAY96HkvulZgeQEhl', '6dzfRM7vrKSLCjLejwBVNW', '0CgYN1d9Dku244f1mWmS0Z', '33Cp9v9wX2NuWZHNQ3FV1Q', '6GVgVJBKkGJoRfarYRvGTU', '2dtVT9pXvSP8Nc741vegrv', '7EcAMovmbjTYE9y5MuiVHX', '3WToPa41ES9m0c5jtkl9lH', '0KfhNXqugNGgCgvzE9RUqI', '1MFiEyZeJAJVMu5kaWGhUD', '3webTMHp27ZhruiSba81Ct', '0UP7r8lFBcqwCZanYdp5ae', '2XGx0pnGVlu0ZYMksDgIPX', '3DC8V0NDZ8h06R9xQhi0AH', '13WEVW04Jf6yEaACl9ABOt', '6O8ECP0BRhR8emAlSdNC9d', '49msJ7xl3s6XrsGACoRP1g', '7bn5Z1XBT7zjUGclKljmSi', '6bpZMLiRQDm2XcjZudu3xV', '0T4thAUeigWf0gwoVIA2vN', '7vqWwggwPAMu5y9I13Gxrv', '4xxcd6cnmFfj67AmnaF6tc', '3pbvFvLSQo8cMvVBZdFh25', '79VHH0352XN3e543jYOl56', '25mKTG9WMd

84

### 3.3 Adding new playlist to user's spotify 

In [39]:
sp=spotipy.Spotify(auth=token)
pl=sp.user_playlist_create(user= username,
                            name="Spotipy Recommender Playlist ",
                            description="Playlist based on your top 20 tracks")
sp.playlist_add_items(pl["id"],ids_playlist_dep)


{'snapshot_id': 'MyxkNzI5Y2I5ZmQ5YmU5Y2QxOGM4YTkxNGJjMDY5MjExYzg4MTAxZTQx'}

## 4 Result

![alt text](static/images/result.png)