Spotify Recommendation Algorithm 
Steps:
1. Install required packages and dependencies and read csv file containing song data
2. Clean CSV data and convert dataframe into item-feature matrix 
3. Read spotify song playlist URL and gather playlist data from the Spotify API
4. Make sure playlist dataframe and song database dataframe have the same corresponding features (columns) 
5. Compute the cosine similarity between the playlist and database of songs to recommend songs 
6. Show recommendations to user! 

In [95]:
# Install packages and dependencies
import pandas as pd
import matplotlib.pyplot as plt 
import numpy as np
import spotipy
import json
from spotipy.oauth2 import SpotifyOAuth
from sklearn.preprocessing import MinMaxScaler
from joblib import Parallel, delayed
from sklearn.metrics.pairwise import cosine_similarity

In [96]:
# Read CSV file 
# @st.cache_data
df = pd.read_csv('spotify_data.csv')

# Create Feature Set, drop unnecessary columns 
feat_vec = df.drop(columns=['Unnamed: 0','artist_name', 'track_name', 'key', 'duration_ms', 'time_signature'])

pd.set_option('display.max_columns', None)

feat_vec

Unnamed: 0,track_id,popularity,year,genre,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
0,53QF56cjZA9RTuuMZDrSA6,68,2012,acoustic,0.483,0.303,-10.058,1,0.0429,0.6940,0.000000,0.1150,0.1390,133.406
1,1s8tP3jP4GZcyHDsjvw218,50,2012,acoustic,0.572,0.454,-10.286,1,0.0258,0.4770,0.000014,0.0974,0.5150,140.182
2,7BRCa8MPiyuvr2VU3O9W0F,57,2012,acoustic,0.409,0.234,-13.711,1,0.0323,0.3380,0.000050,0.0895,0.1450,139.832
3,63wsZUhUZLlh1OsyrZq7sz,58,2012,acoustic,0.392,0.251,-9.845,1,0.0363,0.8070,0.000000,0.0797,0.5080,204.961
4,6nXIYClvJAfi6ujLiKqEq8,54,2012,acoustic,0.430,0.791,-5.419,0,0.0302,0.0726,0.019300,0.1100,0.2170,171.864
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1159759,0m27F0IGHLGAWhqd6ccYst,4,2011,trip-hop,0.373,0.742,-6.453,0,0.0736,0.3250,0.000141,0.1590,0.5220,107.951
1159760,6er9p611eHEcUCU50j7D57,3,2011,trip-hop,0.516,0.675,-7.588,0,0.0326,0.7880,0.000129,0.1300,0.2640,119.897
1159761,7jsMMqxy1tt0rH5FzYcZTQ,2,2011,trip-hop,0.491,0.440,-8.512,1,0.0274,0.4770,0.003130,0.0936,0.0351,100.076
1159762,77lA1InUaXztuRk2vOzD1S,0,2011,trip-hop,0.480,0.405,-13.343,1,0.0276,0.4310,0.000063,0.1250,0.2020,133.885


**Using Multi-Hot Encoding to Represent Genres** 

In order to create an item-feature matrix to use the cosine similarity algorithm, all column types must be of numerical value. So, I would need to convert genre string values into integer values. Multi-hot-encoding is used to represent categorical data as binary vectors (0 and 1). 

In [97]:
#Create genre columns, there's so much, so lets only keep the most popular ones
genre_list = feat_vec['genre'].unique().tolist()

genres_to_remove = ['afrobeat','black-metal','breakbeat','cantopop','chicago-house','comedy','death-metal','deep-house','detroit-techno','drum-and-bass','dubstep','electronic','forro','french','garage','german','grindcore','hard-rock','hardcore','hardstyle','heavy-metal','indian','metalcore','industrial','minimal-techno','new-age','pop-film','power-pop','progressive-house','psych-rock','punk-rock','sertanejo','show-tunes','ska','swedish','trance','trip-hop']
updated_genre_list = list(filter(lambda x: x not in genres_to_remove, genre_list))
 
# replace indie-pop to indie
index = updated_genre_list.index('indie-pop')
updated_genre_list[index] = 'indie'
#need to update df as well
feat_vec.loc[feat_vec['genre'] == 'indie-pop', 'genre'] = 'indie'

print(updated_genre_list)

# use one-hot-encoding to convert genre categories into binary matrix format
# iterate over list of genres and then make value of 1 if genre matches 
for item in updated_genre_list:
    feat_vec['genre_'+item] = feat_vec['genre'].apply(lambda genre: 1 if genre == item else 0)
    
# drop genre column in feat_vec df
feat_vec.drop('genre', axis=1, inplace=True)

feat_vec

['acoustic', 'alt-rock', 'ambient', 'blues', 'chill', 'classical', 'club', 'country', 'dance', 'dancehall', 'disco', 'dub', 'edm', 'electro', 'emo', 'folk', 'funk', 'gospel', 'goth', 'groove', 'guitar', 'hip-hop', 'house', 'indie', 'jazz', 'k-pop', 'metal', 'opera', 'party', 'piano', 'pop', 'punk', 'rock', 'rock-n-roll', 'romance', 'sad', 'salsa', 'samba', 'singer-songwriter', 'sleep', 'songwriter', 'soul', 'spanish', 'tango', 'techno']


Unnamed: 0,track_id,popularity,year,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno
0,53QF56cjZA9RTuuMZDrSA6,68,2012,0.483,0.303,-10.058,1,0.0429,0.6940,0.000000,0.1150,0.1390,133.406,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1s8tP3jP4GZcyHDsjvw218,50,2012,0.572,0.454,-10.286,1,0.0258,0.4770,0.000014,0.0974,0.5150,140.182,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,7BRCa8MPiyuvr2VU3O9W0F,57,2012,0.409,0.234,-13.711,1,0.0323,0.3380,0.000050,0.0895,0.1450,139.832,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,63wsZUhUZLlh1OsyrZq7sz,58,2012,0.392,0.251,-9.845,1,0.0363,0.8070,0.000000,0.0797,0.5080,204.961,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,6nXIYClvJAfi6ujLiKqEq8,54,2012,0.430,0.791,-5.419,0,0.0302,0.0726,0.019300,0.1100,0.2170,171.864,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1159759,0m27F0IGHLGAWhqd6ccYst,4,2011,0.373,0.742,-6.453,0,0.0736,0.3250,0.000141,0.1590,0.5220,107.951,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1159760,6er9p611eHEcUCU50j7D57,3,2011,0.516,0.675,-7.588,0,0.0326,0.7880,0.000129,0.1300,0.2640,119.897,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1159761,7jsMMqxy1tt0rH5FzYcZTQ,2,2011,0.491,0.440,-8.512,1,0.0274,0.4770,0.003130,0.0936,0.0351,100.076,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1159762,77lA1InUaXztuRk2vOzD1S,0,2011,0.480,0.405,-13.343,1,0.0276,0.4310,0.000063,0.1250,0.2020,133.885,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [98]:
# Have a column for each categorial time period (bucketing)

# Find min and max values of year 
print('Min Value', feat_vec['year'].min())
print('Max Value', feat_vec['year'].max())

# Make columns for each time period
feat_vec['year_2000-2004'] = feat_vec['year'].apply(lambda year: 1 if year>=2000 and year<2005 else 0)
feat_vec['year_2005-2009'] = feat_vec['year'].apply(lambda year: 1 if year>=2005 and year<2010 else 0)
feat_vec['year_2010-2014'] = feat_vec['year'].apply(lambda year: 1 if year>=2010 and year<2015 else 0)
feat_vec['year_2015-2019'] = feat_vec['year'].apply(lambda year: 1 if year>=2015 and year<2020 else 0)
feat_vec['year_2020-2023'] = feat_vec['year'].apply(lambda year: 1 if year>=2020 and year<2024 else 0)

# Drop year column, no longer needed
feat_vec = feat_vec.drop(columns=['year'])

feat_vec




Min Value 2000
Max Value 2023


Unnamed: 0,track_id,popularity,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023
0,53QF56cjZA9RTuuMZDrSA6,68,0.483,0.303,-10.058,1,0.0429,0.6940,0.000000,0.1150,0.1390,133.406,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,1s8tP3jP4GZcyHDsjvw218,50,0.572,0.454,-10.286,1,0.0258,0.4770,0.000014,0.0974,0.5150,140.182,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
2,7BRCa8MPiyuvr2VU3O9W0F,57,0.409,0.234,-13.711,1,0.0323,0.3380,0.000050,0.0895,0.1450,139.832,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
3,63wsZUhUZLlh1OsyrZq7sz,58,0.392,0.251,-9.845,1,0.0363,0.8070,0.000000,0.0797,0.5080,204.961,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
4,6nXIYClvJAfi6ujLiKqEq8,54,0.430,0.791,-5.419,0,0.0302,0.0726,0.019300,0.1100,0.2170,171.864,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1159759,0m27F0IGHLGAWhqd6ccYst,4,0.373,0.742,-6.453,0,0.0736,0.3250,0.000141,0.1590,0.5220,107.951,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1159760,6er9p611eHEcUCU50j7D57,3,0.516,0.675,-7.588,0,0.0326,0.7880,0.000129,0.1300,0.2640,119.897,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1159761,7jsMMqxy1tt0rH5FzYcZTQ,2,0.491,0.440,-8.512,1,0.0274,0.4770,0.003130,0.0936,0.0351,100.076,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1159762,77lA1InUaXztuRk2vOzD1S,0,0.480,0.405,-13.343,1,0.0276,0.4310,0.000063,0.1250,0.2020,133.885,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0


**Normalizing Feature Vectors**

All feature values should be on a scale from 0-1. This is to ensure that when running the cosine similarity algorithm, the similarity depends on the direction of the vector, not the dependent on the magnitude or scale of each vetor. Varying scales and magnitude will result in some features having more weighting than others. 

Popularity scale ranges from 1-100; Loudness scale ranges from -60-0, Tempo scale ranges from 0-250. These feature values must be scaled from 0-1 to get a better cosine simlarity score. 

In [99]:
# popularity scale: 1-100, loudness scale: -60-0, tempo scale: 0-250, scale features from 0-1 
# add min and max values for each row to establish min and max values, then once scaling is done, remove min and max columns
min_row = {'popularity': '0', 'loudness': '-60', 'tempo': '0'}
max_row = {'popularity': '100', 'loudness': '0', 'tempo': '250'}

min_row_df = pd.DataFrame([min_row])
max_row_df = pd.DataFrame([max_row])

feat_vec = pd.concat([feat_vec, min_row_df], ignore_index=True)
feat_vec = pd.concat([feat_vec, max_row_df], ignore_index=True)

# scale popularity, loudness, and tempo features to 0-1
scale = ['popularity', 'loudness', 'tempo']
scaler = MinMaxScaler()
feat_vec[scale] = scaler.fit_transform(feat_vec[scale])

# drop min and max values
feat_vec = feat_vec.iloc[:-2]

feat_vec

Unnamed: 0,track_id,popularity,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023
0,53QF56cjZA9RTuuMZDrSA6,0.68,0.483,0.303,0.754730,1.0,0.0429,0.6940,0.000000,0.1150,0.1390,0.533624,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1,1s8tP3jP4GZcyHDsjvw218,0.50,0.572,0.454,0.751285,1.0,0.0258,0.4770,0.000014,0.0974,0.5150,0.560728,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
2,7BRCa8MPiyuvr2VU3O9W0F,0.57,0.409,0.234,0.699525,1.0,0.0323,0.3380,0.000050,0.0895,0.1450,0.559328,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
3,63wsZUhUZLlh1OsyrZq7sz,0.58,0.392,0.251,0.757949,1.0,0.0363,0.8070,0.000000,0.0797,0.5080,0.819844,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
4,6nXIYClvJAfi6ujLiKqEq8,0.54,0.430,0.791,0.824835,0.0,0.0302,0.0726,0.019300,0.1100,0.2170,0.687456,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1159759,0m27F0IGHLGAWhqd6ccYst,0.04,0.373,0.742,0.809209,0.0,0.0736,0.3250,0.000141,0.1590,0.5220,0.431804,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1159760,6er9p611eHEcUCU50j7D57,0.03,0.516,0.675,0.792057,0.0,0.0326,0.7880,0.000129,0.1300,0.2640,0.479588,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1159761,7jsMMqxy1tt0rH5FzYcZTQ,0.02,0.491,0.440,0.778093,1.0,0.0274,0.4770,0.003130,0.0936,0.0351,0.400304,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1159762,77lA1InUaXztuRk2vOzD1S,0.00,0.480,0.405,0.705087,1.0,0.0276,0.4310,0.000063,0.1250,0.2020,0.535540,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0


**Create Item-Feature Matrix of User's Playlist**

The item-feature matrix of the database of 1M+ Spotify songs is now ready to be used for the cosine similarity algorithm. Next, create an item-feature matrix of the user's playlist. 

First, read the URL of the user's playlist and gather the songs along with the associated audio features. 

In [100]:
#connect to spotify API
# Set Spotify API credentials
client_id = '14c84923f9ac478abf582c59dcc6f59c'
client_secret = '8dfc8d50a3164779bdb8d74010f913b5'
redirect_uri = 'http://localhost:3000'

# Initialize the Spotipy client with authentication
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(client_id, client_secret, redirect_uri))

# Retrieve playlist id from playlist link
playlist_link = 'https://open.spotify.com/playlist/37i9dQZF1DX5Q5wA1hY6bS?si=e43b100b9c734ba3'
playlist_id = playlist_link[34:56]

# Get the first 50 songs of the playlist
playlist_tracks = sp.playlist_tracks(playlist_id, limit=50)

# Create lists to hold track titles and artist names
titles, artists, uri = [], [], []

# Iterate through the tracks and collect title, artist, and uri from each song
for item in playlist_tracks['items']:
    track = item['track']
    titles.append(track['name'])
    artist_names = ', '.join([artist['name'] for artist in track['artists']])
    artists.append(artist_names)
    uri.append(track['uri'])

# Create a DataFrame
data = {'Title': titles, 'Artist': artists, 'uri': uri}
playlist = pd.DataFrame(data)

# create new feature columns and assign null values
new_feat = ['danceability', 'energy', 'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo']
for item in new_feat:
    playlist[item] = 0
    
# fill null values with feature values 
for i in range(len(playlist)):
    track_uri = playlist.iloc[i].uri
    audio_features = sp.audio_features(track_uri)
    json_string = json.dumps(audio_features[0])
    dictionary = json.loads(json_string)
    
    #update feature values
    for feature in new_feat:
        playlist.loc[i, feature] = dictionary[feature]
    
playlist

Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,-7.526,1,0.0783,0.393,0.041,0.117,0.219,83.856
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,-6.652,1,0.0548,0.443,0.00912,0.121,0.601,170.043
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,-6.53,1,0.028,0.00114,1e-06,0.151,0.203,144.218
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,-10.84,1,0.0804,0.824,0.000325,0.0953,0.534,71.771
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,-9.643,1,0.0301,0.599,0.0,0.112,0.379,169.909
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,-8.256,1,0.0349,0.047,0.0286,0.0995,0.272,108.043
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,-10.836,1,0.033,0.45,0.0503,0.162,0.224,129.57
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,-12.793,1,0.0455,0.411,0.00107,0.0941,0.224,106.05
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,-11.475,1,0.0308,0.805,0.884,0.126,0.126,117.053
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,-8.243,1,0.0346,0.0404,7e-05,0.112,0.264,135.188


**Finding Genres of Songs in Playlist**

The Spotify API does not have genres for each song, however, they do provide genres for each artist. So for each song, I will use the artist to find the associated genre to the song. 

**Task Parallelization**

To do this, I would have to iterate through each song and use the Spotify API to find the genre to the artist of the song provided. This process would take a long amount of time. So, I used task parallelization to decrease computation times. By default, when you run a Python script, it typically utilizes a single CPU, operating on a single processing unit. To use the power of available CPUs, I integrated the Python package Joblib for parallel processing. This significantly enhances program efficiency by distributing functions across multiple CPUs, resulting in an average of 30.7% reduction in processing times.

In [101]:
# Create a list of artist names from the playlist
artist_names = playlist['Artist'].tolist()

# Create an empty list to store genres
genres = []

# Parallelization process function to iterate through artist names and retrieve genres
def process_artist(artist_name, sp):
    search_results = sp.search(q=artist_name, type='artist')
    genres_info = []

    if 'artists' in search_results and 'items' in search_results['artists']:
        artists = search_results['artists']['items']

        for artist in artists:
            if artist['name'] == artist_name:
                genres_info = artist.get('genres', [])
                break

    genre_string = ', '.join(genres_info) if genres_info else 'No Genre Found'
    return genre_string

# Fill genre for each song using Parallelization
genres = Parallel(n_jobs=-1)(delayed(process_artist)(artist_name, sp) for artist_name in artist_names)

playlist['Genre'] = genres

playlist

Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,Genre
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,-7.526,1,0.0783,0.393,0.041,0.117,0.219,83.856,nyc pop
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,-6.652,1,0.0548,0.443,0.00912,0.121,0.601,170.043,australian indie
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,-6.53,1,0.028,0.00114,1e-06,0.151,0.203,144.218,"pov: indie, stomp and holler"
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,-10.84,1,0.0804,0.824,0.000325,0.0953,0.534,71.771,No Genre Found
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,-9.643,1,0.0301,0.599,0.0,0.112,0.379,169.909,pov: indie
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,-8.256,1,0.0349,0.047,0.0286,0.0995,0.272,108.043,"folk-pop, modern rock"
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,-10.836,1,0.033,0.45,0.0503,0.162,0.224,129.57,"british singer-songwriter, folk-pop"
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,-12.793,1,0.0455,0.411,0.00107,0.0941,0.224,106.05,indie folk
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,-11.475,1,0.0308,0.805,0.884,0.126,0.126,117.053,"ambient folk, indie folk"
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,-8.243,1,0.0346,0.0404,7e-05,0.112,0.264,135.188,"indie folk, modern folk rock, new americana, s..."


**Spotify's Unique Genre Names**

The genres that Spotify provides for each artist is very unique. To successfully run the cosine similarity algorithm, I need the genres of both item-feature matrices to match. To do this, I used the list of genres from the item-feature matrix and did a substring search for each of the genres to assign a binary value in the playlist item-feature matrix. 

In [102]:
# using the genre column, find substrings of genres and assign values of 1 if found
# for genre in updated_genre_list:
#     playlist['genre_' + genre] = playlist['genre'].str.contains(genre).astype(int)

# playlist = playlist.drop(columns=['genre'])

# playlist
genre_count = {}

# Substring search of genres, 
for genre in updated_genre_list:
        # substring search for genres, assigns binary value in playlist item-feature matrix
        playlist['genre_'+genre] = playlist['Genre'].str.contains(genre).astype(int)
        # gather count of each genre in playlist
        if playlist['genre_'+genre].sum() > 0:
            genre_count[genre] = playlist['genre_'+genre].sum()
            
playlist = playlist.drop(columns=['Genre'])

#get top 3 genres for recommendation 
top_3_genres = sorted(genre_count, key=genre_count.get, reverse=True)[:3]
            
playlist


Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,-7.526,1,0.0783,0.393,0.041,0.117,0.219,83.856,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,-6.652,1,0.0548,0.443,0.00912,0.121,0.601,170.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,-6.53,1,0.028,0.00114,1e-06,0.151,0.203,144.218,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,-10.84,1,0.0804,0.824,0.000325,0.0953,0.534,71.771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,-9.643,1,0.0301,0.599,0.0,0.112,0.379,169.909,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,-8.256,1,0.0349,0.047,0.0286,0.0995,0.272,108.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,-10.836,1,0.033,0.45,0.0503,0.162,0.224,129.57,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,-12.793,1,0.0455,0.411,0.00107,0.0941,0.224,106.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,-11.475,1,0.0308,0.805,0.884,0.126,0.126,117.053,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,-8.243,1,0.0346,0.0404,7e-05,0.112,0.264,135.188,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0


In [103]:
# Need to find the year and popularity of each song in the playlist 
    
playlist['year'] = [0]*len(playlist)
playlist['popularity'] = [0]*len(playlist)

# iterate through each song to find popularity and release year
for index, row in playlist.iterrows():
    track_uri = row['uri']
    # Get audio features of the track
    track_info = sp.track(track_uri)

    # Extract release date from track info
    release_date = track_info['album']['release_date']
    popularity = track_info['popularity']

    # Extract year from release date
    release_year = int(release_date.split('-')[0])

    playlist.loc[index, 'year'] = int(release_year)
    playlist.loc[index,'popularity'] = int(popularity)

playlist

Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,year,popularity
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,-7.526,1,0.0783,0.393,0.041,0.117,0.219,83.856,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2017,54
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,-6.652,1,0.0548,0.443,0.00912,0.121,0.601,170.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2023,59
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,-6.53,1,0.028,0.00114,1e-06,0.151,0.203,144.218,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2018,76
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,-10.84,1,0.0804,0.824,0.000325,0.0953,0.534,71.771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2020,57
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,-9.643,1,0.0301,0.599,0.0,0.112,0.379,169.909,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2023,80
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,-8.256,1,0.0349,0.047,0.0286,0.0995,0.272,108.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2014,68
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,-10.836,1,0.033,0.45,0.0503,0.162,0.224,129.57,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,2011,68
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,-12.793,1,0.0455,0.411,0.00107,0.0941,0.224,106.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2017,64
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,-11.475,1,0.0308,0.805,0.884,0.126,0.126,117.053,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2017,78
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,-8.243,1,0.0346,0.0404,7e-05,0.112,0.264,135.188,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2015,62


In [104]:
# make buckets based on year to match item-feature matrix of 1M+ song database
 
# Make columns for each time period
playlist['year_2000-2004'] = playlist['year'].apply(lambda year: 1 if year>=2000 and year<2005 else 0)
playlist['year_2005-2009'] = playlist['year'].apply(lambda year: 1 if year>=2005 and year<2010 else 0)
playlist['year_2010-2014'] = playlist['year'].apply(lambda year: 1 if year>=2010 and year<2015 else 0)
playlist['year_2015-2019'] = playlist['year'].apply(lambda year: 1 if year>=2015 and year<2020 else 0)
playlist['year_2020-2023'] = playlist['year'].apply(lambda year: 1 if year>=2020 and year<2024 else 0)
 
# Drop year column, no longer needed
playlist = playlist.drop(columns=['year'])

playlist

Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,popularity,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,-7.526,1,0.0783,0.393,0.041,0.117,0.219,83.856,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,54,0,0,0,1,0
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,-6.652,1,0.0548,0.443,0.00912,0.121,0.601,170.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,59,0,0,0,0,1
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,-6.53,1,0.028,0.00114,1e-06,0.151,0.203,144.218,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,76,0,0,0,1,0
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,-10.84,1,0.0804,0.824,0.000325,0.0953,0.534,71.771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,57,0,0,0,0,1
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,-9.643,1,0.0301,0.599,0.0,0.112,0.379,169.909,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,80,0,0,0,0,1
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,-8.256,1,0.0349,0.047,0.0286,0.0995,0.272,108.043,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,68,0,0,1,0,0
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,-10.836,1,0.033,0.45,0.0503,0.162,0.224,129.57,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,68,0,0,1,0,0
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,-12.793,1,0.0455,0.411,0.00107,0.0941,0.224,106.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,64,0,0,0,1,0
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,-11.475,1,0.0308,0.805,0.884,0.126,0.126,117.053,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,78,0,0,0,1,0
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,-8.243,1,0.0346,0.0404,7e-05,0.112,0.264,135.188,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,62,0,0,0,1,0


In [105]:
# apply scaling again this time to playlist dataframe to normalize feature values 
min_row = {'popularity': '0', 'loudness': '-60', 'tempo': '0'}
max_row = {'popularity': '100', 'loudness': '0', 'tempo': '250'}

min_row_df = pd.DataFrame([min_row])
max_row_df = pd.DataFrame([max_row])

playlist = pd.concat([playlist, min_row_df], ignore_index=True)
playlist = pd.concat([playlist, max_row_df], ignore_index=True)

# scale popularity, loudness, and tempo features to 0-1
scale = ['popularity', 'loudness', 'tempo']
scaler = MinMaxScaler()
playlist[scale] = scaler.fit_transform(playlist[scale])

# drop min and max values
playlist = playlist.iloc[:-2]

playlist

Unnamed: 0,Title,Artist,uri,danceability,energy,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,popularity,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023
0,Beige,Yoke Lore,spotify:track:5bs5GopDitBx9xjoHHRDoo,0.47,0.67,0.874567,1.0,0.0783,0.393,0.041,0.117,0.219,0.335424,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.54,0.0,0.0,0.0,1.0,0.0
1,Letting Go,Angie McMahon,spotify:track:2XHznZZIWLkh7xO3WQAjpp,0.485,0.753,0.889133,1.0,0.0548,0.443,0.00912,0.121,0.601,0.680172,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.59,0.0,0.0,0.0,0.0,1.0
2,Silver Lining,Mt. Joy,spotify:track:0i5QVxsK3IvEDbUjTA64Li,0.541,0.616,0.891167,1.0,0.028,0.00114,1e-06,0.151,0.203,0.576872,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.76,0.0,0.0,0.0,1.0,0.0
3,Weekend,"Sumbuck, Savannah Conley",spotify:track:2TEZu0Rk7Rr6aEARBCMmhj,0.715,0.332,0.819333,1.0,0.0804,0.824,0.000325,0.0953,0.534,0.287084,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.57,0.0,0.0,0.0,0.0,1.0
4,You’re Gonna Go Far,Noah Kahan,spotify:track:4nHJcUtNSUVjXRnjdP29Bk,0.59,0.36,0.839283,1.0,0.0301,0.599,0.0,0.112,0.379,0.679636,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.8,0.0,0.0,0.0,0.0,1.0
5,Mess Is Mine,Vance Joy,spotify:track:29jtZGdgpE2lWm2mkIt6HS,0.595,0.723,0.8624,1.0,0.0349,0.047,0.0286,0.0995,0.272,0.432172,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.68,0.0,0.0,1.0,0.0,0.0
6,Old Pine,Ben Howard,spotify:track:3CAX47TnPqTujLIQTw8nwI,0.401,0.364,0.8194,1.0,0.033,0.45,0.0503,0.162,0.224,0.51828,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.68,0.0,0.0,1.0,0.0,0.0
7,The Woods,Hollow Coves,spotify:track:5377z0OljWvRR7CdSQrJxP,0.802,0.41,0.786783,1.0,0.0455,0.411,0.00107,0.0941,0.224,0.4242,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.64,0.0,0.0,0.0,1.0,0.0
8,Anchor,Novo Amor,spotify:track:7qH9Z4dJEN0l9bidizW7fq,0.457,0.407,0.80875,1.0,0.0308,0.805,0.884,0.126,0.126,0.468212,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.78,0.0,0.0,0.0,1.0,0.0
9,Sedona,Houndmouth,spotify:track:65T1aY3I9qfNUDVAnaM9bq,0.394,0.654,0.862617,1.0,0.0346,0.0404,7e-05,0.112,0.264,0.540752,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.62,0.0,0.0,0.0,1.0,0.0


In [106]:
# sort the dataframes in alphabetical order so columns correspond to each other for the cosine similarity algorithm
playlist = playlist.sort_index(axis=1)
feat_vec = feat_vec.sort_index(axis=1)

# for cosine similarity, drop track_id column of the dataframe, this is not needed and numerical values are only needed
feat_vec_cosine_sim = feat_vec.drop('track_id', axis=1)

# drop the Artist, Title, and uri in the playlist dataframe as well since they are not numerical values  
columns_dropped = ['Artist', 'Title', 'uri']
playlist_cosine_sim = playlist.drop(columns_dropped, axis=1)


In [107]:
# Calculate column averages of the playlist dataframe
column_averages = playlist_cosine_sim.mean()

# Create a new DataFrame for the averages and totals
averages_cosine_sim = pd.DataFrame([column_averages], index=['Average'])

averages_cosine_sim

Unnamed: 0,acousticness,danceability,energy,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,instrumentalness,liveness,loudness,mode,popularity,speechiness,tempo,valence,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023
Average,0.358024,0.5695,0.5396,0.04,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.64,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.56,0.0,0.36,0.0,0.0,0.0,0.0,0.52,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.03578,0.150602,0.855976,0.9,0.5786,0.039242,0.465931,0.39978,0.02,0.02,0.26,0.48,0.22


In [108]:
#generate similarity scores!
similarity_scores = cosine_similarity(feat_vec_cosine_sim, averages_cosine_sim)
 
feat_vec['similarity_score'] = similarity_scores
 
#sort df from highest to lowest by similarity score and to show songs with highest similarity scores
top_similarities = feat_vec.sort_values(by='similarity_score', ascending=False)

#remove rows in recommendations from top_similarities where IDs match with playlist IDs, this makes sure that no recommendation is already in the user's playlist
top_similarities = top_similarities[~top_similarities['track_id'].isin(playlist['uri'])]


# get song recs from top 3 genres
first_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[0]] == 1].head(45)
second_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[1]] == 1].head(30)
third_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[2]] == 1].head(15)

top_similarities = pd.concat([first_genre, second_genre, third_genre], ignore_index=True)

top_similarities

Unnamed: 0,acousticness,danceability,energy,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,instrumentalness,liveness,loudness,mode,popularity,speechiness,tempo,track_id,valence,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023,similarity_score
0,0.456,0.705,0.780,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00152,0.1150,0.812020,1.0,0.64,0.0805,0.568112,19cL3SOKpwnwoKkII7U3Wh,0.457,0.0,0.0,0.0,1.0,0.0,0.875443
1,0.484,0.637,0.864,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.2220,0.840522,1.0,0.68,0.0468,0.399980,2RiBogNRfulkNf7fVbPOrJ,0.706,0.0,0.0,0.0,1.0,0.0,0.870748
2,0.286,0.673,0.735,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.2420,0.835202,1.0,0.75,0.0457,0.462976,4ofwffwvvnbSkrMSCKQDaC,0.754,0.0,0.0,0.0,1.0,0.0,0.870691
3,0.536,0.704,0.545,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.1250,0.792707,1.0,0.60,0.0323,0.439812,463XKKFCPlcrhtlwGbEovu,0.589,0.0,0.0,0.0,1.0,0.0,0.870220
4,0.305,0.612,0.725,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07840,0.0548,0.822070,1.0,0.62,0.0416,0.584032,4gsR34XSIE2fUY4odwZqym,0.333,0.0,0.0,0.0,1.0,0.0,0.870019
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
85,0.448,0.747,0.760,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.1530,0.841836,1.0,0.64,0.1870,0.399900,3Lfiu5sZ4M4B6JaKMBc0FU,0.682,0.0,0.0,0.0,1.0,0.0,0.848837
86,0.643,0.785,0.702,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.1750,0.838466,1.0,0.54,0.0464,0.464084,3JTtZUOiZljuWbNiasfHB6,0.569,0.0,0.0,0.0,1.0,0.0,0.848808
87,0.277,0.531,0.834,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.1310,0.847126,1.0,0.77,0.0520,0.568104,1iRvhKiXRElIH2Uf4gd95P,0.526,0.0,0.0,0.0,1.0,0.0,0.848710
88,0.316,0.549,0.705,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00000,0.1200,0.822659,1.0,0.62,0.0564,0.584096,0gMW8XpPFPjoApDii5Tj1u,0.520,0.0,0.0,0.0,1.0,0.0,0.848501


In [109]:
# find the track name, artist, and 30s audio preview or each song using the track_id
top_similarities['track'] = [None]*len(top_similarities)
top_similarities['artist'] = [None]*len(top_similarities)
top_similarities['preview'] = [None]*len(top_similarities)
    
# get track name, artist, and 30s audio clip url
for i in range(len(top_similarities)):
    track_info = sp.track(top_similarities.iloc[i,55])
    track_name = track_info['name']
    artist_name = track_info['artists'][0]['name']
    preview_url = track_info['preview_url']
    
    top_similarities.iloc[i, 63] = track_name
    top_similarities.iloc[i, 64] = artist_name
    top_similarities.iloc[i, 65] = preview_url

In [110]:
# Get genres of each track in playlist
artist_names = top_similarities['artist'].tolist()
    
# Create an empty list to store genres
genres = []

# Fill genre for each song using Parallelization
genres = Parallel(n_jobs=1)(delayed(process_artist)(artist_name, sp) for artist_name in artist_names)

# Add genres to the dataframe
top_similarities['genre'] = genres

# if songs in recs have any ethnic songs
ethnic_genres = ['colombia', 'latin', 'mexican', 'puerto rican', 'dominican', 'italian', 'spanish', 'brasil', 'argentine', 'anime', 'japanese', 'indonesian', 'vietnamese', 'korean', 'chinese', 'taiwan', 'spanish']
    
# remove any songs that have ethnic genres included
mask = top_similarities['genre'].str.contains('|'.join(ethnic_genres), case=False)
top_similarities.drop(top_similarities[mask].index, inplace=True)
    

# 15 songs from 1st genre, 10 songs from 2nd genre, 5 songs from 3rd genre
first_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[0]] == 1].head(15)
second_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[1]] == 1].head(10)
third_genre = top_similarities.loc[top_similarities['genre_'+top_3_genres[2]] == 1].head(5)
top_similarities = pd.concat([first_genre, second_genre, third_genre], ignore_index=True)
    
top_similarities

Unnamed: 0,acousticness,danceability,energy,genre_acoustic,genre_alt-rock,genre_ambient,genre_blues,genre_chill,genre_classical,genre_club,genre_country,genre_dance,genre_dancehall,genre_disco,genre_dub,genre_edm,genre_electro,genre_emo,genre_folk,genre_funk,genre_gospel,genre_goth,genre_groove,genre_guitar,genre_hip-hop,genre_house,genre_indie,genre_jazz,genre_k-pop,genre_metal,genre_opera,genre_party,genre_piano,genre_pop,genre_punk,genre_rock,genre_rock-n-roll,genre_romance,genre_sad,genre_salsa,genre_samba,genre_singer-songwriter,genre_sleep,genre_songwriter,genre_soul,genre_spanish,genre_tango,genre_techno,instrumentalness,liveness,loudness,mode,popularity,speechiness,tempo,track_id,valence,year_2000-2004,year_2005-2009,year_2010-2014,year_2015-2019,year_2020-2023,similarity_score,track,artist,preview,genre
0,0.456,0.705,0.78,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00152,0.115,0.81202,1.0,0.64,0.0805,0.568112,19cL3SOKpwnwoKkII7U3Wh,0.457,0.0,0.0,0.0,1.0,0.0,0.875443,Geronimo,Sheppard,https://p.scdn.co/mp3-preview/c6051dbaff789b8f...,"australian indie, folk-pop"
1,0.484,0.637,0.864,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222,0.840522,1.0,0.68,0.0468,0.39998,2RiBogNRfulkNf7fVbPOrJ,0.706,0.0,0.0,0.0,1.0,0.0,0.870748,Saturday Sun,Vance Joy,https://p.scdn.co/mp3-preview/342a7984570355ae...,"folk-pop, modern rock"
2,0.286,0.673,0.735,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.242,0.835202,1.0,0.75,0.0457,0.462976,4ofwffwvvnbSkrMSCKQDaC,0.754,0.0,0.0,0.0,1.0,0.0,0.870691,Shotgun,George Ezra,https://p.scdn.co/mp3-preview/3d87ba7cbe8d7c74...,"folk-pop, neo-singer-songwriter"
3,0.305,0.612,0.725,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0784,0.0548,0.82207,1.0,0.62,0.0416,0.584032,4gsR34XSIE2fUY4odwZqym,0.333,0.0,0.0,0.0,1.0,0.0,0.870019,Seventeen,Sjowgren,https://p.scdn.co/mp3-preview/d4c65a009037f110...,"indie pop, indie poptimism"
4,0.513,0.563,0.606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0391,0.132,0.783428,1.0,0.55,0.0267,0.6,1rCPg5GOtes0FIo1BzgvUi,0.425,0.0,0.0,0.0,1.0,0.0,0.869853,A Trick of the Light,Villagers,https://p.scdn.co/mp3-preview/3819d412866b3ad3...,"chamber pop, indie folk, irish rock"
5,0.461,0.691,0.642,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0775,0.772124,1.0,0.51,0.0772,0.5799,3Ttylc1DWh3GVKP2BzTi1s,0.577,0.0,0.0,0.0,1.0,0.0,0.869806,Outgrown,Dermot Kennedy,https://p.scdn.co/mp3-preview/83b3245082f4d571...,"folk-pop, irish pop, uk pop"
6,0.597,0.614,0.527,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.11,0.810403,1.0,0.71,0.0302,0.487764,42bbDWZ8WmXTH7PkYAlGLu,0.354,0.0,0.0,0.0,1.0,0.0,0.869409,Hold My Girl,George Ezra,https://p.scdn.co/mp3-preview/c17499c08f817abd...,"folk-pop, neo-singer-songwriter"
7,0.274,0.793,0.636,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.096,0.836487,1.0,0.49,0.0358,0.520056,1lvO0JnHYoR2mF1UnkpViN,0.546,0.0,0.0,0.0,1.0,0.0,0.869234,Freida,Morningsiders,https://p.scdn.co/mp3-preview/cfab3b4a26d2f445...,folk-pop
8,0.32,0.639,0.598,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0971,0.819334,1.0,0.53,0.0354,0.471484,5QjoIwD2oUBUUlrHpZa1nZ,0.379,0.0,0.0,0.0,1.0,0.0,0.868936,Bust Your Kneecaps,Pomplamoose,https://p.scdn.co/mp3-preview/d2f056fab63e99e9...,folk-pop
9,0.378,0.628,0.844,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000713,0.225,0.806776,1.0,0.49,0.0324,0.57606,0TD0ydYJuFPEaqshquDEpw,0.607,0.0,0.0,0.0,1.0,0.0,0.868773,Valentina,The Hunts,https://p.scdn.co/mp3-preview/4f06473edda5d3fb...,"folk-pop, hampton roads indie"


In [114]:
#show only specific columns useful to the user 
display_features = ['track', 'artist', 'similarity_score', 'genre', 'preview']

playlist_recs = top_similarities[display_features]

playlist_recs['similarity_score'] = (playlist_recs['similarity_score']*100).round(2)

playlist_recs

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  playlist_recs['similarity_score'] = (playlist_recs['similarity_score']*100).round(2)


Unnamed: 0,track,artist,similarity_score,genre,preview
0,Geronimo,Sheppard,87.54,"australian indie, folk-pop",https://p.scdn.co/mp3-preview/c6051dbaff789b8f...
1,Saturday Sun,Vance Joy,87.07,"folk-pop, modern rock",https://p.scdn.co/mp3-preview/342a7984570355ae...
2,Shotgun,George Ezra,87.07,"folk-pop, neo-singer-songwriter",https://p.scdn.co/mp3-preview/3d87ba7cbe8d7c74...
3,Seventeen,Sjowgren,87.0,"indie pop, indie poptimism",https://p.scdn.co/mp3-preview/d4c65a009037f110...
4,A Trick of the Light,Villagers,86.99,"chamber pop, indie folk, irish rock",https://p.scdn.co/mp3-preview/3819d412866b3ad3...
5,Outgrown,Dermot Kennedy,86.98,"folk-pop, irish pop, uk pop",https://p.scdn.co/mp3-preview/83b3245082f4d571...
6,Hold My Girl,George Ezra,86.94,"folk-pop, neo-singer-songwriter",https://p.scdn.co/mp3-preview/c17499c08f817abd...
7,Freida,Morningsiders,86.92,folk-pop,https://p.scdn.co/mp3-preview/cfab3b4a26d2f445...
8,Bust Your Kneecaps,Pomplamoose,86.89,folk-pop,https://p.scdn.co/mp3-preview/d2f056fab63e99e9...
9,Valentina,The Hunts,86.88,"folk-pop, hampton roads indie",https://p.scdn.co/mp3-preview/4f06473edda5d3fb...
