# Project: Song Recommender

### WebScraping III: revenge of the prototype

----------------------------------------------------------------------------------------------------
Following the prototype discussed in class, build an MVP, where the client will input a song, and the app will check if that song is in top 100 list. If it is, you will recommend another song from billboard 100, if it isn't, you will recommend a random song (for now) from another website, source of music boards.

----------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------
In the two previous labs, 'Lab | Web Scraping Single Page' and 'Lab | Web Scraping Multiple Pages' we created two csv files, hot_songs and other_songs, which we'll use to build the MVP

----------------------------------------------------------------------------------------------------------

In [4]:
# Importing the necessary libraries
import pandas as pd

In [2]:
#Importing the hot_songs csv file into a variable
hot_songs = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\hot_songs.csv")
hot_songs

Unnamed: 0,Artist,Song
0,Teddy Swims,Lose Control
1,Benson Boone,Beautiful Things
2,Ariana Grande,We Can't Be Friends (Wait For Your Love)
3,Jack Harlow,Lovin On Me
4,¥$: Ye & Ty Dolla $ign Featuring Rich The Kid ...,Carnival
...,...,...
95,Bryan Martin,We Ride
96,Zach Bryan,Tourniquet
97,FloyyMenor X Cris Mj,Gata Only
98,Kali Uchis & Peso Pluma,Igual Que Un Angel


In [3]:
#Importing the other_songs csv file into a variable
other_songs = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\other_songs.csv")
other_songs

Unnamed: 0,Artist,Song
0,Taylor Swift,All Too Well (10 Minute Version) (Taylor's Ver...
1,Rhiannon Giddens & Francesco Turrisi,Avalon
2,Anna B Savage,Baby Grand
3,Moneybagg Yo,Wockesha
4,Snail Mail,Automate
...,...,...
195,ROSALÍA,SAOKO
196,Alex G,Runner
197,Bad Bunny,El Apagón
198,Beyoncé,ALIEN SUPERSTAR


In [4]:
# Randomly sample one row (song and its artist)
random_song_row = hot_songs.sample(n=1)

# Extract the artist and song information from the sampled row
artist = random_song_row['Artist'].iloc[0]  # Use .iloc[0] to access the first (and only) item
song = random_song_row['Song'].iloc[0]

# Display the sampled song and artist
print(f"Artist: {artist}")
print(f"Song: {song}")

Artist: Peso Pluma & Anitta
Song: Bellakeo


In [11]:
# Function that checks if the input song is in the billoard 100, if so it return another hot song, 
# if not it returns another song from other_songs list
def song_recommender(song_name, artist_name, hot_songs, other_songs):
    # Check if the song and artist are in hot_songs
    is_in_hot_songs = hot_songs[(hot_songs['Song'] == song_name) & (hot_songs['Artist'] == artist_name)]
    
    # If the song is in hot_songs, recommend another song from hot_songs
    if not is_in_hot_songs.empty:
        random_song_row = hot_songs.sample(n=1)
    else:
        # If not, recommend a song from other_songs
        random_song_row = other_songs.sample(n=1)
    
    # Extract the artist and song information from the sampled row
    artist = random_song_row['Artist'].iloc[0]
    song = random_song_row['Song'].iloc[0]
    
    # Prepare the return string based on the source of the recommendation
    source = "billboard 100" if not is_in_hot_songs.empty else "somewhere else"
    return f"another song from {source}: {song}, {artist}"

In [14]:
song_name = input('Write down a song:')
artist_name = input('Write down the artist:')
song_recommender(song_name, artist_name, hot_songs, other_songs)

Write down a song:Schism
Write down the artist:Tool


'another song from somewhere else: Avalon, Rhiannon Giddens & Francesco Turrisi'

### Using Unspervised Learning

-----------------------------------------------------------------------------------------------------
It's the moment to perform clustering on the songs you collected. Remember that the ultimate goal of this little project is to improve the recommendations of artists. Clustering the songs will allow the recommendation system to limit the scope of the recommendations to only songs that belong to the same cluster - songs with similar audio features.

The experiments you did with the Spotify API and the Billboard web scraping will allow you to create a pipeline such that when the user enters a song, you:

1. Check whether or not the song is in the Billboard Hot 200.
2. Collect the audio features from the Spotify API.
After that, you want to send the Spotify audio features of the submitted song to the clustering model, which should return a cluster number.

We want to have as many songs as possible to create the clustering model, so we will add the songs you collected to a bigger dataset available on Kaggle containing 160 thousand songs.

------------------------------------------------------------------------------------------------

In [1]:
import spotipy
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from spotipy.oauth2 import SpotifyClientCredentials
from sklearn.metrics import pairwise_distances_argmin_min

In [2]:
#Initialize SpotiPy with user credentias
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="c43ae4f18c0d4b2c8b04c93649fa4b72",
                                                           client_secret="1f7865f1e7aa439e9e997bc38b591855"))

---------------------------------------------------------------------------------------------------
Let's start with importing the spotify tracks we extracted from spotify.

-------------------------------------------------------------------------------------------------

###### 1. Import dataframes

In [5]:
# Importing the track features csv file
spotify_track_features = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\files_for_lab\spotify_track_features.csv")
spotify_track_features

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.695,0.700,1,-1.587,0,0.0332,0.104000,0.000000,0.192,0.461,94.959,audio_features,2ikmBwZKZr0ahGcX4x8qtj,spotify:track:2ikmBwZKZr0ahGcX4x8qtj,https://api.spotify.com/v1/tracks/2ikmBwZKZr0a...,https://api.spotify.com/v1/audio-analysis/2ikm...,183296,4
1,0.593,0.741,4,-4.353,0,0.0359,0.022100,0.000000,0.393,0.460,96.978,audio_features,2MxErftY5S07dFtIdxQOSF,spotify:track:2MxErftY5S07dFtIdxQOSF,https://api.spotify.com/v1/tracks/2MxErftY5S07...,https://api.spotify.com/v1/audio-analysis/2MxE...,220670,4
2,0.660,0.765,2,-6.217,1,0.0299,0.125000,0.000956,0.235,0.681,123.051,audio_features,19meO0ADnoTjRuBMXZCdbs,spotify:track:19meO0ADnoTjRuBMXZCdbs,https://api.spotify.com/v1/tracks/19meO0ADnoTj...,https://api.spotify.com/v1/audio-analysis/19me...,175333,4
3,0.577,0.891,0,-4.672,1,0.0359,0.001230,0.000000,0.114,0.846,144.989,audio_features,3OPyobYAM5MgTm35AJV99O,spotify:track:3OPyobYAM5MgTm35AJV99O,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,https://api.spotify.com/v1/audio-analysis/3OPy...,155707,4
4,0.531,0.693,6,-5.203,0,0.0374,0.009310,0.000003,0.119,0.555,157.960,audio_features,4nDfJDZaUVtwOSnGROb2GN,spotify:track:4nDfJDZaUVtwOSnGROb2GN,https://api.spotify.com/v1/tracks/4nDfJDZaUVtw...,https://api.spotify.com/v1/audio-analysis/4nDf...,164453,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3003,0.406,0.807,7,-3.871,1,0.0507,0.008350,0.000000,0.118,0.290,159.713,audio_features,38ODYA4I5jEhFr4xJJd1RG,spotify:track:38ODYA4I5jEhFr4xJJd1RG,https://api.spotify.com/v1/tracks/38ODYA4I5jEh...,https://api.spotify.com/v1/audio-analysis/38OD...,226861,3
3004,0.502,0.961,9,-4.389,1,0.0905,0.000075,0.000002,0.124,0.281,110.028,audio_features,3oGNDHK33fp1GMqU9e4HQ7,spotify:track:3oGNDHK33fp1GMqU9e4HQ7,https://api.spotify.com/v1/tracks/3oGNDHK33fp1...,https://api.spotify.com/v1/audio-analysis/3oGN...,234550,4
3005,0.639,0.832,1,-4.976,1,0.1180,0.003080,0.000382,0.121,0.482,119.045,audio_features,2jPqRiw1kJvxDKIibCPhHu,spotify:track:2jPqRiw1kJvxDKIibCPhHu,https://api.spotify.com/v1/tracks/2jPqRiw1kJvx...,https://api.spotify.com/v1/audio-analysis/2jPq...,166467,4
3006,0.741,0.810,11,-5.808,0,0.1650,0.002650,0.018400,0.131,0.799,132.076,audio_features,0mH0iiNINYULYFwszeqWnW,spotify:track:0mH0iiNINYULYFwszeqWnW,https://api.spotify.com/v1/tracks/0mH0iiNINYUL...,https://api.spotify.com/v1/audio-analysis/0mH0...,125455,4


In [6]:
# Importing the track data csv file
spotify_track_data = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\files_for_lab\spotify_track_data.csv")
spotify_track_data

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,href,id,is_local,name,popularity,...,album.name,album.release_date,album.release_date_precision,album.total_tracks,album.type,album.uri,external_ids.isrc,external_urls.spotify,artist_name,artist_id
0,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,198123,False,https://api.spotify.com/v1/tracks/7hgIaQykdol1...,7hgIaQykdol1sWnj1uqBup,False,CULT CLASSIC,31,...,Cult Classic,2023-11-10,day,6,album,spotify:album:3vw9IZ3YV2T5bqYTYG0IXr,QMRSZ2302147,https://open.spotify.com/track/7hgIaQykdol1sWn...,Holy Wars,2dTOWcCL0cYviin0Uz1lj4
1,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,188808,False,https://api.spotify.com/v1/tracks/4pYiIZn2DKrK...,4pYiIZn2DKrK8MBYuS946R,False,BACKSTABBER,31,...,BACKSTABBER,2024-01-05,day,1,album,spotify:album:1RFijpHOUqGPZ3hdvcm8IM,QZWFH2374284,https://open.spotify.com/track/4pYiIZn2DKrK8MB...,"Ergo, Bria",0AF9HrL08aOaZPsIiO8GmA
2,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,195226,False,https://api.spotify.com/v1/tracks/3xeyZGEEVW8S...,3xeyZGEEVW8SZPmteR9Fw6,False,Love Goes On,31,...,Love Goes On,2023-05-12,day,1,album,spotify:album:30eJbXVjjtYtYtwkt337Sr,QZQAY2345784,https://open.spotify.com/track/3xeyZGEEVW8SZPm...,Kelsy Karter & The Heroines,2mAAO54PkHr3NjdlRpzEDl
3,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,93251,False,https://api.spotify.com/v1/tracks/1mj4y7NHlTq6...,1mj4y7NHlTq6YfTZOeNqOx,False,Tornillo,31,...,Tornillo,2023-11-16,day,1,album,spotify:album:6MSMis63C7wWLVSSmSl92b,USHR22316801,https://open.spotify.com/track/1mj4y7NHlTq6YfT...,Margaritas Podridas,5O9NicFLG2F9Xr7OHxmrb7
4,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,155706,True,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,3OPyobYAM5MgTm35AJV99O,False,you don't like me like that,31,...,you don't like me like that,2023-05-19,day,2,album,spotify:album:35YQAprIaYWKCWtr7iS5UT,USHR22316102,https://open.spotify.com/track/3OPyobYAM5MgTm3...,Zeph,502gYHkFCtLzBIcU4ctPLd
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3003,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,199906,True,https://api.spotify.com/v1/tracks/5W8YXBz9MTID...,5W8YXBz9MTIDyrpYaCg2Ky,False,Last Resort,81,...,Infest,2001-04-25,day,13,album,spotify:album:0BHa0ePkvGAVKymB4FU58m,USDW10021712,https://open.spotify.com/track/5W8YXBz9MTIDyrp...,Papa Roach,4RddZ3iHvSpGV4dvATac9X
3004,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,210240,False,https://api.spotify.com/v1/tracks/2DlHlPMa4M17...,2DlHlPMa4M17kufBvI2lEN,False,Chop Suey!,85,...,Toxicity,2001-09-04,day,15,album,spotify:album:6jWde94ln40epKIQCd8XUh,USSM10107256,https://open.spotify.com/track/2DlHlPMa4M17kuf...,System Of A Down,5eAWCfyUhZtHHtBdNk56l1
3005,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CL...",1,157333,False,https://api.spotify.com/v1/tracks/3K4HG9evC7dg...,3K4HG9evC7dg3N0R9cYqk4,False,One Step Closer,82,...,Hybrid Theory (Bonus Edition),2000,year,15,album,spotify:album:6hPkbAV3ZXpGZBGUvL6jVM,USWB10002399,https://open.spotify.com/track/3K4HG9evC7dg3N0...,Linkin Park,6XyY86QOPPrYVGvF9ch6wz
3006,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,225306,True,https://api.spotify.com/v1/tracks/6nJPHXRpKYv2...,6nJPHXRpKYv2yqtalEjKy5,False,Got the Life,73,...,Follow The Leader,1998-08-18,day,14,album,spotify:album:0gsiszk6JWYwAyGvaTTud4,USSM19801763,https://open.spotify.com/track/6nJPHXRpKYv2yqt...,Korn,3RNrq3jvMZxD9ZyoOZbQOD


------------------------------------------------------------------------------------------------
We need to get the audio features from the songs we extract on the Web Scraping lab.

----------------------------------------------------------------------------------------------

In [7]:
# Importing the other track features csv file
other_track_features = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\files_for_lab\other_track_features.csv")
other_track_features

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.864,0.476,1,-10.068,1,0.4360,0.03740,0.000000,0.3740,0.647,157.144,audio_features,1vrFJDrysqmsNAgyjBzx4f,spotify:track:1vrFJDrysqmsNAgyjBzx4f,https://api.spotify.com/v1/tracks/1vrFJDrysqms...,https://api.spotify.com/v1/audio-analysis/1vrF...,137580,4
1,0.545,0.641,10,-6.398,0,0.0998,0.00453,0.000066,0.1710,0.464,121.892,audio_features,1Hohk6AufHZOrrhMXZppax,spotify:track:1Hohk6AufHZOrrhMXZppax,https://api.spotify.com/v1/tracks/1Hohk6AufHZO...,https://api.spotify.com/v1/audio-analysis/1Hoh...,215460,4
2,0.629,0.698,8,-4.485,1,0.3080,0.05090,0.001660,0.0909,0.599,117.765,audio_features,0UvZcEfpzVyx47QsRbjyBz,spotify:track:0UvZcEfpzVyx47QsRbjyBz,https://api.spotify.com/v1/tracks/0UvZcEfpzVyx...,https://api.spotify.com/v1/audio-analysis/0UvZ...,201816,4
3,0.732,0.703,11,-7.224,1,0.0314,0.10400,0.041700,0.1090,0.836,108.979,audio_features,5DRnssBoVo8e7uAQZkNT8O,spotify:track:5DRnssBoVo8e7uAQZkNT8O,https://api.spotify.com/v1/tracks/5DRnssBoVo8e...,https://api.spotify.com/v1/audio-analysis/5DRn...,156772,4
4,0.827,0.768,0,-5.702,1,0.2650,0.79000,0.000024,0.4970,0.734,99.988,audio_features,2FYGZDfsAnNsrm1gVbyKnG,spotify:track:2FYGZDfsAnNsrm1gVbyKnG,https://api.spotify.com/v1/tracks/2FYGZDfsAnNs...,https://api.spotify.com/v1/audio-analysis/2FYG...,137533,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
148,0.645,0.531,8,-6.239,1,0.0272,0.08520,0.000004,0.1550,0.537,125.986,audio_features,6ZXH6YfcouRBWanewWxuuz,spotify:track:6ZXH6YfcouRBWanewWxuuz,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,https://api.spotify.com/v1/audio-analysis/6ZXH...,189027,3
149,0.582,0.537,0,-9.229,1,0.6190,0.12900,0.000000,0.0749,0.771,82.103,audio_features,0nqhKXDjsyBMvbeWmgijD0,spotify:track:0nqhKXDjsyBMvbeWmgijD0,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,https://api.spotify.com/v1/audio-analysis/0nqh...,180930,4
150,0.429,0.181,3,-12.005,1,0.0412,0.73400,0.000008,0.1010,0.259,80.106,audio_features,1AU6deCmEZI4di2DzzEt0U,spotify:track:1AU6deCmEZI4di2DzzEt0U,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,https://api.spotify.com/v1/audio-analysis/1AU6...,316170,4
151,0.569,0.480,7,-7.533,1,0.0412,0.78100,0.082200,0.1000,0.353,123.700,audio_features,3cJI6VFdyRdriDVwB0sU3Y,spotify:track:3cJI6VFdyRdriDVwB0sU3Y,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,https://api.spotify.com/v1/audio-analysis/3cJI...,234827,5


In [8]:
# Importing the other track data csv file
other_track_data = pd.read_csv(r"C:\Users\mafal\Documents\ironhack\projects\project-song-recommender\files_for_lab\other_track_data.csv")
other_track_data

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,href,id,is_local,name,popularity,...,album.release_date,album.release_date_precision,album.total_tracks,album.type,album.uri,external_ids.isrc,external_urls.spotify,artist_name,artist_id,artist_genre
0,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,137579,True,https://api.spotify.com/v1/tracks/1vrFJDrysqms...,1vrFJDrysqmsNAgyjBzx4f,False,F.N.F. (Let's Go),59,...,2022-05-03,day,1,album,spotify:album:1FkcZKerCfWg4nUItVHf9B,QZRD92201643,https://open.spotify.com/track/1vrFJDrysqmsNAg...,Hitkidd,5pR1zWq3UPsOpW1pTWayLf,['memphis hip hop']
1,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,215459,True,https://api.spotify.com/v1/tracks/1Hohk6AufHZO...,1Hohk6AufHZOrrhMXZppax,False,ALIEN SUPERSTAR,73,...,2022-07-29,day,16,album,spotify:album:6FJxoadUE4JNVwWHghBwnb,USSM12206231,https://open.spotify.com/track/1Hohk6AufHZOrrh...,Beyoncé,6vWDO969PvNqNYHIOW5v0m,"['pop', 'r&b']"
2,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,201816,True,https://api.spotify.com/v1/tracks/0UvZcEfpzVyx...,0UvZcEfpzVyx47QsRbjyBz,False,El Apagón,70,...,2022-05-06,day,23,album,spotify:album:3RQQmkQEvNCY4prGKE6oc5,QM6MZ2214890,https://open.spotify.com/track/0UvZcEfpzVyx47Q...,Bad Bunny,4q3ewBCX7sLwd24euuV69X,"['reggaeton', 'trap latino', 'urbano latino']"
3,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,156772,False,https://api.spotify.com/v1/tracks/5DRnssBoVo8e...,5DRnssBoVo8e7uAQZkNT8O,False,Runner,54,...,2022-09-23,day,13,album,spotify:album:6TzgWk5HZItbFmMT7hH4bU,GBCEL2200057,https://open.spotify.com/track/5DRnssBoVo8e7uA...,Alex G,6lcwlkAjBPSKnFBZjjZFJs,"['philly indie', 'pov: indie']"
4,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,137533,True,https://api.spotify.com/v1/tracks/2FYGZDfsAnNs...,2FYGZDfsAnNsrm1gVbyKnG,False,SAOKO,66,...,2022-03-18,day,16,album,spotify:album:6jbtHi5R0jMXoliU2OS0lo,USSM12109218,https://open.spotify.com/track/2FYGZDfsAnNsrm1...,ROSALÍA,7ltDVBr6mKbRvohxheJ9h1,"['pop', 'r&b en espanol']"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
151,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,189026,False,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,6ZXH6YfcouRBWanewWxuuz,False,Automate,32,...,2021-11-05,day,10,album,spotify:album:0zNWhYDalgisc4uweLIGZJ,USMTD2100282,https://open.spotify.com/track/6ZXH6YfcouRBWan...,Snail Mail,4QkSD9TRUnMtI8Fq1jXJJe,"['art pop', 'baltimore indie', 'bubblegrunge',..."
152,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,180930,True,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,0nqhKXDjsyBMvbeWmgijD0,False,Wockesha,64,...,2021-04-23,day,22,album,spotify:album:5ffogo3K3fYibGWa93IzUe,USUM72105621,https://open.spotify.com/track/0nqhKXDjsyBMvbe...,Moneybagg Yo,3tJoFztHeIJkJWMrx0td2f,"['memphis hip hop', 'rap', 'southern hip hop',..."
153,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA', 'CL...",1,316169,False,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,1AU6deCmEZI4di2DzzEt0U,False,Baby Grand,9,...,2021-01-29,day,10,album,spotify:album:3Jr2EYny7lPAoB1XPWaxe5,DED622000036,https://open.spotify.com/track/1AU6deCmEZI4di2...,Anna B Savage,6nbtlXRy0S6adYpDVoRdNi,[]
154,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,234826,False,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,3cJI6VFdyRdriDVwB0sU3Y,False,Avalon (with Francesco Turrisi),20,...,2021-04-09,day,12,album,spotify:album:75qSKqLqEt7zOei7If7Lms,USNO12100014,https://open.spotify.com/track/3cJI6VFdyRdriDV...,Rhiannon Giddens,1EI0NtLHoh9KBziYCeN1vM,"['black americana', 'folk', 'new americana', '..."


--------------------------------------------------------------------------------------------------------
Let's now concat both dfs spotify tracks and other tracks to get a larger dataset for our song recommender.

--------------------------------------------------------------------------------------------------------

In [9]:
# Concatenate DataFrames by appending rows
track_features = pd.concat([spotify_track_features, other_track_features], ignore_index=True)
track_features

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.695,0.700,1,-1.587,0,0.0332,0.10400,0.000000,0.1920,0.461,94.959,audio_features,2ikmBwZKZr0ahGcX4x8qtj,spotify:track:2ikmBwZKZr0ahGcX4x8qtj,https://api.spotify.com/v1/tracks/2ikmBwZKZr0a...,https://api.spotify.com/v1/audio-analysis/2ikm...,183296,4
1,0.593,0.741,4,-4.353,0,0.0359,0.02210,0.000000,0.3930,0.460,96.978,audio_features,2MxErftY5S07dFtIdxQOSF,spotify:track:2MxErftY5S07dFtIdxQOSF,https://api.spotify.com/v1/tracks/2MxErftY5S07...,https://api.spotify.com/v1/audio-analysis/2MxE...,220670,4
2,0.660,0.765,2,-6.217,1,0.0299,0.12500,0.000956,0.2350,0.681,123.051,audio_features,19meO0ADnoTjRuBMXZCdbs,spotify:track:19meO0ADnoTjRuBMXZCdbs,https://api.spotify.com/v1/tracks/19meO0ADnoTj...,https://api.spotify.com/v1/audio-analysis/19me...,175333,4
3,0.577,0.891,0,-4.672,1,0.0359,0.00123,0.000000,0.1140,0.846,144.989,audio_features,3OPyobYAM5MgTm35AJV99O,spotify:track:3OPyobYAM5MgTm35AJV99O,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,https://api.spotify.com/v1/audio-analysis/3OPy...,155707,4
4,0.531,0.693,6,-5.203,0,0.0374,0.00931,0.000003,0.1190,0.555,157.960,audio_features,4nDfJDZaUVtwOSnGROb2GN,spotify:track:4nDfJDZaUVtwOSnGROb2GN,https://api.spotify.com/v1/tracks/4nDfJDZaUVtw...,https://api.spotify.com/v1/audio-analysis/4nDf...,164453,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3156,0.645,0.531,8,-6.239,1,0.0272,0.08520,0.000004,0.1550,0.537,125.986,audio_features,6ZXH6YfcouRBWanewWxuuz,spotify:track:6ZXH6YfcouRBWanewWxuuz,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,https://api.spotify.com/v1/audio-analysis/6ZXH...,189027,3
3157,0.582,0.537,0,-9.229,1,0.6190,0.12900,0.000000,0.0749,0.771,82.103,audio_features,0nqhKXDjsyBMvbeWmgijD0,spotify:track:0nqhKXDjsyBMvbeWmgijD0,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,https://api.spotify.com/v1/audio-analysis/0nqh...,180930,4
3158,0.429,0.181,3,-12.005,1,0.0412,0.73400,0.000008,0.1010,0.259,80.106,audio_features,1AU6deCmEZI4di2DzzEt0U,spotify:track:1AU6deCmEZI4di2DzzEt0U,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,https://api.spotify.com/v1/audio-analysis/1AU6...,316170,4
3159,0.569,0.480,7,-7.533,1,0.0412,0.78100,0.082200,0.1000,0.353,123.700,audio_features,3cJI6VFdyRdriDVwB0sU3Y,spotify:track:3cJI6VFdyRdriDVwB0sU3Y,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,https://api.spotify.com/v1/audio-analysis/3cJI...,234827,5


In [15]:
# Checking for duplicates
track_features.duplicated().sum()

132

In [16]:
# Dropping the duplicates
track_features = track_features.drop_duplicates()
track_features

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.695,0.700,1,-1.587,0,0.0332,0.10400,0.000000,0.1920,0.461,94.959,audio_features,2ikmBwZKZr0ahGcX4x8qtj,spotify:track:2ikmBwZKZr0ahGcX4x8qtj,https://api.spotify.com/v1/tracks/2ikmBwZKZr0a...,https://api.spotify.com/v1/audio-analysis/2ikm...,183296,4
1,0.593,0.741,4,-4.353,0,0.0359,0.02210,0.000000,0.3930,0.460,96.978,audio_features,2MxErftY5S07dFtIdxQOSF,spotify:track:2MxErftY5S07dFtIdxQOSF,https://api.spotify.com/v1/tracks/2MxErftY5S07...,https://api.spotify.com/v1/audio-analysis/2MxE...,220670,4
2,0.660,0.765,2,-6.217,1,0.0299,0.12500,0.000956,0.2350,0.681,123.051,audio_features,19meO0ADnoTjRuBMXZCdbs,spotify:track:19meO0ADnoTjRuBMXZCdbs,https://api.spotify.com/v1/tracks/19meO0ADnoTj...,https://api.spotify.com/v1/audio-analysis/19me...,175333,4
3,0.577,0.891,0,-4.672,1,0.0359,0.00123,0.000000,0.1140,0.846,144.989,audio_features,3OPyobYAM5MgTm35AJV99O,spotify:track:3OPyobYAM5MgTm35AJV99O,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,https://api.spotify.com/v1/audio-analysis/3OPy...,155707,4
4,0.531,0.693,6,-5.203,0,0.0374,0.00931,0.000003,0.1190,0.555,157.960,audio_features,4nDfJDZaUVtwOSnGROb2GN,spotify:track:4nDfJDZaUVtwOSnGROb2GN,https://api.spotify.com/v1/tracks/4nDfJDZaUVtw...,https://api.spotify.com/v1/audio-analysis/4nDf...,164453,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3156,0.645,0.531,8,-6.239,1,0.0272,0.08520,0.000004,0.1550,0.537,125.986,audio_features,6ZXH6YfcouRBWanewWxuuz,spotify:track:6ZXH6YfcouRBWanewWxuuz,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,https://api.spotify.com/v1/audio-analysis/6ZXH...,189027,3
3157,0.582,0.537,0,-9.229,1,0.6190,0.12900,0.000000,0.0749,0.771,82.103,audio_features,0nqhKXDjsyBMvbeWmgijD0,spotify:track:0nqhKXDjsyBMvbeWmgijD0,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,https://api.spotify.com/v1/audio-analysis/0nqh...,180930,4
3158,0.429,0.181,3,-12.005,1,0.0412,0.73400,0.000008,0.1010,0.259,80.106,audio_features,1AU6deCmEZI4di2DzzEt0U,spotify:track:1AU6deCmEZI4di2DzzEt0U,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,https://api.spotify.com/v1/audio-analysis/1AU6...,316170,4
3159,0.569,0.480,7,-7.533,1,0.0412,0.78100,0.082200,0.1000,0.353,123.700,audio_features,3cJI6VFdyRdriDVwB0sU3Y,spotify:track:3cJI6VFdyRdriDVwB0sU3Y,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,https://api.spotify.com/v1/audio-analysis/3cJI...,234827,5


In [10]:
# Concatenate DataFrames by appending rows
track_data = pd.concat([spotify_track_data, other_track_data], ignore_index=True)
track_data

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,href,id,is_local,name,popularity,...,album.release_date,album.release_date_precision,album.total_tracks,album.type,album.uri,external_ids.isrc,external_urls.spotify,artist_name,artist_id,artist_genre
0,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,198123,False,https://api.spotify.com/v1/tracks/7hgIaQykdol1...,7hgIaQykdol1sWnj1uqBup,False,CULT CLASSIC,31,...,2023-11-10,day,6,album,spotify:album:3vw9IZ3YV2T5bqYTYG0IXr,QMRSZ2302147,https://open.spotify.com/track/7hgIaQykdol1sWn...,Holy Wars,2dTOWcCL0cYviin0Uz1lj4,
1,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,188808,False,https://api.spotify.com/v1/tracks/4pYiIZn2DKrK...,4pYiIZn2DKrK8MBYuS946R,False,BACKSTABBER,31,...,2024-01-05,day,1,album,spotify:album:1RFijpHOUqGPZ3hdvcm8IM,QZWFH2374284,https://open.spotify.com/track/4pYiIZn2DKrK8MB...,"Ergo, Bria",0AF9HrL08aOaZPsIiO8GmA,
2,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,195226,False,https://api.spotify.com/v1/tracks/3xeyZGEEVW8S...,3xeyZGEEVW8SZPmteR9Fw6,False,Love Goes On,31,...,2023-05-12,day,1,album,spotify:album:30eJbXVjjtYtYtwkt337Sr,QZQAY2345784,https://open.spotify.com/track/3xeyZGEEVW8SZPm...,Kelsy Karter & The Heroines,2mAAO54PkHr3NjdlRpzEDl,
3,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,93251,False,https://api.spotify.com/v1/tracks/1mj4y7NHlTq6...,1mj4y7NHlTq6YfTZOeNqOx,False,Tornillo,31,...,2023-11-16,day,1,album,spotify:album:6MSMis63C7wWLVSSmSl92b,USHR22316801,https://open.spotify.com/track/1mj4y7NHlTq6YfT...,Margaritas Podridas,5O9NicFLG2F9Xr7OHxmrb7,
4,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,155706,True,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,3OPyobYAM5MgTm35AJV99O,False,you don't like me like that,31,...,2023-05-19,day,2,album,spotify:album:35YQAprIaYWKCWtr7iS5UT,USHR22316102,https://open.spotify.com/track/3OPyobYAM5MgTm3...,Zeph,502gYHkFCtLzBIcU4ctPLd,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3159,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,189026,False,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,6ZXH6YfcouRBWanewWxuuz,False,Automate,32,...,2021-11-05,day,10,album,spotify:album:0zNWhYDalgisc4uweLIGZJ,USMTD2100282,https://open.spotify.com/track/6ZXH6YfcouRBWan...,Snail Mail,4QkSD9TRUnMtI8Fq1jXJJe,"['art pop', 'baltimore indie', 'bubblegrunge',..."
3160,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,180930,True,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,0nqhKXDjsyBMvbeWmgijD0,False,Wockesha,64,...,2021-04-23,day,22,album,spotify:album:5ffogo3K3fYibGWa93IzUe,USUM72105621,https://open.spotify.com/track/0nqhKXDjsyBMvbe...,Moneybagg Yo,3tJoFztHeIJkJWMrx0td2f,"['memphis hip hop', 'rap', 'southern hip hop',..."
3161,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA', 'CL...",1,316169,False,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,1AU6deCmEZI4di2DzzEt0U,False,Baby Grand,9,...,2021-01-29,day,10,album,spotify:album:3Jr2EYny7lPAoB1XPWaxe5,DED622000036,https://open.spotify.com/track/1AU6deCmEZI4di2...,Anna B Savage,6nbtlXRy0S6adYpDVoRdNi,[]
3162,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,234826,False,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,3cJI6VFdyRdriDVwB0sU3Y,False,Avalon (with Francesco Turrisi),20,...,2021-04-09,day,12,album,spotify:album:75qSKqLqEt7zOei7If7Lms,USNO12100014,https://open.spotify.com/track/3cJI6VFdyRdriDV...,Rhiannon Giddens,1EI0NtLHoh9KBziYCeN1vM,"['black americana', 'folk', 'new americana', '..."


In [13]:
# Dropping the 'artist_genre' column in place
track_data.drop('artist_genre', axis=1, inplace=True)
track_data

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,href,id,is_local,name,popularity,...,album.name,album.release_date,album.release_date_precision,album.total_tracks,album.type,album.uri,external_ids.isrc,external_urls.spotify,artist_name,artist_id
0,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,198123,False,https://api.spotify.com/v1/tracks/7hgIaQykdol1...,7hgIaQykdol1sWnj1uqBup,False,CULT CLASSIC,31,...,Cult Classic,2023-11-10,day,6,album,spotify:album:3vw9IZ3YV2T5bqYTYG0IXr,QMRSZ2302147,https://open.spotify.com/track/7hgIaQykdol1sWn...,Holy Wars,2dTOWcCL0cYviin0Uz1lj4
1,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,188808,False,https://api.spotify.com/v1/tracks/4pYiIZn2DKrK...,4pYiIZn2DKrK8MBYuS946R,False,BACKSTABBER,31,...,BACKSTABBER,2024-01-05,day,1,album,spotify:album:1RFijpHOUqGPZ3hdvcm8IM,QZWFH2374284,https://open.spotify.com/track/4pYiIZn2DKrK8MB...,"Ergo, Bria",0AF9HrL08aOaZPsIiO8GmA
2,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,195226,False,https://api.spotify.com/v1/tracks/3xeyZGEEVW8S...,3xeyZGEEVW8SZPmteR9Fw6,False,Love Goes On,31,...,Love Goes On,2023-05-12,day,1,album,spotify:album:30eJbXVjjtYtYtwkt337Sr,QZQAY2345784,https://open.spotify.com/track/3xeyZGEEVW8SZPm...,Kelsy Karter & The Heroines,2mAAO54PkHr3NjdlRpzEDl
3,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,93251,False,https://api.spotify.com/v1/tracks/1mj4y7NHlTq6...,1mj4y7NHlTq6YfTZOeNqOx,False,Tornillo,31,...,Tornillo,2023-11-16,day,1,album,spotify:album:6MSMis63C7wWLVSSmSl92b,USHR22316801,https://open.spotify.com/track/1mj4y7NHlTq6YfT...,Margaritas Podridas,5O9NicFLG2F9Xr7OHxmrb7
4,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,155706,True,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,3OPyobYAM5MgTm35AJV99O,False,you don't like me like that,31,...,you don't like me like that,2023-05-19,day,2,album,spotify:album:35YQAprIaYWKCWtr7iS5UT,USHR22316102,https://open.spotify.com/track/3OPyobYAM5MgTm3...,Zeph,502gYHkFCtLzBIcU4ctPLd
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3159,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,189026,False,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,6ZXH6YfcouRBWanewWxuuz,False,Automate,32,...,Valentine,2021-11-05,day,10,album,spotify:album:0zNWhYDalgisc4uweLIGZJ,USMTD2100282,https://open.spotify.com/track/6ZXH6YfcouRBWan...,Snail Mail,4QkSD9TRUnMtI8Fq1jXJJe
3160,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,180930,True,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,0nqhKXDjsyBMvbeWmgijD0,False,Wockesha,64,...,A Gangsta’s Pain,2021-04-23,day,22,album,spotify:album:5ffogo3K3fYibGWa93IzUe,USUM72105621,https://open.spotify.com/track/0nqhKXDjsyBMvbe...,Moneybagg Yo,3tJoFztHeIJkJWMrx0td2f
3161,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA', 'CL...",1,316169,False,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,1AU6deCmEZI4di2DzzEt0U,False,Baby Grand,9,...,A Common Turn,2021-01-29,day,10,album,spotify:album:3Jr2EYny7lPAoB1XPWaxe5,DED622000036,https://open.spotify.com/track/1AU6deCmEZI4di2...,Anna B Savage,6nbtlXRy0S6adYpDVoRdNi
3162,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,234826,False,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,3cJI6VFdyRdriDVwB0sU3Y,False,Avalon (with Francesco Turrisi),20,...,They're Calling Me Home (with Francesco Turrisi),2021-04-09,day,12,album,spotify:album:75qSKqLqEt7zOei7If7Lms,USNO12100014,https://open.spotify.com/track/3cJI6VFdyRdriDV...,Rhiannon Giddens,1EI0NtLHoh9KBziYCeN1vM


In [14]:
# Checking for duplicates
track_data.duplicated().sum()

118

In [17]:
# Dropping the duplicates
track_data = track_data.drop_duplicates()
track_data

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,href,id,is_local,name,popularity,...,album.name,album.release_date,album.release_date_precision,album.total_tracks,album.type,album.uri,external_ids.isrc,external_urls.spotify,artist_name,artist_id
0,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,198123,False,https://api.spotify.com/v1/tracks/7hgIaQykdol1...,7hgIaQykdol1sWnj1uqBup,False,CULT CLASSIC,31,...,Cult Classic,2023-11-10,day,6,album,spotify:album:3vw9IZ3YV2T5bqYTYG0IXr,QMRSZ2302147,https://open.spotify.com/track/7hgIaQykdol1sWn...,Holy Wars,2dTOWcCL0cYviin0Uz1lj4
1,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,188808,False,https://api.spotify.com/v1/tracks/4pYiIZn2DKrK...,4pYiIZn2DKrK8MBYuS946R,False,BACKSTABBER,31,...,BACKSTABBER,2024-01-05,day,1,album,spotify:album:1RFijpHOUqGPZ3hdvcm8IM,QZWFH2374284,https://open.spotify.com/track/4pYiIZn2DKrK8MB...,"Ergo, Bria",0AF9HrL08aOaZPsIiO8GmA
2,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,195226,False,https://api.spotify.com/v1/tracks/3xeyZGEEVW8S...,3xeyZGEEVW8SZPmteR9Fw6,False,Love Goes On,31,...,Love Goes On,2023-05-12,day,1,album,spotify:album:30eJbXVjjtYtYtwkt337Sr,QZQAY2345784,https://open.spotify.com/track/3xeyZGEEVW8SZPm...,Kelsy Karter & The Heroines,2mAAO54PkHr3NjdlRpzEDl
3,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,93251,False,https://api.spotify.com/v1/tracks/1mj4y7NHlTq6...,1mj4y7NHlTq6YfTZOeNqOx,False,Tornillo,31,...,Tornillo,2023-11-16,day,1,album,spotify:album:6MSMis63C7wWLVSSmSl92b,USHR22316801,https://open.spotify.com/track/1mj4y7NHlTq6YfT...,Margaritas Podridas,5O9NicFLG2F9Xr7OHxmrb7
4,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,155706,True,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,3OPyobYAM5MgTm35AJV99O,False,you don't like me like that,31,...,you don't like me like that,2023-05-19,day,2,album,spotify:album:35YQAprIaYWKCWtr7iS5UT,USHR22316102,https://open.spotify.com/track/3OPyobYAM5MgTm3...,Zeph,502gYHkFCtLzBIcU4ctPLd
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3159,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,189026,False,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,6ZXH6YfcouRBWanewWxuuz,False,Automate,32,...,Valentine,2021-11-05,day,10,album,spotify:album:0zNWhYDalgisc4uweLIGZJ,USMTD2100282,https://open.spotify.com/track/6ZXH6YfcouRBWan...,Snail Mail,4QkSD9TRUnMtI8Fq1jXJJe
3160,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,180930,True,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,0nqhKXDjsyBMvbeWmgijD0,False,Wockesha,64,...,A Gangsta’s Pain,2021-04-23,day,22,album,spotify:album:5ffogo3K3fYibGWa93IzUe,USUM72105621,https://open.spotify.com/track/0nqhKXDjsyBMvbe...,Moneybagg Yo,3tJoFztHeIJkJWMrx0td2f
3161,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA', 'CL...",1,316169,False,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,1AU6deCmEZI4di2DzzEt0U,False,Baby Grand,9,...,A Common Turn,2021-01-29,day,10,album,spotify:album:3Jr2EYny7lPAoB1XPWaxe5,DED622000036,https://open.spotify.com/track/1AU6deCmEZI4di2...,Anna B Savage,6nbtlXRy0S6adYpDVoRdNi
3162,[{'external_urls': {'spotify': 'https://open.s...,"['AR', 'AU', 'AT', 'BE', 'BO', 'BR', 'BG', 'CA...",1,234826,False,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,3cJI6VFdyRdriDVwB0sU3Y,False,Avalon (with Francesco Turrisi),20,...,They're Calling Me Home (with Francesco Turrisi),2021-04-09,day,12,album,spotify:album:75qSKqLqEt7zOei7If7Lms,USNO12100014,https://open.spotify.com/track/3cJI6VFdyRdriDV...,Rhiannon Giddens,1EI0NtLHoh9KBziYCeN1vM


-----------------------------------------------------------------------------------------------------
Before creating the clusters we need a single df with both audio features and song and artist name.

------------------------------------------------------------------------------------------------------

In [18]:
# Merging track_features with specific columns from track_data on 'id'
tracks = pd.merge(track_features, track_data[['id', 'name', 'artist_name']], on='id', how='inner')
tracks

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,name,artist_name
0,0.695,0.700,1,-1.587,0,0.0332,0.10400,0.000000,0.1920,0.461,94.959,audio_features,2ikmBwZKZr0ahGcX4x8qtj,spotify:track:2ikmBwZKZr0ahGcX4x8qtj,https://api.spotify.com/v1/tracks/2ikmBwZKZr0a...,https://api.spotify.com/v1/audio-analysis/2ikm...,183296,4,Baggage,Bishop Briggs
1,0.593,0.741,4,-4.353,0,0.0359,0.02210,0.000000,0.3930,0.460,96.978,audio_features,2MxErftY5S07dFtIdxQOSF,spotify:track:2MxErftY5S07dFtIdxQOSF,https://api.spotify.com/v1/tracks/2MxErftY5S07...,https://api.spotify.com/v1/audio-analysis/2MxE...,220670,4,Love & War,BAYBE
2,0.660,0.765,2,-6.217,1,0.0299,0.12500,0.000956,0.2350,0.681,123.051,audio_features,19meO0ADnoTjRuBMXZCdbs,spotify:track:19meO0ADnoTjRuBMXZCdbs,https://api.spotify.com/v1/tracks/19meO0ADnoTj...,https://api.spotify.com/v1/audio-analysis/19me...,175333,4,Flood Into,Fazerdaze
3,0.577,0.891,0,-4.672,1,0.0359,0.00123,0.000000,0.1140,0.846,144.989,audio_features,3OPyobYAM5MgTm35AJV99O,spotify:track:3OPyobYAM5MgTm35AJV99O,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,https://api.spotify.com/v1/audio-analysis/3OPy...,155707,4,you don't like me like that,Zeph
4,0.531,0.693,6,-5.203,0,0.0374,0.00931,0.000003,0.1190,0.555,157.960,audio_features,4nDfJDZaUVtwOSnGROb2GN,spotify:track:4nDfJDZaUVtwOSnGROb2GN,https://api.spotify.com/v1/tracks/4nDfJDZaUVtw...,https://api.spotify.com/v1/audio-analysis/4nDf...,164453,4,Falling for the Sky,Kailee Morgue
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2850,0.645,0.531,8,-6.239,1,0.0272,0.08520,0.000004,0.1550,0.537,125.986,audio_features,6ZXH6YfcouRBWanewWxuuz,spotify:track:6ZXH6YfcouRBWanewWxuuz,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,https://api.spotify.com/v1/audio-analysis/6ZXH...,189027,3,Automate,Snail Mail
2851,0.582,0.537,0,-9.229,1,0.6190,0.12900,0.000000,0.0749,0.771,82.103,audio_features,0nqhKXDjsyBMvbeWmgijD0,spotify:track:0nqhKXDjsyBMvbeWmgijD0,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,https://api.spotify.com/v1/audio-analysis/0nqh...,180930,4,Wockesha,Moneybagg Yo
2852,0.429,0.181,3,-12.005,1,0.0412,0.73400,0.000008,0.1010,0.259,80.106,audio_features,1AU6deCmEZI4di2DzzEt0U,spotify:track:1AU6deCmEZI4di2DzzEt0U,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,https://api.spotify.com/v1/audio-analysis/1AU6...,316170,4,Baby Grand,Anna B Savage
2853,0.569,0.480,7,-7.533,1,0.0412,0.78100,0.082200,0.1000,0.353,123.700,audio_features,3cJI6VFdyRdriDVwB0sU3Y,spotify:track:3cJI6VFdyRdriDVwB0sU3Y,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,https://api.spotify.com/v1/audio-analysis/3cJI...,234827,5,Avalon (with Francesco Turrisi),Rhiannon Giddens


In [19]:
# Renaming column 'name' to 'song_name'
# Rename the column in place
tracks.rename(columns={'name': 'song_name'}, inplace=True)
tracks

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,song_name,artist_name
0,0.695,0.700,1,-1.587,0,0.0332,0.10400,0.000000,0.1920,0.461,94.959,audio_features,2ikmBwZKZr0ahGcX4x8qtj,spotify:track:2ikmBwZKZr0ahGcX4x8qtj,https://api.spotify.com/v1/tracks/2ikmBwZKZr0a...,https://api.spotify.com/v1/audio-analysis/2ikm...,183296,4,Baggage,Bishop Briggs
1,0.593,0.741,4,-4.353,0,0.0359,0.02210,0.000000,0.3930,0.460,96.978,audio_features,2MxErftY5S07dFtIdxQOSF,spotify:track:2MxErftY5S07dFtIdxQOSF,https://api.spotify.com/v1/tracks/2MxErftY5S07...,https://api.spotify.com/v1/audio-analysis/2MxE...,220670,4,Love & War,BAYBE
2,0.660,0.765,2,-6.217,1,0.0299,0.12500,0.000956,0.2350,0.681,123.051,audio_features,19meO0ADnoTjRuBMXZCdbs,spotify:track:19meO0ADnoTjRuBMXZCdbs,https://api.spotify.com/v1/tracks/19meO0ADnoTj...,https://api.spotify.com/v1/audio-analysis/19me...,175333,4,Flood Into,Fazerdaze
3,0.577,0.891,0,-4.672,1,0.0359,0.00123,0.000000,0.1140,0.846,144.989,audio_features,3OPyobYAM5MgTm35AJV99O,spotify:track:3OPyobYAM5MgTm35AJV99O,https://api.spotify.com/v1/tracks/3OPyobYAM5Mg...,https://api.spotify.com/v1/audio-analysis/3OPy...,155707,4,you don't like me like that,Zeph
4,0.531,0.693,6,-5.203,0,0.0374,0.00931,0.000003,0.1190,0.555,157.960,audio_features,4nDfJDZaUVtwOSnGROb2GN,spotify:track:4nDfJDZaUVtwOSnGROb2GN,https://api.spotify.com/v1/tracks/4nDfJDZaUVtw...,https://api.spotify.com/v1/audio-analysis/4nDf...,164453,4,Falling for the Sky,Kailee Morgue
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2850,0.645,0.531,8,-6.239,1,0.0272,0.08520,0.000004,0.1550,0.537,125.986,audio_features,6ZXH6YfcouRBWanewWxuuz,spotify:track:6ZXH6YfcouRBWanewWxuuz,https://api.spotify.com/v1/tracks/6ZXH6YfcouRB...,https://api.spotify.com/v1/audio-analysis/6ZXH...,189027,3,Automate,Snail Mail
2851,0.582,0.537,0,-9.229,1,0.6190,0.12900,0.000000,0.0749,0.771,82.103,audio_features,0nqhKXDjsyBMvbeWmgijD0,spotify:track:0nqhKXDjsyBMvbeWmgijD0,https://api.spotify.com/v1/tracks/0nqhKXDjsyBM...,https://api.spotify.com/v1/audio-analysis/0nqh...,180930,4,Wockesha,Moneybagg Yo
2852,0.429,0.181,3,-12.005,1,0.0412,0.73400,0.000008,0.1010,0.259,80.106,audio_features,1AU6deCmEZI4di2DzzEt0U,spotify:track:1AU6deCmEZI4di2DzzEt0U,https://api.spotify.com/v1/tracks/1AU6deCmEZI4...,https://api.spotify.com/v1/audio-analysis/1AU6...,316170,4,Baby Grand,Anna B Savage
2853,0.569,0.480,7,-7.533,1,0.0412,0.78100,0.082200,0.1000,0.353,123.700,audio_features,3cJI6VFdyRdriDVwB0sU3Y,spotify:track:3cJI6VFdyRdriDVwB0sU3Y,https://api.spotify.com/v1/tracks/3cJI6VFdyRdr...,https://api.spotify.com/v1/audio-analysis/3cJI...,234827,5,Avalon (with Francesco Turrisi),Rhiannon Giddens


###### 2. Create clusters

----------------------------------------------------------------------------------------------
For the number of clusters we'll choose the number of different music genres present in our tracks' list, 27:
- Nu-Metal
- Alternative Metal
- Progressive Metal
- Black Metal
- Grunge
- Punk
- Pure Pop Punk
- Alternative Rock 
- Blues Rock 
- Mordern Blues Rock
- Hard Rock
- New Alt Rock 
- Mordern Rock 
- Rock Pop 
- Indie
- Pop
- Jazz 
- Mordern Jazz 
- Sad Soul
- Trap
- Hip-Hop 
- Rap 
- Rap Tuga
- Alternative Beats
- Rocktronic
- Techno Rave
- Fierce Femmes

---------------------------------------------------------------------------------------------------

In [20]:
x = tracks[['danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness', 
            'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo']].copy()

scaler = StandardScaler()

x_prep = scaler.fit_transform(x)

kmeans = KMeans(n_clusters=27, random_state=42)
kmeans.fit(x_prep)

clusters = kmeans.predict(x_prep)

scaled_df = pd.DataFrame(x_prep, columns=x.columns)
scaled_df['song name'] = tracks['song_name']
scaled_df['artist'] = tracks['artist_name']
scaled_df['cluster'] = clusters
scaled_df

  super()._check_params_vs_input(X, default_n_init=10)


Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,song name,artist,cluster
0,1.266152,-0.551513,-1.186416,1.463626,-1.186276,-0.766921,0.055679,-0.433631,-0.191730,0.183673,-1.119276,Baggage,Bishop Briggs,11
1,0.617467,-0.334003,-0.337702,0.483710,-1.186276,-0.733250,-0.342576,-0.433631,1.044944,0.179241,-1.054183,Love & War,BAYBE,15
2,1.043564,-0.206679,-0.903511,-0.176652,0.842974,-0.808075,0.157796,-0.429990,0.072832,1.158730,-0.213576,Flood Into,Fazerdaze,9
3,0.515712,0.461769,-1.469320,0.370698,0.842974,-0.733250,-0.444060,-0.433631,-0.671634,1.890024,0.493716,you don't like me like that,Zeph,9
4,0.223168,-0.588649,0.228107,0.182579,-1.186276,-0.714543,-0.404770,-0.433619,-0.640871,0.600288,0.911908,Falling for the Sky,Kailee Morgue,25
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2850,0.948169,-1.448082,0.793917,-0.184446,0.842974,-0.841746,-0.035740,-0.433616,-0.419377,0.520511,-0.118950,Automate,Snail Mail,8
2851,0.547511,-1.416251,-1.469320,-1.243718,0.842974,6.538515,0.177246,-0.433631,-0.912201,1.557618,-1.533760,Wockesha,Moneybagg Yo,10
2852,-0.425516,-3.304882,-0.620606,-2.227177,0.842974,-0.667154,3.119179,-0.433600,-0.751618,-0.711607,-1.598145,Baby Grand,Anna B Savage,4
2853,0.464835,-1.718644,0.511012,-0.642874,0.842974,-0.667154,3.347726,-0.120575,-0.757771,-0.294992,-0.192652,Avalon (with Francesco Turrisi),Rhiannon Giddens,26


In [21]:
# analyzing our results, we can start to see some winners
scaled_df.groupby(['cluster', 'artist'], as_index=False).count().sort_values(['cluster', 'key'], ascending=[True, False])[['artist', 'cluster', 'key']].reset_index(drop=True)

Unnamed: 0,artist,cluster,key
0,A Day To Remember,0,5
1,Atreyu,0,3
2,Breaking Benjamin,0,3
3,Escape the Fate,0,3
4,Falling In Reverse,0,3
...,...,...,...
2486,The Red Clay Strays,26,1
2487,The Shins,26,1
2488,The Walters,26,1
2489,Yvonne Fair,26,1


###### 3. Create function to recommend a song

In [22]:
song_name = input('Choose a song: ')
results = sp.search(q=f'track:{song_name}', limit=1)
track_id = results['tracks']['items'][0]['id']
audio_features = sp.audio_features(track_id)

df_ = pd.DataFrame(audio_features)
new_features = df_[x.columns]

scaled_x = scaler.transform(new_features)
cluster = kmeans.predict(scaled_x)

filtered_df = np.array(scaled_df[scaled_df['cluster'] == cluster[0]][x.columns], order="C")
closest, _ = pairwise_distances_argmin_min(scaled_x, filtered_df)

Choose a song: Schism


In [25]:
cluster[0], scaled_df.loc[closest[0]]['song name'], scaled_df.loc[closest[0]]['artist']

(18, 'Shortest Fuse', 'Softcult')

In [26]:
# put everything inside a function
def recommend_song():
    # get song id
    song_name = input('Choose a song: ')
    results = sp.search(q=f'track:{song_name}', limit=1)
    track_id = results['tracks']['items'][0]['id']
    # get song features with the obtained id
    audio_features = sp.audio_features(track_id)
    # create dataframe
    df_ = pd.DataFrame(audio_features)
    new_features = df_[x.columns]
    # scale features
    scaled_x = scaler.transform(new_features)
    # predict cluster
    cluster = kmeans.predict(scaled_x)
    # filter dataset to predicted cluster
    filtered_df = np.array(scaled_df[scaled_df['cluster'] == cluster[0]][x.columns], order="C")
    # get closest song from filtered dataset
    closest, _ = pairwise_distances_argmin_min(scaled_x, filtered_df)
    # return it in a readable way
    print('\n[RECOMMENDED SONG]')
    return ' - '.join([scaled_df.loc[closest]['song name'].values[0], scaled_df.loc[closest]['artist'].values[0]])

In [34]:
recommend_song()

Choose a song: Duster - Black Midi


IndexError: list index out of range

In [39]:
# put everything inside a function
def recommend_song():
    try: 
        # get song id
        song_name = input('Choose a song: ')
        results = sp.search(q=f'track:{song_name}', limit=1)

        # Check if list is not empty before accessing it
        if results['tracks']['items']:
            track_id = results['tracks']['items'][0]['id']
            # get song features with the obtained id
            audio_features = sp.audio_features(track_id)
            # create dataframe
            df_ = pd.DataFrame(audio_features)
            new_features = df_[x.columns]
            # scale features
            scaled_x = scaler.transform(new_features)
            # predict cluster
            cluster = kmeans.predict(scaled_x)
            # filter dataset to predicted cluster
            filtered_df = np.array(scaled_df[scaled_df['cluster'] == cluster[0]][x.columns], order="C")
            # get closest song from filtered dataset
            closest, _ = pairwise_distances_argmin_min(scaled_x, filtered_df)
            # return it in a readable way
            print('\n[RECOMMENDED SONG]')
            return ' - '.join([scaled_df.loc[closest]['song name'].values[0], scaled_df.loc[closest]['artist'].values[0]])
        else:
            print('No results found for the specified song. Please try again with a different song.')
            print('Or try with the following pattern: <song_name> - <artist_name>')

    except IndexError as e:
        print(f"An error occurred: {e}")

In [42]:
recommend_song()

Choose a song: Duster - Black Midi
No results found for the specified song. Please try again with a different song.
Or try with the following pattern: <song_name> - <artist_name>


In [43]:
recommend_song()

Choose a song: Ducter - Black Midi

[RECOMMENDED SONG]


'Tornillo - Margaritas Podridas'