# Spotify Recommendation System

<img src="images/sago.jpg">

Spotify is the perfect example of the rise of music streaming services. The success of an app depends a lot on the user experience that the app provides to its users. A recommendation system is what helps a streaming application in providing a good user experience. So we can say that the Spotify recommendation system has played a major role in providing a good user experience which has resulted in such success for Spotify. In this project, I will walk you through how to build the Spotify Recommendation System with machine learning using Python.

In [28]:
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm
import warnings

warnings.filterwarnings('ignore')
pd.set_option('display.max_column', 100)

In [29]:
df = pd.read_csv('data/dataset.csv')

In [30]:
df.head()

Unnamed: 0.1,Unnamed: 0,track_id,artists,album_name,track_name,popularity,duration_ms,explicit,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature,track_genre
0,0,5SuOikwiRyPMVoIQDJUgSV,Gen Hoshino,Comedy,Comedy,73,230666,False,0.676,0.461,1,-6.746,0,0.143,0.0322,1e-06,0.358,0.715,87.917,4,acoustic
1,1,4qPNDBW1i3p13qLCt0Ki3A,Ben Woodward,Ghost (Acoustic),Ghost - Acoustic,55,149610,False,0.42,0.166,1,-17.235,1,0.0763,0.924,6e-06,0.101,0.267,77.489,4,acoustic
2,2,1iJBSr7s7jYXzM8EGcbK5b,Ingrid Michaelson;ZAYN,To Begin Again,To Begin Again,57,210826,False,0.438,0.359,0,-9.734,1,0.0557,0.21,0.0,0.117,0.12,76.332,4,acoustic
3,3,6lfxq3CG4xtTiEg7opyCyx,Kina Grannis,Crazy Rich Asians (Original Motion Picture Sou...,Can't Help Falling In Love,71,201933,False,0.266,0.0596,0,-18.515,1,0.0363,0.905,7.1e-05,0.132,0.143,181.74,3,acoustic
4,4,5vjLSffimiIP26QG5WcN2K,Chord Overstreet,Hold On,Hold On,82,198853,False,0.618,0.443,2,-9.681,1,0.0526,0.469,0.0,0.0829,0.167,119.949,4,acoustic


In [31]:
df.shape

(114000, 21)

In [32]:
df.isnull().sum()

Unnamed: 0          0
track_id            0
artists             1
album_name          1
track_name          1
popularity          0
duration_ms         0
explicit            0
danceability        0
energy              0
key                 0
loudness            0
mode                0
speechiness         0
acousticness        0
instrumentalness    0
liveness            0
valence             0
tempo               0
time_signature      0
track_genre         0
dtype: int64

In [33]:
df.dropna(inplace=True)

In [34]:
data = df.drop(columns=['track_id', 'track_name', 'artists'])
data.corr()

Unnamed: 0.1,Unnamed: 0,popularity,duration_ms,explicit,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
Unnamed: 0,1.0,0.032146,-0.032738,-0.054735,0.003445,-0.055993,-0.005521,-0.027307,0.00511,-0.084952,0.076837,-0.070285,0.033641,0.053109,-0.025825,-0.021115
popularity,0.032146,1.0,-0.007129,0.044078,0.035444,0.001053,-0.003847,0.05042,-0.013948,-0.04493,-0.025458,-0.095147,-0.005397,-0.040522,0.013212,0.031076
duration_ms,-0.032738,-0.007129,1.0,-0.06527,-0.073435,0.05852,0.008123,-0.003475,-0.035581,-0.062605,-0.10377,0.124364,0.010308,-0.154464,0.024356,0.018229
explicit,-0.054735,0.044078,-0.06527,1.0,0.122506,0.096954,0.004485,0.108587,-0.037216,0.307951,-0.0944,-0.103405,0.032547,-0.003378,-0.002815,0.038387
danceability,0.003445,0.035444,-0.073435,0.122506,1.0,0.134325,0.03647,0.259076,-0.069224,0.108625,-0.171531,-0.185608,-0.13162,0.477347,-0.050448,0.207219
energy,-0.055993,0.001053,0.05852,0.096954,0.134325,1.0,0.048007,0.76169,-0.078365,0.142508,-0.733908,-0.18188,0.184795,0.258937,0.247852,0.187127
key,-0.005521,-0.003847,0.008123,0.004485,0.03647,0.048007,1.0,0.038591,-0.135911,0.020419,-0.040942,-0.006821,-0.001597,0.034099,0.010914,0.015064
loudness,-0.027307,0.05042,-0.003475,0.108587,0.259076,0.76169,0.038591,1.0,-0.041768,0.060826,-0.589804,-0.433478,0.076897,0.279851,0.212447,0.191992
mode,0.00511,-0.013948,-0.035581,-0.037216,-0.069224,-0.078365,-0.135911,-0.041768,1.0,-0.046535,0.095568,-0.049961,0.014004,0.021964,0.000572,-0.02409
speechiness,-0.084952,-0.04493,-0.062605,0.307951,0.108625,0.142508,0.020419,0.060826,-0.046535,1.0,-0.002184,-0.089617,0.205218,0.036637,0.017274,-1.1e-05


In [35]:
from sklearn.preprocessing import MinMaxScaler
datatypes = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
normarization = df.select_dtypes(include=datatypes)
for col in normarization.columns:
    MinMaxScaler(col)

In [36]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=10)
features = kmeans.fit_predict(normarization)
df['features'] = features
MinMaxScaler(df['features'])

In [44]:
import numpy as np
from tqdm import tqdm

class Spotify_Recommendation():
    def __init__(self, dataset):
        self.dataset = dataset

    def recommend(self, song_name, amount=1):

        song = self.dataset[(self.dataset.track_name.str.lower() == song_name.lower())].head(1).values[0]
        rec = self.dataset[self.dataset.track_name.str.lower() != song_name.lower()]
        distance = []

        non_numeric_cols = ['artists', 'track_name', 'track_id']  

        for songs in tqdm(rec.values):
            d = 0
            for col in np.arange(len(self.dataset.columns)):
                column_name = self.dataset.columns[col]
                if column_name not in non_numeric_cols:
                    try:
                        d += np.absolute(float(song[col]) - float(songs[col]))
                    except ValueError:
                        pass 
            distance.append(d)

        rec['distance'] = distance

        rec = rec.sort_values('distance')

        columns = ['artists', 'track_name']
        return rec[columns].head(amount)

recommendations = Spotify_Recommendation(df)
recommendations.recommend("Sagopa Yaşlı Planet", 10)


100%|██████████| 113998/113998 [00:02<00:00, 56551.57it/s]


Unnamed: 0,artists,track_name
112534,No.1,Dünyaya Yazık
112075,Emir Can İğrek,Bıraktım Şaşırmayı
112189,Müzeyyen Senar,Gamzedeyim Deva Bulmam
112188,Ati242;Bedo;Stabil,KAPA GÖZLERİNİ
111755,Bent,Take 15
112530,Mirkelam,Kaprislisin Sevgilim
112209,Tepki,Rüyalar
112300,Serhat Durmus;Zerrin,Yolu Yok (feat. Zerrin)
112162,Birkan Nasuhoğlu,SEVDA
112272,Şanışer,Yazamam Ecele
