# Spotify Song Recommender Model & Explainer
_Author: Jonathan Finger_

## Imports

Imports important packages...

In [1]:
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt

## Supplemental Functions

In [2]:
def trackname_from_index(index):
    return df[df.index == index]["track_name"].values.astype(str)[0]

def artist_from_index(index):
    return df[df.index == index]["artist_name"].values.astype(str)[0]

def index_from_trackid(trackid):
    return df[df['track_id'] == trackid].index.values.astype(int)[0]

## Import Kaggle Data

Since we did not have the ability to connect web to DS (due to short time and absence of a DS student with Flask training) I utilized a dataset from Kaggle. I used audio features from [April 2019](https://www.kaggle.com/tomigelo/spotify-audio-features).

In [3]:
DATA_PATH = "./data/"
df = pd.read_csv(DATA_PATH+'SpotifyAudioFeaturesApril2019.csv', nrows = 80000)

In [4]:
df

Unnamed: 0,artist_name,track_id,track_name,acousticness,danceability,duration_ms,energy,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,valence,popularity
0,YG,2RM4jf1Xa9zPgMGRDiht8O,"Big Bank feat. 2 Chainz, Big Sean, Nicki Minaj",0.005820,0.743,238373,0.339,0.00000,1,0.0812,-7.678,1,0.4090,203.927,4,0.118,15
1,YG,1tHDG53xJNGsItRA3vfVgs,BAND DRUM (feat. A$AP Rocky),0.024400,0.846,214800,0.557,0.00000,8,0.2860,-7.259,1,0.4570,159.009,4,0.371,0
2,R3HAB,6Wosx2euFPMT14UXiWudMy,Radio Silence,0.025000,0.603,138913,0.723,0.00000,9,0.0824,-5.890,0,0.0454,114.966,4,0.382,56
3,Chris Cooq,3J2Jpw61sO7l6Hc7qdYV91,Lactose,0.029400,0.800,125381,0.579,0.91200,5,0.0994,-12.118,0,0.0701,123.003,4,0.641,0
4,Chris Cooq,2jbYvQCyPgX3CdmAzeVeuS,Same - Original mix,0.000035,0.783,124016,0.792,0.87800,7,0.0332,-10.277,1,0.0661,120.047,4,0.928,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
79995,Chee,3xvMcaSAbbhI8nlYklU48G,Stand The Funk,0.121000,0.683,245581,0.610,0.08720,2,0.1140,-5.885,1,0.4710,86.218,4,0.287,17
79996,Cambatta,5snuFXP3Jxac1pslG2EQP9,The Shepherd,0.310000,0.480,288117,0.833,0.00000,2,0.1290,-4.552,1,0.2820,113.497,5,0.233,9
79997,King Mydas,4yUGrmNPgf9YBusDnL9VoW,Plugg,0.175000,0.793,156004,0.625,0.00000,6,0.0622,-8.078,0,0.3770,74.926,4,0.399,0
79998,Elin Engdahl,4DceGOFFCWUVaigreuo0f6,Come Clarity,0.396000,0.522,205993,0.420,0.00192,0,0.0916,-11.018,0,0.0334,150.003,4,0.130,1


## Model preparation

I chose features I felt would most impact someone's choice of song.

In [5]:
features = [
 'danceability',
 'duration_ms',
 'key',
 'liveness',
 'loudness']

I utilize _cosine similarity_ to find similar songs to one sample track. Cosine similarity is a method of finding like-objects by using angles. If two objects exist near eachother, the angle between lines drawn from the origin to those points should be small. Objects disimilar to eachother occupy space far from eachother and would have a larger angle between them (lines drawn from the origin to the two points).

In [6]:
cosine_simil = cosine_similarity(df[features])

## Model Test

Selecting a test track: _BAND DRUM (feat. A$AP Rocky)_ by YG.

In [7]:
test_trackid = index_from_trackid('1tHDG53xJNGsItRA3vfVgs')

In [8]:
#Create list of similar songs
similar_songs =  list(enumerate(cosine_simil[test_trackid]))

In [9]:
sorted_similar_songs = sorted(similar_songs,key=lambda x:x[1],reverse=True)

In [11]:
## Print titles of first 10 songs
i=0
for song in sorted_similar_songs:
    if i == 0:
        i = i+1
    else:
        print(f'No. {i} "{trackname_from_index(song[0])}" by {artist_from_index(song[0])}')
        i=i+1
        if i>10:
            break

No. 1 "Butterfly" by Jemme
No. 2 "Girlfriend" by Michael Christmas
No. 3 "Circles (feat. Gonzalla)" by Loudan
No. 4 "My Duffle" by Lil Keke
No. 5 "Coquito la Pieza" by Cruz Cafuné
No. 6 "With You" by Spada
No. 7 "What You Do to Me" by KeySoul
No. 8 "Too Sexy (Trap God)" by Cartier Coston
No. 9 "Poesia senza veli" by Ultimo
No. 10 "Flesh N' Blood - Chela's Bad Habit Remix" by Chela
