# KNN Recommender 

👉 K-Nearest-Neighbors (KNN) can be used to model and make predictions, but they can alternatively be utilized to find the closest points in a dataset.  

👨🏻‍🏫 In this recap, we will use a KNN model to create a basic music recommender system.

In [23]:
import pandas as pd

url = 'https://wagon-public-datasets.s3.amazonaws.com/Machine%20Learning%20Datasets/ML_spotify_data.csv'

# Using pandas, load the data from the provided URL
# $CHALLENGIFY_BEGIN
df = pd.read_csv(url)

df.head()
# $CHALLENGIFY_END

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
0,We're For The Dark - Remastered 2010,['Badfinger'],22,0.678,0.559,0.432,0,3,0.0727,-12.696,0.0334,117.674
1,Sixty Years On - Piano Demo,['Elton John'],25,0.456,0.259,0.368,0,6,0.156,-10.692,0.028,143.783
2,Got to Find Another Way,['The Guess Who'],21,0.433,0.833,0.724,0,0,0.17,-9.803,0.0378,84.341
3,Feelin' Alright - Live At The Fillmore East/1970,['Joe Cocker'],22,0.436,0.87,0.914,0,5,0.855,-6.955,0.061,174.005
4,Caravan - Take 7,['Van Morrison'],23,0.669,0.564,0.412,0,7,0.401,-13.095,0.0679,78.716


🎯 Let's find songs that are "similar" to Queen's mythical *Another one bites the dust*.

In [8]:
queen_song = df.iloc[4295:4296] # Another one bites the dust - Queen

queen_song

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
4295,Another One Bites The Dust - Live at Wembley '86,['Queen'],29,0.534,0.114,0.984,0,4,0.982,-5.058,0.297,115.991


## 1. Calculating the distances

👇 First, train the KNN to have it learn the distances between each observation of the dataset.  
Since we are only concerned by the similarity of features between the songs, it doesn't matter which target it is fitted to.

In [10]:
X = df.drop(columns =['name','artists'])
X.head(2)

Unnamed: 0,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
0,22,0.678,0.559,0.432,0,3,0.0727,-12.696,0.0334,117.674
1,25,0.456,0.259,0.368,0,6,0.156,-10.692,0.028,143.783


In [15]:
from sklearn.neighbors import NearestNeighbors

neigh = NearestNeighbors(n_neighbors=10)

neigh.fit(X)

Check out the [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#sklearn.neighbors.KNeighborsRegressor.kneighbors)

## 2. Passing the new point

👇 You can now pass a new point to the KNN model and find its closest point.

In [16]:
queen_song

Unnamed: 0,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
4295,29,0.534,0.114,0.984,0,4,0.982,-5.058,0.297,115.991


In [18]:
#queen_song = queen_song.drop(columns=['name','artists'])
queen_song

Unnamed: 0,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
4295,29,0.534,0.114,0.984,0,4,0.982,-5.058,0.297,115.991


In [19]:
neigh.kneighbors(queen_song)

(array([[0.        , 2.55269431, 2.56187197, 2.63552967, 2.7231866 ,
         2.8106185 , 3.10731868, 3.26098142, 3.53048079, 3.53614762]]),
 array([[4295, 3488, 2700, 2507, 3586, 2794, 5179, 3047, 1704, 3648]]))

In [21]:
df.iloc[3488]

name                      Confidence Man
artists         ['The Jeff Healey Band']
popularity                            30
danceability                        0.56
valence                            0.868
energy                             0.927
explicit                               0
key                                    6
liveness                           0.316
loudness                          -5.682
speechiness                       0.0715
tempo                            116.236
Name: 3488, dtype: object

## 3. Making a playlist!

👇 Make a playlist with 10 songs based on Queen's *Another one bites the dust*, sorted by increasing tempo.

In [None]:
queen_song

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
4295,Another One Bites The Dust - Live at Wembley '86,['Queen'],29,0.534,0.114,0.984,0,4,0.982,-5.058,0.297,115.991


In [31]:
df.iloc[neigh.kneighbors(queen_song)[1][0]].sort_values(by='tempo')

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
3648,Mary Jane - Remastered,['Megadeth'],30,0.429,0.364,0.959,0,2,0.342,-4.789,0.12,113.361
2700,君のハートはマリンブルー,"['オメガトライブ', 'Kiyotaka Sugiyama']",29,0.602,0.624,0.794,0,4,0.413,-5.512,0.0271,113.612
1704,Baba O'Riley - Live At Shepperton,['The Who'],27,0.304,0.412,0.835,0,5,0.857,-7.372,0.0662,114.621
2794,Reaction to Action,['Foreigner'],30,0.631,0.404,0.935,0,2,0.151,-6.459,0.0564,115.687
5179,On Silent Wings,['Tina Turner'],30,0.519,0.518,0.581,0,2,0.0613,-6.9,0.0337,115.851
4295,Another One Bites The Dust - Live at Wembley '86,['Queen'],29,0.534,0.114,0.984,0,4,0.982,-5.058,0.297,115.991
3586,LOVE IN THE FIRST DEGREE ~悪いあなた~ (Remastered 2...,['Wink'],29,0.784,0.757,0.944,0,2,0.234,-6.579,0.0505,116.058
3488,Confidence Man,['The Jeff Healey Band'],30,0.56,0.868,0.927,0,6,0.316,-5.682,0.0715,116.236
2507,Too Much Blood - Remastered,['The Rolling Stones'],30,0.592,0.479,0.909,0,6,0.0571,-5.887,0.0512,116.439
3047,Millionaires Against Hunger,['Red Hot Chili Peppers'],29,0.815,0.549,0.97,1,2,0.0348,-3.384,0.0834,117.264
