# KNN Recommender 

👉 K-Nearest-Neighbors (KNN) models can be used to model and make predictions, but they can alternatively be utilized to find the closest points in a dataset.  

👨🏻‍🏫 In this recap, we will use a KNN model to create a basic music recommender system.

In [3]:
import pandas as pd

url = 'https://wagon-public-datasets.s3.amazonaws.com/Machine%20Learning%20Datasets/ML_spotify_data.csv'

# Using pandas, load the data from the provided URL
df = pd.read_csv(url)
df

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
0,We're For The Dark - Remastered 2010,['Badfinger'],22,0.678,0.559,0.432,0,3,0.0727,-12.696,0.0334,117.674
1,Sixty Years On - Piano Demo,['Elton John'],25,0.456,0.259,0.368,0,6,0.1560,-10.692,0.0280,143.783
2,Got to Find Another Way,['The Guess Who'],21,0.433,0.833,0.724,0,0,0.1700,-9.803,0.0378,84.341
3,Feelin' Alright - Live At The Fillmore East/1970,['Joe Cocker'],22,0.436,0.870,0.914,0,5,0.8550,-6.955,0.0610,174.005
4,Caravan - Take 7,['Van Morrison'],23,0.669,0.564,0.412,0,7,0.4010,-13.095,0.0679,78.716
...,...,...,...,...,...,...,...,...,...,...,...,...
9995,China,"['Anuel AA', 'Daddy Yankee', 'KAROL G', 'Ozuna...",72,0.786,0.608,0.808,0,7,0.0822,-3.702,0.0881,105.029
9996,Halloweenie III: Seven Days,['Ashnikko'],68,0.717,0.734,0.753,0,7,0.1010,-6.020,0.0605,137.936
9997,AYA,['MAMAMOO'],76,0.634,0.637,0.858,0,4,0.2580,-2.226,0.0809,91.688
9998,Darkness,['Eminem'],70,0.671,0.195,0.623,1,2,0.6430,-7.161,0.3080,75.055


🎯 Let's find songs that are "similar" to Queen's mythical *Another One Bites the Dust*.

In [5]:
k_song = df.iloc[9997:9998] # Another One Bites the Dust - Queen

k_song

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9997,AYA,['MAMAMOO'],76,0.634,0.637,0.858,0,4,0.258,-2.226,0.0809,91.688


## 1. Calculating the distances

👇 First, train the KNN to have it learn the distances between each observation of the dataset.  
Since we are only concerned with the similarity of features between the songs, it doesn't matter which target the model is fitted on.

In [10]:
# YOUR CODE HERE
X = df.select_dtypes(include=['int', 'float'])
from sklearn.neighbors import NearestNeighbors
neigh = NearestNeighbors(n_neighbors=5)
neigh.fit(X)

Check out the [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#sklearn.neighbors.KNeighborsRegressor.kneighbors)

## 2. Passing the new point

👇 You can now pass a new point to the KNN model and find its closest point.

In [15]:
similar = k_song.select_dtypes(include=['int', 'float'])
neigh.kneighbors(similar)

(array([[0.        , 4.91613652, 5.01715033, 5.53793168, 5.68200681]]),
 array([[9997, 9963, 9746, 9841, 9999]]))

In [17]:
# YOUR CODE HERE
df.iloc[neigh.kneighbors(similar)[1][0]]

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9997,AYA,['MAMAMOO'],76,0.634,0.637,0.858,0,4,0.258,-2.226,0.0809,91.688
9963,Cool,['Dua Lipa'],72,0.47,0.79,0.785,0,5,0.0931,-4.015,0.0664,89.717
9746,No Te Vayas,['Camilo'],72,0.766,0.81,0.721,0,6,0.128,-4.46,0.0738,92.001
9841,Boyfriend,['Selena Gomez'],73,0.811,0.346,0.512,0,2,0.0768,-6.381,0.17,92.046
9999,Billetes Azules (with J Balvin),"['KEVVO', 'J Balvin']",74,0.856,0.642,0.721,1,7,0.182,-4.928,0.108,94.991


## 3. Making a playlist!

👇 Make a playlist with 10 songs based on Queen's *Another One Bites the Dust*, sorted by increasing tempo.

In [18]:
k_song

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9997,AYA,['MAMAMOO'],76,0.634,0.637,0.858,0,4,0.258,-2.226,0.0809,91.688


In [19]:
# YOUR CODE HEREX = df.select_dtypes(include=['int', 'float'])
from sklearn.neighbors import NearestNeighbors
neigh = NearestNeighbors(n_neighbors=10)
neigh.fit(X)

In [20]:
similar = k_song.select_dtypes(include=['int', 'float'])
neigh.kneighbors(similar)

(array([[0.        , 4.91613652, 5.01715033, 5.53793168, 5.68200681,
         6.26892393, 6.31049182, 6.70178065, 7.05635233, 7.2945644 ]]),
 array([[9997, 9963, 9746, 9841, 9999, 9905, 9837, 9464, 9787, 9237]]))

In [24]:
df.iloc[neigh.kneighbors(similar)[1][0]].sort_values(by=['tempo'], ascending=True)

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9905,Keii,['Anuel AA'],73,0.708,0.51,0.797,0,0,0.096,-3.095,0.0391,88.015
9963,Cool,['Dua Lipa'],72,0.47,0.79,0.785,0,5,0.0931,-4.015,0.0664,89.717
9237,No One - Acoustic,['Alicia Keys'],70,0.379,0.433,0.325,0,4,0.0997,-5.862,0.0331,89.798
9997,AYA,['MAMAMOO'],76,0.634,0.637,0.858,0,4,0.258,-2.226,0.0809,91.688
9746,No Te Vayas,['Camilo'],72,0.766,0.81,0.721,0,6,0.128,-4.46,0.0738,92.001
9841,Boyfriend,['Selena Gomez'],73,0.811,0.346,0.512,0,2,0.0768,-6.381,0.17,92.046
9464,No Me Acuerdo,"['Thalia', 'Natti Natasha']",71,0.837,0.748,0.784,0,7,0.0897,-4.531,0.101,94.036
9787,Un Año,"['Sebastian Yatra', 'Reik']",73,0.771,0.535,0.382,0,1,0.104,-6.808,0.0514,94.931
9999,Billetes Azules (with J Balvin),"['KEVVO', 'J Balvin']",74,0.856,0.642,0.721,1,7,0.182,-4.928,0.108,94.991
9837,Animals,['Architects'],72,0.532,0.293,0.759,1,1,0.073,-3.842,0.0319,95.01


In [41]:
new_rules = pd.DataFrame(df[df['artists'] == "['Dua Lipa']"].iloc[1])
new_rules = new_rules.T
new_rules

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9277,New Rules,['Dua Lipa'],53,0.762,0.608,0.7,0,9,0.153,-6.021,0.0694,116.073


In [48]:

similar = new_rules.drop(columns=['name', 'artists'])
similar

Unnamed: 0,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
9277,53,0.762,0.608,0.7,0,9,0.153,-6.021,0.0694,116.073


In [49]:
neigh.kneighbors(similar)
df.iloc[neigh.kneighbors(similar)[1][0]].sort_values(by=['tempo'], ascending=True)

Unnamed: 0,name,artists,popularity,danceability,valence,energy,explicit,key,liveness,loudness,speechiness,tempo
3862,Vattene amore (feat. Amedeo Minghi),"['Mietta', 'Amedeo Minghi']",52,0.562,0.469,0.639,0,8,0.0858,-8.356,0.0298,113.989
9175,Nicotine Dream,['Breakup Shoes'],52,0.525,0.722,0.608,0,8,0.323,-6.03,0.0289,114.868
8840,Caligula,['Ghostemane'],55,0.86,0.142,0.524,1,11,0.0993,-4.698,0.0729,114.946
8974,Realiti,['Grimes'],54,0.638,0.349,0.632,0,9,0.115,-7.589,0.029,114.992
8134,Revolting Children,['Matilda the Musical Original Cast'],51,0.715,0.59,0.703,0,7,0.288,-5.731,0.059,115.911
8623,After the Disco,['Broken Bells'],52,0.729,0.966,0.74,0,9,0.13,-5.444,0.0258,115.984
9026,Go Ahead and Break My Heart (feat. Gwen Stefani),"['Blake Shelton', 'Gwen Stefani']",54,0.618,0.352,0.731,0,7,0.421,-5.181,0.0321,116.008
9277,New Rules,['Dua Lipa'],53,0.762,0.608,0.7,0,9,0.153,-6.021,0.0694,116.073
7784,Farewell To The Fairground,['White Lies'],55,0.527,0.304,0.766,0,9,0.115,-5.327,0.0375,117.994
7954,Tu Angelito,['Chino & Nacho'],53,0.834,0.912,0.868,0,8,0.203,-6.498,0.0415,118.025
