##### Blessed Zhou R205757M HDSC
##### Brandon Tonderai Mutombwa HDSC
##### Primrose Chifamba  HDSC

#### Collaborative filtering

This method utilizes preferences and behaviors of other users to come up with recommendations. The general operation of these systems is to pair users with similar tastes together into groups. Recommendations are then made based on the collective preferences of the users within each group.
Thus if user x and user y are both members of the same preference group, and user x likes a particular sample, then there should be a high probability that user y also like this sample. Obviously, this method requires a significantly large initial dataset to provide relevant predictions.

**import** libraries

In [1]:
import pandas as pd
import numpy as np
import scipy.stats

# Visualization
import seaborn as sns

# Similarity
from sklearn.metrics.pairwise import cosine_similarity

In [2]:
data = pd.read_csv("spotify_millsongdata.csv")
listen_counts = pd.read_csv("test.csv")

In [3]:
data.head(6)

Unnamed: 0,artist,song,link,text
0,ABBA,Ahe's My Kind Of Girl,/a/abba/ahes+my+kind+of+girl_20598417.html,"Look at her face, it's a wonderful face \r\nA..."
1,ABBA,"Andante, Andante",/a/abba/andante+andante_20002708.html,"Take it easy with me, please \r\nTouch me gen..."
2,ABBA,As Good As New,/a/abba/as+good+as+new_20003033.html,I'll never know why I had to go \r\nWhy I had...
3,ABBA,Bang,/a/abba/bang_20598415.html,Making somebody happy is a question of give an...
4,ABBA,Bang-A-Boomerang,/a/abba/bang+a+boomerang_20002668.html,Making somebody happy is a question of give an...
5,ABBA,Burning My Bridges,/a/abba/burning+my+bridges_20003011.html,"Well, you hoot and you holler and you make me ..."


In [4]:
listen_counts.head(6)

Unnamed: 0,userId,songId,count,sid,uid
0,969cc6fb74e076a68e36a04409cb9d3765757508,SOABRAB12A6D4F7AAF,2,4204,0
1,969cc6fb74e076a68e36a04409cb9d3765757508,SOAOQFD12A6D4FAAA9,1,20861,0
2,969cc6fb74e076a68e36a04409cb9d3765757508,SOBFPJC12A58A7D1AB,9,30308,0
3,969cc6fb74e076a68e36a04409cb9d3765757508,SOBZZDU12A6310D8A3,2,306,0
4,969cc6fb74e076a68e36a04409cb9d3765757508,SOCFYDV12A8C1406E2,1,13869,0
5,969cc6fb74e076a68e36a04409cb9d3765757508,SODCNJX12A6D4F93CB,2,2233,0


In [5]:
data.shape

(57650, 4)

In [6]:
listen_counts.shape

(2952326, 5)

In [7]:
data.isnull().sum()

artist    0
song      0
link      0
text      0
dtype: int64

In [8]:
### lets sample our data to reduce the computation
## for that we will pick 10000 samples
data =data.sample(10000)
##3again lets drop the column link its not necessry
data.drop('link', axis=1).reset_index(drop=True)

Unnamed: 0,artist,song,text
0,Bee Gees,Close Another Door,"Many years have passed, it seems, \r\nAnd I a..."
1,Face To Face,Walk Away,don't want to hear what you said \r\ndon't wa...
2,Neil Sedaka,Moon Of Gold,Moon of gold in the sky. \r\nMy loving sweeth...
3,Devo,C'mon,"C'mon, c'mon, c'mon \r\nWhen nothing's funny ..."
4,Fleetwood Mac,Believe Me,Written by christine mcvie. \r\n \r\nYou got...
...,...,...,...
9995,Amy Grant,Christmas Lullaby,Are you far away from home \r\nThis dark and ...
9996,INXS,Show Me (Cherry Baby),"Show me, show me \r\nShow me how \r\nShow me..."
9997,Misfits,Lost In Space,Go! \r\n \r\nOf all the things they taught y...
9998,Miley Cyrus,On My Own,Hey! I ain't looking at you \r\nFor no partic...


In [9]:
####lets take a glance at some of the text
some_digit = data['text'].values[0]
some_digit

"Many years have passed, it seems,  \r\nAnd I am all alone.  \r\nI've sent the children far away  \r\nTo some obscure unknown.  \r\n  \r\n[Chorus: x2]  \r\nIt's so sad,  \r\nSo sad.  \r\nClose another door.  \r\nListen to my eyes.  \r\nClose another door.  \r\nYou're much too old to work,  \r\nSo won't you run away?  \r\n  \r\nWhen I was young, I used to say  \r\nThat age won't bother me.  \r\nThe life I had was very sad,  \r\nIt all went out to sea  \r\n  \r\n[Chorus]  \r\n  \r\nAnd though the sun is in outside,  \r\nThe rain is in my hair.  \r\nNow my life is lived inside,  \r\n(Now all my life is mystified)  \r\nMy home is in my chair.  \r\n  \r\n[Chorus]  \r\n  \r\nLet me go.  \r\nSend me flowers and  \r\nPut me on a plane.  \r\nI've paid before,  \r\n  \r\nSo I've been told  \r\nAt least I'm alive  \r\n  \r\nI may be old but I've been told at least I'm still alive  \r\nFly me young, fly, and tomorrow, yeah  \r\nGet me up  \r\nI been working so, so very hard\r\n\r\n"

In [10]:
###3 lets clean the text column
### for that lets define and use  regex
## walah the text contains strings like ((,\n, ',etc, and reduce to lowercase))
data['text'] = data['text'].str.lower().replace(r'^\w\s', ' ').replace(r'\n', ' ', regex = True)


In [11]:
##lets vieew the cleaned text
X = data['text'].values[0]
X

"many years have passed, it seems,  \r and i am all alone.  \r i've sent the children far away  \r to some obscure unknown.  \r   \r [chorus: x2]  \r it's so sad,  \r so sad.  \r close another door.  \r listen to my eyes.  \r close another door.  \r you're much too old to work,  \r so won't you run away?  \r   \r when i was young, i used to say  \r that age won't bother me.  \r the life i had was very sad,  \r it all went out to sea  \r   \r [chorus]  \r   \r and though the sun is in outside,  \r the rain is in my hair.  \r now my life is lived inside,  \r (now all my life is mystified)  \r my home is in my chair.  \r   \r [chorus]  \r   \r let me go.  \r send me flowers and  \r put me on a plane.  \r i've paid before,  \r   \r so i've been told  \r at least i'm alive  \r   \r i may be old but i've been told at least i'm still alive  \r fly me young, fly, and tomorrow, yeah  \r get me up  \r i been working so, so very hard\r \r "

In [12]:
### lets apply porters stemer
import nltk

nltk.download('punkt')

from nltk.stem.porter import PorterStemmer
stemmer = PorterStemmer()

def tokenization(txt):
    tokens = nltk.word_tokenize(txt)
    stemming = [stemmer.stem(w) for w in tokens]
    return " ".join(stemming)

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\blessy\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [13]:
data['text'] = data['text'].apply(lambda x: tokenization(x))

In [14]:
Y = data['text'].values[0]
Y

"mani year have pass , it seem , and i am all alon . i 've sent the children far away to some obscur unknown . [ choru : x2 ] it 's so sad , so sad . close anoth door . listen to my eye . close anoth door . you 're much too old to work , so wo n't you run away ? when i wa young , i use to say that age wo n't bother me . the life i had wa veri sad , it all went out to sea [ choru ] and though the sun is in outsid , the rain is in my hair . now my life is live insid , ( now all my life is mystifi ) my home is in my chair . [ choru ] let me go . send me flower and put me on a plane . i 've paid befor , so i 've been told at least i 'm aliv i may be old but i 've been told at least i 'm still aliv fli me young , fli , and tomorrow , yeah get me up i been work so , so veri hard"

In [15]:
import requests
import spotipy # Package to acces the Spotify API
from bs4 import BeautifulSoup
from tqdm import tqdm
from spotipy.oauth2 import SpotifyClientCredentials # Module to authenticate a Spotify User
import time
from ast import literal_eval
import re
from copy import deepcopy
from sklearn.preprocessing import MinMaxScaler, MultiLabelBinarizer
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
import math
import scipy
#from mplsoccer import PyPizza, FontManager # Modules to plot Pizza Charts and download fonts 
#from highlight_text import fig_text, ax_text # Modules to highlight text in visualizations
import matplotlib.patheffects as path_effects
from matplotlib.ticker import MaxNLocator
from matplotlib.colors import LinearSegmentedColormap
from time import sleep
#from lyricsgenius import Genius
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
#from my_secrets import client_id, client_access_token, client_secret # API keys for Genius API
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
#import keras
#from keras import layers
#from keras import regularizers
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
#import tensorflow as tf
from sklearn.metrics.pairwise import cosine_similarity
#import xgboost as xgb
#from xgboost import XGBClassifier
from sklearn.metrics import mean_squared_error, accuracy_score, roc_curve, roc_auc_score
from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import OneHotEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
from sklearn.decomposition import PCA
from sklearn.linear_model import RidgeClassifier
import sklearn
#import xgboost as xgb 
import sklearn.ensemble
from sklearn.model_selection import train_test_split 
 

import warnings
warnings.filterwarnings("ignore")

In [16]:
data.head()

Unnamed: 0,artist,song,link,text
24976,Bee Gees,Close Another Door,/b/bee+gees/close+another+door_20015532.html,"mani year have pass , it seem , and i am all a..."
5821,Face To Face,Walk Away,/f/face+to+face/walk+away_20052318.html,do n't want to hear what you said do n't want ...
45339,Neil Sedaka,Moon Of Gold,/n/neil+sedaka/moon+of+gold_20613350.html,moon of gold in the sky . my love sweetheart s...
29601,Devo,C'mon,/d/devo/cmon_20039776.html,"c'mon , c'mon , c'mon when noth 's funni it ge..."
6076,Fleetwood Mac,Believe Me,/f/fleetwood+mac/believe+me_20054317.html,written by christin mcvie . you got to believ ...


In [17]:
# Select a random sample 
song_id= listen_counts.sample(n=57650)

In [18]:
# Reset the index of the sampled DataFrame
song_id.reset_index(drop=True, inplace=True)


In [19]:
data['userId'] =  song_id['userId']
data['songId'] = song_id['songId']
data['count'] = song_id['count']

In [20]:
data.reset_index(drop=True, inplace=True)

In [21]:
print(data.head())

          artist                song  \
0       Bee Gees  Close Another Door   
1   Face To Face           Walk Away   
2    Neil Sedaka        Moon Of Gold   
3           Devo               C'mon   
4  Fleetwood Mac          Believe Me   

                                           link  \
0  /b/bee+gees/close+another+door_20015532.html   
1       /f/face+to+face/walk+away_20052318.html   
2     /n/neil+sedaka/moon+of+gold_20613350.html   
3                    /d/devo/cmon_20039776.html   
4     /f/fleetwood+mac/believe+me_20054317.html   

                                                text  \
0  mani year have pass , it seem , and i am all a...   
1  do n't want to hear what you said do n't want ...   
2  moon of gold in the sky . my love sweetheart s...   
3  c'mon , c'mon , c'mon when noth 's funni it ge...   
4  written by christin mcvie . you got to believ ...   

                                     userId              songId  count  
0  82d497a89ecc7230980d6e731135c7a5ca46cea

In [22]:
# Create user-item matrix
matrix = data.pivot_table(index='songId', columns='userId', values='count')
matrix.head()

userId,000e5b27c695ccf98c5b89a26ea69fccd8563db9,000f07976f5e0861edc8e8de3dd56a9721038b69,00117127c2e13d97128c45a0084a194ea5c5290e,00154598d7b6c732217db98ca6732a02ceb1dc39,002304b0442afcf39a90b9f904b28ff92e6b5640,0024009aecc53148ae0d3ba056893e5c7864a228,002c70a763a2346a44576163bd9be4046af8b7eb,002efa7dee7aedd34ea26ce399fdd19e49e06569,00415f709a321ea4f1706d5c033b90feb6595b75,00540e39c6aed9d1576149699fbc3357753bd4ce,...,ffc663b7de6e31c0488b414b739fbf2ece8a00f5,ffc9a1010f753da5f0bd856d4c756b073c5ebda6,ffd5dd905bde38a3bcef0c9c878f96e181d6a484,ffdaab327f2fc6b9fa01a4e3e7f41fdd0e468046,ffe047640db7ab39070f80c829614d75001d5b10,ffe2fb3282f44535d797f6690588cc656ffbafad,ffee473b99a051b71cc910711d83af24903c7d79,ffef9c3e59ab44554a9775af5e3b2ac149111bb6,fff83c8596c1519f90fd5c5ed540f2ad93ea7bc5,ffffdc6c89988cd6119067769162948eacf8b670
songId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
SOAAEJI12AB0188AB5,,,,,,,,,,,...,,,,,,,,,,
SOAAFAC12A67ADF7EB,,,,,,,,,,,...,,,,,,,,,,
SOAAQXG12AB017E149,,,,,,,,,,,...,,,,,,,,,,
SOAAROC12A6D4FA420,,,,,,,,,,,...,,,,,,,,,,
SOAAUDN12AAF3B599F,,,,,,,,,,,...,,,,,,,,,,


In [23]:
# Normalize user-item matrix
matrix_norm = matrix.subtract(matrix.mean(axis=1), axis = 0)
matrix_norm.head()

userId,000e5b27c695ccf98c5b89a26ea69fccd8563db9,000f07976f5e0861edc8e8de3dd56a9721038b69,00117127c2e13d97128c45a0084a194ea5c5290e,00154598d7b6c732217db98ca6732a02ceb1dc39,002304b0442afcf39a90b9f904b28ff92e6b5640,0024009aecc53148ae0d3ba056893e5c7864a228,002c70a763a2346a44576163bd9be4046af8b7eb,002efa7dee7aedd34ea26ce399fdd19e49e06569,00415f709a321ea4f1706d5c033b90feb6595b75,00540e39c6aed9d1576149699fbc3357753bd4ce,...,ffc663b7de6e31c0488b414b739fbf2ece8a00f5,ffc9a1010f753da5f0bd856d4c756b073c5ebda6,ffd5dd905bde38a3bcef0c9c878f96e181d6a484,ffdaab327f2fc6b9fa01a4e3e7f41fdd0e468046,ffe047640db7ab39070f80c829614d75001d5b10,ffe2fb3282f44535d797f6690588cc656ffbafad,ffee473b99a051b71cc910711d83af24903c7d79,ffef9c3e59ab44554a9775af5e3b2ac149111bb6,fff83c8596c1519f90fd5c5ed540f2ad93ea7bc5,ffffdc6c89988cd6119067769162948eacf8b670
songId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
SOAAEJI12AB0188AB5,,,,,,,,,,,...,,,,,,,,,,
SOAAFAC12A67ADF7EB,,,,,,,,,,,...,,,,,,,,,,
SOAAQXG12AB017E149,,,,,,,,,,,...,,,,,,,,,,
SOAAROC12A6D4FA420,,,,,,,,,,,...,,,,,,,,,,
SOAAUDN12AAF3B599F,,,,,,,,,,,...,,,,,,,,,,


In [24]:
# Item similarity matrix using Pearson correlation
item_similarity = matrix_norm.T.corr()
item_similarity.head()

songId,SOAAEJI12AB0188AB5,SOAAFAC12A67ADF7EB,SOAAQXG12AB017E149,SOAAROC12A6D4FA420,SOAAUDN12AAF3B599F,SOAAVUV12AB0186646,SOAAWEE12A6D4FBEC8,SOABDKL12A6701D928,SOABGAJ12AB0184566,SOABHTM12AB0186BF9,...,SOZYSDT12A8C13BFD7,SOZZDGI12A67020F28,SOZZEKW12AB017F704,SOZZQLZ12AB018B9D2,SOZZRHE12A6702165F,SOZZTNF12A8C139916,SOZZWEY12AF72A5E16,SOZZWNY12A6D4F9C49,SOZZWZV12A67AE140F,SOZZXDG12AB0180308
songId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
SOAAEJI12AB0188AB5,,,,,,,,,,,...,,,,,,,,,,
SOAAFAC12A67ADF7EB,,,,,,,,,,,...,,,,,,,,,,
SOAAQXG12AB017E149,,,1.0,,,,,,,,...,,,,,,,,,,
SOAAROC12A6D4FA420,,,,1.0,,,,,,,...,,,,,,,,,,
SOAAUDN12AAF3B599F,,,,,,,,,,,...,,,,,,,,,,


In [25]:
# Item similarity matrix using cosine similarity
item_similarity_cosine = cosine_similarity(matrix_norm.fillna(0))
item_similarity_cosine

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

In [26]:
from scipy.sparse import csr_matrix

In [27]:
# convert the dataframe into a pivot table
df_songs_features = data.pivot_table(index='songId', columns='userId', values='count').fillna(0)

# obtain a sparse matrix
mat_songs_features = csr_matrix(df_songs_features.values)

In [28]:
df_songs_features.head()

userId,000e5b27c695ccf98c5b89a26ea69fccd8563db9,000f07976f5e0861edc8e8de3dd56a9721038b69,00117127c2e13d97128c45a0084a194ea5c5290e,00154598d7b6c732217db98ca6732a02ceb1dc39,002304b0442afcf39a90b9f904b28ff92e6b5640,0024009aecc53148ae0d3ba056893e5c7864a228,002c70a763a2346a44576163bd9be4046af8b7eb,002efa7dee7aedd34ea26ce399fdd19e49e06569,00415f709a321ea4f1706d5c033b90feb6595b75,00540e39c6aed9d1576149699fbc3357753bd4ce,...,ffc663b7de6e31c0488b414b739fbf2ece8a00f5,ffc9a1010f753da5f0bd856d4c756b073c5ebda6,ffd5dd905bde38a3bcef0c9c878f96e181d6a484,ffdaab327f2fc6b9fa01a4e3e7f41fdd0e468046,ffe047640db7ab39070f80c829614d75001d5b10,ffe2fb3282f44535d797f6690588cc656ffbafad,ffee473b99a051b71cc910711d83af24903c7d79,ffef9c3e59ab44554a9775af5e3b2ac149111bb6,fff83c8596c1519f90fd5c5ed540f2ad93ea7bc5,ffffdc6c89988cd6119067769162948eacf8b670
songId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
SOAAEJI12AB0188AB5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
SOAAFAC12A67ADF7EB,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
SOAAQXG12AB017E149,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
SOAAROC12A6D4FA420,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
SOAAUDN12AAF3B599F,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [29]:
df_unique_songs = data.drop_duplicates(subset=['songId']).reset_index(drop=True)[['songId', 'song']]

In [30]:
decode_id_song = {
    song: i for i, song in 
    enumerate(list(df_unique_songs.set_index('songId').loc[df_songs_features.index].song))
}

In [32]:
# Pick a user ID
picked_userid = '000e5b27c695ccf98c5b89a26ea69fccd8563db9'

# Pick a movie
picked_movie = 'SOAACPJ12A81C21360'

# Movies that the target user has watched
picked_userid_watched = pd.DataFrame(matrix_norm[picked_userid].dropna(axis=0, how='all')\
                          .sort_values(ascending=False))\
                          .reset_index()\
                          .rename(columns={1:'count'})

picked_userid_watched.head()

Unnamed: 0,songId,000e5b27c695ccf98c5b89a26ea69fccd8563db9
0,SOZEUQZ12A6310E8E3,-0.5


In [33]:
# Merge the 'picked_userid_watched' DataFrame with 'song_titles_df' on 'song_id'
picked_userid_watched_with_titles = pd.merge(picked_userid_watched, data, on='songId', how='left')

# Display the DataFrame with song titles
picked_userid_watched_with_titles.drop(['artist','link','userId','text'], axis=1, inplace=True)


In [34]:
picked_userid_watched_with_titles

Unnamed: 0,songId,000e5b27c695ccf98c5b89a26ea69fccd8563db9,song,count
0,SOZEUQZ12A6310E8E3,-0.5,Bob That Head,2
1,SOZEUQZ12A6310E8E3,-0.5,No More Sweet Music,1


In [39]:
from sklearn.neighbors import NearestNeighbors

In [42]:
from sklearn.neighbors import NearestNeighbors
from fuzzywuzzy import fuzz
import numpy as np

class Recommender:
    def __init__(self, metric, algorithm, k, data, decode_id_song):
        self.metric = metric
        self.algorithm = algorithm
        self.k = k
        self.data = data
        self.decode_id_song = decode_id_song
        self.data = data
        self.model = self._recommender().fit(data)
    
    def make_recommendation(self, new_song, n_recommendations):
        recommended = self._recommend(new_song=new_song, n_recommendations=n_recommendations)
        print("... Done")
        return recommended 
    
    def _recommender(self):
        return NearestNeighbors(metric=self.metric, algorithm=self.algorithm, n_neighbors=self.k, n_jobs=-1)
    
    def _recommend(self, new_song, n_recommendations):
        # Get the id of the recommended songs
        recommendations = []
        recommendation_ids = self._get_recommendations(new_song=new_song, n_recommendations=n_recommendations)
        # return the name of the song using a mapping dictionary
        recommendations_map = self._map_indeces_to_song_title(recommendation_ids)
        # Translate this recommendations into the ranking of song titles recommended
        for i, (idx, dist) in enumerate(recommendation_ids):
            recommendations.append(recommendations_map[idx])
        return recommendations
                 
    def _get_recommendations(self, new_song, n_recommendations):
        # Get the id of the song according to the text
        recom_song_id = self._fuzzy_matching(song=new_song)
        # Start the recommendation process
        print(f"Starting the recommendation process for {new_song} ...")
        # Return the n neighbors for the song id
        distances, indices = self.model.kneighbors(self.data[recom_song_id], n_neighbors=n_recommendations+1)
        return sorted(list(zip(indices.squeeze().tolist(), distances.squeeze().tolist())), key=lambda x: x[1])[:0:-1]
    
    def _map_indeces_to_song_title(self, recommendation_ids):
        # get reverse mapper
        return {song_id: song_title for song_title, song_id in self.decode_id_song.items()}
    
    def _fuzzy_matching(self, song):
        match_tuple = []
        # get match
        for title, idx in self.decode_id_song.items():
            ratio = fuzz.ratio(title.lower(), song.lower())
            if ratio >= 60:
                match_tuple.append((title, idx, ratio))
        # sort
        match_tuple = sorted(match_tuple, key=lambda x: x[2])[::-1]
        if not match_tuple:
            print(f"The recommendation system could not find a match for {song}")
            return
        return match_tuple[0][1]

In [43]:
model = Recommender(metric='cosine', algorithm='brute', k=20, data=mat_songs_features, decode_id_song=decode_id_song)

In [44]:
song = 'I believe in miracles'

In [47]:
new_recommendations = model.make_recommendation(new_song=song, n_recommendations=4)

Starting the recommendation process for I believe in miracles ...
... Done


In [48]:
print(f"The recommendations for {song} are:")
print(f"{new_recommendations}")

The recommendations for I believe in miracles are:
['Winds Of War (Invasion)', 'SFTS SFTS', 'Lets Make Love', 'You Talk A Lot']
