# **Music Recommendation System**

# **Milestone 2**

Now that we have explored the data, let's apply different algorithms to build recommendation systems.

**Note:** Used the shorter version of the data, i.e., the data after the cutoffs as used in Milestone 1.

## **Load the dataset**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


In [None]:
# Loading the dataset that was saved at the end of milestone 1
from google.colab import drive
drive.mount('/content/drive')
df_final = pd.read_csv('drive/My Drive/Colab Notebooks/MIT Case Studies/Capstone Project/df_final.csv')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
df_final.drop(columns='Unnamed: 0', inplace=True)

### **Popularity-Based Recommendation Systems**

Let's take the count and sum of play counts of the songs and build the popularity recommendation systems based on the sum of play counts.

In [None]:
# Calculating average play_count
average_count = df_final.groupby(by='song_id').count()['play_count']        

# Calculating the frequency a song is played
play_freq = df_final.groupby(by='song_id').sum()['play_count']        

In [None]:
# Making a dataframe with the average_count and play_freq
final_play = pd.DataFrame({'avg_count':average_count, 'play_freq':play_freq})

# Let us see the first five records of the final_play dataset
final_play.head()

Unnamed: 0_level_0,avg_count,play_freq
song_id,Unnamed: 1_level_1,Unnamed: 2_level_1
21,282,515
22,137,222
52,453,888
62,126,257
93,126,222


Now, let's create a function to find the top n songs for a recommendation based on the average play count of song. With a threshold for a minimum number of playcounts for a song to be considered for recommendation.

In [None]:
# Build the function to find top n songs
def top_n_songs(n, minplays):
  top = final_play.loc[final_play.play_freq > minplays]
  top = top.sort_values(by='avg_count', ascending=False)
  return top[:n]

In [None]:
# Recommend top 10 songs using the function defined above
top_n_songs(10,1000)

Unnamed: 0_level_0,avg_count,play_freq
song_id,Unnamed: 1_level_1,Unnamed: 2_level_1
352,1002,2904
2220,928,2658
8582,838,1898
5531,817,2422
7416,754,2172
1118,724,1513
4152,721,1604
4448,712,1806
1334,688,1798
317,684,2201


### **User User Similarity-Based Collaborative Filtering**

To build the user-user-similarity-based and subsequent models we will use the "surprise" library.

In [None]:
# Install the surprise package
!pip install surprise 

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# Import necessary libraries

# To compute the accuracy of models
from surprise import accuracy

# This class is used to parse a file containing play_counts, data should be in structure - user; item; play_count
from surprise.reader import Reader

# Class for loading datasets
from surprise.dataset import Dataset

# For tuning model hyperparameters
from surprise.model_selection import GridSearchCV

# For splitting the data in train and test dataset
from surprise.model_selection import train_test_split

# For implementing similarity-based recommendation system
from surprise.prediction_algorithms.knns import KNNBasic

# For implementing matrix factorization based recommendation system
from surprise.prediction_algorithms.matrix_factorization import SVD

# For implementing KFold cross-validation
from surprise.model_selection import KFold

# For implementing clustering-based recommendation system
from surprise import CoClustering

In [None]:
from collections import defaultdict

### Some useful functions

Below is the function to calculate precision@k and recall@k, RMSE and F1_Score@k to evaluate the model performance.

In [None]:
# The function to calulate the RMSE, precision@k, recall@k, and F_1 score
def precision_recall_at_k(model, k = 30, threshold = 1.5):
    """Return precision and recall at k metrics for each user"""

    # First map the predictions to each user.
    user_est_true = defaultdict(list)
    
    # Making predictions on the test data
    predictions=model.test(testset)
    
    for uid, _, true_r, est, _ in predictions:
        user_est_true[uid].append((est, true_r))

    precisions = dict()
    recalls = dict()
    for uid, user_ratings in user_est_true.items():

        # Sort user ratings by estimated value
        user_ratings.sort(key = lambda x : x[0], reverse = True)

        # Number of relevant items
        n_rel = sum((true_r >= threshold) for (_, true_r) in user_ratings)

        # Number of recommended items in top k
        n_rec_k = sum((est >= threshold) for (est, _) in user_ratings[ : k])

        # Number of relevant and recommended items in top k
        n_rel_and_rec_k = sum(((true_r >= threshold) and (est >= threshold))
                              for (est, true_r) in user_ratings[ : k])

        # Precision@K: Proportion of recommended items that are relevant
        # When n_rec_k is 0, Precision is undefined. We here set Precision to 0 when n_rec_k is 0

        precisions[uid] = n_rel_and_rec_k / n_rec_k if n_rec_k != 0 else 0

        # Recall@K: Proportion of relevant items that are recommended
        # When n_rel is 0, Recall is undefined. We here set Recall to 0 when n_rel is 0

        recalls[uid] = n_rel_and_rec_k / n_rel if n_rel != 0 else 0
    
    # Mean of all the predicted precisions are calculated
    precision = round((sum(prec for prec in precisions.values()) / len(precisions)), 3)

    # Mean of all the predicted recalls are calculated
    recall = round((sum(rec for rec in recalls.values()) / len(recalls)), 3)
    
    accuracy.rmse(predictions)

    # Command to print the overall precision
    print('Precision: ', precision)

    # Command to print the overall recall
    print('Recall: ', recall)
    
    # Formula to compute the F-1 score
    print('F_1 score: ', round((2 * precision * recall) / (precision + recall), 3))

  

In [None]:
# Instantiating Reader scale with expected rating scale 
reader = Reader(rating_scale=(0,5)) #use rating scale (0, 5)

# Loading the dataset
data = Dataset.load_from_df(df_final[['user_id','song_id','play_count']], reader) # Take only "user_id","song_id", and "play_count"

# Splitting the data into train and test dataset
trainset, testset = train_test_split(data, test_size=.4, random_state = 42) # Take test_size = 0.4

In [None]:
from pandas.core.common import random_state
# Build the default user-user-similarity model
sim_options = {'name': 'cosine',
               'user_based':True}

# KNN algorithm is used to find desired similar items
sim_user_user = KNNBasic(sim_options=sim_options, verbose=True, random_state=1) # Use random_state = 1 

# Train the algorithm on the trainset, and predict play_count for the testset
sim_user_user.fit(trainset)

# Let us compute precision@k, recall@k, and f_1 score with k = 30
precision_recall_at_k(sim_user_user, k=30, threshold=1.5) # Use sim_user_user model

Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 1.3844
Precision:  0.448
Recall:  0.829
F_1 score:  0.582


**Observations and Insights:_________**

This is a baseline user_user similarity model. We tried Hyperpameter tuning in the following section to find the optimal model and then compare the performance matrices

In [None]:
# Predicting play_count for a sample user with a listened song
sim_user_user.predict(uid = 6958, iid = 1671, r_ui = 2, verbose = True) # Use user id 6958 and song_id 1671

user: 6958       item: 1671       r_ui = 2.00   est = 1.65   {'actual_k': 40, 'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=1.6539807631633718, details={'actual_k': 40, 'was_impossible': False})

In [None]:
# Predicting play_count for a sample user with a song not-listened by the user
sim_user_user.predict(uid = 6958, iid = 3232, verbose = True) # Use user_id 6958 and song_id 3232

user: 6958       item: 3232       r_ui = None   est = 1.50   {'actual_k': 40, 'was_impossible': False}


Prediction(uid=6958, iid=3232, r_ui=None, est=1.4980068628531316, details={'actual_k': 40, 'was_impossible': False})

**Observations and Insights:_________**

Both the songs 1671 & 3232 will be recommended by the model as estimated rating for both of them is >=1.5

Recommendations were generated based on default number of neighbors (=40) for the user

Now, let's try to tune the model and see if we can improve the model performance.

In [None]:
# Setting up parameter grid to tune the hyperparameters
param_grid = {'k': [10, 20, 30], 'min_k': [3, 6, 9],
              'sim_options': {'name': ["cosine", 'pearson', "pearson_baseline"],
                              'user_based': [True], "min_support": [2, 4]}
              }

# Performing 3-fold cross-validation to tune the hyperparameters
gs = GridSearchCV(KNNBasic, param_grid, measures=['rmse'], cv=3, n_jobs=-1)

# Fitting the data
gs.fit(data) # Use entire data for GridSearch

# Best RMSE score
print(gs.best_score['rmse'])

# Combination of parameters that gave the best RMSE score
print(gs.best_params['rmse'])

1.286246753771276
{'k': 30, 'min_k': 9, 'sim_options': {'name': 'pearson_baseline', 'user_based': True, 'min_support': 2}}


In [None]:
# Train the best model found in above gridsearch
sim_options = {'name': 'pearson_baseline', 
               'user_based': True, 
               'min_support': 2}
sim_user_user_opt = KNNBasic(sim_options=sim_options, k = 30, min_k=9, verbose=True, random_state=1)
sim_user_user_opt.fit(trainset)
precision_recall_at_k(sim_user_user_opt, k=30, threshold=1.5)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
RMSE: 1.2985
Precision:  0.461
Recall:  0.831
F_1 score:  0.593


**Observations and Insights:_________**

1. Optimized model performed better on all the scores (RMSE, Precision, Recall, F_1 Score)

2. Best rating prediction is done based on 30 neighbours for each user. In case there are not sufficient user, minimum number of users used are 9 users.

3. In cases where model could not find even minimum number of users then the model would give predicted rating for a song as global average rating 

In [None]:
# Predict the play count for a user who has listened to the song. Take user_id 6958, song_id 1671 and r_ui = 2
sim_user_user_opt.predict(uid=6958, iid=1671, r_ui=2, verbose=True)

user: 6958       item: 1671       r_ui = 2.00   est = 1.99   {'actual_k': 23, 'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=1.9904297600569851, details={'actual_k': 23, 'was_impossible': False})

In [None]:
# Predict the play count for a song that is not listened to by the user (with user_id 6958)
sim_user_user_opt.predict(uid=6958, iid=3232, verbose=True)

user: 6958       item: 3232       r_ui = None   est = 2.02   {'was_impossible': True, 'reason': 'Not enough neighbors.'}


Prediction(uid=6958, iid=3232, r_ui=None, est=2.0171783532298884, details={'was_impossible': True, 'reason': 'Not enough neighbors.'})

**Observations and Insights:______________**

1) for user 6958, preidcted rating was very close to actual rating and no of neighbours found were 23 for item 1671 

2) for user/item (6958/3232) there was no existing interaction. The preicted rating is 1.41 based on 10 neighbours. As the rating threshhold is 1.5 model would recommend this song to the user   

**Think About It:** Along with making predictions on listened and unknown songs can we get 5 nearest neighbors (most similar) to a certain song?

In [None]:
# Use inner id 0
sim_user_user_opt.get_neighbors(0, 5)

[1550, 798, 502, 2357, 1931]

Below we will be implementing a function where the input parameters are:

- data: A **song** dataset
- user_id: A user-id **against which we want the recommendations**
- top_n: The **number of songs we want to recommend**
- algo: The algorithm we want to use **for predicting the play_count**
- The output of the function is a **set of top_n items** recommended for the given user_id based on the given algorithm

In [None]:
def get_recommendations(data, user_id, top_n, algo):
    
    # Creating an empty list to store the recommended product ids
    recommendations = []
    
    # Creating an user item interactions matrix 
    user_item_interactions_matrix = data.pivot(index = 'user_id', columns = 'song_id', values = 'play_count')
    
    # Extracting those business ids which the user_id has not visited yet
    non_interacted_products = user_item_interactions_matrix.loc[user_id][user_item_interactions_matrix.loc[user_id].isnull()].index.tolist()
    
    # Looping through each of the business ids which user_id has not interacted yet
    for item_id in non_interacted_products:
        
        # Predicting the ratings for those non visited restaurant ids by this user
        est = algo.predict(user_id, item_id).est
        
        # Appending the predicted ratings
        recommendations.append((item_id, est))

    # Sorting the predicted ratings in descending order
    recommendations.sort(key = lambda x : x[1], reverse = True)

    return recommendations[:top_n] # Returing top n highest predicted rating products for this user

In [None]:
# Make top 5 recommendations for user_id 6958 with a similarity-based recommendation engine
recommendations =get_recommendations(df_final, 6958, 5, sim_user_user_opt)

In [None]:
# Building the dataframe for above recommendations with columns "song_id" and "predicted_ratings"
pd.DataFrame(recommendations, columns=['song_id', 'prediction'])

Unnamed: 0,song_id,prediction
0,8324,3.625336
1,614,3.468239
2,4831,3.426938
3,5653,3.42494
4,8831,3.384205


**Observations and Insights:______________**

Predicted rating for the songs are above rating threshold 1.5, so all these items are recommended to the user. Also, model predicted the most liked song id would be 8324 and least liked would be 8831

### Correcting the play_counts and Ranking the above songs

In [None]:
def ranking_songs(recommendations, final_rating):
  # Sort the songs based on play counts
  ranked_songs = final_rating.loc[[items[0] for items in recommendations]].sort_values('play_freq', ascending = False)[['play_freq']].reset_index()

  # Merge with the recommended songs to get predicted play_count
  ranked_songs = ranked_songs.merge(pd.DataFrame(recommendations, columns = ['song_id', 'predicted_ratings']), on = 'song_id', how = 'inner')

  # Rank the songs based on corrected play_counts
  ranked_songs['corrected_ratings'] = ranked_songs['predicted_ratings'] - 1 / np.sqrt(ranked_songs['play_freq'])

  # Sort the songs based on corrected play_counts
  ranked_songs = ranked_songs.sort_values('corrected_ratings', ascending=False)
  
  return ranked_songs

**Think About It:** In the above function to correct the predicted play_count a quantity 1/np.sqrt(n) is subtracted. What is the intuition behind it? Is it also possible to add this quantity instead of subtracting?

In [None]:
# Applying the ranking_songs function on the final_play data
ranking_songs(recommendations, final_play)

Unnamed: 0,song_id,play_freq,predicted_ratings,corrected_ratings
1,8324,537,3.625336,3.582183
0,614,2067,3.468239,3.446243
2,5653,508,3.42494,3.380572
4,4831,354,3.426938,3.373788
3,8831,410,3.384205,3.334819


**Observations and Insights:______________**

The predicted ratings are adjusted to account for the play_frequency. The corrected ratings changed the recommendaton for song_id 4831 to 4th position. While it was at 3rd position based on non_corrected ratings (predicted_ratings) 

### Item Item Similarity-based collaborative filtering recommendation systems 

In [None]:
# Apply the item-item similarity collaborative filtering model with random_state = 1 and evaluate the model performance

# Build default model
sim_options = {'name': 'cosine',
               'user_based':False}

# initialize the model
sim_item_item = KNNBasic(sim_options=sim_options, random_state = 1)

#Train the model
sim_item_item.fit(trainset)

#Printing model accuracy
precision_recall_at_k(sim_item_item, k=30, threshold=1.5)

Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 1.2944
Precision:  0.385
Recall:  0.685
F_1 score:  0.493


**Observations and Insights:______________**

1. The item_item similarity model has performed better in terms of RMSE but poorer in terms of precision, recall & f_1 score than basic user_user model.

In [None]:
# Predicting play count for a sample user_id 6958 and song (with song_id 1671) heard by the user
sim_item_item.predict(uid=6958, iid=1671, r_ui = 2, verbose=True)

user: 6958       item: 1671       r_ui = 2.00   est = 1.35   {'actual_k': 20, 'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=1.3532404188378213, details={'actual_k': 20, 'was_impossible': False})

In [None]:
#Find users that have not interacted with song_id 1671 to answer next question

user_item_interactions_matrix = df_final.pivot(index = 'user_id', columns = 'song_id', values = 'play_count')
non_interacted_user = user_item_interactions_matrix.loc[:, 1671][user_item_interactions_matrix.loc[:,1671].isnull()].index.tolist()
non_interacted_user[:5]

[11, 17, 57, 84, 120]

In [None]:
# Predict the play count for a user that has not listened to the song (with song_id 1671)
sim_item_item.predict(uid=11, iid=1671, verbose=True)

user: 11         item: 1671       r_ui = None   est = 1.34   {'actual_k': 3, 'was_impossible': False}


Prediction(uid=11, iid=1671, r_ui=None, est=1.3391344199837052, details={'actual_k': 3, 'was_impossible': False})

**Observations and Insights:______________**

1. For a sample user_id 6958 and song (with song_id 1671) the prediction is 1.35 which is not better than the prediction done by user_user model. Also this means model predicted that song would not be liked by user which is a wrong prediction

2. for sample user 11 who has not listened song 1671 the predicted rating is 1.34, which means the song would not be recommended to the user based the rating threshold 1.5

In [None]:
# Apply grid search for enhancing model performance

# Setting up parameter grid to tune the hyperparameters
param_grid = {'k': [20, 30, 40, 50], 'min_k': [3, 6, 9],
              'sim_options': {'name': ["cosine", 'pearson', "pearson_baseline"],
                              'user_based': [False], "min_support": [2, 4]}
              }

# Performing 3-fold cross-validation to tune the hyperparameters
gs = GridSearchCV(KNNBasic, param_grid=param_grid, measures=['rmse'], cv = 3, n_jobs= -1 )

# Fitting the data
gs.fit(data)

# Find the best RMSE score
print(gs.best_score['rmse'])
# Extract the combination of parameters that gave the best RMSE score
print(gs.best_params['rmse'])

1.2396743129798227
{'k': 20, 'min_k': 6, 'sim_options': {'name': 'pearson_baseline', 'user_based': False, 'min_support': 2}}


In [None]:
# Apply the best modle found in the grid search
sim_options = {'name': 'pearson_baseline',
               'user_based':False, 'min_support':2}

sim_item_item_opt = KNNBasic(sim_options=sim_options, k = 20, min_k=6, verbose=True, random_state=1)

#training the model
sim_item_item_opt.fit(trainset)

#Calculating precision Recall

precision_recall_at_k(sim_item_item_opt, k = 30, threshold = 1.5)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
RMSE: 1.2507
Precision:  0.464
Recall:  0.742
F_1 score:  0.571


**Observations and Insights:______________**

1. optimized item_item model performed better based on reduced RMSE but did not perform better on precision-recall compared to optimized user_user model

In [None]:
# Predict the play_count by a user(user_id 6958) for the song (song_id 1671)
sim_item_item_opt.predict(uid = 6958, iid = 1671, r_ui = 2, verbose = True)


user: 6958       item: 1671       r_ui = 2.00   est = 2.05   {'actual_k': 6, 'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=2.048247339119229, details={'actual_k': 6, 'was_impossible': False})

In [None]:
# Predicting play count for a sample user_id 6958 with song_id 3232 which is not heard by the user
sim_item_item_opt.predict(uid=6958, iid = 3232, verbose = True)

user: 6958       item: 3232       r_ui = None   est = 1.08   {'actual_k': 10, 'was_impossible': False}


Prediction(uid=6958, iid=3232, r_ui=None, est=1.0753932700253295, details={'actual_k': 10, 'was_impossible': False})

**Observations and Insights:______________**

1. estimated rating for user/item (6958/1671) is 2.05 much close to actual rating 2.00 with number of neighbours = 6

2. estimated rating for user/item (6958/3232) is 1.08 meaning the song would not be recommended to the user. Number of neighbours found for this song was 10

In [None]:
# Find five most similar items to the item with inner id 0
sim_item_item_opt.get_neighbors(0, 5)

[215, 365, 397, 97, 425]

In [None]:
# Making top 5 recommendations for user_id 6958 with item_item_similarity-based recommendation engine
recommendations = get_recommendations(df_final, 6958, 5, sim_item_item_opt)

In [None]:
# Building the dataframe for above recommendations with columns "song_id" and "predicted_play_count"
pd.DataFrame(recommendations, columns = ['song_id', 'predictions'])

Unnamed: 0,song_id,predictions
0,139,2.424359
1,4178,2.4048
2,2672,2.383264
3,3101,2.147891
4,9391,2.122287


In [None]:
# Applying the ranking_songs function
ranking_songs(recommendations, final_play)

Unnamed: 0,song_id,play_freq,predicted_ratings,corrected_ratings
3,139,277,2.424359,2.364275
2,4178,444,2.4048,2.357343
0,2672,869,2.383264,2.349341
1,9391,490,2.122287,2.077112
4,3101,197,2.147891,2.076643


**Observations and Insights:_________**

1) Based on corrected ratings the prediction for song_id 31031 has been lowered from 4th position to 5th position

2) Song recommendations has completely changed in item_item model compared to user_user model

### Model Based Collaborative Filtering - Matrix Factorization

Model-based Collaborative Filtering is a **personalized recommendation system**, the recommendations are based on the past behavior of the user and it is not dependent on any additional information. We use **latent features** to find recommendations for each user.

In [None]:
# Build baseline model using svd

svd = SVD(random_state=1)

# Training the algorithm on the train set
svd.fit(trainset)

# Use the function precision_recall_at_k to compute precision@k, recall@k, F1-Score, and RMSE
precision_recall_at_k(svd)

RMSE: 1.2449
Precision:  0.467
Recall:  0.758
F_1 score:  0.578


In [None]:
# Making prediction for user (with user_id 6958) to song (with song_id 1671), take r_ui = 2
svd.predict("6958", "1671", r_ui = 2, verbose = True)

user: 6958       item: 1671       r_ui = 2.00   est = 2.02   {'was_impossible': False}


Prediction(uid='6958', iid='1671', r_ui=2, est=2.0171783532298884, details={'was_impossible': False})

In [None]:
#Find users that have not interacted with song_id 1671 to answer next questions

user_item_interactions_matrix = df_final.pivot(index = 'user_id', columns = 'song_id', values = 'play_count')
non_interacted_user = user_item_interactions_matrix.loc[:, 3232][user_item_interactions_matrix.loc[:,3232].isnull()].index.tolist()
non_interacted_user[:5]

[11, 17, 57, 84, 120]

In [None]:
# Making a prediction for the user who has not listened to the song (song_id 3232)
svd.predict("11", "3232", verbose = True)

user: 11         item: 3232       r_ui = None   est = 2.02   {'was_impossible': False}


Prediction(uid='11', iid='3232', r_ui=None, est=2.0171783532298884, details={'was_impossible': False})

#### Improving matrix factorization based recommendation system by tuning its hyperparameters

In [None]:
# Set the parameter space to tune
param_grid = {'n_epochs': [10, 20, 30], 'lr_all': [0.001, 0.005, 0.01],
              'reg_all': [0.2, 0.4, 0.6]}

# Performe 3-fold grid-search cross-validation
gs = GridSearchCV(SVD, param_grid=param_grid, measures=['rmse'], cv = 3, n_jobs=-1)

# Fitting data
gs.fit(data)
# Best RMSE score
print(gs.best_score['rmse'])
# Combination of parameters that gave the best RMSE score
print(gs.best_params['rmse'])

1.2201647735455816
{'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.2}


In [None]:
# Building the optimized SVD model using optimal hyperparameters
svd_optimized = SVD(n_epochs = 30, lr_all = 0.01, reg_all = 0.2, random_state = 1)

# Train the algorithm on the train set
svd_optimized.fit(trainset)

# Use the function precision_recall_at_k to compute precision@k, recall@k, F1-Score, and RMSE
precision_recall_at_k(svd_optimized)

RMSE: 1.2224
Precision:  0.468
Recall:  0.772
F_1 score:  0.583


**Observations and Insights:_________**

1. Optimized model performed better than the base svd model

2. This model performed better on all metrics compared to item_item optimized model

3. this model performed better on RMSE compated to user_user optimized model but performed poorer on F_1 score

In [None]:
# Using svd_algo_optimized model to recommend for userId 6958 and song_id 1671
svd_optimized.predict(uid = 6958, iid = 1671, r_ui = 2, verbose = True)

user: 6958       item: 1671       r_ui = 2.00   est = 1.29   {'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=1.2857012375506207, details={'was_impossible': False})

In [None]:
# Using svd_algo_optimized model to recommend for userId 6958 and song_id 3232 with unknown baseline rating
svd_optimized.predict(uid = 6958, iid = 3232, verbose = True)

user: 6958       item: 3232       r_ui = None   est = 1.27   {'was_impossible': False}


Prediction(uid=6958, iid=3232, r_ui=None, est=1.2669219049189524, details={'was_impossible': False})

**Observations and Insights:_________**

In [None]:
# Getting top 5 recommendations for user_id 6958 using "svd_optimized" algorithm
svd_recommendations = get_recommendations(df_final, 6958, 5, svd_optimized)

In [None]:
# Ranking songs based on above recommendations
ranking_songs(svd_recommendations, final_play)

Unnamed: 0,song_id,play_freq,predicted_ratings,corrected_ratings
2,7224,531,2.87057,2.827174
3,5653,508,2.561234,2.516866
1,8324,537,2.470953,2.4278
4,4831,354,2.414656,2.361507
0,614,2067,2.33905,2.317055


**Observations and Insights:_________**

song recommendations from svd model is totally different from other 2 models trained above

### Cluster Based Recommendation System

In **clustering-based recommendation systems**, we explore the **similarities and differences** in people's tastes in songs based on how they rate different songs. We cluster similar users together and recommend songs to a user based on play_counts from other users in the same cluster.

In [None]:
# Make baseline clustering model
cc = CoClustering(random_state=1)
cc.fit(trainset)

precision_recall_at_k(cc, k = 30, threshold = 1.5)

RMSE: 1.2881
Precision:  0.457
Recall:  0.654
F_1 score:  0.538


In [None]:
# Making prediction for user_id 6958 and song_id 1671
cc.predict(uid=6958, iid=1671, r_ui=2, verbose=True)

user: 6958       item: 1671       r_ui = 2.00   est = 0.95   {'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=0.952914534700418, details={'was_impossible': False})

In [None]:
# Making prediction for user (userid 6958) for a song(song_id 3232) not heard by the user
cc.predict(uid=6958, iid=3232, verbose=True)

user: 6958       item: 3232       r_ui = None   est = 1.19   {'was_impossible': False}


Prediction(uid=6958, iid=3232, r_ui=None, est=1.1866726556399483, details={'was_impossible': False})

#### Improving clustering-based recommendation system by tuning its hyper-parameters

In [None]:
from scipy.stats import gstd
# Set the parameter space to tune
param_grid = {'n_cltr_u': [5, 6, 7, 8], 'n_cltr_i': [5, 6, 7, 8], 'n_epochs': [10, 20, 30]}

# Performing 3-fold grid search cross-validation
gscc = GridSearchCV(CoClustering, param_grid=param_grid, measures = ['rmse'], n_jobs=-1 )
# Fitting data
gscc.fit(data)
# Best RMSE score
print(gscc.best_score['rmse'])
# Combination of parameters that gave the best RMSE score
print(gscc.best_params['rmse'])

1.2770197574080915
{'n_cltr_u': 5, 'n_cltr_i': 8, 'n_epochs': 30}


In [None]:
# Train the tuned Coclustering algorithm
cc_opt = CoClustering(n_cltr_u= 5, n_cltr_i= 8, n_epochs=30)
cc_opt.fit(trainset)

precision_recall_at_k(cc_opt, k=30, threshold=1.5)

RMSE: 1.3048
Precision:  0.457
Recall:  0.653
F_1 score:  0.538


**Observations and Insights:_________**

Tune CoClustering Models performed worse than all the model trained so far on all parameters (RMSE, precision, recall, F_1)

Precision/recall of the model is very low


In [None]:
# Using co_clustering_optimized model to recommend for userId 6958 and song_id 1671
cc_opt.predict(uid=6958, iid=1671, r_ui=2, verbose = True)

user: 6958       item: 1671       r_ui = 2.00   est = 1.28   {'was_impossible': False}


Prediction(uid=6958, iid=1671, r_ui=2, est=1.2764992390495908, details={'was_impossible': False})

In [None]:
# Use Co_clustering based optimized model to recommend for userId 6958 and song_id 3232 with unknown baseline rating
cc_opt.predict(uid=6958, iid=3232, verbose=True)

user: 6958       item: 3232       r_ui = None   est = 0.74   {'was_impossible': False}


Prediction(uid=6958, iid=3232, r_ui=None, est=0.7353099200648747, details={'was_impossible': False})

**Observations and Insights:_________**

Model perfromance is poor overall.
- for the user/item (6958/1671) the model estimated rating of .86 means the item would not be recommended to user. However actual rating is 2. 

- for item/user(3232/6958) model predicted unseen rating of .57 which means the item would not be recommeded to the user.

#### Implementing the recommendation algorithm based on optimized CoClustering model

In [None]:
# Getting top 5 recommendations for user_id 6958 using "Co-clustering based optimized" algorithm
clustering_recommendations = get_recommendations(df_final, 6958, 5, cc_opt)

### Correcting the play_count and Ranking the above songs

In [None]:
# Ranking songs based on the above recommendations
ranking_songs(clustering_recommendations, final_play)

Unnamed: 0,song_id,play_freq,predicted_ratings,corrected_ratings
0,614,2067,4.183777,4.161782
1,617,1215,3.715504,3.686816
2,9096,797,3.63168,3.596259
3,4954,563,3.455905,3.41376
4,8324,537,3.451213,3.40806


**Observations and Insights:_________**

From ranked songs it would seem that the prediction for song 6860 should be higer as it has the highest play_freq among the recommended songs, but the plausible explanation could be that the cluster this song (6860) is part of might have overall rating low so the average rating for this song came out to be low. This could be the fallout of incorrect clustering for this song

### Content Based Recommendation Systems

In [None]:
df_small = df_final.copy()

In [None]:
df_small.shape

(130398, 7)

In [None]:
# Concatenate the "title", "release", "artist_name" columns to create a different column named "text"

df_small['text'] = df_small['title'] + " " + df_small['release'] + " " + df_small['artist_name']

In [None]:
# Select the columns 'user_id', 'song_id', 'play_count', 'title', 'text' from df_small data
df_small.drop(columns = ['release', 'artist_name', 'year'], inplace=True)

# Drop the duplicates from the title column
df_small = df_small.drop_duplicates(subset = 'title')

# Set the title column as the index
df_small.set_index(keys = 'title', inplace=True, drop=True)

# See the first 5 records of the df_small dataset
df_small.head()

Unnamed: 0_level_0,user_id,song_id,play_count,text
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Daisy And Prudence,6958,447,1,Daisy And Prudence Distillation Erin McKeown
The Ballad of Michael Valentine,6958,512,1,The Ballad of Michael Valentine Sawdust The Ki...
I Stand Corrected (Album),6958,549,1,I Stand Corrected (Album) Vampire Weekend Vamp...
They Might Follow You,6958,703,1,They Might Follow You Tiny Vipers Tiny Vipers
Monkey Man,6958,719,1,Monkey Man You Know I'm No Good Amy Winehouse


In [None]:
# check the shape of the df_small table again 

df_small.shape

(561, 4)

In [None]:
# Create the series of indices from the data
indices =df_small.index.tolist()
# to create index of the list convert it to series object
indices = pd.Series(indices)
print(indices)

0                   Daisy And Prudence
1      The Ballad of Michael Valentine
2            I Stand Corrected (Album)
3                They Might Follow You
4                           Monkey Man
                    ...               
556                          Waterfall
557                       Tuesday Moon
558                          Starlight
559      Everything In Its Right Place
560                      The Last Song
Length: 561, dtype: object


In [None]:
! pip install nltk

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# Importing necessary packages to work with text data
import nltk

# Download punkt library
nltk.download("punkt")

# Download stopwords library
nltk.download("stopwords")

# Download wordnet 
nltk.download("wordnet")

# Import regular expression
import re

# Import word_tokenizer
from nltk import word_tokenize

# Import WordNetLemmatizer
from nltk.stem import WordNetLemmatizer

# Import stopwords
from nltk.corpus import stopwords

# Import CountVectorizer and TfidfVectorizer
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [None]:
# this code was used based on the recommendation from the code error while building tfidf vectorizer
nltk.download('omw-1.4')

[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


True

We will create a **function to pre-process the text data:**

In [None]:
# Function to tokenize the text
def tokenize(text):
    
    text = re.sub(r"[^a-zA-Z]"," ", text.lower())
    
    tokens = word_tokenize(text)
    
    words = [word for word in tokens if word not in stopwords.words('english')]  # Use stopwords of english
    
    text_lems = [WordNetLemmatizer().lemmatize(lem).strip() for lem in words]

    return text_lems

In [None]:
# Create tfidf vectorizer 
tfidf = TfidfVectorizer(tokenizer = tokenize)

# Fit_transfrom the above vectorizer on the text column and then convert the output into an array
song_tfidf = tfidf.fit_transform(df_small['text'].values).toarray()

# pd.DataFrame(song_tfidf)

In [None]:
from sklearn.metrics.pairwise import cosine_similarity

# Compute the cosine similarity for the tfidf above output
similar_songs = cosine_similarity(song_tfidf, song_tfidf)

# Let us see the above array
similar_songs

array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])

Let's create a function to find most similar songs to recommend for a given song.

In [None]:
# Function that takes in song title as input and returns the top 10 recommended songs
def recommendations(title, similar_songs):
    
    recommended_songs = []
    
    # Getting the index of the song that matches the title
    idx = indices[indices == title].index[0]

    # Creating a Series with the similarity scores in descending order
    score_series = pd.Series(similar_songs[idx]).sort_values(ascending = False)

    # Getting the indexes of the 10 most similar songs
    top_10_indexes = list(score_series.iloc[1 : 11].index)
    print(top_10_indexes)
    
    # Populating the list with the titles of the best 10 matching songs
    for i in top_10_indexes:
        recommended_songs.append(list(df_small.index)[i])
        
    return recommended_songs

Recommending 10 songs similar to Learn to Fly

In [None]:
# Make the recommendation for the song with title 'Learn To Fly'
recommendations('Learn To Fly', similar_songs)

[525, 273, 447, 372, 419, 413, 368, 382, 381, 380]


['Everlong',
 'The Pretender',
 'Nothing Better (Album)',
 'From Left To Right',
 'Lifespan Of A Fly',
 'Closer',
 'LDN',
 'Rianna',
 'Eye Of The Tiger',
 "What I've Done (Album Version)"]

**Observations and Insights:_________**

1. The best model is User_User_Similarity_Optimized for the following reasons
    - evaluation metric F_1 Score is highest at .593
    - Recall is highest at .831
    - Precision is lowest for this model at .461 however other models also have precision in the same range (.464, .468, .461)
    -  To increase precision, False Negatives must be lowered. To increase the prediction power of models we need to consider to collect more features  

## **Conclusion and Recommendations:** Check Document for the answer to these questions

- **Refined Insights -** What are the most meaningful insights from the data relevant to the problem?

- **Comparison of various techniques and their relative performance -** How do different techniques perform? Which one is performing relatively better? Is there scope to improve the performance further?

- **Proposal for the final solution design -** What model do you propose to be adopted? Why is this the best solution to adopt?