# Recommendation System

In this notebook, I will be using the famous movie lens dataset to recommend movies to a user based on Collaborative Filtering

### Import

In [7]:
import surprise
import pandas as pd

from surprise import Reader, Dataset
from surprise import SVD, evaluate
from surprise import NMF

In [9]:
ratings = pd.read_csv('data/ml-latest-small/ratings.csv')

In [10]:
# to load dataset from pandas df, we need `load_fromm_df` method in surprise lib

ratings_dict = {'itemID': list(ratings.movieId),
                'userID': list(ratings.userId),
                'rating': list(ratings.rating)}
df = pd.DataFrame(ratings_dict)

In [12]:
# A reader is still needed but only the rating_scale param is required.
# The Reader class is used to parse a file containing ratings.
reader = Reader(rating_scale=(0.5, 5.0))


In [13]:
# The columns must correspond to user id, item id and ratings (in that order).
data = Dataset.load_from_df(df[['userID', 'itemID', 'rating']], reader)

Below are the codes and outputs of **Singular Value Decomposition (SVD)**, **Non negative Matrix Factorization (NMF)** and **K nearest Neighbour (KNN)**

In [8]:
# Split data into 5 folds
data.split(n_folds=5)

uid = str(196)  # raw user id (as in the ratings file). They are **strings**!
iid = str(302)  # raw item id (as in the ratings file). They are **strings**!


## Predict

In [15]:
# get a prediction for specific users and items.
algo_svd = SVD()
algo_svd.fit(trainset)
pred_svd = algo_svd.predict(uid, iid)
print(pred_svd)

In [16]:
algo_nmf = NMF()
algo_nmf.fit(trainset)
pred_nmf = algo_nmf.predict(uid, iid)
print(pred_nmf)

In [17]:
algo_knn = KNNBasic()
algo_knn.fit(trainset)
pred_knn = algo_knn.predict(uid, iid)
print(pred_knn)

Computing the msd similarity matrix...
Done computing similarity matrix.


## Testing accuracy

In [12]:
#Testing for SVD
evaluate(algo_svd, data, measures=['RMSE'])

Evaluating RMSE of algorithm SVD.

------------
Fold 1




RMSE: 0.9379
------------
Fold 2
RMSE: 0.9476
------------
Fold 3
RMSE: 0.9344
------------
Fold 4
RMSE: 0.9308
------------
Fold 5
RMSE: 0.9330
------------
------------
Mean RMSE: 0.9368
------------
------------


CaseInsensitiveDefaultDict(list,
                           {'rmse': [0.9378937978577937,
                             0.9476473759486812,
                             0.9343927360650216,
                             0.9307767327379257,
                             0.9330463797576408]})

In [11]:
#Testing for NMF
evaluate(algo_nmf, data, measures=['RMSE'])



Evaluating RMSE of algorithm NMF.

------------
Fold 1
RMSE: 0.9649
------------
Fold 2
RMSE: 0.9645
------------
Fold 3
RMSE: 0.9583
------------
Fold 4
RMSE: 0.9570
------------
Fold 5
RMSE: 0.9614
------------
------------
Mean RMSE: 0.9612
------------
------------


CaseInsensitiveDefaultDict(list,
                           {'rmse': [0.9649378015675462,
                             0.9644515438441085,
                             0.9583185167777158,
                             0.9570394997897902,
                             0.9613606967932976]})

In [None]:
#Testing for KNN 
evaluate(algo_knn, data, measures=['RMSE'])