![](https://www.ieseg.fr/wp-content/uploads/IESEG-Logo-2012-rgb.jpg)


# Model Evaluation Exercises

# Movielens 100k Data
- u.data     -- The full u data set, 100000 ratings by 943 users on 1682 items.
              Each user has rated at least 20 movies.  Users and items are
              numbered consecutively from 1.  The data is randomly
              ordered. This is a tab separated list of 
	         user id | item id | rating | timestamp. 
              The time stamps are unix seconds since 1/1/1970 UTC   

- u.info     -- The number of users, items, and ratings in the u data set.

- u.item     -- Information about the items (movies); this is a tab separated
              list of
              movie id | movie title | release date | video release date |
              IMDb URL | unknown | Action | Adventure | Animation |
              Children's | Comedy | Crime | Documentary | Drama | Fantasy |
              Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi |
              Thriller | War | Western |
              The last 19 fields are the genres, a 1 indicates the movie
              is of that genre, a 0 indicates it is not; movies can be in
              several genres at once.
              The movie ids are the ones used in the u.data data set.

- u.genre    -- A list of the genres.

- u.user     -- Demographic information about the users; this is a tab
              separated list of
              user id | age | gender | occupation | zip code
              The user ids are the ones used in the u.data data set.

- u.occupation -- A list of the occupations.

In [23]:
import pandas as pd 
import numpy as np 
from IESEGRecSys import eval
from sklearn.model_selection import train_test_split
from surprise import Dataset
from surprise import Reader
from surprise import KNNBasic
from surprise import accuracy
from surprise import BaselineOnly, SVD, CoClustering
import surprise
from IESEGRecSys.eval import prediction_metrics
from surprise.model_selection.search import GridSearchCV

In [2]:
# Import user-rating matrix
data = pd.read_csv('u.data', sep='\t', header=None)
data.columns = ['user', 'item', 'rating', 'timestamp']

# train-test split
train, test = train_test_split(data, test_size=0.3, random_state=42)

# reset index
train = train.reset_index(drop=True)
test = test.reset_index(drop=True)

print(data.shape)
print(train.shape)
print(test.shape)

(100000, 4)
(70000, 4)
(30000, 4)


In [3]:
train.head()

Unnamed: 0,user,item,rating,timestamp
0,907,628,5,880158986
1,622,206,1,882670899
2,18,480,4,880129595
3,484,699,4,891195773
4,871,690,3,888192315


# Exercise 1

- Apply and evaluate the following approaches:
    - user-based
    - item-based
    - matrix factorization
    - co-clustering

In [4]:
#create train and test sets that are exclusively user, item, and rating
trainUIR = train[['user','item','rating']]
testUIR = test[['user','item','rating']]

In [46]:
#set up reader with min rating and max rating arguments
reader = Reader(rating_scale=(1, 5))

# surprise training and test set 
data = Dataset.load_from_df(data[["user","item","rating"]], reader)
df_trainUIR = Dataset.load_from_df(trainUIR,reader).build_full_trainset()
df_testUIR = list(testUIR.itertuples(index=False, name=None))
evaluationDf = pd.DataFrame(columns=['Model','EvaluationScore']).set_index(['Model'])

## User-Based KNN

In [7]:
#create option parameter dict
optionsUb = {'name':'cosine','user_based':True}

#initialize user-based KNN model 
KNNbasicUb = KNNBasic(k=20,min_k=5,sim_options=optionsUb,random_state=123)

#fit on training set
KNNbasicUb.fit(df_trainUIR)

#predict test set
predsKNNbasicUb = KNNbasicUb.test(df_testUIR)

#compute rmse
accuracy = surprise.accuracy.rmse(predsKNNbasicUb)
print(f"KNN user-based basic RMSE of: ",accuracy)

Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 1.0251
KNN user-based basic RMSE of:  1.0251119861485698


In [32]:
evalKNNUb = eval.evaluate(prediction=predsKNNbasicUb, topn=5, rating_cutoff=4, excl_impossible=True)
evalKNNUb

Excluded 458 (30000) samples. 29542 remaining ...
Excluded 458 (30000) samples. 29542 remaining ...


Unnamed: 0,value
RMSE,1.015158
MAE,0.803685
Recall,0.412776
Precision,0.763214
F1,0.535781
NDCG@5,0.88643


## Item-Based KNN

In [8]:
#create option parameter dict
optionsIb = {'name':'cosine','user_based':False}

#initialize user-based KNN model 
KNNbasicIb = KNNBasic(k=20,min_k=5,sim_options=optionsIb,random_state=123)

#fit on training set
KNNbasicIb.fit(df_trainUIR)

#predict test set
predsKNNbasicIb = KNNbasicIb.test(df_testUIR)

#compute rmse
accuracyIb = surprise.accuracy.rmse(predsKNNbasicIb)


print(f"KNN item-based basic RMSE of: ",accuracyIb)

Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 1.0602
KNN item-based basic RMSE of:  1.0601934012805978


In [33]:
evalKNNIb = eval.evaluate(prediction=predsKNNbasicIb, topn=5, rating_cutoff=4, excl_impossible=True)
evalKNNIb

Excluded 68 (30000) samples. 29932 remaining ...
Excluded 68 (30000) samples. 29932 remaining ...


Unnamed: 0,value
RMSE,1.058353
MAE,0.841862
Recall,0.256061
Precision,0.761691
F1,0.383275
NDCG@5,0.862949


The RMSE for the user-based KNN is slightly better than that of the item-based KNN.

## ALS

In [9]:
#set baseline options
bsl_options = {'method': 'als',
              'n_epochs': 5, #default is 10
              'reg_u': 10, #regularization parameter for users
               'reg_i':5 #regularization parameter for items
              }
#initialize ALS algo and fit to train set
AlsMod = BaselineOnly(bsl_options=bsl_options).fit(df_trainUIR)
#predict
AlsPreds = AlsMod.test(df_testUIR)
#eval
accuracyAls = surprise.accuracy.rmse(AlsPreds)

print(f"ALS baseline RMSE: ",accuracyAls)

Estimating biases using als...
RMSE: 0.9422
ALS baseline RMSE:  0.942238914461919


In [34]:
evalAls = eval.evaluate(prediction=AlsPreds, topn=5, rating_cutoff=4, excl_impossible=True)
evalAls

Excluded 0 (30000) samples. 30000 remaining ...
Excluded 0 (30000) samples. 30000 remaining ...


Unnamed: 0,value
RMSE,0.942239
MAE,0.746035
Recall,0.33583
Precision,0.838588
F1,0.479596
NDCG@5,0.892415


## SVD Matrix Facorization

In [10]:
#SVD
#initialize and train SVD model
svdMod = SVD(n_factors=200,n_epochs=25,biased=False,random_state=123).fit(df_trainUIR)
#predict
svdModPreds = svdMod.test(df_testUIR)
#eval
accuracySVD = surprise.accuracy.rmse(svdModPreds)

print(f"SVD RMSE: ",accuracySVD)

RMSE: 0.9768
SVD RMSE:  0.9768170407696117


In [35]:
evalSVD = eval.evaluate(prediction=svdModPreds, topn=5, rating_cutoff=4, excl_impossible=True)
evalSVD

Excluded 60 (30000) samples. 29940 remaining ...
Excluded 60 (30000) samples. 29940 remaining ...


Unnamed: 0,value
RMSE,0.975189
MAE,0.773254
Recall,0.294541
Precision,0.855938
F1,0.438267
NDCG@5,0.887624


## Cocluster

In [11]:
#cocluster
#initialize and fit
CoClusterMod = CoClustering(n_cltr_u=10,n_cltr_i=10,n_epochs=50,random_state=123).fit(df_trainUIR)
#predict
clustModPreds = CoClusterMod.test(df_testUIR)
#eval
accuracyClust = surprise.accuracy.rmse(clustModPreds)

print(f"Cocluster RMSE: ",accuracyClust)

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  CoClusterMod = CoClustering(n_cltr_u=10,n_cltr_i=10,n_epochs=50,random_state=123).fit(df_trainUIR)


RMSE: 0.9925
Cocluster RMSE:  0.9924563507767619


In [36]:
evalCocluster = eval.evaluate(prediction=clustModPreds, topn=5, rating_cutoff=4, excl_impossible=True)
evalCocluster

Excluded 0 (30000) samples. 30000 remaining ...
Excluded 0 (30000) samples. 30000 remaining ...


Unnamed: 0,value
RMSE,0.992456
MAE,0.776421
Recall,0.429892
Precision,0.796151
F1,0.558315
NDCG@5,0.883225


# Exercise 2

- Display the evaluation results for all approaches in one dataframe

In [44]:
evalKNNUb.rename({"value":"UB KNN"}, axis=1, inplace=True)
evalKNNIb.rename({"value":"IB KNN"}, axis=1, inplace=True)
evalAls.rename({"value":"ALS"}, axis=1, inplace=True)
evalSVD.rename({"value":"SVD"}, axis=1, inplace=True)
evalCocluster.rename({"value":"Cocluster"}, axis=1, inplace=True)

evalDF = pd.concat([evalKNNUb,evalKNNIb,evalAls,evalSVD,evalCocluster],axis=1)

In [45]:
evalDF

Unnamed: 0,UB KNN,IB KNN,ALS,SVD,Cocluster
RMSE,1.015158,1.058353,0.942239,0.975189,0.992456
MAE,0.803685,0.841862,0.746035,0.773254,0.776421
Recall,0.412776,0.256061,0.33583,0.294541,0.429892
Precision,0.763214,0.761691,0.838588,0.855938,0.796151
F1,0.535781,0.383275,0.479596,0.438267,0.558315
NDCG@5,0.88643,0.862949,0.892415,0.887624,0.883225


# Exercise 3

- Check the documentation for "surprise.model_selection.GridSearchCV"
- Select the SVD model:
    - Grid search over <font color="blue">factors=[5,10,20], epochs=[10,20]</font> using 3-Fold cross-validation
- Display the parameters resulting in the lowest RMSE

In [50]:
param_grid = {'n_factors':[5,10,20],'n_epochs':[10,20]}

gridSearchSVD = GridSearchCV(SVD, param_grid, measures=['rmse'], cv=3)
gridSearchSVD.fit(data)

print(gridSearchSVD.best_params['rmse'])
print(gridSearchSVD.best_score['rmse'])

{'n_factors': 20, 'n_epochs': 20}
0.941101218879104
