<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">


# Recommendations with surprise

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Load-the-movielens-100k-dataset-from-disk" data-toc-modified-id="Load-the-movielens-100k-dataset-from-disk-1">Load the movielens-100k dataset from disk</a></span><ul class="toc-item"><li><span><a href="#Instantiate-the-algorithm" data-toc-modified-id="Instantiate-the-algorithm-1.1">Instantiate the algorithm</a></span></li><li><span><a href="#Extract-the-model-parameters" data-toc-modified-id="Extract-the-model-parameters-1.2">Extract the model parameters</a></span></li><li><span><a href="#Evaluate-the-model:" data-toc-modified-id="Evaluate-the-model:-1.3">Evaluate the model:</a></span></li><li><span><a href="#Put-the-predictions-in-a-dataframe" data-toc-modified-id="Put-the-predictions-in-a-dataframe-1.4">Put the predictions in a dataframe</a></span></li><li><span><a href="#Correlations-between-predicted-and-true-ratings" data-toc-modified-id="Correlations-between-predicted-and-true-ratings-1.5">Correlations between predicted and true ratings</a></span></li></ul></li><li><span><a href="#Cross-validation,-train-test-split-and-grid-search" data-toc-modified-id="Cross-validation,-train-test-split-and-grid-search-2">Cross validation, train-test split and grid search</a></span></li><li><span><a href="#Slope-One" data-toc-modified-id="Slope-One-3">Slope One</a></span></li><li><span><a href="#KNN-with-Means" data-toc-modified-id="KNN-with-Means-4">KNN with Means</a></span></li><li><span><a href="#Precision@k-and-Recall@k" data-toc-modified-id="Precision@k-and-Recall@k-5">Precision@k and Recall@k</a></span></li><li><span><a href="#Top-n-predictions" data-toc-modified-id="Top-n-predictions-6">Top-n predictions</a></span><ul class="toc-item"><li><span><a href="#Coverage" data-toc-modified-id="Coverage-6.1">Coverage</a></span></li></ul></li></ul></div>

In this lab we will make use of the [surprise package](https://surprise.readthedocs.io/en/stable/index.html), a package dedicated to recommendation systems.

`conda install -c conda-forge scikit-surprise`

First we will need some data. Download the Movielens 100K dataset from [here](https://grouplens.org/datasets/movielens/). 
It is a very famous dataset about movie ratings.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

plt.style.use('ggplot')
sns.set(font_scale=1.5)
%config InlineBackend.figure_format = 'retina'
%matplotlib inline

In [2]:
# load surprise
import surprise as sur

## Load the movielens-100k dataset from disk

With the above command we could load the data in a simplified and already prepared way. As reading and preparing other files is not that straight-forward, we will rather load the file from disk.

In [3]:
df_data = pd.read_csv(
    '~/.surprise_data/ml-100k/ml-100k/u.data', sep='\t', header=None)
df_data.columns = ['user_id', 'item_id', 'rating', 'timestamp']
df_data.head()

Unnamed: 0,user_id,item_id,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


In [4]:
df_data.rating.describe()

count    100000.000000
mean          3.529860
std           1.125674
min           1.000000
25%           3.000000
50%           4.000000
75%           4.000000
max           5.000000
Name: rating, dtype: float64

The reader function serves to specify which columns are referring to user, items and ratings as well as the rating scale.

In [5]:
reader = sur.Reader(rating_scale=(1, 5))

In [6]:
# The columns must correspond to user id, item id and ratings (in that order).
data_1 = sur.Dataset.load_from_df(
    df_data[['user_id', 'item_id', 'rating']], reader)

### Instantiate the algorithm

In [7]:
algo = sur.SVD(random_state=1,
               biased=True,  # isolate biases
               reg_all=0.2,  # use regularisation (the same for all)
               n_epochs=20,  # number of epochs for stochastic gradient descent search
               n_factors=100  # number of factors to retain in SVD
               )

# we have to build a training set from the data
trainset_full = data_1.build_full_trainset()
# fit the model
algo.fit(trainset_full)

# we prepare a test set from the training set
trainsetfull_build = trainset_full.build_testset()
# obtain the predictions
predictions_full = algo.test(trainsetfull_build)
# evaluate the predictions
print(sur.accuracy.rmse(predictions_full, verbose=False))

0.9167802882204997


In [8]:
predictions_full[:5]

[Prediction(uid=196, iid=242, r_ui=3.0, est=3.911793041493128, details={'was_impossible': False}),
 Prediction(uid=196, iid=393, r_ui=4.0, est=3.4204953983043573, details={'was_impossible': False}),
 Prediction(uid=196, iid=381, r_ui=4.0, est=3.5515672295024134, details={'was_impossible': False}),
 Prediction(uid=196, iid=251, r_ui=3.0, est=4.108294470660626, details={'was_impossible': False}),
 Prediction(uid=196, iid=655, r_ui=5.0, est=3.7920116316066803, details={'was_impossible': False})]

### Extract the model parameters

In [9]:
mu = algo.default_prediction()
bu = algo.bu
bi = algo.bi
pu = algo.pu
qi = algo.qi
puqi = pu.dot(qi.T)

> Note that internally surprise uses other (inner) indices for users and items than in the original data.
> The original ones are the raw indices. There are functions to translate between the two.

In [10]:
# check that we can reconstruct the predictions using the parameters
i = 10
print(predictions_full[i])
print()
uid = predictions_full[i].uid
iid = predictions_full[i].iid
u_inner = trainset_full.to_inner_uid(uid)
i_inner = trainset_full.to_inner_iid(iid)

pred_calc = mu + bu[u_inner] + bi[i_inner] + puqi[u_inner, i_inner]
print('Results agree:', predictions_full[i].est - pred_calc)

user: 196        item: 580        r_ui = 2.00   est = 3.34   {'was_impossible': False}

Results agree: 0.0


### Evaluate the model:

In [11]:
sur.accuracy.rmse(predictions_full)
sur.accuracy.mae(predictions_full);

RMSE: 0.9168
MAE:  0.7305


### Put the predictions in a dataframe

In [12]:
df_pred = pd.DataFrame([(x.r_ui, x.est) for x in predictions_full],
                       columns=['Rating', 'Predicted'])

In [13]:
# reconstruct RMSE
np.sqrt(df_pred.apply(lambda x: (x[0]-x[1])**2, axis=1).mean())

0.9167802882205027

In [14]:
# reconstruct MAE
df_pred.apply(lambda x: abs(x[0]-x[1]), axis=1).mean()

0.7305196298517427

### Correlations between predicted and true ratings

In [15]:
df_pred.corr(method='pearson')

Unnamed: 0,Rating,Predicted
Rating,1.0,0.591973
Predicted,0.591973,1.0


In [16]:
df_pred.corr(method='spearman')

Unnamed: 0,Rating,Predicted
Rating,1.0,0.576486
Predicted,0.576486,1.0


In [17]:
df_pred.corr(method='kendall')

Unnamed: 0,Rating,Predicted
Rating,1.0,0.451749
Predicted,0.451749,1.0


## Cross validation, train-test split and grid search

Example from https://surprise.readthedocs.io/en/stable/FAQ.html?highlight=raw_ratings

In [18]:
import random

raw_ratings = data_1.raw_ratings
np.random.seed(1)
# shuffle ratings if you want
random.shuffle(raw_ratings)

# A = 90% of the data, B = 10% of the data
threshold = int(.9 * len(raw_ratings))
A_raw_ratings = raw_ratings[:threshold]
B_raw_ratings = raw_ratings[threshold:]

print(len(A_raw_ratings))
print(len(B_raw_ratings))

data_1.raw_ratings = A_raw_ratings  # data is now the set A

90000
10000


In [19]:
data_1

<surprise.dataset.DatasetAutoFolds at 0x1a224e9650>

In [20]:
len(data_1.raw_ratings)

90000

In [21]:
algo = sur.SVD(random_state=1)

In [22]:
cv_results = sur.model_selection.cross_validate(
    algo, data_1, measures=['RMSE', 'MAE'], cv=5)
pd.DataFrame(cv_results)

Unnamed: 0,test_rmse,test_mae,fit_time,test_time
0,0.951627,0.747188,5.788004,0.209199
1,0.940511,0.745558,5.462567,0.126055
2,0.940788,0.740873,5.216815,0.12385
3,0.938461,0.741327,4.812452,0.22917
4,0.937336,0.737476,4.945317,0.123679


In [23]:
# Select your best algo with grid search.
print('Grid Search...')
param_grid = {'n_epochs': [5, 10], 'lr_all': [0.002, 0.005]}
grid_search = sur.model_selection.GridSearchCV(sur.SVD,
                                               param_grid,
                                               measures=['rmse'],
                                               cv=3,
                                               refit=True)
grid_search.fit(data_1)

algo = grid_search.best_estimator['rmse']

# retrain on the whole set A
trainset = data_1.build_full_trainset()
algo.fit(trainset)

# Compute score on training set
trainset_build = trainset.build_testset()
predictions_train = algo.test(trainset_build)
print('Training score ', end='   ')
sur.accuracy.rmse(predictions_train)

# Compute score on rated test set
testset = data_1.construct_testset(B_raw_ratings)  # testset is now the set B
predictions_test = algo.test(testset)
print('Test score (rated items) ', end=' ')
sur.accuracy.rmse(predictions_test)

# Compute score on unrated data
# The anti-test set is the part where we did not have any ratings
no_ratings = trainset.build_anti_testset()
predictions_no_ratings = algo.test(no_ratings)
print('Test score (unrated items) ', end='   ')
sur.accuracy.rmse(predictions_no_ratings, verbose=False)

Grid Search...
Training score    RMSE: 0.8371
Test score (rated items)  RMSE: 0.9478
Test score (unrated items)    

0.5171933412180628

In [24]:
print(len(trainset_build), len(testset), len(no_ratings))

90000 10000 1483867


In [25]:
print(predictions_train[0])
print(predictions_test[0])
print(predictions_no_ratings[0])

user: 538        item: 183        r_ui = 4.00   est = 3.91   {'was_impossible': False}
user: 49         item: 294        r_ui = 1.00   est = 1.98   {'was_impossible': False}
user: 538        item: 302        r_ui = 3.53   est = 4.05   {'was_impossible': False}


In [26]:
# extract model parameters
mu = algo.default_prediction()
print(f'Training set mean: {mu:.6}')
bu = algo.bu
bi = algo.bi
pu = algo.pu
qi = algo.qi
puqi = pu.dot(qi.T)

Training set mean: 3.52949


In [27]:
bu[0]

-0.28292524012727094

In [28]:
# reconstruct predictions
i = 10
print(predictions_train[i])
print()
uid = predictions_train[i].uid
iid = predictions_train[i].iid
u_inner = trainset.to_inner_uid(uid)
i_inner = trainset.to_inner_iid(iid)

pred_calc = mu + bu[u_inner] + bi[i_inner] + puqi[u_inner, i_inner]
print('Results agree:', predictions_train[i].est - pred_calc)

user: 538        item: 196        r_ui = 4.00   est = 3.76   {'was_impossible': False}

Results agree: 0.0


## Slope One

Repeat the same steps with the slope one model.

In [29]:
algo = sur.SlopeOne()

## KNN with Means

Repeat the same steps with the kNN with means model.

In [30]:
algo = sur.KNNWithMeans()

## Precision@k and Recall@k

Obtain  precision@k and recall@k following the [example](https://surprise.readthedocs.io/en/stable/FAQ.html#how-to-compute-precision-k-and-recall-k).

## Top-n predictions

Obtain the n top-ranked predictions for each user following the [example](https://surprise.readthedocs.io/en/stable/FAQ.html#how-to-get-the-top-n-recommendations-for-each-user).

### Coverage

The coverage is the fraction of all available items which appear at least once in any recommended list of items.
Calculate the coverage of the top-ranked recommendations.