In [0]:
# install packages
import sys

!pip install scikit-surprise

# Recommendation System

In this lab, we will use a python package named [Surprise](http://surpriselib.com/), which is an easy-to-use Python scikit for recommendation systems. It includes several commonly used algorithms, including [collaborative filtering](https://surprise.readthedocs.io/en/stable/knn_inspired.html) and [Matrix Factorization-based algorithms](https://surprise.readthedocs.io/en/stable/matrix_factorization.html).

In [0]:
from surprise.prediction_algorithms.matrix_factorization import SVD
from surprise.prediction_algorithms.knns import KNNBasic
from surprise.prediction_algorithms.knns import KNNWithMeans
from surprise.prediction_algorithms.knns import KNNBaseline
from surprise import Dataset
from surprise import accuracy
from surprise.model_selection import cross_validate
from surprise.model_selection import train_test_split
from surprise.model_selection import GridSearchCV

## Load data from package surprise 

First, we can download the dataset included in package surprise. The data will be saved in the .surprise_data folder in your home directory.

In [0]:
# Load the movielens-100k dataset (download it if needed),
data = Dataset.load_builtin('ml-100k')

# sample random trainset and testset where test set is made of 20% of the ratings.
trainset, testset = train_test_split(data, test_size=0.20)

## Collaborative Filtering

First, we will apply three different flavors of collaborative filtering to this data.

### The basic collaborative filtering algorithm

In [0]:
# Use the basic collaborative filtering algorithm. 
# See https://surprise.readthedocs.io/en/stable/knn_inspired.html for more details.
cf = KNNBasic()
cf.fit(trainset)

# Train the algorithm on the trainset, and predict ratings for the testset
predictions = cf.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)
accuracy.mae(predictions)

### The basic collaborative filtering algorithm with user mean ratings

In [0]:
# Use the basic collaborative filtering algorithm, taking into account the mean ratings of each user.
# See https://surprise.readthedocs.io/en/stable/knn_inspired.html for more details.
cf_mean = KNNWithMeans()
cf_mean.fit(trainset)

# Train the algorithm on the trainset, and predict ratings for the testset
predictions = cf_mean.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)
accuracy.mae(predictions)

## Matrix Factorization

Then, we will apply the matrix factorization to this data.

In [0]:
# We'll use the famous SVD algorithm.
svd = SVD(n_factors=20)
svd.fit(trainset)

# Train the algorithm on the trainset, and predict ratings for the testset
predictions = svd.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)
accuracy.mae(predictions)

In [0]:
# We'll use the famous SVD algorithm.
svd = SVD(n_factors=50)
svd.fit(trainset)

# Train the algorithm on the trainset, and predict ratings for the testset
predictions = svd.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)
accuracy.mae(predictions)

# End of Lab: Recommendation System