# Building a recommender system

Using Surprise library (https://github.com/NicolasHug/Surprise)
and the builtin dataset movielens-100k (https://grouplens.org/datasets/movielens/)

100,000 ratings (1-5) from 943 users on 1682 movies

features are explicit (ratings,etc) not implicit (wishlist,etc)
The library supports collaborating filtering (CF) only, and not content-based methods.
Therefore there is no need to know the charasterics of items or users, only the crowd behaviour is necessary.

<pre>
Recommender System types:
1-Content-based filtering
2-Collaborative filtering
    a)Model-based
        Neural Nets, Matrix Factorization, SVD
    b)Memory-based
        (!)User-based
        (!!)Item-based
3-hybrid filtering
</pre>

In [3]:
from surprise import Dataset
from surprise.model_selection import cross_validate
from surprise import accuracy
from surprise.model_selection import train_test_split
from surprise.model_selection import GridSearchCV
from surprise import SVD


# Load the movielens-100k dataset (download it if needed),
data = Dataset.load_builtin('ml-100k')

In [4]:
#finding the best parameter values for the algorithm
param_grid = {'n_factors': [50,100],
              'lr_all': [0.003, 0.004, 0.005],
              'reg_all': [0.04, 0.4, 0.6]}
gs = GridSearchCV(SVD, param_grid, measures=['rmse', 'mae'], cv=3)

gs.fit(data)

print(gs.best_score['rmse'])
print(gs.best_params['rmse'])

0.9392085892450517
{'n_factors': 100, 'lr_all': 0.005, 'reg_all': 0.04}


# Split the data to train and test datasets and build the model

In [5]:
trainset, testset = train_test_split(data, test_size=.25)

# using SVD algorithm.It makes latent features for items and users and factorize the matrix into two matrices.
algo = SVD(n_factors= 100, lr_all= 0.005, reg_all= 0.04)

algo.fit(trainset)
predictions = algo.test(testset)

accuracy.rmse(predictions)

RMSE: 0.9369


0.9369290780020654

In [6]:
#using the prediction model
uid = str(160)  # user id 
iid = str(207)  # item id

# get a prediction for specific users and items.
pred = algo.predict(uid, iid, verbose=True)

user: 160        item: 207        r_ui = None   est = 4.30   {'was_impossible': False}
