<a href="https://colab.research.google.com/github/wolframalexa/FrequentistML/blob/master/recommendation_nmf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [23]:
# NMF project is to select a dataset and use the out of the box sci-kit learn to build a recommendation system.
# Note, you should make sure that the dataset is appropriate for a NMF based rec system. 
# Use your system to output some recommendations for a user or two.

In [24]:
# import packages
import random
from surprise import Dataset
from surprise.model_selection.search import GridSearchCV
from surprise.prediction_algorithms.matrix_factorization import NMF

In [25]:
# use built-in movielens-100k dataset
data = Dataset.load_builtin('ml-100k')

# set seed
random.seed(200)

In [26]:
# use NMF algorithm to cross validate on different parameters
reg_params = [0.04, 0.06, 0.08]
param_grid = {'reg_pu': reg_params, 'reg_qi': reg_params}
gs = GridSearchCV(NMF, param_grid, measures=['mse'], cv = 3)
gs.fit(data)

In [27]:
print("The best score was",gs.best_score['mse'])
print("The best parameters were:",gs.best_params['mse'])

The best score was 0.9312201763729083
The best parameters were: {'reg_pu': 0.08, 'reg_qi': 0.08}


In [28]:
# fit model on best parameter combination
best_model = gs.best_estimator['mse']
best_model.fit(data.build_full_trainset())


<surprise.prediction_algorithms.matrix_factorization.NMF at 0x7fe3b69bf550>

In [29]:
# output recommendations for a few users
best_model.predict('3','335', '1')

Prediction(uid='3', iid='335', r_ui='1', est=1.707660946260052, details={'was_impossible': False})

In [30]:
best_model.predict('200', '673', '5')

Prediction(uid='200', iid='673', r_ui='5', est=4.128051248984277, details={'was_impossible': False})

In [31]:
best_model.predict('200', '222', '5')

Prediction(uid='200', iid='222', r_ui='5', est=4.421435031486878, details={'was_impossible': False})

This recommendation system is able to predict a user's score relatively closely - MSE is 0.93 and the sample predictions are pretty close. Therefore it has a good chance of delivering recommendations that the user will enjoy.