In [1]:
import pandas as pd

from src.metrics import map_score, mrr_score, ndcg_score, rmse_score
from src.models.alternating_least_squares import ALSRecommender
from src.utils import train_test_split, to_user_movie_matrix, make_binary_matrix, RatingMatrix

Let's load the datasets with users info, movies info and users' ratings for movies.

Then we split it to training/test subsets by the timestamp.

In [2]:
users = pd.read_table("../data/users.dat", sep="::", names=['UserID', 'Gender', 'Age', 'Occupation', 'Zip-code'], engine='python')

movies = pd.read_table("../data/movies.dat", sep="::", names=['MovieID', 'Title', 'Genres'], engine='python', encoding='latin1')

ratings = pd.read_table("../data/ratings.dat", sep="::", names=['UserID', 'MovieID', 'Rating', 'Timestamp'], engine='python')
ratings['Timestamp'] = pd.to_datetime(ratings['Timestamp'], unit='s')

ratings = ratings[ratings['MovieID'].isin(movies['MovieID'])]

train_ratings, test_ratings = train_test_split(ratings, 'Timestamp')
user_movie_train = to_user_movie_matrix(train_ratings)
user_movie_test = to_user_movie_matrix(test_ratings)

Now, we are going to train our content-based recommender model, which predicts ratings based on the movie features. It takes into account the similarity between the movie, we are going to predict the rating of, and the movies, which were rated by the user before.

In [3]:
model = ALSRecommender()
model.train(user_movie_train, 10)

y_pred = model.predict(make_binary_matrix(user_movie_test.get_rating_matrix()))

Given the predicted ratings and test dataset, we are going to evaluate our model by four metrics:
* mean average precision (MAP)
* mean reciprocal rank (MRR)
* normalized discounted cumulative gain (NDCG)
* root mean squared error (RMSE)

In [4]:
map_score_value = map_score(RatingMatrix(user_movie_test.get_rating_matrix()[y_pred.get_users()]), y_pred, top=10)
mrr_score_value = mrr_score(RatingMatrix(user_movie_test.get_rating_matrix()[y_pred.get_users()]), y_pred, top=10)
ndcg_score_value = ndcg_score(RatingMatrix(user_movie_test.get_rating_matrix()[y_pred.get_users()]), y_pred, top=10)
rmse_score_value = rmse_score(RatingMatrix(user_movie_test.get_rating_matrix()[y_pred.get_users()]), y_pred)

print(f'MAP: {map_score_value}')
print(f'MRR: {mrr_score_value}')
print(f'NDCG: {ndcg_score_value}')
print(f'RMSE: {rmse_score_value}')

MAP: 0.19645168170058033
MRR: 0.47359939864345146
NDCG: 0.2863324829777592
RMSE: 3.2470191994977236


A MAP of 0.196 indicates that, on average, about 19.6% of the top-10 recommended items are relevant.

An MRR of 0.473 means that, on average, the first relevant item appears between the 1st and 2nd positions in the recommendation list.

NDCG measures the quality of the recommendations by considering the position of the relevant items in the list, with higher-ranked items contributing more to the score.

RMSE measures the differences between the predicted and actual ratings, which in our case is about 1 and a little bit bigger than baseline