# Lab 10 - Recommender Systems Exercises

## Task 1: Filtering

One does not necessarily have to return the top recommendations as-is from the model without considering user preferences in a collaborative filtering system. Thus, for this optional exercise, your task is to augment/modify the `get_top_recommendations` function to improve the recommendations provided to a given user.

In [1]:
# Do not modify this cell

!pip install scikit-surprise --upgrade

import pandas as pd
import numpy as np
from surprise import Dataset
from surprise import Reader
from surprise import KNNBasic

try:
    ratings = pd.read_csv('../data/movie_ratings.csv')
except:
    ratings = pd.read_csv('https://raw.githubusercontent.com/GUC-DM/W2021-Berlin/main/data/movie_ratings.csv')

try:
    movies_db = pd.read_csv('../data/movies_db.csv')
except:
    movies_db = pd.read_csv('https://raw.githubusercontent.com/GUC-DM/W2021-Berlin/main/data/movies_db.csv')

# We'll set the TMDB ID as the index for quick indexing by ID
movies_db = movies_db.set_index('tmdbId')


# The Reader class is used to parse a file containing ratings
# Since we already loaded it as a dataframe, we only need to set the rating_scale parameter.
reader = Reader(rating_scale=(0.5, 5))

# The columns must correspond to user id, item id and ratings (in that order).
data = Dataset.load_from_df(ratings[['userId', 'tmdbId', 'rating']], reader)

sim_options_user = {
    'name': 'cosine', # there are other options as well, including pearson
    'user_based': True  # compute similarities between users
}

user_knn_model = KNNBasic(k=40, min_k=1, sim_options=sim_options_user)

# Builds a training set from the entire dataset (no splitting is done)
# Needed to use the models for recommendations
trainset = data.build_full_trainset()

# Fit each model to the training set
user_knn_model.fit(trainset)

Computing the cosine similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNBasic at 0x7f22922d9700>

**Augment/Modify the function below to improve the recommendations returned to the user**

**Hint**: consider how you can filter the recommendations returned from the model based on the attributes available in the movies_db dataset. You may also test with different user IDs.

In [2]:
def get_top_recommendations(user_id, n=10):
    # Get the IDs of movies that the user has already rated
    rated_movies = ratings.loc[ratings['userId'] == user_id, 'tmdbId']
    
    # Get the IDs of movies that were not yet rated by the user
    # Note: ~ is bitwise not
    movies_to_predict = movies_db[~movies_db.index.isin(rated_movies)].index

    # Setup dataframe to use for building and sorting the movie rating predictions for the user
    user_predictions = pd.DataFrame(movies_to_predict)

    # Predict the user's rating for each of the movies that were not previously rated
    user_predictions['predicted_rating'] = user_predictions['tmdbId'].apply(lambda movie_id: user_knn_model.predict(user_id, movie_id).est)

    # Return the top n recommendations based on the predicted score (and merge with movies_db to see movie title, genre, etc.)
    return user_predictions.merge(movies_db.reset_index()).nlargest(n, 'predicted_rating')

In [3]:
# Movies the given user (id: 1) has already rated
ratings.loc[ratings['userId'] == 1].merge(movies_db.reset_index()).sort_values('rating', ascending=False)

Unnamed: 0,userId,tmdbId,rating,imdb_id,title,overview,original_language,vote_average,vote_count,release_year,genre_1,genre_2
4,1,11216,4.0,tt0095765,Cinema Paradiso,"A filmmaker recalls his childhood, when he fel...",it,8.2,834.0,1988.0,Drama,Romance
13,1,97,4.0,tt0084827,Tron,As Kevin Flynn searches for proof that he inve...,en,6.6,717.0,1982.0,Science Fiction,Action
12,1,1051,4.0,tt0067116,The French Connection,Tough narcotics detective 'Popeye' Doyle is in...,en,7.4,435.0,1971.0,Action,Crime
8,1,6114,3.5,tt0103874,Dracula,When Dracula leaves the captive Jonathan Harke...,en,7.1,1087.0,1992.0,Romance,Horror
19,1,11072,3.0,tt0071230,Blazing Saddles,A town – where everyone seems to be named John...,en,7.2,619.0,1974.0,Western,Comedy
1,1,11360,3.0,tt0033563,Dumbo,Dumbo is a baby elephant born with oversized e...,en,6.8,1206.0,1941.0,Animation,Family
2,1,819,3.0,tt0117665,Sleepers,Two gangsters seek revenge on the state jail w...,en,7.3,729.0,1996.0,Crime,Drama
14,1,8393,3.0,tt0080801,The Gods Must Be Crazy,Misery is brought to a small group of Sho in t...,en,7.1,251.0,1980.0,Action,Comedy
17,1,9426,2.5,tt0091064,The Fly,When Seth Brundle makes a huge scientific and ...,en,7.1,1038.0,1986.0,Horror,Science Fiction
0,1,9909,2.5,tt0112792,Dangerous Minds,Former Marine Louanne Johnson lands a gig teac...,en,6.4,249.0,1995.0,Drama,Crime


In [4]:
get_top_recommendations(1)

Unnamed: 0,tmdbId,predicted_rating,imdb_id,title,overview,original_language,vote_average,vote_count,release_year,genre_1,genre_2
49,49133,5.0,tt0110299,Lamerica,"Fiore, an Italian conman, arrives in post Comm...",it,7.7,11.0,1994.0,Drama,Foreign
160,48787,5.0,tt0110604,Mute Witness,"Billy is mute, but it hasn't kept her from bec...",en,6.4,36.0,1995.0,Thriller,Foreign
268,30304,5.0,tt0114129,Picture Bride,"Riyo, an orphaned 17-year old, sails from Yoko...",en,7.4,5.0,1995.0,Drama,History
276,159185,5.0,tt0110769,"Red Firecracker, Green Firecracker",A woman inherits her father's fireworks factor...,zh,7.0,2.0,1994.0,Drama,
596,753,5.0,tt0062952,Faces,An old married man leaves his wife for a young...,en,7.1,36.0,1968.0,Drama,
632,85778,5.0,tt0110480,Maya Lin: A Strong Clear Vision,A film about the work of the artist most famou...,en,0.0,0.0,1995.0,Documentary,
636,22621,5.0,tt0113280,Heavy,Victor is a cook who works in a greasy bar/res...,en,7.7,11.0,1995.0,Drama,Romance
686,48144,5.0,tt0111424,The Day the Sun Turned Cold,,zh,7.0,2.0,1994.0,,
705,11985,5.0,tt0109066,Vive L'Amour,The film focuses on three city folks who unkno...,zh,7.3,16.0,1994.0,Drama,
781,23114,5.0,tt0027893,Little Lord Fauntleroy,An American boy turns out to be the heir of a ...,en,6.6,13.0,1936.0,Drama,Family


## Task 2: Hybrid Recommender System

Create a hybrid recommender systems by combining the collaborative and content-based recommender systems shown in the lab.

_Extra: take it a step further, and incorporate the demographic filtering system in the results._