# DSAIT4335 Recommender Systems
# Assignment 1: Content-based Recommendation

In this assignment, you will work to build a content-based recommendation model using different text processing methods. Then, you will apply your content-based recommendation model on a public dataset. The dataset is **MovieLens100K**, a movie recommendation dataset collected by GroupLens: https://grouplens.org/datasets/movielens/100k/.

By the end of this assignment, you will:
1. Understand the fundamental principles of content-based recommender systems
2. Develop feature extraction method (BERT)
3. Build user and item profiles from content features
4. Perform both rating prediction and top-k recommendation tasks
5. Evaluate content-based methods to understand their strengths/limitations

# Instruction

The MovieLens100K is already splitted into 80% training and 20% test sets. Along with training and test sets, movies metadata as content information is also provided.

**Expected file structure** for this assignment:   
   
   ```
   Assignment1/
   ├── training.txt
   ├── test.txt
   ├── movies.txt
   └── hw1.ipynb
   ```

**Note:** Be sure to run all cells in each section sequentially, so that intermediate variables and packages are properly carried over to subsequent cells.

**Submission:** Answer all the questions in this jupyter-notebook file. Submit this jupyter-notebook file (your answers included) to Brightspace. Change the name of this jupyter-notebook file to your name: firstname-lastname.ipynb.

# Setup

Import necessary libraries/packages.

In [2]:
# !pip install transformers torch  # For BERT

# you can refer https://huggingface.co/docs/transformers/en/model_doc/bert for various versions of the pre-trained model BERT

# Check if transformers & torch are in conda list
!conda list | grep -E 'transformers|torch'

pytorch-mutex             1.0                        cuda    pytorch-nightly
sentence-transformers     3.3.1              pyhd8ed1ab_0    conda-forge
torch                     2.6.0                    pypi_0    pypi
torch-tb-profiler         0.4.3                    pypi_0    pypi
torchvision               0.21.0                   pypi_0    pypi
transformers              4.51.0.dev0              pypi_0    pypi


In [4]:
# For BERT embeddings (install: pip install transformers torch)
print("Check the status of BERT installation:")

try:
    from transformers import AutoTokenizer, AutoModel
    import torch
    BERT_AVAILABLE = True
    print("BERT libraries loaded successfully!")
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print(f"Using device: {device}")
except ImportError:
    BERT_AVAILABLE = False
    print("BERT libraries not available. Install with: pip install transformers torch")

Check the status of BERT installation:
BERT libraries loaded successfully!
Using device: cpu


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tqdm import tqdm
import seaborn as sns
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity, euclidean_distances
from sklearn.preprocessing import StandardScaler, MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
import re
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

print("Basic libraries imported successfully!")

# 1) MovieLens100K dataset

Before building content-based recommender, we need to do exploratory data analysis to thoroughly understand our data and its content features.

Preliminary: Take a glimpse of the following dataframes to check what the training, test, and movie data files constitute.

In [None]:
# loading the training set and test set
columns_name=['user_id','item_id','rating','timestamp']
train_data = pd.read_csv('training.txt', sep='\t', names=columns_name)
test_data = pd.read_csv('test.txt', sep='\t', names=columns_name)

print(f'The training data:')
display(train_data[['user_id','item_id','rating']].head())
print(f'The shape of the training data: {train_data.shape}')
print('--------------------------------')
print(f'The test data:')
display(test_data[['user_id','item_id','rating']].head())
print(f'The shape of the test data: {test_data.shape}')
# print(test_data.shape)

In [None]:
movies = pd.read_csv('movies.txt',names=['item_id','title','genres','description'],sep='\t')
movies.head()

### Question 1: How many users, items, and ratings are in the training set?

In [None]:
def get_data_stats(train_data):
    """
    Perform basic statistical analysis of the MovieLens100K dataset.
    """
    if train_data is None:
        print("Please load the MovieLens dataset first!")
        return
    
    n_users = 0
    n_movies = 0
    n_ratings = 0
    
    ############# Your code here ############
    
    #########################################
    
    return n_users, n_movies, n_ratings

n_users, n_movies, n_ratings = get_data_stats(train_data)
print("Dataset Analysis")
print("=" * 30)
print(f"Number of Users: {n_users:,}")
print(f"Number of Movies: {n_movies:,}")
print(f"Number of Actual Ratings: {n_ratings:,}")
print("-" * 40)

### Question 2: What is the sparsity of the data?

In [None]:
def get_UIM_sparsity(train_data):
    # Implement the function that returns the fraction of missing data in user-item rating matrix. 
    # You can call get_data_stats(train_data) function for getting the necessary variables.
    
    sparsity = 0.0

    ############# Your code here ############

    #########################################

    return sparsity

sparsity = get_UIM_sparsity(train_data)
print("Sparsity of the data is {}".format(sparsity))

### Question 3: Create the histogram of movie title length. Set the number of bins to 20.

In [None]:
def hist_title_length(movies):
    # Given movies dataframe, implement the function that generates a histogram of movies title length. 
    # Hint: in histogram, x-axis shows the length of title, and y-axis shows the number of movies with the corresponding length.

    ############# Your code here ############
    
    #########################################
    
hist_title_length(movies)

Discuss your observations.

### Question 4: Create the histogram of movie description length. Set the number of bins to 20.

In [None]:
def hist_description_length(movies):
    # Given movies dataframe, implement the function that generates a histogram of movies description length. 
    # Hint: in histogram, x-axis shows the length of description, and y-axis shows the number of movies with the corresponding length.

    ############# Your code here ############
    
    #########################################
    
hist_description_length(movies)

Discuss your observations.

# 2) Deriving content representation with BERT

BERT (Bidirectional Encoder Representations from Transformers) provides rich, contextual embeddings that can capture semantic meaning. See [Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." NAACL-HLT 2 2019](https://aclanthology.org/N19-1423.pdf) for more details.

In [14]:
def create_bert_embeddings(content):
    """
    Generate BERT embeddings for movie content.

    Args:
        content: Content of items

    Returns:
        numpy.ndarray: BERT embeddings matrix
    """
    if not BERT_AVAILABLE:
        print("BERT libraries not available. Install with: pip install transformers torch")
        return None

    if content is None:
        return None

    if isinstance(content, pd.Series):
        content = content.fillna("").astype(str).tolist()
    elif isinstance(content, np.ndarray):
        content = content.astype(str).tolist()

    model_name = 'distilbert-base-uncased'

    print(f"Loading BERT model: {model_name}")

    # Load tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModel.from_pretrained(model_name)

    # Set device (GPU if available)
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print(f"Using cuda or cpu: {device}")
    model.to(device)
    model.eval()

    print(f"Using device: {device}")

    # Generate embeddings in batches
    batch_size = 32  # Adjust based on available memory
    emb = []

    for i in range(0, len(content), batch_size):
        if i % (batch_size * 10) == 0:
            print(f"Processing batch {i//batch_size + 1}/{len(content)//batch_size + 1}")

        batch_texts = content[i:i + batch_size]

        # Tokenize batch
        inputs = tokenizer(
            batch_texts,
            padding=True,
            truncation=True,
            max_length=512,
            return_tensors='pt'
        )

        # Move to device
        inputs = {k: v.to(device) for k, v in inputs.items()}

        # Generate embeddings
        with torch.no_grad():
            outputs = model(**inputs)

            # Use [CLS] token embedding (first token)
            batch_embeddings = outputs.last_hidden_state[:, 0, :].cpu().numpy()
            emb.extend(batch_embeddings)

    emb = np.array(emb)

    print(f"BERT embeddings generated: {emb.shape}")
    print(f"Embedding dimension: {emb.shape[1]}")

    return emb

You can get the content representation using BERT as follows:

In [None]:
# Sample content
sample_content = ['aa','ab','ac','bc']

# Generate BERT embeddings (this may take several minutes)
print("Generating BERT embeddings...")
bert_embeddings = create_bert_embeddings(sample_content)

if bert_embeddings is not None:
    print("BERT embeddings created successfully!")
else:
    print("BERT embeddings not available. Continuing with TF-IDF only.")

### Question 5: Derive the representation of items for three types of content: 
    1) title + genres 
    2) description 
    3) title + genres + description

In [None]:
# Implement code to derive the content representation for title and genres. Concatenate the two content as: title + ' ' + genres
item_emb_titlegenres = None
############# Your code here ############

#########################################

# Implement code to derive the content representation for description.
item_emb_description = None
############# Your code here ############

#########################################

# Implement code to derive the content representation for title, genres, and description. Concatenate the two content as: title + ' ' + genres + '' + description
item_emb_full = None
############# Your code here ############

#########################################

### Question 6: What is the embedding of item_id=100?

In [None]:
def get_item_emb(item_id, content_type):
    # Implement the function that given content type (title+genres, description, or title+genres+description) returns the embedding derived for the corresponding item_id. 
    # Hint1: keep in mind that item_id in the data starts from 1, but in the embedding variable it starts from 0, e.g., item_id 100 corresponds to index 99 in embedding variable..
    # Hint2: use if-else conditions to return the embedding for the requested content types.
    # Hint3: use the global variables (embeddings) already computed in previous cells.

    emb = None
    
    ############# Your code here ############
    
    #########################################

    return emb
    
item_id = 100
print('Embedding representation for content type = title+genres:')
print(get_item_emb(item_id, 'title_genres'))
print('------------------------------------------')
print('Embedding representation for content type = description:')
print(get_item_emb(item_id, 'description'))
print('------------------------------------------')
print('Embedding representation for content type = full:')
print(get_item_emb(item_id, 'full'))
print('------------------------------------------')

# 3) User profile construction

### Question 7: What are the embeddings and ratings of interacted items by user_id=100?

In [None]:
def get_interacted_items_embs_rating(train_data, user_id, content_type):
    # Implement the function that given content type (title+genres, description, or title+genres+description) returns the embeddings and ratings of interacted items by user_id=100. 
    # Hint1: use train_data to retrieve the item_ids that target user (user_id=100 in this example) interacted, then pass these item_ids to function previously implemented to retrieve the embeddings and ratings.

    embs, ratings = [], []
    
    ############# Your code here ############
    
    #########################################

    return embs, ratings
    
user_id = 100
print('Embeddings and ratings of interacted items by user_id=100 for content type = title+genres:')
print(get_interacted_items_embs_rating(train_data, user_id, 'title_genres'))
print('------------------------------------------')
print('Embeddings and ratings of interacted items by user_id=100 for content type = description:')
print(get_interacted_items_embs_rating(train_data, user_id, 'description'))
print('------------------------------------------')
print('Embeddings and ratings of interacted items by user_id=100 for content type = full:')
print(get_interacted_items_embs_rating(train_data, user_id, 'full'))
print('------------------------------------------')

### Question 8: Derive the representation for user_id=100 using the following aggregation methods:
1. **avg:** Average representation of interacted item 
2. **weighted_avg:** Weighted average representation of interacted item using rating values
3. **avg_pos:** Average representation of positively interacted item (ratings >= 4)

In [None]:
def get_user_emb(train_data, user_id, content_type, aggregation_method):
    # Implement the function that given content type (title+genres, description, or title+genres+description) and aggregation method (avg, weighted_avg, avg_pos) returns the representation of a user. 
    # Hint1: use the previsouly implemented items for retrieving ratings and representation of interacted items by a user.

    emb = []
    
    ############# Your code here ############
    
    #########################################

    return emb
    
user_id = 100
content_type, aggregation_method = 'full', 'avg' # alternatives are content_type={title_genres,description,full} and aggregation_method={avg,weighted_avg,avg_pos}
print('Embeddings of user_id=100 for content type '+content_type+' by aggregation method '+aggregation_method+':')
print(get_user_emb(train_data, user_id, content_type, aggregation_method))

# 4) Content-based recommendation

Predict the rating for a user-item pair. Use dot product between user and item ebmeddings to predict the score.

### Question 9: What is the predicted score for user_id=100 and item_id=266?

**Note:** The predicted score might not be in the interval [1,5]. In the next part, after predicting the ratings for all user-item pair, the predictions will be normalized.

In [None]:
def get_user_item_prediction(train_data, user_id, item_id, content_type, aggregation_method):
    # Implement the function that given content type and aggregation method returns the predicted rating for a user-item pair. 
    # Hint1: use the previsouly implemented functions for retrieving the embeddings and then compute the dot product of user and item embeddings.

    pred_rating = 0.0
    
    ############# Your code here ############
    
    #########################################

    return pred_rating
    
user_id, item_id = 100, 266
content_type, aggregation_method = 'full', 'avg' # alternatives are content_type={title_genres,description,full} and aggregation_method={avg,weighted_avg,avg_pos}
print('Predicted score for user_id=100 and item_id=266 for content type '+content_type+' and aggregation method '+aggregation_method+':')
print(get_user_item_prediction(train_data, user_id, item_id, content_type, aggregation_method))

# 5) Metrics

For this part, refer to the lecture on "Evaluation of Recommender Systems" where different metrics are described.

### Question 10: Implement MAE, MSE, and RMSE for rating prediction task.

In [57]:
def MAE(actual_ratings, pred_ratings):
    # Implement a function that computes MAE error between actual ratings and predicted ratings. 
    # Note that actual_ratings and pred_ratings are lists.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def MSE(actual_rating, pred_rating):
    # Implement a function that computes MSE error between actual ratings and predicted ratings. 
    # Note that actual_ratings and pred_ratings are lists.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def RMSE(actual_rating, pred_rating):
    # Implement a function that computes RMSE error between actual ratings and predicted ratings. 
    # Note that actual_ratings and pred_ratings are lists.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

### Question 11: Implement Precision, Recall, NDCG, MRR, and MAP for ranking task.

In [None]:
def Precision(ground_truth, rec_list):
    # Implement a function that computes Precision across ground truth data and recommendation list generated for each user. 
    # Note that ground_truth and rec_list contain the list of items for all users, e.g., 2-dimensional arrays.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def Recall(ground_truth, rec_list):
    # Implement a function that computes Recall across ground truth data and recommendation list generated for each user. 
    # Note that ground_truth and rec_list contain the list of items for all users, e.g., 2-dimensional arrays.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def NDCG(ground_truth, rec_list):
    # Implement a function that computes NDCG across ground truth data and recommendation list generated for each user. 
    # Note that ground_truth and rec_list contain the list of items for all users, e.g., 2-dimensional arrays.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def MRR(ground_truth, rec_list):
    # Implement a function that computes MRR across ground truth data and recommendation list generated for each user. 
    # Note that ground_truth and rec_list contain the list of items for all users, e.g., 2-dimensional arrays.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

def MAP(ground_truth, rec_list):
    # Implement a function that computes MAP across ground truth data and recommendation list generated for each user. 
    # Note that ground_truth and rec_list contain the list of items for all users, e.g., 2-dimensional arrays.
    
    result = 0.0
    
    ############# Your code here ############
    
    #########################################

    return result

# 6) Evaluation of content-based recommender for rating prediction task

### Question 12: Predict the ratings for all user-item pairs in test set and compute MAE, MSE, and RMSE. Discuss your observations.

In [None]:
def evaluate_rating_prediction(train_data, test_data, content_type, aggregation_method):
    # Implement a function that first computes the representation of users and then predicts the rating for each user-item pair. Finally, call the implemented metrics to measure the error.
    # Hint: the reason for pre-computing the representation of all users is to speed up the experiments and to avoid too many unnecessary computations.
    # Note: Make sure to map the predicted score into [1,5] interval. The prediction from content-based model may not necessarily be in 5-star rating scale.

    # here we compute the representation of users
    users_emb = []
    users = list(train_data['user_id'].unique())
    for user in tqdm(users):
        users_emb.append(get_user_emb(train_data, user, content_type, aggregation_method))

    print('Computing the representation of users is done!')

    actual_ratings, pred_ratings = [], []
    
    ############# Your code here ############
    
    #########################################

    # Given predicted ratings, map them into [1,5] interval using -> 1 + (pred - min_val) * (4 / (max_val - min_val))
    ############# Your code here ############
    
    #########################################

    mae_value, mse_value, rmse_value = 0.0, 0.0, 0.0

    # compute the metrics: MAE, MSE, RMSE 
    ############# Your code here ############
    
    #########################################

    return mae_value, mse_value, rmse_value

print('Performance of content-based recommender for content type = title+genres and aggregation method = avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'description', 'avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'full', 'avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = title+genres and aggregation method = weighted_avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'weighted_avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = weighted_avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'description', 'weighted_avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = weighted_avg:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'full', 'weighted_avg')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = title+genres and aggregation method = avg_pos:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'avg_pos')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = avg_pos:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'description', 'avg_pos')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = avg_pos:')
mae_value, mse_value, rmse_value = evaluate_rating_prediction(train_data, test_data, 'full', 'avg_pos')
print('MAE='+str(round(mae_value,5))+', MSE='+str(round(mse_value,5))+', RMSE='+str(round(rmse_value,5)))
print('------------------------------------------')

Discuss your observations.

### Question 13: Generate recommendation list of size 10 for each user and compute Precision, Recall, NDCG, MRR, and MAP. Discuss your observations.

In [None]:
def evaluate_rating_prediction(train_data, test_data, content_type, aggregation_method):
    # Implement a function that first computes the representation of users and then predicts the relevance score of all items for each user. Next, return 10 unseen items with the highest predicted relevance score for each user as the recommendation list. Finally, call the implemented metrics to measure accuracy of recommendation.
    # Hint: the reason for pre-computing the representation of all users is to speed up the experiments and to avoid too many unnecessary computations.
    # Hint: no normalization is needed.

    # here we compute the representation of users
    users_emb = []
    users = list(train_data['user_id'].unique())
    for user in users:
        users_emb.append(get_user_emb(train_data, user, content_type, aggregation_method))

    ground_truth, rec_list = [], []
    
    ############# Your code here ############
    
    #########################################

    precision_value, recall_value, ndcg_value, mrr_value, map_value = 0.0, 0.0, 0.0, 0.0, 0.0

    # compute the metrics: Precision, Recall, NDCG, MRR, MAP
    ############# Your code here ############
    
    #########################################

    return precision_value, recall_value, ndcg_value, mrr_value, map_value

print('Performance of content-based recommender for content type = title+genres and aggregation method = avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'description', 'avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'full', 'avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = title+genres and aggregation method = weighted_avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'weighted_avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = weighted_avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'description', 'weighted_avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = weighted_avg:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'full', 'weighted_avg')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = title+genres and aggregation method = avg_pos:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'title_genres', 'avg_pos')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = description and aggregation method = avg_pos:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'description', 'avg_pos')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')
print('Performance of content-based recommender for content type = full and aggregation method = avg_pos:')
precision_value, recall_value, ndcg_value, mrr_value, map_value = evaluate_rating_prediction(train_data, test_data, 'full', 'avg_pos')
print('Precision='+str(round(precision_value,5))+', Recall='+str(round(recall_value,5))+', NDCG='+str(round(ndcg_value,5))+', MRR='+str(round(mrr_value,5))+', MAP='+str(round(map_value,5)))
print('------------------------------------------')

Discuss your observations.

# Extended experiments

### Question 14: Implement a baseline for rating prediction task that returns average rating of target item as the predicted rating. 

For example, for a user u and item i, the prediction is the average of ratings given to i in training data.

#### Evaluate the performance of this baseline in terms of MAE, MSE, RMSE, and compare it with the results in Question 12. 

In [92]:
def evaluate_item_avg_baseline(train_data, test_data):
    # Hint: no normalization is needed.
    
    actual_ratings, pred_ratings = [], []
    
    ############# Your code here ############
    
    #########################################

    mae_value, mse_value, rmse_value = 0.0, 0.0, 0.0

    # compute the metrics: MAE, MSE, RMSE 
    ############# Your code here ############
    
    #########################################

    return mae_value, mse_value, rmse_value

Discuss your observations:

### Question 15: Implement the following baselines for ranking task:

**Random:** Randomly recommend 10 unseen items (items not interacted by the target user in training data) to each user

**Popular:** Recommend 10 most popular items that are not yet interacted by the target user. Most popular items are the ones that are rated by majority of users in the training data.

#### Evaluate the performance of these baselines in terms of Precision, Recall, NDCG, MRR, and MAP and compare them with the results in Question 13. 

In [None]:
def evaluate_random_baseline(train_data, test_data):
    # Hint: no normalization is needed.
    
    ground_truth, rec_list = [], []
    
    ############# Your code here ############
    
    #########################################

    precision_value, recall_value, ndcg_value, mrr_value, map_value = 0.0, 0.0, 0.0, 0.0, 0.0

    # compute the metrics: Precision, Recall, NDCG, MRR, MAP 
    ############# Your code here ############
    
    #########################################

    return precision_value, recall_value, ndcg_value, mrr_value, map_value

In [94]:
def evaluate_popular_baseline(train_data, test_data):
    # Hint: no normalization is needed.
    
    ground_truth, rec_list = [], []
    
    ############# Your code here ############
    
    #########################################

    precision_value, recall_value, ndcg_value, mrr_value, map_value = 0.0, 0.0, 0.0, 0.0, 0.0

    # compute the metrics: Precision, Recall, NDCG, MRR, MAP
    ############# Your code here ############
    
    #########################################

    return precision_value, recall_value, ndcg_value, mrr_value, map_value

Discuss your observations: