### Evaluation Metrics for Recommender Systems

### Error-based Metrics

### Root Mean Square Error (RMSE)
RMSE is one of the most popular metrics for evaluating the accuracy of predicted ratings in recommender systems. It emphasizes larger errors by squaring them before taking the mean, making it particularly sensitive to outliers.

The formula for RMSE is:

$$
RMSE = \sqrt{\frac{1}{N}\sum_{i=1}^{N}(x_i - \hat{x_i})^2}
$$


### Mean Absolute Error (MAE)
MAE measures the average magnitude of errors without considering their direction. Unlike RMSE, it treats all errors on a linear scale.

$$
MAE = \frac{1}{N}\sum_{i=1}^{N}|x_i - \hat{x_i}|
$$

## Ranking-based Metrics

### Precision
Precision measures the proportion of relevant items among all recommended items:

$$
Precision@k = \frac{\text{number of relevant items @k}}{\text{total number of recommended items @k}}
$$

### Recall
Recall measures the proportion of relevant items that were successfully recommended:

$$
Recall@k = \frac{\text{number of relevant items @k}}{\text{total number of relevant items}}
$$

### Normalized Discounted Cumulative Gain (NDCG)
NDCG measures the quality of ranking by considering both the relevance and position of recommendations. It penalizes highly relevant items appearing lower in the recommendation list.

$$
DCG@k = \sum_{i=1}^k \frac{2^{rel_i} - 1}{\log_2(i + 1)}
$$

$$
NDCG@k = \frac{DCG@k}{IDCG@k}
$$

where:
$$rel_i \text{: the relevance score of item at position i}$$
$$IDCG@k \text{: the DCG@k of the ideal ranking}$$

### Coverage and Diversity Metrics

### Catalog Coverage
Measures the percentage of items that the system is able to recommend:

$$
Coverage = \frac{\text{number of items that can be recommended}}{\text{total number of items}} \times 100\%
$$

### User Coverage
Measures the percentage of users for whom the system can make recommendations:

$$
User Coverage = \frac{\text{number of users who receive recommendations}}{\text{total number of users}} \times 100\%
$$

In [None]:
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.metrics.pairwise import _______  # Exercise: Import cosine_similarity
from sklearn.decomposition import _______ # Exercise: Import NMF

In [None]:
ratings_df = pd.read_csv('data-1m/ratings.csv', 
                         sep='\t',
                         encoding='latin-1',
                         engine='python',
                         index_col=0)

In [None]:
# Take a random sample
sample_size = 100000
ratings_sample = ratings_df.sample(n=sample_size, random_state=42)

# Split the data 
train_data, test_data = train_test_split(
    ratings_sample, 
    test_size=____, # Exercise: Fill in test size
    random_state=42
)

# Create rating matrix for training
rating_matrix = train_data.pivot(
    index='_______', //Exercise: fill in index
    columns='_______', # Exercise: Fill in columns
    values='_______' # Exercise: Fill in values
)

# Fill NaN with mean rating for each movie
movie_means = rating_matrix._______() # Exercise: Compute mean
rating_matrix = rating_matrix.fillna(_______) # Exercise: Fill NaN with movie_means

In [None]:
def predict_rating_useruser(user_id, movie_id, rating_matrix, user_sim, n_neighbors=5):
    if user_id not in rating_matrix.index or movie_id not in rating_matrix.columns:
        return None
        
    user_idx = rating_matrix.index.get_loc(user_id)
    sim_scores = user_sim[user_idx]
    
    movie_ratings = rating_matrix[movie_id]
    rated_mask = movie_ratings > 0
    
    if not rated_mask.any():
        return None
        
    sim_users = sim_scores[rated_mask]
    ratings = movie_ratings[rated_mask]
    
    top_indices = np.argsort(sim_users)[-n_neighbors:]
    weights = sim_users[top_indices]
    
    if weights.sum() == 0:
        return None
        
    pred = np.average(ratings.iloc[top_indices], weights=weights)
    return pred

In [None]:
# User-User CF
user_sim = _______(_______) # Exercise: Calculate cosine similarity

# NMF
nmf = NMF(
    n_components=15, # Exercise: Experiment with different values
    init='nndsvd',
    solver='cd',
    random_state=42
)

# Fit NMF model
W = nmf._______(_______) # Exercise: Fit the model
H = nmf._______ # Exercise: Get components_


In [None]:
print("Making predictions...")
uu_predictions = []
nmf_predictions = []
actuals = []

for _, row in test_data.iterrows():
    user_id = row['user_id']
    movie_id = row['movie_id']
    actual_rating = row['rating']
    
    # Skip if user or movie not in training data
    if user_id not in rating_matrix.index or movie_id not in rating_matrix.columns:
        continue
        
    # User-User CF prediction
    uu_pred = predict_rating_useruser(user_id, movie_id, rating_matrix, user_sim)
    
    # NMF prediction
    if uu_pred is not None:
        user_idx = rating_matrix.index.get_loc(user_id)
        movie_idx = rating_matrix.columns.get_loc(movie_id)
        
        # Compute NMF prediction
        nmf_pred = W[user_idx].dot(H[:, movie_idx])
        
        # Clip to valid rating range
        nmf_pred = np.clip(nmf_pred, 1, 5)
        
        uu_predictions.append(uu_pred)
        nmf_predictions.append(nmf_pred)
        actuals.append(actual_rating)

In [None]:
# Calculate RMSE
uu_rmse = np.sqrt(_______(_______, _______)) # Exercise: Calculate RMSE
nmf_rmse = np.sqrt(_______(_______, _______))  # Exercise: Calculate RMSE