# Model-Based recommender system
## Plan
* Dataset of WineID,UserID,Rating
* Train using MF(SVD)
* Evaluate

### Singular Value Decomposition (SVD) algorithm
#### https://surprise.readthedocs.io/en/stable/matrix_factorization.html
Is a Matrix Factorization algorithm (Probabalistic MF if biases are not used). Minimizes the regularized square error by straightforward Stohastic Gradient Descent.

Pros:
- Has high accuracy
- Scales well on large data
- Dimestionality reduction

Cons:
- Doesn't handle cold-start problem without additional modifications (like hybrid models, pre-training clustering together similar users and items)
- Hard to interpret due to latent factors
- SVD doesn't use any metadata

Why use SVD?
1. Large dataset (trainset ~16 mill rows)
2. No user metadata
3. Handles only warm-warm start, is not designed to handle any of cold-start scenarions.
However the RMSE scores look good here since in cold start cases the SVD model from surprise just take the global mean for unseen items/users.
Attempt to handle cold-start with Hybrid model (CF + Content-based)


In [1]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.metrics import mean_squared_error, mean_absolute_error, root_mean_squared_error
from surprise import SVD, Dataset, Reader



In [None]:
# Data loading
base_path = '..\\..\\Evaluation'

train = pd.read_csv(f'{base_path}\\trainset.csv', usecols=['UserID', 'WineID', 'Rating'])
test_uwarm_iwarm = pd.read_csv(f'{base_path}\\testset_warm_user_warm_item.csv', usecols=['RatingID', 'UserID', 'WineID', 'Rating'])
test_uwarm_icold = pd.read_csv(f'{base_path}\\testset_warm_user_cold_item.csv', usecols=['RatingID', 'UserID', 'WineID', 'Rating'])
test_ucold_iwarm = pd.read_csv(f'{base_path}\\testset_cold_user_warm_item.csv', usecols=['RatingID', 'UserID', 'WineID', 'Rating'])
test_ucold_icold = pd.read_csv(f'{base_path}\\testset_cold_user_cold_item.csv', usecols=['RatingID', 'UserID', 'WineID', 'Rating'])

FileNotFoundError: [Errno 2] No such file or directory: '..\\..\\Evaluation\\trainset.csv'

In [12]:
print('Train set shape:', train.shape)
print('Test set shape (warm user, warm item):', test_uwarm_iwarm.shape)
print('Test set shape (warm user, cold item):', test_uwarm_icold.shape)
print('Test set shape (cold user, warm item):', test_ucold_iwarm.shape)
print('Test set shape (cold user, cold item):', test_ucold_icold.shape)

Train set shape: (16917894, 3)
Test set shape (warm user, warm item): (2036778, 4)
Test set shape (warm user, cold item): (35456, 4)
Test set shape (cold user, warm item): (506800, 4)
Test set shape (cold user, cold item): (16504, 4)


## SVD using Surprise library

In [None]:
# Here we use only raw data since Surprise handles the encoding internally

# For train: convert to Surprise format
reader = Reader(rating_scale=(1, 5))
train_data = Dataset.load_from_df(train[['UserID', 'WineID', 'Rating']], reader)
train_surprise = train_data.build_full_trainset()

# For test: convert to Surprise format
test_uwarm_iwarm_surprise = list(test_uwarm_iwarm[['UserID', 'WineID', 'Rating']].itertuples(index=False, name=None))
test_uwarm_icold_surprise = list(test_uwarm_icold[['UserID', 'WineID', 'Rating']].itertuples(index=False, name=None))
test_ucold_iwarm_surprise = list(test_ucold_iwarm[['UserID', 'WineID', 'Rating']].itertuples(index=False, name=None))
test_ucold_icold_surprise = list(test_ucold_icold[['UserID', 'WineID', 'Rating']].itertuples(index=False, name=None))

In [None]:
# Train SVD
model = SVD()
model.fit(train_surprise)

# Predict ratings for test sets
pred_uwarm_iwarm = model.test(test_uwarm_iwarm_surprise)
pred_uwarm_icold = model.test(test_uwarm_icold_surprise)
pred_ucold_iwarm = model.test(test_ucold_iwarm_surprise)
pred_ucold_icold = model.test(test_ucold_icold_surprise)

# Convert redicted ratings to list
pred_uwarm_iwarm = [pred.est for pred in pred_uwarm_iwarm]
pred_uwarm_icold = [pred.est for pred in pred_uwarm_icold]
pred_ucold_iwarm = [pred.est for pred in pred_ucold_iwarm]
pred_ucold_icold = [pred.est for pred in pred_ucold_icold]

# Save predictions to dataframe 
test_uwarm_iwarm['Prediction'] = pred_uwarm_iwarm
test_uwarm_icold['Prediction'] = pred_uwarm_icold
test_ucold_iwarm['Prediction'] = pred_ucold_iwarm
test_ucold_icold['Prediction'] = pred_ucold_icold

# Write predictions to CSV [RatingID, Prediction]
test_uwarm_iwarm.to_csv(f'{base_path}/svd/svd_warm_user_warm_item.csv', index=False, columns=['RatingID', 'Prediction'], header=['RatingID', 'Rating'])
test_uwarm_icold.to_csv(f'{base_path}/svd/svd_warm_user_cold_item.csv', index=False, columns=['RatingID', 'Prediction'], header=['RatingID', 'Rating'])
test_ucold_iwarm.to_csv(f'{base_path}/svd/svd_cold_user_warm_item.csv', index=False, columns=['RatingID', 'Prediction'], header=['RatingID', 'Rating'])
test_ucold_icold.to_csv(f'{base_path}/svd/svd_cold_user_cold_item.csv', index=False, columns=['RatingID', 'Prediction'], header=['RatingID', 'Rating'])

# Evaluate MSE
mse_uwarm_iwarm = mean_squared_error(test_uwarm_iwarm['Rating'], pred_uwarm_iwarm)
mse_uwarm_icold = mean_squared_error(test_uwarm_icold['Rating'], pred_uwarm_icold)
mse_ucold_iwarm = mean_squared_error(test_ucold_iwarm['Rating'], pred_ucold_iwarm)
mse_ucold_icold = mean_squared_error(test_ucold_icold['Rating'], pred_ucold_icold)

# Evaluate RMSE
rmse_uwarm_iwarm = root_mean_squared_error(test_uwarm_iwarm['Rating'], pred_uwarm_iwarm)
rmse_uwarm_icold = root_mean_squared_error(test_uwarm_icold['Rating'], pred_uwarm_icold)
rmse_ucold_iwarm = root_mean_squared_error(test_ucold_iwarm['Rating'], pred_ucold_iwarm)
rmse_ucold_icold = root_mean_squared_error(test_ucold_icold['Rating'], pred_ucold_icold)


# Evaluate MAE
mae_uwarm_iwarm = mean_absolute_error(test_uwarm_iwarm['Rating'], pred_uwarm_iwarm)
mae_uwarm_icold = mean_absolute_error(test_uwarm_icold['Rating'], pred_uwarm_icold)
mae_ucold_iwarm = mean_absolute_error(test_ucold_iwarm['Rating'], pred_ucold_iwarm)
mae_ucold_icold = mean_absolute_error(test_ucold_icold['Rating'], pred_ucold_icold)


# Print results
print('SVD Results:')
print('MSE (warm user, warm item):', mse_uwarm_iwarm)
print('RMSE (warm user, warm item):', rmse_uwarm_iwarm)
print('MAE (warm user, warm item):', mae_uwarm_iwarm)
print('-' * 50)
print('MSE (warm user, cold item):', mse_uwarm_icold)
print('RMSE (warm user, cold item):', rmse_uwarm_icold)
print('MAE (warm user, cold item):', mae_uwarm_icold)
print('-' * 50)
print('MSE (cold user, warm item):', mse_ucold_iwarm)
print('RMSE (cold user, warm item):', rmse_ucold_iwarm)
print('MAE (cold user, warm item):', mae_ucold_iwarm)
print('-' * 50)
print('MSE (cold user, cold item):', mse_ucold_icold)
print('RMSE (cold user, cold item):', rmse_ucold_icold)
print('MAE (cold user, cold item):', mae_ucold_icold)


SVD Results:
MSE (warm user, warm item): 0.3198472927282139
RMSE (warm user, warm item): 0.5655504334082098
MAE (warm user, warm item): 0.412036745658953
--------------------------------------------------
MSE (warm user, cold item): 0.41533766974870867
RMSE (warm user, cold item): 0.644466965599253
MAE (warm user, cold item): 0.47336789039746074
--------------------------------------------------
MSE (cold user, warm item): 0.44334311057386555
RMSE (cold user, warm item): 0.6658401539212437
MAE (cold user, warm item): 0.49323154327054913
--------------------------------------------------
MSE (cold user, cold item): 0.5983717312186697
RMSE (cold user, cold item): 0.7735449122182045
MAE (cold user, cold item): 0.574148821209833


# Evaluate top-k, since MSE/RMSE/MAE are not really descriptive in cold-start cases

In [4]:
# Load predictions
pred_uwarm_iwarm = pd.read_csv(f'{base_path}\\svd\\svd_warm_user_warm_item.csv')
pred_uwarm_icold = pd.read_csv(f'{base_path}\\svd\\svd_warm_user_cold_item.csv')
pred_ucold_iwarm = pd.read_csv(f'{base_path}\\svd\\svd_cold_user_warm_item.csv')
pred_ucold_icold = pd.read_csv(f'{base_path}\\svd\\svd_cold_user_cold_item.csv')
# Merge predictions with test sets
pred_uwarm_iwarm = pd.merge(test_uwarm_iwarm, pred_uwarm_iwarm, on='RatingID', how='inner', suffixes=('', '_pred'))
pred_uwarm_icold = pd.merge(test_uwarm_icold, pred_uwarm_icold, on='RatingID', how='inner', suffixes=('', '_pred'))
pred_ucold_iwarm = pd.merge(test_ucold_iwarm, pred_ucold_iwarm, on='RatingID', how='inner', suffixes=('', '_pred'))
pred_ucold_icold = pd.merge(test_ucold_icold, pred_ucold_icold, on='RatingID', how='inner', suffixes=('', '_pred'))


In [5]:
# Create Rank and Rank_pred columns

# Warm user, warm item
pred_uwarm_iwarm["Rank"] = pred_uwarm_iwarm.groupby("UserID")["Rating"].rank(method="first", ascending=False)
pred_uwarm_iwarm["Rank_pred"] = pred_uwarm_iwarm.groupby("UserID")["Rating_pred"].rank(method="first", ascending=False)
# Warm user, cold item
pred_uwarm_icold["Rank"] = pred_uwarm_icold.groupby("UserID")["Rating"].rank(method="first", ascending=False)
pred_uwarm_icold["Rank_pred"] = pred_uwarm_icold.groupby("UserID")["Rating_pred"].rank(method="first", ascending=False)
# Cold user, warm item
pred_ucold_iwarm["Rank"] = pred_ucold_iwarm.groupby("UserID")["Rating"].rank(method="first", ascending=False)
pred_ucold_iwarm["Rank_pred"] = pred_ucold_iwarm.groupby("UserID")["Rating_pred"].rank(method="first", ascending=False)
# Cold user, cold item
pred_ucold_icold["Rank"] = pred_ucold_icold.groupby("UserID")["Rating"].rank(method="first", ascending=False)
pred_ucold_icold["Rank_pred"] = pred_ucold_icold.groupby("UserID")["Rating_pred"].rank(method="first", ascending=False)

# Calculate Relevance
pred_uwarm_iwarm["Relevance"] = pred_uwarm_iwarm["Rating"].apply(lambda x: 1 if x >= 3.5 else 0)
pred_uwarm_icold["Relevance"] = pred_uwarm_icold["Rating"].apply(lambda x: 1 if x >= 3.5 else 0)
pred_ucold_iwarm["Relevance"] = pred_ucold_iwarm["Rating"].apply(lambda x: 1 if x >= 3.5 else 0)
pred_ucold_icold["Relevance"] = pred_ucold_icold["Rating"].apply(lambda x: 1 if x >= 3.5 else 0)



In [11]:

def evaluate_topk_fast(df, k=10):
    # Pre-sort so top-k is at the top per user
    df = df.sort_values(['UserID', 'Rank_pred'], ascending=[True, True])

    # Assign group index per row (unique integer per user)
    user_index, user_pos = np.unique(df['UserID'], return_inverse=True)

    # Count items per user
    user_counts = np.bincount(user_pos)
    user_offsets = np.zeros(len(df), dtype=int)
    np.add.at(user_offsets, np.cumsum(user_counts)[:-1], 1)
    user_offsets = np.cumsum(user_offsets)

    # Mask to keep only top-k per user
    df['row_number'] = df.groupby('UserID').cumcount()
    topk_df = df[df['row_number'] < k].copy()

    # Precision@k
    precision = topk_df['Relevance'].groupby(topk_df['UserID']).mean().mean()

    # Recall@k
    relevant_per_user = df.groupby('UserID')['Relevance'].sum()
    hits_per_user = topk_df.groupby('UserID')['Relevance'].sum()
    recall = (hits_per_user / relevant_per_user).fillna(0).mean()

    # HitRate@k
    hits = (hits_per_user > 0).astype(int)
    hit_rate = hits.mean()

    # MAP@k
    def map_at_k_per_user(x):
        rels = x['Relevance'].values
        precisions = [(rels[:i + 1].sum() / (i + 1)) for i in range(len(rels)) if rels[i]]
        return np.mean(precisions) if precisions else 0
    mapk = topk_df.groupby('UserID').apply(map_at_k_per_user, include_groups=False).mean()

    # nDCG@k
    def dcg(rels):
        return np.sum(rels / np.log2(np.arange(2, len(rels) + 2)))
    def ndcg_per_user(x):
        dcg_val = dcg(x['Relevance'].values)
        ideal = x.sort_values('Relevance', ascending=False).head(k)
        idcg_val = dcg(ideal['Relevance'].values)
        return dcg_val / idcg_val if idcg_val > 0 else 0
    ndcg = topk_df.groupby('UserID').apply(ndcg_per_user, include_groups=False).mean()

    return {
        f'Precision@{k}': precision,
        f'Recall@{k}': recall,
        f'HitRate@{k}': hit_rate,
        f'MAP@{k}': mapk,
        f'nDCG@{k}': ndcg
    }


In [12]:
# Run evaluation
ks = [10, 20, 50, 100]
# Evaluate for different k values
for k in ks:
    results = {}
    results['warm user, warm item'] = evaluate_topk_fast(pred_uwarm_iwarm, k=k)
    results['warm user, cold item'] = evaluate_topk_fast(pred_uwarm_icold, k=k)
    results['cold user, warm item'] = evaluate_topk_fast(pred_ucold_iwarm, k=k)
    results['cold user, cold item'] = evaluate_topk_fast(pred_ucold_icold, k=k)

    # Print evaluation results
    for case, metrics in results.items():
        print(f"Evaluation on {case} at top {k}:")
        for metric, value in metrics.items():
            print(f"{metric}: {value:.4f}")
        print('-' * 50)


Evaluation on warm user, warm item at top 10:
Precision@10: 0.8819
Recall@10: 0.9238
HitRate@10: 0.9526
MAP@10: 0.9261
nDCG@10: 0.9362
--------------------------------------------------
Evaluation on warm user, cold item at top 10:
Precision@10: 0.8389
Recall@10: 0.8605
HitRate@10: 0.8605
MAP@10: 0.8479
nDCG@10: 0.8515
--------------------------------------------------
Evaluation on cold user, warm item at top 10:
Precision@10: 0.8192
Recall@10: 0.9233
HitRate@10: 0.9929
MAP@10: 0.9164
nDCG@10: 0.9489
--------------------------------------------------
Evaluation on cold user, cold item at top 10:
Precision@10: 0.7089
Recall@10: 0.7210
HitRate@10: 0.7210
MAP@10: 0.7139
nDCG@10: 0.7159
--------------------------------------------------
Evaluation on warm user, warm item at top 20:
Precision@20: 0.8795
Recall@20: 0.9437
HitRate@20: 0.9526
MAP@20: 0.9253
nDCG@20: 0.9360
--------------------------------------------------
Evaluation on warm user, cold item at top 20:
Precision@20: 0.8389
Rec