# Task 2: Movie Recommender System
## Machine Learning with Matrix Data for Recommender Systems

**Author:** HazelTChikara  
**Date:** November 23, 2025

This notebook implements and evaluates three recommender system algorithms:
1. Probabilistic Matrix Factorization (PMF)
2. User-based Collaborative Filtering
3. Item-based Collaborative Filtering

## Setup and Imports

In [None]:
# Install required libraries (uncomment if needed)
# !pip install pandas numpy matplotlib seaborn scikit-surprise scipy

### Important: Installing scikit-surprise

**Python 3.13 Compatibility Issue:**
The `scikit-surprise` library currently has compatibility issues with Python 3.13.

**Recommended Solutions:**

1. **Use Conda (Recommended):**
   ```bash
   conda install -c conda-forge scikit-surprise
   ```

2. **Use Python 3.9-3.12:**
   - Create a virtual environment with Python 3.9, 3.10, 3.11, or 3.12
   - Then install: `pip install scikit-surprise`

3. **Try terminal installation:**
   ```bash
   pip install scikit-surprise
   ```

4. **Alternative for Python 3.13:**
   ```bash
   # Install from git (development version)
   pip install git+https://github.com/NicolasHug/Surprise.git
   ```

**Check your Python version first:**

In [None]:
%pip install -q scikit-surprise

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from surprise import Dataset, Reader, SVD, KNNBasic
from surprise.model_selection import cross_validate
from collections import defaultdict
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print("Libraries imported successfully!")

## Load and Explore Data

In [None]:
# Load the ratings data
print("Loading data from: ratings_small.csv")

df = pd.read_csv('ratings_small.csv')
print(f"Dataset shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")
print(f"\nFirst few rows:")
df.head()

In [None]:
# Dataset statistics
print(f"Dataset Statistics:")
print(f"Number of users: {df['userId'].nunique()}")
print(f"Number of movies: {df['movieId'].nunique()}")
print(f"Number of ratings: {len(df)}")
print(f"Rating range: [{df['rating'].min()}, {df['rating'].max()}]")
print(f"Average rating: {df['rating'].mean():.2f}")
print(f"\nRating distribution:")
df['rating'].value_counts().sort_index()

In [None]:
# Visualize rating distribution
plt.figure(figsize=(10, 5))
df['rating'].hist(bins=10, edgecolor='black', alpha=0.7)
plt.xlabel('Rating', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.title('Distribution of Movie Ratings', fontsize=14, fontweight='bold')
plt.grid(alpha=0.3)
plt.show()

In [None]:
# Create Surprise dataset
reader = Reader(rating_scale=(0.5, 5.0))
data = Dataset.load_from_df(df[['userId', 'movieId', 'rating']], reader)
print("Surprise dataset created successfully!")

## Task 2c: 5-Fold Cross-Validation Comparison

Compute average MAE and RMSE for:
1. Probabilistic Matrix Factorization (PMF)
2. User-based Collaborative Filtering
3. Item-based Collaborative Filtering

In [None]:
results = {}

print("="*80)
print("TASK 2C: 5-Fold Cross-Validation Comparison")
print("="*80)

In [None]:
# 1. Probabilistic Matrix Factorization (PMF) - using SVD
print("\n1. Probabilistic Matrix Factorization (SVD)...")
pmf = SVD()
pmf_results = cross_validate(pmf, data, measures=['MAE', 'RMSE'], cv=5, verbose=True)
results['PMF'] = {
    'MAE': pmf_results['test_mae'].mean(),
    'RMSE': pmf_results['test_rmse'].mean(),
    'MAE_std': pmf_results['test_mae'].std(),
    'RMSE_std': pmf_results['test_rmse'].std()
}
print(f"MAE: {results['PMF']['MAE']:.4f} ± {results['PMF']['MAE_std']:.4f}")
print(f"RMSE: {results['PMF']['RMSE']:.4f} ± {results['PMF']['RMSE_std']:.4f}")

In [None]:
# 2. User-based Collaborative Filtering
print("\n2. User-based Collaborative Filtering...")
user_cf = KNNBasic(sim_options={'name': 'cosine', 'user_based': True})
user_results = cross_validate(user_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=True)
results['User-based CF'] = {
    'MAE': user_results['test_mae'].mean(),
    'RMSE': user_results['test_rmse'].mean(),
    'MAE_std': user_results['test_mae'].std(),
    'RMSE_std': user_results['test_rmse'].std()
}
print(f"MAE: {results['User-based CF']['MAE']:.4f} ± {results['User-based CF']['MAE_std']:.4f}")
print(f"RMSE: {results['User-based CF']['RMSE']:.4f} ± {results['User-based CF']['RMSE_std']:.4f}")

In [None]:
# 3. Item-based Collaborative Filtering
print("\n3. Item-based Collaborative Filtering...")
item_cf = KNNBasic(sim_options={'name': 'cosine', 'user_based': False})
item_results = cross_validate(item_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=True)
results['Item-based CF'] = {
    'MAE': item_results['test_mae'].mean(),
    'RMSE': item_results['test_rmse'].mean(),
    'MAE_std': item_results['test_mae'].std(),
    'RMSE_std': item_results['test_rmse'].std()
}
print(f"MAE: {results['Item-based CF']['MAE']:.4f} ± {results['Item-based CF']['MAE_std']:.4f}")
print(f"RMSE: {results['Item-based CF']['RMSE']:.4f} ± {results['Item-based CF']['RMSE_std']:.4f}")

In [None]:
# Summary table
print("\n" + "-"*80)
print("SUMMARY OF RESULTS:")
print("-"*80)
results_df = pd.DataFrame(results).T
results_df[['MAE', 'RMSE']]

## Task 2d: Model Comparison and Best Model Selection

In [None]:
# Find best model
comparison_df = pd.DataFrame(results).T
best_mae_model = comparison_df['MAE'].idxmin()
best_rmse_model = comparison_df['RMSE'].idxmin()

print(f"Best Model by MAE: {best_mae_model} (MAE = {comparison_df.loc[best_mae_model, 'MAE']:.4f})")
print(f"Best Model by RMSE: {best_rmse_model} (RMSE = {comparison_df.loc[best_rmse_model, 'RMSE']:.4f})")

In [None]:
# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

algorithms = list(results.keys())
mae_values = [results[algo]['MAE'] for algo in algorithms]
mae_std = [results[algo]['MAE_std'] for algo in algorithms]
rmse_values = [results[algo]['RMSE'] for algo in algorithms]
rmse_std = [results[algo]['RMSE_std'] for algo in algorithms]

# MAE comparison
axes[0].bar(algorithms, mae_values, yerr=mae_std, capsize=5, alpha=0.7, 
            color=['#1f77b4', '#ff7f0e', '#2ca02c'])
axes[0].set_ylabel('MAE', fontsize=12)
axes[0].set_title('Mean Absolute Error (MAE) Comparison', fontsize=14, fontweight='bold')
axes[0].set_ylim(bottom=0)
axes[0].tick_params(axis='x', rotation=15)
for i, (v, s) in enumerate(zip(mae_values, mae_std)):
    axes[0].text(i, v + s + 0.01, f'{v:.4f}', ha='center', va='bottom', fontsize=10)

# RMSE comparison
axes[1].bar(algorithms, rmse_values, yerr=rmse_std, capsize=5, alpha=0.7, 
            color=['#1f77b4', '#ff7f0e', '#2ca02c'])
axes[1].set_ylabel('RMSE', fontsize=12)
axes[1].set_title('Root Mean Squared Error (RMSE) Comparison', fontsize=14, fontweight='bold')
axes[1].set_ylim(bottom=0)
axes[1].tick_params(axis='x', rotation=15)
for i, (v, s) in enumerate(zip(rmse_values, rmse_std)):
    axes[1].text(i, v + s + 0.01, f'{v:.4f}', ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.savefig('task_2d_model_comparison.png', dpi=300, bbox_inches='tight')
plt.show()
print("Plot saved as 'task_2d_model_comparison.png'")

## Task 2e: Impact of Similarity Metrics

Examine how cosine, MSD, and Pearson similarities impact CF performance.

In [None]:
similarities = ['cosine', 'msd', 'pearson']
user_sim_results = defaultdict(dict)
item_sim_results = defaultdict(dict)

for sim in similarities:
    print(f"\n--- Testing similarity: {sim.upper()} ---")
    
    # User-based CF
    print(f"User-based CF with {sim}...")
    user_cf = KNNBasic(sim_options={'name': sim, 'user_based': True})
    user_cv = cross_validate(user_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=False)
    user_sim_results[sim]['MAE'] = user_cv['test_mae'].mean()
    user_sim_results[sim]['RMSE'] = user_cv['test_rmse'].mean()
    user_sim_results[sim]['MAE_std'] = user_cv['test_mae'].std()
    user_sim_results[sim]['RMSE_std'] = user_cv['test_rmse'].std()
    print(f"  MAE: {user_sim_results[sim]['MAE']:.4f}, RMSE: {user_sim_results[sim]['RMSE']:.4f}")
    
    # Item-based CF
    print(f"Item-based CF with {sim}...")
    item_cf = KNNBasic(sim_options={'name': sim, 'user_based': False})
    item_cv = cross_validate(item_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=False)
    item_sim_results[sim]['MAE'] = item_cv['test_mae'].mean()
    item_sim_results[sim]['RMSE'] = item_cv['test_rmse'].mean()
    item_sim_results[sim]['MAE_std'] = item_cv['test_mae'].std()
    item_sim_results[sim]['RMSE_std'] = item_cv['test_rmse'].std()
    print(f"  MAE: {item_sim_results[sim]['MAE']:.4f}, RMSE: {item_sim_results[sim]['RMSE']:.4f}")

In [None]:
# Summary tables
print("\nUser-based CF:")
user_sim_df = pd.DataFrame(user_sim_results).T
display(user_sim_df[['MAE', 'RMSE']])

print("\nItem-based CF:")
item_sim_df = pd.DataFrame(item_sim_results).T
display(item_sim_df[['MAE', 'RMSE']])

In [None]:
# Visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
x_pos = np.arange(len(similarities))
width = 0.35

user_mae = [user_sim_results[sim]['MAE'] for sim in similarities]
item_mae = [item_sim_results[sim]['MAE'] for sim in similarities]
user_mae_std = [user_sim_results[sim]['MAE_std'] for sim in similarities]
item_mae_std = [item_sim_results[sim]['MAE_std'] for sim in similarities]
user_rmse = [user_sim_results[sim]['RMSE'] for sim in similarities]
item_rmse = [item_sim_results[sim]['RMSE'] for sim in similarities]
user_rmse_std = [user_sim_results[sim]['RMSE_std'] for sim in similarities]
item_rmse_std = [item_sim_results[sim]['RMSE_std'] for sim in similarities]

# MAE comparison
axes[0, 0].bar(x_pos - width/2, user_mae, width, yerr=user_mae_std, label='User-based CF', 
               capsize=5, alpha=0.8, color='#1f77b4')
axes[0, 0].bar(x_pos + width/2, item_mae, width, yerr=item_mae_std, label='Item-based CF', 
               capsize=5, alpha=0.8, color='#ff7f0e')
axes[0, 0].set_ylabel('MAE', fontsize=12)
axes[0, 0].set_title('MAE: User-based vs Item-based CF', fontsize=13, fontweight='bold')
axes[0, 0].set_xticks(x_pos)
axes[0, 0].set_xticklabels([s.upper() for s in similarities])
axes[0, 0].legend()
axes[0, 0].grid(axis='y', alpha=0.3)

# RMSE comparison
axes[0, 1].bar(x_pos - width/2, user_rmse, width, yerr=user_rmse_std, label='User-based CF', 
               capsize=5, alpha=0.8, color='#1f77b4')
axes[0, 1].bar(x_pos + width/2, item_rmse, width, yerr=item_rmse_std, label='Item-based CF', 
               capsize=5, alpha=0.8, color='#ff7f0e')
axes[0, 1].set_ylabel('RMSE', fontsize=12)
axes[0, 1].set_title('RMSE: User-based vs Item-based CF', fontsize=13, fontweight='bold')
axes[0, 1].set_xticks(x_pos)
axes[0, 1].set_xticklabels([s.upper() for s in similarities])
axes[0, 1].legend()
axes[0, 1].grid(axis='y', alpha=0.3)

# User-based CF trends
axes[1, 0].plot(similarities, user_mae, 'o-', label='MAE', linewidth=2, markersize=8, color='#2ca02c')
axes[1, 0].plot(similarities, user_rmse, 's-', label='RMSE', linewidth=2, markersize=8, color='#d62728')
axes[1, 0].set_xlabel('Similarity Metric', fontsize=12)
axes[1, 0].set_ylabel('Error', fontsize=12)
axes[1, 0].set_title('User-based CF: Similarity Impact', fontsize=13, fontweight='bold')
axes[1, 0].legend()
axes[1, 0].grid(alpha=0.3)

# Item-based CF trends
axes[1, 1].plot(similarities, item_mae, 'o-', label='MAE', linewidth=2, markersize=8, color='#2ca02c')
axes[1, 1].plot(similarities, item_rmse, 's-', label='RMSE', linewidth=2, markersize=8, color='#d62728')
axes[1, 1].set_xlabel('Similarity Metric', fontsize=12)
axes[1, 1].set_ylabel('Error', fontsize=12)
axes[1, 1].set_title('Item-based CF: Similarity Impact', fontsize=13, fontweight='bold')
axes[1, 1].legend()
axes[1, 1].grid(alpha=0.3)

plt.tight_layout()
plt.savefig('task_2e_similarity_metrics.png', dpi=300, bbox_inches='tight')
plt.show()
print("Plot saved as 'task_2e_similarity_metrics.png'")

In [None]:
# Analysis
user_best_mae = min(similarities, key=lambda s: user_sim_results[s]['MAE'])
user_best_rmse = min(similarities, key=lambda s: user_sim_results[s]['RMSE'])
item_best_mae = min(similarities, key=lambda s: item_sim_results[s]['MAE'])
item_best_rmse = min(similarities, key=lambda s: item_sim_results[s]['RMSE'])

print(f"Best similarity for User-based CF: {user_best_mae.upper()} (MAE), {user_best_rmse.upper()} (RMSE)")
print(f"Best similarity for Item-based CF: {item_best_mae.upper()} (MAE), {item_best_rmse.upper()} (RMSE)")

if user_best_mae == item_best_mae and user_best_rmse == item_best_rmse:
    print(f"\nThe impact of similarity metrics is CONSISTENT between User-based and Item-based CF.")
else:
    print(f"\nThe impact of similarity metrics is NOT CONSISTENT between User-based and Item-based CF.")

## Task 2f: Impact of Number of Neighbors

Examine how different k values affect CF performance.

In [None]:
k_values = [5, 10, 20, 30, 40, 50, 60, 70, 80]
user_neighbor_results = defaultdict(dict)
item_neighbor_results = defaultdict(dict)

for k in k_values:
    print(f"\n--- Testing k = {k} ---")
    
    # User-based CF
    print(f"User-based CF with k={k}...")
    user_cf = KNNBasic(k=k, sim_options={'name': 'cosine', 'user_based': True})
    user_cv = cross_validate(user_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=False)
    user_neighbor_results[k]['MAE'] = user_cv['test_mae'].mean()
    user_neighbor_results[k]['RMSE'] = user_cv['test_rmse'].mean()
    user_neighbor_results[k]['MAE_std'] = user_cv['test_mae'].std()
    user_neighbor_results[k]['RMSE_std'] = user_cv['test_rmse'].std()
    print(f"  MAE: {user_neighbor_results[k]['MAE']:.4f}, RMSE: {user_neighbor_results[k]['RMSE']:.4f}")
    
    # Item-based CF
    print(f"Item-based CF with k={k}...")
    item_cf = KNNBasic(k=k, sim_options={'name': 'cosine', 'user_based': False})
    item_cv = cross_validate(item_cf, data, measures=['MAE', 'RMSE'], cv=5, verbose=False)
    item_neighbor_results[k]['MAE'] = item_cv['test_mae'].mean()
    item_neighbor_results[k]['RMSE'] = item_cv['test_rmse'].mean()
    item_neighbor_results[k]['MAE_std'] = item_cv['test_mae'].std()
    item_neighbor_results[k]['RMSE_std'] = item_cv['test_rmse'].std()
    print(f"  MAE: {item_neighbor_results[k]['MAE']:.4f}, RMSE: {item_neighbor_results[k]['RMSE']:.4f}")

In [None]:
# Summary tables
print("\nUser-based CF:")
user_neighbor_df = pd.DataFrame(user_neighbor_results).T
display(user_neighbor_df[['MAE', 'RMSE']])

print("\nItem-based CF:")
item_neighbor_df = pd.DataFrame(item_neighbor_results).T
display(item_neighbor_df[['MAE', 'RMSE']])

In [None]:
# Visualization
fig, axes = plt.subplots(1, 2, figsize=(15, 6))

user_mae = [user_neighbor_results[k]['MAE'] for k in k_values]
user_rmse = [user_neighbor_results[k]['RMSE'] for k in k_values]
item_mae = [item_neighbor_results[k]['MAE'] for k in k_values]
item_rmse = [item_neighbor_results[k]['RMSE'] for k in k_values]

# User-based CF
axes[0].plot(k_values, user_mae, 'o-', label='MAE', linewidth=2, markersize=8, color='#2ca02c')
axes[0].plot(k_values, user_rmse, 's-', label='RMSE', linewidth=2, markersize=8, color='#d62728')
axes[0].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[0].set_ylabel('Error', fontsize=12)
axes[0].set_title('User-based CF: Impact of k', fontsize=13, fontweight='bold')
axes[0].legend(fontsize=11)
axes[0].grid(alpha=0.3)
axes[0].set_xticks(k_values)

# Item-based CF
axes[1].plot(k_values, item_mae, 'o-', label='MAE', linewidth=2, markersize=8, color='#2ca02c')
axes[1].plot(k_values, item_rmse, 's-', label='RMSE', linewidth=2, markersize=8, color='#d62728')
axes[1].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[1].set_ylabel('Error', fontsize=12)
axes[1].set_title('Item-based CF: Impact of k', fontsize=13, fontweight='bold')
axes[1].legend(fontsize=11)
axes[1].grid(alpha=0.3)
axes[1].set_xticks(k_values)

plt.tight_layout()
plt.savefig('task_2f_neighbor_impact.png', dpi=300, bbox_inches='tight')
plt.show()
print("Plot saved as 'task_2f_neighbor_impact.png'")

## Task 2g: Identifying Best K Value

Find the optimal number of neighbors for both CF methods.

In [None]:
# Find best k for each method based on RMSE
user_best_k = min(k_values, key=lambda k: user_neighbor_results[k]['RMSE'])
item_best_k = min(k_values, key=lambda k: item_neighbor_results[k]['RMSE'])

print("Best K based on RMSE:")
print(f"User-based CF: k = {user_best_k} (RMSE = {user_neighbor_results[user_best_k]['RMSE']:.4f})")
print(f"Item-based CF: k = {item_best_k} (RMSE = {item_neighbor_results[item_best_k]['RMSE']:.4f})")

# Also show best k based on MAE
user_best_k_mae = min(k_values, key=lambda k: user_neighbor_results[k]['MAE'])
item_best_k_mae = min(k_values, key=lambda k: item_neighbor_results[k]['MAE'])

print("\nBest K based on MAE:")
print(f"User-based CF: k = {user_best_k_mae} (MAE = {user_neighbor_results[user_best_k_mae]['MAE']:.4f})")
print(f"Item-based CF: k = {item_best_k_mae} (MAE = {item_neighbor_results[item_best_k_mae]['MAE']:.4f})")

In [None]:
# Visualization with best k highlighted
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# RMSE plots
axes[0, 0].plot(k_values, user_rmse, 'o-', linewidth=2, markersize=8, color='#1f77b4', label='User-based CF')
axes[0, 0].plot([user_best_k], [user_neighbor_results[user_best_k]['RMSE']], 'r*', 
                 markersize=20, label=f'Best k={user_best_k}')
axes[0, 0].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[0, 0].set_ylabel('RMSE', fontsize=12)
axes[0, 0].set_title('User-based CF: RMSE vs k', fontsize=13, fontweight='bold')
axes[0, 0].legend(fontsize=11)
axes[0, 0].grid(alpha=0.3)
axes[0, 0].set_xticks(k_values)

axes[0, 1].plot(k_values, item_rmse, 'o-', linewidth=2, markersize=8, color='#ff7f0e', label='Item-based CF')
axes[0, 1].plot([item_best_k], [item_neighbor_results[item_best_k]['RMSE']], 'r*', 
                 markersize=20, label=f'Best k={item_best_k}')
axes[0, 1].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[0, 1].set_ylabel('RMSE', fontsize=12)
axes[0, 1].set_title('Item-based CF: RMSE vs k', fontsize=13, fontweight='bold')
axes[0, 1].legend(fontsize=11)
axes[0, 1].grid(alpha=0.3)
axes[0, 1].set_xticks(k_values)

# MAE plots
axes[1, 0].plot(k_values, user_mae, 'o-', linewidth=2, markersize=8, color='#1f77b4', label='User-based CF')
axes[1, 0].plot([user_best_k_mae], [user_neighbor_results[user_best_k_mae]['MAE']], 'r*', 
                 markersize=20, label=f'Best k={user_best_k_mae}')
axes[1, 0].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[1, 0].set_ylabel('MAE', fontsize=12)
axes[1, 0].set_title('User-based CF: MAE vs k', fontsize=13, fontweight='bold')
axes[1, 0].legend(fontsize=11)
axes[1, 0].grid(alpha=0.3)
axes[1, 0].set_xticks(k_values)

axes[1, 1].plot(k_values, item_mae, 'o-', linewidth=2, markersize=8, color='#ff7f0e', label='Item-based CF')
axes[1, 1].plot([item_best_k_mae], [item_neighbor_results[item_best_k_mae]['MAE']], 'r*', 
                 markersize=20, label=f'Best k={item_best_k_mae}')
axes[1, 1].set_xlabel('Number of Neighbors (k)', fontsize=12)
axes[1, 1].set_ylabel('MAE', fontsize=12)
axes[1, 1].set_title('Item-based CF: MAE vs k', fontsize=13, fontweight='bold')
axes[1, 1].legend(fontsize=11)
axes[1, 1].grid(alpha=0.3)
axes[1, 1].set_xticks(k_values)

plt.tight_layout()
plt.savefig('task_2g_best_k.png', dpi=300, bbox_inches='tight')
plt.show()
print("Plot saved as 'task_2g_best_k.png'")

In [None]:
# Conclusion
print("\n" + "="*80)
print("CONCLUSION:")
print("="*80)
if user_best_k == item_best_k:
    print(f"The best k is THE SAME for both methods: k = {user_best_k}")
    print("This suggests that both user-based and item-based CF benefit from")
    print("the same neighborhood size in this dataset.")
else:
    print("The best k is DIFFERENT for each method:")
    print(f"  - User-based CF: k = {user_best_k}")
    print(f"  - Item-based CF: k = {item_best_k}")
    print("This suggests that user-based and item-based CF have different")
    print("optimal neighborhood sizes for this dataset.")
print("="*80)

## Summary

This notebook has completed all tasks:
- **Task 2c**: Evaluated PMF, User-based CF, and Item-based CF with 5-fold CV
- **Task 2d**: Compared models and identified the best performer
- **Task 2e**: Analyzed impact of similarity metrics (cosine, MSD, Pearson)
- **Task 2f**: Examined how k affects performance
- **Task 2g**: Identified optimal k values for both CF methods

All results have been saved as high-resolution PNG files.