In this lesson, you will learn how to build a recommendation system using the Surprise library in Python. Recommendation systems are algorithms designed to suggest relevant items to users based on available information about users, items, and their interactions. They have become essential in many online platforms including e-commerce (Amazon), streaming services (Netflix, Spotify), and social media (Facebook, Twitter).

We'll follow a comprehensive step-by-step process to:

1. Load and prepare data.
2. Train recommendation models using different algorithms.
3. Tune a recommendation model using gridsearch.
4. Make predictions for specific users and items.
5. Evaluate the models' performance using appropriate metrics.
6. Implement a function to generate personalized recommendations.

In [6]:
# Import required modules
from surprise import Dataset  # For loading and handling datasets
from surprise import Reader   # For parsing custom datasets
from surprise import SVD      # Singular Value Decomposition algorithm
from surprise import KNNBasic, KNNWithMeans, KNNWithZScore  # K-Nearest Neighbors algorithms
from surprise import NMF      # Non-negative Matrix Factorization algorithm
from surprise import BaselineOnly  # Basic algorithm using baselines
from surprise.model_selection import train_test_split  # For splitting data
from surprise.model_selection import cross_validate    # For cross-validation
from surprise.model_selection import GridSearchCV      # For hyperparameter tuning
from surprise import accuracy  # For computing prediction accuracy metrics

#### Step 1: Loading the Data ####

For this lesson, we'll use the Surprise built-in MovieLens 100k dataset, which contains 100,000 ratings (1-5) from 943 users on 1,682 movies. This is a popular benchmark dataset for recommendation systems.

In [7]:
# Load the built-in MovieLens dataset
data = Dataset.load_builtin('ml-100k')

In [None]:
# loading a locally saved dataset

# Define the format of your custom file
file_path = 'your_ratings.csv'

# Create a reader object specifying the rating scale
reader = Reader(line_format='user item rating timestamp', sep=',', rating_scale=(1, 5))

# Load the data from your file
data = Dataset.load_from_file(file_path, reader)

# If your data is in a pandas DataFrame
import pandas as pd

# Sample DataFrame structure
df = pd.DataFrame({
    'user': [1, 1, 2, 2],
    'item': [101, 102, 101, 103],
    'rating': [4, 3, 5, 2]
})

# Create a reader object
reader = Reader(rating_scale=(1, 5))

# Load data from DataFrame using column names
data = Dataset.load_from_df(df[['user', 'item', 'rating']], reader)

#### Step 2: Splitting the Data ####

Surprise uses a special masking procedure to hide known ratings from the data in order to create a train and test set. These masked ratings can then be compared to their predicted ratings from the model in order to evaluate the recommendation system.

In [8]:
# Split the dataset into train and test sets (75% training, 25% testing)
trainset, testset = train_test_split(data, test_size=0.25, random_state=42)

#### Step 3: Selecting and Training a Model

*Note:* we will use the full data set to cross validate as Surprise train_test_split cannot work with cross validation given its unique method of splitting data. We will use our trainset and testset to evaluate the final model.

In [9]:
# Define a list of algorithms to compare
algorithms = [
    SVD(),
    KNNBasic(sim_options={'user_based': True}),  # User-based collaborative filtering
    KNNBasic(sim_options={'user_based': False}), # Item-based collaborative filtering
    KNNWithMeans(sim_options={'user_based': True}),
    NMF(),
    BaselineOnly()
]

# Evaluate each algorithm using cross-validation
results = {}
for algo in algorithms:
    algo_name = algo.__class__.__name__
    sim_option = ''
    
    # Add user/item based info for KNN algorithms
    if algo_name.startswith('KNN'):
        user_based = algo.sim_options.get('user_based', True)
        sim_option = 'User-based' if user_based else 'Item-based'
        algo_name = f"{algo_name} ({sim_option})"
    
    # Run 5-fold cross-validation
    cv_results = cross_validate(algo, data, measures=['RMSE', 'MAE'], 
                               cv=5, verbose=False)
    
    # Store results
    results[algo_name] = {
        'RMSE': cv_results['test_rmse'].mean(),
        'MAE': cv_results['test_mae'].mean()
    }

# Print comparison table
print("\nAlgorithm Comparison:")
print("-" * 60)
print(f"{'Algorithm':<30} {'RMSE':<15} {'MAE':<15}")
print("-" * 60)
for algo_name, metrics in results.items():
    print(f"{algo_name:<30} {metrics['RMSE']:<15.4f} {metrics['MAE']:<15.4f}")

Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computing similarity matrix.
Computing the msd similarity matrix...
Done computi

**Algorithm Descriptions**

- SVD (Singular Value Decomposition): A matrix factorization technique that decomposes the user-item rating matrix into lower-dimensional user and item factors.
- KNNBasic (K-Nearest Neighbors Basic):
    - User-based: Predicts ratings based on the ratings of similar users
    - Item-based: Predicts ratings based on the user's ratings of similar items
- KNNWithMeans: Similar to KNNBasic, but takes into account the mean ratings of each user when calculating similarities.
- NMF (Non-negative Matrix Factorization): A matrix factorization method where all factors must be non-negative, which can lead to more interpretable factors.
- BaselineOnly: A simple algorithm that predicts the baseline estimate for a given user and item, which consists of global average, user bias, and item bias.

**Algorithm Strengths and Weaknesses**

- SVD typically provides good accuracy and can handle large datasets efficiently.
- KNN methods are more interpretable (recommendations can be explained by showing similar users/items).
- NMF might provide more interpretable latent factors than SVD.
- BaselineOnly serves as a good baseline to compare against more sophisticated algorithms.

The best algorithm depends on the specific dataset and application. Comparing multiple algorithms allows us to select the one that performs best for our particular use case.

#### Step 4: Hyperparameter Tuning ####

The SVD model had the lowest RMSE and MAE so we will proceed with that model as the one to tweak via hyperparameter tuning. 

Key Hyperparameters:

*n_factors*: Number of latent factors (the dimensionality of the user and item vectors).
- More factors increase ability to capture patterns but increase computational cost and may lead to overfitting

*n_epochs*: Number of iterations of SGD (Stochastic Gradient Descent).
- More epochs may lead to better convergence but increase the risk of overfitting.

*lr_all*: Learning rate for all parameters.
- Lower learning rates may result in more stable convergence but require more epochs.

*reg_all*: Regularization term for all parameters.
- Higher regularization values help prevent overfitting but might reduce the model's ability to capture patterns.

In [11]:
# Define parameter grid for SVD
param_grid = {
   'n_factors': [50, 100, 150],
   'n_epochs': [10, 20, 30],
   'lr_all': [0.002, 0.005, 0.01],
   'reg_all': [0.02, 0.1, 0.5],
   'random_state': [42]
}

# Perform grid search
gs = GridSearchCV(SVD, param_grid, measures=['rmse', 'mae'], cv=5)
gs.fit(data)

# Print best parameters
print("\nBest RMSE Parameters:")
print(gs.best_params['rmse'])
print(f"Best RMSE Score: {gs.best_score['rmse']:.4f}")

print("\nBest MAE Parameters:")
print(gs.best_params['mae'])
print(f"Best MAE Score: {gs.best_score['mae']:.4f}")

# Train the model with the best parameters
best_algo = SVD(**gs.best_params['mae'])
best_algo.fit(trainset)


Best RMSE Parameters:
{'n_factors': 150, 'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.1, 'random_state': 42}
Best RMSE Score: 0.9123

Best MAE Parameters:
{'n_factors': 150, 'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.1, 'random_state': 42}
Best MAE Score: 0.7215


<surprise.prediction_algorithms.matrix_factorization.SVD at 0x20cdb0aeca0>

#### Step 5: Make Predictions and Evaluate ####

Each prediction contains:

- uid: The user identifier
- iid: The item identifier
- r_ui: The true rating given by the user
- est: The estimated rating predicted by the algorithm
- The difference between true and estimated ratings indicates the prediction error

The goal is to minimize this error across all predictions. A smaller error indicates a better-performing recommendation system.

In [12]:
# Make predictions on the testset
predictions = best_algo.test(testset)

# Look at the first few predictions
for pred in predictions[:3]:
    print(f"User: {pred.uid}, Item: {pred.iid}, "
          f"Actual Rating: {pred.r_ui:.2f}, Predicted Rating: {pred.est:.2f}, "
          f"Error: {pred.r_ui - pred.est:.2f}")

User: 391, Item: 591, Actual Rating: 4.00, Predicted Rating: 3.53, Error: 0.47
User: 181, Item: 1291, Actual Rating: 1.00, Predicted Rating: 1.54, Error: -0.54
User: 637, Item: 268, Actual Rating: 2.00, Predicted Rating: 2.80, Error: -0.80


In [13]:
# Compute RMSE and MAE on the testset
rmse = accuracy.rmse(predictions)
mae = accuracy.mae(predictions)

RMSE: 0.9176
MAE:  0.7244


On average this recommendation model's rating predictions are off by about 0.72 of a rating level. We could continue to tune the model given the results of our first gridsearch to try and improve further, but for the sake of time and this lesson we will stop here.

#### Step 6: Make Specific Recommendations ####

To provide practical recommendations, we want to suggest items a user hasn't interacted with yet which are predicted to be highly rated by that user.

In [14]:
# Create a full training set that includes all users
full_trainset = data.build_full_trainset()

# Fit model to all users and data
best_algo.fit(full_trainset)

def get_top_n_recommendations(algo, data, user_id, n=10):
    """
    Generate top-N recommendations for a specific user
    
    Parameters:
    -----------
    algo : surprise.prediction_algorithms
        Trained algorithm
    data : surprise.Trainset
        The full training dataset
    user_id : str
        The user ID for whom to generate recommendations
    n : int, default=10
        Number of recommendations to generate
        
    Returns:
    --------
    list of tuples
        (item_id, predicted_rating) sorted by predicted rating in descending order
    """
    # Get a list of all items
    all_item_ids = data.all_items()
    
    # Convert raw user ID to inner ID used by the trainset
    try:
        inner_user_id = data.to_inner_uid(user_id)
    except ValueError:
        print(f"User {user_id} doesn't exist in the training set")
        return []
    
    # Get items rated by this user
    user_items = [j for (j, _) in data.ur[inner_user_id]]
    
    # Items not rated by the user
    unrated_items = [item_id for item_id in all_item_ids if item_id not in user_items]
    
    # Predict ratings for unrated items
    predictions = []
    for item_id in unrated_items:
        # Convert inner item ID back to raw ID for prediction
        raw_item_id = data.to_raw_iid(item_id)
        # Get prediction
        pred = algo.predict(user_id, raw_item_id)
        predictions.append((raw_item_id, pred.est))
    
    # Sort predictions by estimated rating
    predictions.sort(key=lambda x: x[1], reverse=True)
    
    # Return top n recommendations
    return predictions[:n]

# Example usage
user_id = '196'  # Choose a user ID from the dataset
top_recommendations = get_top_n_recommendations(algo, trainset, user_id, n=5)

print(f"Top 5 movie recommendations for user {user_id}:")
for movie_id, predicted_rating in top_recommendations:
    print(f"Movie ID: {movie_id}, Predicted Rating: {predicted_rating:.2f}")

Top 5 movie recommendations for user 196:
Movie ID: 408, Predicted Rating: 4.57
Movie ID: 318, Predicted Rating: 4.54
Movie ID: 169, Predicted Rating: 4.54
Movie ID: 483, Predicted Rating: 4.52
Movie ID: 64, Predicted Rating: 4.52


This function works by:

1. Identifying all items not yet rated by the user.
2. Predicting how the user would rate each unrated item.
3. Sorting these items by their predicted ratings.
4. Returning the top-N items with the highest predicted ratings.

This approach ensures we're recommending new items the user hasn't seen yet, rather than items they've already rated. The conversion between inner IDs (used by Surprise internally) and raw IDs (the original identifiers) is a crucial step when working with the Surprise library. The returned Movie IDâ€™s can then be connected with meta data to provide contextual information about the movie (title, genre, etc).