# Ensemble

### Scope
In this notebook we combine multiple models to decrease the variance in the final predictions, with the aim of improving the $\text{RMSE}$ score on Kaggle's leaderboard.

### About the ensemble model
The ensamble model consists of a weighted average of the predictions of the TOP-K models trained in the hyperparameter tuning of the LightGCNPlus model.
Mathematically, the ensamble model is defined as follows:
$$
\text{Ensemble}(x) = \sum_{i=1}^{K} w_i \cdot \text{Model}_i(x)
$$
where $w_i$ is the weight of the $i$-th model and $\text{Model}_i(x)$ is the prediction of the $i$-th model on the input $x$.

### Tuning 
In order to find the optimal weights for the ensamble model, we split the original validation set into two subsets: a fit subset and a test subset. We use the fit subset to try different combinations of the models and the test subset to evaluate the performance of the combination.
We then select the combination that gives the best performance on the test subset.

### Results
As ensambling models radically improves the $\text{RMSE}$ score on Kaggle's leaderboard, the ensamble model is our final model.

In [20]:
# load validation set from "../data/model_state/val_df.csv"

import pandas as pd
import numpy as np

val_df = pd.read_csv("../data/model_state/val_df.csv")

users = val_df["val_users"].values
items = val_df["val_items"].values
ratings = val_df["val_ratings"].values

print(type(users), users.shape)

<class 'numpy.ndarray'> (12001,)


In [25]:
import numpy as np
from sklearn.metrics import mean_squared_error
from scipy.optimize import minimize
import torch
from models import LightGCNPlus, load_best_val_model
from postprocess import create_submission_matrix, load_means_stds, to_submission_format
from load import load_submission_users_items

# split data into val and test
from sklearn.model_selection import train_test_split
TEST_SIZE = 0.5
users_val, users_test, items_val, items_test, ratings_val, ratings_test = train_test_split(users, items, ratings, test_size=TEST_SIZE)

DEVICE = torch.device('mps')

# Model IDs and configurations
model_configs = [
    (28, 4, (5, ), 1),
    (28, 4, (6, ), 1),
    (28, 5, (6, 1),1),
    (28, 5, (6, ), 1),
    (28, 6, (7, ), 1),
    (28, 8, (9, ), 1),
    (32, 9, (10, ), 1),
    (34, 4, (5, ), 1),
    (32, 9, (10, 1), 1),
    (32, 8, (9,), 1),
]

def load_and_predict(model_class, config_id, users, items):
    ID = f"{config_id[0]}_{config_id[1]}_{str(config_id[2])}_{config_id[3]}"
    model = load_best_val_model(model_class, ID)
    raw_pred_ratings = model.get_ratings(users, items).detach().cpu().numpy()
    raw_submission_matrix = create_submission_matrix(raw_pred_ratings, users, items)
    pred_ratings = raw_submission_matrix[users, items]
    pred_ratings = np.clip(pred_ratings, 1, 5)
    return pred_ratings

# Load and predict with each model for validation and test sets
pred_ratings_list_val = [load_and_predict(LightGCNPlus, config, users_val, items_val) for config in model_configs]
pred_ratings_list_test = [load_and_predict(LightGCNPlus, config, users_test, items_test) for config in model_configs]

# Define the objective function for optimization
def objective(weights, predictions, true_ratings):
    ensemble_preds = np.sum(weights[:, None] * predictions, axis=0)
    mse = mean_squared_error(true_ratings, ensemble_preds)
    return mse

# Initial weights (equal weights)
initial_weights = np.ones(len(pred_ratings_list_val)) / len(pred_ratings_list_val)

# Bounds for the weights (they should be between 0 and 1)
bounds = [(0, 1) for _ in range(len(pred_ratings_list_val))]

# Constraints (weights should sum to 1)
constraints = {'type': 'eq', 'fun': lambda w: np.sum(w) - 1}

# Find the best weights using the validation set
result = minimize(objective, initial_weights, args=(np.array(pred_ratings_list_val), ratings_val), bounds=bounds, constraints=constraints, method='SLSQP')

best_weights = result.x
print(f"Best weights found: {best_weights}")

# Evaluate the best weights on the test set
ensemble_pred_ratings_test = np.sum(best_weights[:, None] * np.array(pred_ratings_list_test), axis=0)
mse_test = mean_squared_error(ratings_test, ensemble_pred_ratings_test)
print(f"Test MSE: {mse_test}")

# Load submission users and items
submission_users, submission_items = load_submission_users_items()

# Load and predict with each model for the submission set
pred_ratings_list_submission = [load_and_predict(LightGCNPlus, config, submission_users, submission_items) for config in model_configs]

# Get final ensemble predictions for submission
ensemble_pred_ratings_submission = np.sum(best_weights[:, None] * np.array(pred_ratings_list_submission), axis=0)
ensemble_pred_ratings_submission = np.clip(ensemble_pred_ratings_submission, 1, 5)

# Create the submission dataframe
submission = to_submission_format(submission_users, submission_items, ensemble_pred_ratings_submission)

# Save the submission to a CSV file
submission.to_csv('../data/submission_data/submission.csv', index=False)
print("Submission saved to '../data/submission_data/submission.csv'")

Best weights found: [1.89452618e-01 1.50103317e-01 0.00000000e+00 2.85901454e-01
 0.00000000e+00 7.32854166e-02 3.90312782e-18 3.00632528e-01
 6.93889390e-18 6.24665968e-04]
Test MSE: 0.9273070748163846
Submission saved to '../data/submission_data/submission.csv'
