## School project - 5MLRE
The following notebook was created for a school project to create an anime recommendation system. The subject and the questions are available in the appendix.

The group members who participated in this project are:
- AMIMI Lamine
- BEZIN Théo
- LECOMTE Alexis
- PAWLOWSKI Maxence

### Main index
1. Data analysis
2. **Collaborative filtering (you are here)**
3. Content-based filtering
4. _Appendix_

# 2 - Collaborative filtering
In the previous notebook, we loaded, cleaned and studied the [MyAnimeList](https://myanimelist.net/) datasets. Now that we know them better, we will start to create the recommendation system using collaborative filtering. Collaborative filtering is a technique that filters out items that a user might like based on feedback from similar users. There are two sub-techniques: User-based collaborative filtering and article-based collaborative filtering.

### Index
<ol type="A">
  <li>Notebook initialization</li>
  <li>Collaborative filtering: unfiltered training</li>
  <li>Collaborative filtering: filtered training</li>
  <li>Getting the Top-N</li>
  <li>Conclusion of the collaborative filtering</li>
</ol>

## A - Notebook initialization
### A.1 - Imports

In [10]:
# OS and filesystem
import os
import sys
from pathlib import Path
import random

# Data
import pandas
from matplotlib import pyplot
import matplotx

# Model processing
import surprise

# Console output
from colorama import Style

# Misc.
from ast import literal_eval

# Local files
sys.path.append(os.path.join(os.pardir, os.pardir))
import helpers

### A.2 - Package initialization

In [2]:
pyplot.rcParams.update(pyplot.rcParamsDefault)
pyplot.style.use(matplotx.styles.dracula)  # Set the matplotlib style

### A.3 - Constants

In [3]:
# Filesystem paths
PARENT_FOLDER = Path.cwd()
DATA_FOLDER = (PARENT_FOLDER / ".." / ".." / "data").resolve()
MODELS_FOLDER = (PARENT_FOLDER / ".." / ".." / "models").resolve()
TEMP_FOLDER = (PARENT_FOLDER / ".." / ".." / "temp").resolve()

# Plots
FIG_SIZE = (12, 7)

# Misc.
RANDOM_STATE = 2077

### A.4 - Datasets loading

In [8]:
# data_reader = surprise.Reader(line_format="user item rating", sep=",", rating_scale=(-1, 10), skip_lines=1)
# data = surprise.Dataset.load_from_file(file_path=(DATA_FOLDER / "rating2.csv"), reader=data_reader)

# Load a smaller sample of the dataset instead of the 8M rows
data = pandas.read_csv((DATA_FOLDER / "rating.csv"), dtype={"user_id": str, "anime_id": str})
# data = data[data["rating"] >= 0]
data = data.head(n=125_000)  # We use `head` instead of `samples`.

data_reader = surprise.Reader(rating_scale=(-1, 10))
data = surprise.Dataset.load_from_df(df=data[["user_id", "anime_id", "rating"]], reader=data_reader)

In [4]:
data_anime = pandas.read_csv(DATA_FOLDER / "anime_cleaned.csv", converters={"genre_split": literal_eval})

In [6]:
rankings = pandas.Series(data=data_anime["rank_num_ratings"].values, index=data_anime["anime_id"]).to_dict()

In [7]:
evaluator = helpers.ml.ModelEvaluator(dataset=data, rankings=rankings, models_folder=MODELS_FOLDER, seed=RANDOM_STATE)

Constructing sets. This can take a while...[37m[2m
[37m[2m   > Building train/test sets...[37m[2m
[37m[2m   > Building LeaveOneOut sets...[37m[2m
[37m[2m   > Building full sets...[37m[2m
[37m[2m   > Preparing the similarities model...[37m[2m
Estimating biases using als...
Computing the cosine similarity matrix...
Done computing similarity matrix.
[0m


# B - Collaborative filtering: unfiltered training
As we explained earlier, Collaborative filtering is a method used to make personalized recommendations by analyzing a user's past preferences or behaviors and comparing them to those of similar users. There are two sub-techniques: User-based collaborative filtering and article-based collaborative filtering.

- User-based: focuses on finding similar users that are looking like the target user in terms of preferences, liked items and user's navigation.
- Item-based: focuses on finding similar items based on the user's previous interactions with other items.

User-based is more relevant for entertainment-related items, as this approach would recommend items that other users with similar preferences have liked. There are a lot of parameters in terms of preference nuances. Item-based recommendations are more pertinent to online shops, which recommend products based on their characteristics. We are talking about individual tastes here.

In our case, user-based filtering should give better results. But in this notebook we will test our models with both methods.

### B.1 - Slope One
Slope One is a collaborative filtering algorithm designed for recommendations. Its lightweight and simple design calculates the average difference between the user's items rating and uses this information to predict the user's potential rating on an unseen article.

In [8]:
evaluator.run_model(name="Slope One", model=surprise.SlopeOne, hyper_params=None, measure_key="rmse", override=False)

[32mTesting "Slope One".[39m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{}[0m
[1mRMSE:[22m 2.223912
[1mMAE:[22m 1.432369

[1mHit rate:[22m 0.404858%
[1mHit rate per rating value:[22m
Rating	Hit rate
10.0	2.136752%
[1mCumulative hit rate (min_rating=3.0):[22m 0.488281%
[1mAverage reciprocal hit rank:[22m 0.0020917678812415654
[1mUser coverage (num_users=1235, min_rating=3.0):[22m 100.000000%
[1mDiversity:[22m 0.666667
[1mNovelty:[22m 3281.476531

[0mTesting of the "Slope One" model successfully completed in 0:10:14.751439.
Grid search: N/A
Training and testing: 0:00:04.693023
Top-N building: 0:10:06.570479


### B.2 - KNN Basic
KNN Basic (K-Nearest Neighbors) is another collaborative filtering algorithm used for recommandation systems. It consists of finding the most similar "K" items or users based on a similarity metric. It then calculates the weighted average of the ratings of the items found to predict the user's rating for the target item or recommend items based on the user.

In [9]:
evaluator.run_model(
    name="KNN Basic",
    model=surprise.KNNBasic,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN Basic".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 1, 'sim_options': {'name': 'pearson_baseline', 'user_based': False}}[0m
[1mRMSE:[22m 2.229013
[1mMAE:[22m 1.382239

[1mHit rate:[22m 0.323887%
[1mHit rate per rating value:[22m
Rating	Hit rate
-1.0	

### B.3 - KNN With Means
KNN With Means is a variant of the KNN Basic algorithm. This time, the algorithm adjusts the previously calculated weighted average by adding the overall average user or article rating.

In [10]:
evaluator.run_model(
    name="KNN With Means",
    model=surprise.KNNWithMeans,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN With Means".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 5, 'sim_options': {'name': 'pearson_baseline', 'user_based': True}}[0m
[1mRMSE:[22m 2.160327
[1mMAE:[22m 1.351401

[1mHit rate:[22m 2.348178%
[1mHit rate per rating value:[22m
Rating	Hit rate
7

### B.4 - KNN With Z-Score
KNN With Z-Score is another variant of the KNN algorithm that takes into account the average ratings and standard deviations of users or items for predictions. In addition to the previous steps, the algorithm calculates the Z-Score by subtracting the average score and dividing the result by the standard deviation on the weighted average. With this method, this algorithm normalizes the ratings by trends and variabilities, which means better accuracy for predictions and recommendations.

In [11]:
evaluator.run_model(
    name="KNN With Z-Score",
    model=surprise.KNNWithZScore,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN With Z-Score".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 3, 'sim_options': {'name': 'pearson_baseline', 'user_based': True}}[0m
[1mRMSE:[22m 2.155113
[1mMAE:[22m 1.320680

[1mHit rate:[22m 1.457490%
[1mHit rate per rating value:[22m
Rating	Hit rate

### B.5 - KNN Baseline
KNN Baseline is simpler than the previous algorithm. It calculates the distance between the raw values of the features that we want to use for our prediction. The counterpart of this method is the loss of accuracy depending to the scales or ranges of the features.

In [12]:
evaluator.run_model(
    name="KNN Baseline",
    model=surprise.KNNBaseline,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        },
        "bsl_options": {
            "method": ["als"],
            "n_epochs": [5, 10, 15],
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN Baseline".[39m
Running GridSearchCV...[37m[2m


KeyboardInterrupt: 

### B.6 - Non-negative Matrix Factorization
Non-negative Matrix Factorization (NMF) is a technique used to facilitate the interpretation of non-negative matrices*¹* of data. For this, the algorithm tries to find a way to represent a non-negative matrix in smaller non-negative matrices. It can then better interpret the data structures and its predictions are improved.

*1: A non-negative matrix is a matrix where all elements are greater than or equal to zero.*

In [13]:
evaluator.run_model(
    name="Non-negative Matrix Factorization",
    model=surprise.NMF,
    hyper_params={
        "n_factors": [5, 15, 25],
        "n_epochs": [25, 50, 75],
        "biased": [True, False]
    },
    measure_key="rmse",
    override=False
)

[32mTesting "Non-negative Matrix Factorization".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'n_factors': 5, 'n_epochs': 75, 'biased': True}[0m
[1mRMSE:[22m 2.541973
[1mMAE:[22m 1.684747

[1mHit rate:[22m 1.376518%
[1mHit rate per rating value:[22m
Rating	Hit rate
-1.0	0.961538%
6.0	2.941176%
7.0	1.047120%
8.0	1.181102%
9.0	1.293103%
10.0	2.136752%
[1mCumulative hit rate (min_rating=3.0):[22m 1.464844%
[1mAverage reciprocal hit rank:[22m 0.0049411991517254675
[1mUser coverage (num_users=1235, min_rating=3.0):[22m 100.000000%
[1mDiversity:[22m 0.667507
[1mNovelty:[22m 1032.217333

[0mTesting of the "Non-negative Matrix

### B.7 - Co-clustering
The goal of Co-clustering is to find a way to group similar rows and columns of a matrix, like patterns, to make them more apparent. This method is useful when working on datasets with complex row and column relationships. The algorithm will first find a correlation between the rows and columns of the dataset. It will then use the k-mean method to group the data before interpreting the results.

In [14]:
evaluator.run_model(
    name="Co-clustering",
    model=surprise.CoClustering,
    hyper_params={
        "n_cltr_u": [1, 3, 5],
        "n_cltr_i": [1, 3, 5],
        "n_epochs": [10, 20, 30],
    },
    measure_key="rmse",
    override=False
)

[32mTesting "Co-clustering".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'n_cltr_u': 1, 'n_cltr_i': 1, 'n_epochs': 10}[0m
[1mRMSE:[22m 2.312769
[1mMAE:[22m 1.504430

[1mHit rate:[22m 2.105263%
[1mHit rate per rating value:[22m
Rating	Hit rate
6.0	1.470588%
7.0	0.523560%
8.0	2.362205%
9.0	2.155172%
10.0	5.555556%
[1mCumulative hit rate (min_rating=3.0):[22m 2.539062%
[1mAverage reciprocal hit rank:[22m 0.006970953023584602
[1mUser coverage (num_users=1235, min_rating=3.0):[22m 93.441296%
[1mDiversity:[22m 0.977778
[1mNovelty:[22m 2389.190310

[0mTesting of the "Co-clustering" model successfully completed in 0:13:19.172

### B.8 - Comparing performance
Before we compare all the previous results, let's define a few important terms that we need to understand to properly compare the performance of our models.

<u>Machine Learning metrics:</u>
- Root Mean Squared Error (RMSE): measure of the average deviation of the predicted values of the model. The lower, the better.
- Mean Absolute Error (MAE): refers to the magnitude of difference between the prediction of an observation and the true value of that observation.

<u>Recommendation systems metrics:</u>
- Hit rate: the proportion of recommended items that are relevant to the user, expressed in percent.
- Hit rate per rating value: is the hit rate but calculated independently for each of the possible ratings (from one to ten in our case).
- Cumulative hit rate: is also the hit rate calculated for all rating values up to a certain threshold.
- Average reciprocal hit rank (ARHR): is the average of the reciprocal ranks of the relevant items in the recommended list.
- User coverage: is the proportion of users for whom the system is able to make recommendations.
- Diversity: is the variety or dissimilarity of items recommended to users.
- Novelty: is the degree to which recommended items are new or unexpected to the user.

We can now compare our models. We are going to observe which is the best model for each metrics, and then conclude on the overall best model.

The model with the most interesting metric values is the KNN Baseline due to its good performance on RMSE, MAE and hit rate. It has a reasonable training time of seven seconds. Even though the grid search takes three hours, we already have the best parameters for this model, so we don't need to run the grid search again. In another hand, KNN With Means show us an interesting performance for ARHR and a training of three seconds. KNN Basic is automatically excluded because 35% of its predictions were impossible (one of the parameters was unknown to it).

Overall, based on the metrics and results we have at this moment of the training, KNN Baseline appears to be the most efficient model.

## E - Collaborative filtering: filtered training
We will now run our previous experiment again, but this time using a filtered dataset.

### E.1 - Filtering the dataset
We only keep users with at least two hundred ratings and items with at least two hundred ratings.

In [6]:
data_filtered = pandas.read_csv((DATA_FOLDER / "rating.csv"), dtype={"user_id": str, "anime_id": str})

# Calculate the data filters
ratings_per_users = data_filtered.value_counts(subset="user_id").reset_index(name="count")
ratings_per_anime = data_filtered.value_counts(subset="anime_id").reset_index(name="count")

# Apply the filters
min_n_ratings = 200
data_filtered = data_filtered[data_filtered["user_id"].isin(ratings_per_users[ratings_per_users["count"] > min_n_ratings]["user_id"])]
data_filtered = data_filtered[data_filtered["anime_id"].isin(ratings_per_anime[ratings_per_anime["count"] > min_n_ratings]["anime_id"])]

# Save the file
data_filtered.to_csv((DATA_FOLDER / "rating_filtered_min_rating_200.csv"), index=False, mode="w")

We can now load the filtered dataset.

In [7]:
data_filtered = pandas.read_csv((DATA_FOLDER / "rating_filtered_min_rating_200.csv"), dtype={"user_id": str, "anime_id": str})
data_filtered = data_filtered.head(n=125_000)

data_reader = surprise.Reader(rating_scale=(-1, 10))
data_filtered = surprise.Dataset.load_from_df(df=data_filtered[["user_id", "anime_id", "rating"]], reader=data_reader)

And we initialize a new evaluator.

In [8]:
evaluator_filtered = helpers.ml.ModelEvaluator(dataset=data_filtered, rankings=rankings, models_folder=MODELS_FOLDER, seed=RANDOM_STATE)

Constructing sets. This can take a while...[37m[2m
[37m[2m   > Building train/test sets...[37m[2m
[37m[2m   > Building LeaveOneOut sets...[37m[2m
[37m[2m   > Building full sets...[37m[2m
[37m[2m   > Preparing the similarities model...[37m[2m
Estimating biases using als...
Computing the cosine similarity matrix...
Done computing similarity matrix.
[0m


We will now redefine our previous model and train them a second time with this new dataset.

### E.2 - Slope One

In [9]:
evaluator_filtered.run_model(name="Slope One - Filtered", model=surprise.SlopeOne, hyper_params=None, measure_key="rmse", override=False)

[32mTesting "Slope One - Filtered".[39m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{}[0m
[1mRMSE:[22m 1.945212
[1mMAE:[22m 1.246553

[1mHit rate:[22m 0.000000%
[1mHit rate per rating value:[22m
Rating	Hit rate
[1mCumulative hit rate (min_rating=3.0):[22m 0.000000%
[1mAverage reciprocal hit rank:[22m 0
[1mUser coverage (num_users=379, min_rating=3.0):[22m 98.153034%
[1mDiversity:[22m 0.888889
[1mNovelty:[22m 2299.656091

[0mTesting of the "Slope One - Filtered" model successfully completed in 0:05:39.533346.
Grid search: N/A
Training and testing: 0:00:05.747381
Top-N building: 0:05:29.728253


### E.3 - KNN Basic

In [10]:
evaluator_filtered.run_model(
    name="KNN Basic - Filtered",
    model=surprise.KNNBasic,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN Basic - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 1, 'sim_options': {'name': 'pearson_baseline', 'user_based': False}}[0m
[1mRMSE:[22m 1.928885
[1mMAE:[22m 1.176739

[1mHit rate:[22m 0.791557%
[1mHit rate per rating value:[22m
Rating	Hit

### E.4 - KNN With Means

In [11]:
evaluator_filtered.run_model(
    name="KNN With Means - Filtered",
    model=surprise.KNNWithMeans,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN With Means - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 3, 'sim_options': {'name': 'pearson_baseline', 'user_based': True}}[0m
[1mRMSE:[22m 1.870636
[1mMAE:[22m 1.156339

[1mHit rate:[22m 0.263852%
[1mHit rate per rating value:[22m
Rating

### E.5 - KNN With Z-Score

In [12]:
evaluator_filtered.run_model(
    name="KNN With Z-Score - Filtered",
    model=surprise.KNNWithZScore,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN With Z-Score - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 40, 'min_k': 3, 'sim_options': {'name': 'pearson_baseline', 'user_based': True}}[0m
[1mRMSE:[22m 1.868933
[1mMAE:[22m 1.136648

[1mHit rate:[22m 1.055409%
[1mHit rate per rating value:[22m
Rati

### E.6 - KNN Baseline

In [13]:
evaluator_filtered.run_model(
    name="KNN Baseline - Filtered",
    model=surprise.KNNBaseline,
    hyper_params={
        "k": [20, 40, 60],
        "min_k": [1, 2, 3, 5],
        "sim_options": {
            "name": ["cosine", "msd", "pearson", "pearson_baseline"],
            "user_based": [True, False]
        },
        "bsl_options": {
            "method": ["als"],
            "n_epochs": [5, 10, 15],
        }
    },
    measure_key="rmse",
    override=False
)

[32mTesting "KNN Baseline - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
[37m[2m   > Fitting on the full set...[37m[2m
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'k': 20, 'min_k': 5, 'sim_options': {'name': 'pearson_baseline', 'user_based': False}, 'bsl_options': {'method': 'als', 'n_epochs': 15}}[0m
[1mRMSE:[22m 1.830114
[1mMAE:[22m 1.100667

[1mHit rate:[22m 2.3

### E.7 - Non-negative Matrix Factorization

In [14]:
evaluator_filtered.run_model(
    name="Non-negative Matrix Factorization - Filtered",
    model=surprise.NMF,
    hyper_params={
        "n_factors": [5, 15, 25],
        "n_epochs": [25, 50, 75],
        "biased": [True, False]
    },
    measure_key="rmse",
    override=False
)

[32mTesting "Non-negative Matrix Factorization - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'n_factors': 5, 'n_epochs': 25, 'biased': True}[0m
[1mRMSE:[22m 2.150025
[1mMAE:[22m 1.405496

[1mHit rate:[22m 0.263852%
[1mHit rate per rating value:[22m
Rating	Hit rate
8.0	1.111111%
[1mCumulative hit rate (min_rating=3.0):[22m 0.335570%
[1mAverage reciprocal hit rank:[22m 0.0002638522427440633
[1mUser coverage (num_users=379, min_rating=3.0):[22m 96.042216%
[1mDiversity:[22m 0.059527
[1mNovelty:[22m 759.867948

[0mTesting of the "Non-negative Matrix Factorization - Filtered" model successfully completed in 0:05:

### E.8 - Co-clustering

In [15]:
evaluator_filtered.run_model(
    name="Co-clustering - Filtered",
    model=surprise.CoClustering,
    hyper_params={
        "n_cltr_u": [1, 3, 5],
        "n_cltr_i": [1, 3, 5],
        "n_epochs": [10, 20, 30],
    },
    measure_key="rmse",
    override=False
)

[32mTesting "Co-clustering - Filtered".[39m
Running GridSearchCV...[37m[2m
Computing metrics...[37m[2m
[37m[2mCalculating the accuracy (RMSE, MAE)...[37m[2m
[37m[2mBuilding the top-N...[37m[2m
[37m[2m   > Fitting on the LOOCV...[37m[2m
[37m[2m   > Fitting on the full set...[37m[2m
Built top-N for each user (n=10, min_rating=3.0)
Built top-N for each user (n=10, min_rating=3.0)
[0m
[1mBest params:[22m [2m[37m{'n_cltr_u': 1, 'n_cltr_i': 1, 'n_epochs': 10}[0m
[1mRMSE:[22m 2.034271
[1mMAE:[22m 1.326865

[1mHit rate:[22m 0.263852%
[1mHit rate per rating value:[22m
Rating	Hit rate
9.0	1.612903%
[1mCumulative hit rate (min_rating=3.0):[22m 0.335570%
[1mAverage reciprocal hit rank:[22m 0.00032981530343007914
[1mUser coverage (num_users=379, min_rating=3.0):[22m 100.000000%
[1mDiversity:[22m 0.933333
[1mNovelty:[22m 1976.632591

[0mTesting of the "Co-clustering - Filtered" model successfully completed in 0:11:14.688600.
Grid search: 0:10:55.369773


### C.9 - Comparing performance
The models show similar results to the previous run on unfiltered data. But, if we look closely to the metrics, we easily see that most of the scores are worse than ever. We already have a really low hit rate and the filtered dataset doesn't help in this situation.

However, the model KNN Baseline is still the best.

## E - ???
We didn't expected to get result as low as we got in this notebook. Most of our previous tests with the MovieLens dataset gave much better results, even the simplest model. We tried multiple things to improve the scores, none of them actually worked. But, we will talk about it anyway.

The first runs were using the original dataset. A major portion of this set is full of `-1`, a rating that indicate that the user gave no rating. It's still the method used in this notebook and you can notice that it doesn't work very well.

Well, maybe the negative ratings are causing trouble during the models training? They could see these ratings as very bad items that musn't be recommended and then recommend items with better average rating but that do not really fit what the user likes. For the next runs, we decided to filter out these ratings. We train the models once more, and, once again, they were pretty bad at giving relevant predictions. It was even worse than before.

Keeping the negatives ratings gives a better result, but the trained models do not give predictions that fit the user needs. Removing them reduce the amount of data and some user are even entirely excluded from the prediction system. Why don't we try both at once ? For the last runs, we decide to keep every rows, but we replaced the `-1` by the median of the range `(1, 10)`. We can judge that if a user didn't gave a note, it wasn't an anime enough bad or good to be rated. Surprisingly, it was worse than ever. Worse than the two previous experiments. Most models were getting a hit rate lower than one percent.

We're not really sure why we got those scores and, unfortunately, we ran out of time and idea. For now, our best model is the KNN Baseline with three percent of hit rate.

## D - Getting the Top-N
The final step is to display the top-N of a user. We start by loading the previously saved top of our best model. To compare a bit more our models, we will take the two best of the unfiltered and the filtered dataset.

In [6]:
top_n_knn_baseline = helpers.ml.Model.load_top_n(filepath=(MODELS_FOLDER / "KNN_Baseline__topN-full.pkl"))
top_n_filtered_knn_baseline = helpers.ml.Model.load_top_n(filepath=(MODELS_FOLDER / "KNN_Baseline_-_Filtered__topN-full.pkl"))

We then pick a random user from our dataset.

In [19]:
random_user_id = int(random.choice(list(set([r[0] for r in data.raw_ratings]))))
random_user_id

933

We define a function that build a human-readable table from the top-N.

In [12]:
def get_top_n_of(user_id: int, top_n: dict[int, list], items_df: pandas.DataFrame, auto_print: bool = False) -> pandas.DataFrame:
    """ Returns the formatted top-N recommendation for a specific user. """
    user_top = []

    for top_item_id, estimated_rating, _ in top_n[user_id]:
        item = items_df[items_df["anime_id"] == top_item_id].iloc[0]
        user_top.append({
            "Name": item["name"],
            "Genre": item["genre"],
            "Num. ratings": item["num_ratings"],
            "Mean ratings": item["rating"],
            "User estimated rating": estimated_rating
        })

    return pandas.DataFrame(data=user_top)

In [20]:
print(f"{Style.BRIGHT}Top-N: unfiltered dataset{Style.RESET_ALL}")
get_top_n_of(user_id=random_user_id, top_n=top_n_knn_baseline, items_df=data_anime, auto_print=False)

[1mTop-N: unfiltered dataset[0m


Unnamed: 0,Name,Genre,Num. ratings,Mean ratings,User estimated rating
0,Hunter x Hunter (2011),"Action, Adventure, Shounen, Super Power",7477,9.13,10.0
1,Gintama,"Action, Comedy, Historical, Parody, Samurai, S...",4264,9.04,9.786675
2,Howl no Ugoku Shiro,"Adventure, Drama, Fantasy, Romance",14560,8.74,9.775792
3,Monster,"Drama, Horror, Mystery, Police, Psychological,...",4079,8.72,9.774643
4,Cowboy Bebop,"Action, Adventure, Comedy, Drama, Sci-Fi, Space",13449,8.82,9.762582
5,Boku dake ga Inai Machi,"Mystery, Psychological, Seinen, Supernatural",7991,8.65,9.741796
6,Shinsekai yori,"Drama, Horror, Mystery, Sci-Fi, Supernatural",5485,8.53,9.733932
7,Gyakkyou Burai Kaiji: Ultimate Survivor,"Game, Psychological, Seinen, Thriller",2653,8.33,9.729091
8,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",1188,9.25,9.630482
9,Kuroko no Basket 2nd Season,"Comedy, School, Shounen, Sports",6819,8.58,9.587015


In [21]:
print(f"{Style.BRIGHT}Top-N: filtered dataset{Style.RESET_ALL}")
get_top_n_of(user_id=random_user_id, top_n=top_n_filtered_knn_baseline, items_df=data_anime, auto_print=False)

[1mTop-N: filtered dataset[0m


KeyError: 933

Here is the top-10 of the randomly selected user.

## E - Conclusion of the collaborative filtering
**TODO: Add text**