# Benchmark with Movielens dataset
The main purpose of this notebook is not to produce comprehensive benchmarking results on multiple datasets. Rather, it is intended to evaluate different recommender algorithms(SVD, LightGCN, Transformer and our algorithm) in this repository.

* Datasets
  * [Movielens 100K](https://grouplens.org/datasets/movielens/100k/).
  * [Movielens 1M](https://grouplens.org/datasets/movielens/1m/).

* Data split
  * TODO
  

* Evaluation metrics
  * Ranking metrics:
    * Precision@k.
    * Recall@k.
    * Normalized discounted cumulative gain@k (NDCG@k).
    * Mean-average-precision (MAP). 
  * Rating metrics:
    * Root mean squared error (RMSE).
    * Mean average error (MAE).
    * R squared.
    * Explained variance.

In [1]:
!pip install torch



In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import warnings
warnings.filterwarnings("ignore")
import logging
logging.basicConfig(level=logging.ERROR)

import numpy as np
import pandas as pd

from train_and_evaluate import svd_model_train_and_evaluate, lgcn_model_train_and_evaluate

In [4]:
def generate_summary(data, algo, k, rating_metrics, ranking_metrics):
    summary = {"Data": data, "Algo": algo, "K": k}
    if rating_metrics is None:
        rating_metrics = {
            "RMSE": np.nan,
            "MAE": np.nan,
            "R2": np.nan,
            "Explained Variance": np.nan,
        }
    if ranking_metrics is None:
        ranking_metrics = {
            "MAP": np.nan,
            "nDCG@k": np.nan,
            "Precision@k": np.nan,
            "Recall@k": np.nan,
        }
    summary.update(rating_metrics)
    summary.update(ranking_metrics)
    return summary

In [6]:
def benchmark_recommenders():
    cols = ["Data", "Algo", "K", "RMSE", "MAE", "R2", "Explained Variance", "MAP", "nDCG@k", "Precision@k", "Recall@k"]
    df_results = pd.DataFrame(columns=cols)
    sizes = ["100k"]
    algos=["svd", "lgcn"]
    models={"svd":svd_model_train_and_evaluate, "lgcn":lgcn_model_train_and_evaluate}
    for size in sizes:
        for algo in algos:
            ratings, rankings = models[algo](size)
            summary = generate_summary(size, algo, 10, ratings, rankings)
            df_results.loc[df_results.shape[0] + 1] = summary
    return df_results



In [9]:
df_results = benchmark_recommenders()
df_results

   user  item  rating  timestamp
0   196   242       3  881250949
1   186   302       3  891717742
2    22   377       1  878887116
3   244    51       2  880606923
4   166   346       1  886397596


Unnamed: 0,Data,Algo,K,RMSE,MAE,R2,Explained Variance,MAP,nDCG@k,Precision@k,Recall@k
1,100k,svd,10,0.948037,0.745077,0.289146,0.289196,0.014355,0.104999,0.096603,0.034853
2,100k,lgcn,10,3.372332,3.181291,-8.035794,0.005029,0.0,0.0,0.0,0.0
