# Experiment Result Reproduction

This notebook includes a centralized procedure for running the experiments that we presented in the project report.

* Datasets
  * [Movielens 100K](https://grouplens.org/datasets/movielens/100k/).

* Datasplit
  * 80% of all users will have 75% of their ratings in the train set and the remaining 25% in the test set
  * 20% of all users will have 25% of their ratings in the train set and the remaining 75% in the test set (Cold-Start User)
  * It is guaranteed that all the user and item nodes are included in the graph of the train set
  

* Evaluation metrics
  * Precision@k.
  * Recall@k.
  * Normalized discounted cumulative gain@k (NDCG@k).
  * Mean-average-precision (MAP). 

In [13]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [14]:
import warnings
warnings.filterwarnings("ignore")
import logging
logging.basicConfig(level=logging.ERROR)

import numpy as np
import pandas as pd
from utils.dataloader import DataLoader 
from utils.data_split import train_test_split
from train_and_evaluate import *

In [15]:
def generate_summary(data, algo, k, rating_metrics, ranking_metrics):
    summary = {"Data": data, "Algo": algo, "K": k}
    if rating_metrics is None:
        rating_metrics = {
            "RMSE": np.nan,
            "MAE": np.nan,
            "R2": np.nan,
            "Explained Variance": np.nan,
        }
    if ranking_metrics is None:
        ranking_metrics = {
            "MAP": np.nan,
            "nDCG@k": np.nan,
            "Precision@k": np.nan,
            "Recall@k": np.nan,
        }
    summary.update(rating_metrics)
    summary.update(ranking_metrics)
    return summary

In [16]:
def benchmark_recommenders():
    cols = ["Data", "Algo", "K", "nDCG@k", "Precision@k", "Recall@k"]
    df_results = pd.DataFrame(columns=cols)
    sizes = ["100k"]
    algos=["SVD", "Light GCN_l1","Light GCN_l3" ,"Ours_l1", "Ours_l3"]
    models={"SVD":svd_model_train_and_evaluate, 
            "Light GCN_l1":lgcn_model_train_and_evaluate_1layer,
            "Light GCN_l3":lgcn_model_train_and_evaluate_3layer,  
            "Ours_l1": lgcn_model_train_and_evaluate_2_1layer,
            "Ours_l3": lgcn_model_train_and_evaluate_2_3layer}
    for size in sizes:
        movie_data = DataLoader(size=size)

        # Load rating data
        data = movie_data.load_ratings()
        data = data[['user', 'item', 'rating']]
        train_list, test_list = train_test_split(data)

        for algo in algos:
            ratings, rankings = models[algo](train_set=train_list, test_set=test_list)
            summary = generate_summary(size, algo, 10, ratings, rankings)
            df_results.loc[df_results.shape[0] + 1] = summary
    return df_results



In [17]:
df_results = benchmark_recommenders()
df_results

100%|██████████| 1000/1000 [15:39<00:00,  1.06it/s]
100%|██████████| 1000/1000 [14:22<00:00,  1.16it/s]
100%|██████████| 1000/1000 [09:34<00:00,  1.74it/s]
100%|██████████| 1000/1000 [09:09<00:00,  1.82it/s]


Unnamed: 0,Data,Algo,K,nDCG@k,Precision@k,Recall@k
1,100k,SVD,10,0.107541,0.097667,0.034944
2,100k,Light GCN_l1,10,0.438631,0.3772,0.208013
3,100k,Light GCN_l3,10,0.398978,0.347296,0.184681
4,100k,Ours_l1,10,0.3382,0.293849,0.153594
5,100k,Ours_l3,10,0.342988,0.303181,0.155875
