# Summarize the different split approaches for all datasets

- [ ] Read in all datasets
- [ ] Bring into one table
- [ ] Format the output

In [1]:
# imports
import pandas as pd
import helpers_summarize

## What is the average rank of the approach with **time-sorted** users with its std?

In [2]:
# load data
df = helpers_summarize.load_approach_tables(path='../../results/tables/approaches/sorted_users')

# summarize and print
results = helpers_summarize.prepare_results(df)
results

Unnamed: 0,average_rank,average_rank_std
time_cut,2.29,1.5
bl_user_based_last,3.29,1.7
bl_user_based_all,3.57,2.37
user_cut,3.57,1.72
average_user,3.86,0.69
user_wise,4.33,2.07
bl_assessment_based_last,6.86,0.69
bl_assessment_based_all,7.71,0.49


The approaches where do not take users into account lead to an overestimation of the performance of the classifier in the testset. That is, if we allow users to be present in both the test and the train set

## What is the average rank of the approach with **randomly-drawn** users with its std?

Method: The whole ML pipeline for 9 datasets with 8 approaches each was repeated 5 times. Each time, a different seed was chosen to randomly draw train and test users.
The overall question is: Do the approach rankings change if users are change from test to train sets and vice versa?

In [3]:
seeds = [1962, 1964, 1991, 1994, 2023]
results_random = pd.DataFrame()
for i, seed in enumerate(seeds):
    # load data
    df = helpers_summarize.load_approach_tables(path=f'../../results/tables/approaches/random_users/seed_{seed}')

    # summarize and print
    res = helpers_summarize.prepare_results(df)
    
    results_random[f'seed_{seed}'] = res['average_rank'] 

results_random['mean_ranking'] = results_random.mean(axis=1)
results_random['std_ranking'] = results_random.std(axis=1)

results_random

Unnamed: 0,seed_1962,seed_1964,seed_1991,seed_1994,seed_2023,mean_ranking,std_ranking
time_cut,1.57,1.57,1.43,1.86,1.43,1.572,0.157022
user_cut,3.14,3.14,3.14,2.86,3.0,3.056,0.112
bl_user_based_last,3.29,3.43,3.43,3.29,3.86,3.46,0.209571
average_user,3.43,3.71,3.86,3.71,2.86,3.514,0.355336
bl_user_based_all,4.29,4.29,4.57,4.29,4.71,4.43,0.177088
user_wise,5.67,5.0,4.5,5.17,5.17,5.102,0.375414
bl_assessment_based_last,6.86,6.57,6.86,6.86,6.86,6.802,0.116
bl_assessment_based_all,7.43,7.86,7.71,7.57,7.71,7.656,0.145547
