# A Brief Analysis of Singular and Array Runs
*Sean Steinle*

This notebook loads the results of singular and array runs, comparing the size and results for each.

In [31]:
#imports
import os
import sys
import pandas as pd
from numpy import mean

In [27]:
def load_results(results_dir: str):
    """Load all .csvs in directory into a single dataframe."""
    dfs = []
    for results_file in os.listdir(results_dir):
        if results_file[-4:] != ".csv": continue #skip if not csv
        dfs.append(pd.read_csv(results_dir+results_file))
    if len(dfs) == 0: 
        print("no .csv files found in directory!")
        sys.exit(0)
    elif len(dfs) == 1:
        df = dfs[0]
    elif len(dfs) > 1:
        df = pd.concat(dfs)
    return df

In [28]:
single_df = load_results("outputs/singular_runs/")
array_df = load_results("outputs/array_runs/")

In [29]:
single_df

Unnamed: 0,model_index,rmspe,tau
0,0,0.964382,1.244099
1,0,1.112011,1.250929
2,0,0.942991,1.233294
3,0,1.000811,1.209918
4,0,0.901193,1.241517
...,...,...,...
1495,2,1.039830,1.269568
1496,2,1.045319,1.253914
1497,2,1.058284,1.241548
1498,2,1.075280,1.265534


For the singular runs, we have 10 folds, 50 repetitions per model = 500 runs per model * 3 models = 1500 rmspe,tau pairs!

In [30]:
array_df

Unnamed: 0,model_index,rmspe,tau
0,0,0.951515,1.224623
1,0,0.960172,1.208036
2,0,0.949753,1.228807
3,0,0.970651,1.235272
4,0,1.068974,1.233524
...,...,...,...
355,2,1.047568,1.251275
356,2,1.082346,1.262900
357,2,1.015826,1.262833
358,2,1.046323,1.258919


For the array runs, we have 10 folds, 12 inner repetitions per model = 120 runs per model * 10 outer repetitions = 1200 runs per model * 3 models = 3600 rmspe,tau pairs!

In [34]:
print(f"Singular RMSPE: {mean(single_df['rmspe'])}\tSingular Tau: {mean(single_df['tau'])}")
print(f"Array RMSPE: {mean(array_df['rmspe'])}\tArray Tau: {mean(array_df['tau'])}")

Singular RMSPE: 0.9958933571836872	Singular Tau: 1.2404949454792338
Array RMSPE: 0.9958694653130755	Array Tau: 1.2403465511859193


Looks like we get similar results, all the way down to the 1 X 10^-4ish precision!