# Results - Classification of simulated events
This notebook is the primary source of plots and tables for the classification part of the thesis, 
with the goal of keeping every table and figure as standardized as possible. (And who has the time to update
90 tables one by one anyway).

## Questions
* Descriptive statistics
    - Should descriptive statistics of the simulated data be included?\
    If so, how much? And should it be included for each fold in the k-fold cross-validation?
* Classification results
    - Breakdown of results based on event type? Single, double, close double?
    Reasonable to include in order to confirm the assumption that close doubles are the
    most difficult event type to classify correctly in simulated data
    Random state is included, so should be simple to reproduce the indices


## TODO
* Add dense network experiment
* Add cnn experiment
* Add pretrained experiment
* Combine the aggregated stats (means, stds etc.) into one df.
* Output aggregated df to tex in thesis repo
* Implement reproducing the validation indices for each fold based on the random seed from config

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
from master_scripts.data_functions import get_git_root
from master_scripts.analysis_functions import load_experiment, experiment_metrics_to_df
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Pre-processed simulated data - no additional modifications
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.

## Logistic regression

In [2]:
# Load logistic regression experiment
log_ex_id = "f0f36fe4060f"
log_ex = load_experiment(log_ex_id)
#log_model = tf.keras.models.load_model(repo_root + "models/" + log_ex_id + ".h5")

log_metrics = experiment_metrics_to_df(log_ex)
display(log_metrics)
display(log_metrics.agg([np.mean, np.std]).applymap('{:.3f}'.format))

Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
fold_0,0.732661,0.732133,0.465324,0.832972,139580,50431,51158,138831
fold_1,0.732082,0.730995,0.464178,0.832152,139863,50149,51660,138328
fold_2,0.7322,0.731559,0.464405,0.832358,139572,50440,51324,138664
fold_3,0.733416,0.733301,0.466832,0.833482,139431,50581,50721,139267
fold_4,0.732866,0.731449,0.465757,0.83187,140247,49765,51746,138242


Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
mean,0.733,0.732,0.465,0.833,139738.6,50273.2,51321.8,138666.4
std,0.001,0.001,0.001,0.001,324.605,324.483,412.953,413.052


## Small dense network

In [3]:
# Load logistic regression experiment
dense_ex_id = "907f81f926e3"
dense_ex = load_experiment(dense_ex_id)
#log_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5")

dense_metrics = experiment_metrics_to_df(dense_ex)
display(dense_metrics)
display(dense_metrics.agg([np.mean, np.std]).applymap('{:.3f}'.format))

Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
fold_0,0.931289,0.927141,0.868215,0.964995,187764,2247,23863,166126
fold_1,0.937963,0.934603,0.880576,0.966941,187975,2037,21537,168451
fold_2,0.941774,0.9389,0.887471,0.9687,187873,2139,19987,170001
fold_3,0.943555,0.940726,0.89117,0.969604,188346,1666,19783,170205
fold_4,0.943608,0.940808,0.891203,0.9694,188274,1738,19691,170297


Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
mean,0.94,0.936,0.884,0.968,188046.4,1965.4,20972.2,169016.0
std,0.005,0.006,0.01,0.002,253.218,252.939,1781.994,1781.589


## Small CNN

In [6]:
# Load logistic regression experiment
cnn_ex_id = "660ec692e6d1"
cnn_ex = load_experiment(cnn_ex_id)
#log_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5")

cnn_metrics = experiment_metrics_to_df(cnn_ex)
display(cnn_metrics)
display(cnn_metrics.agg([np.mean, np.std]).applymap('{:.3f}'.format))

Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
fold_0,0.951995,0.94973,0.907673,0.983271,189440,571,17671,172318
fold_1,0.955653,0.953743,0.914418,0.985114,189418,594,16258,173730
fold_2,0.960218,0.958736,0.922812,0.985986,189266,746,14371,175617
fold_3,0.962647,0.961319,0.927475,0.987063,189426,586,13608,176380
fold_4,0.964474,0.963274,0.930928,0.987501,189456,556,12944,177044


Unnamed: 0,accuracy_score,f1_score,matthews_corrcoef,roc_auc_score,TN,FP,FN,TP
mean,0.959,0.957,0.921,0.986,189401.2,610.6,14970.4,175017.8
std,0.005,0.006,0.01,0.002,76.949,77.077,1954.026,1953.68


## Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes and issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [5]:
# Load logistic regression experiment
pretrained_ex_id = ""
pretrained_ex = load_experiment(pretrained_ex_id)
#log_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5")

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
display(pretrained_metrics)
#display(pretrained_metrics.agg([np.mean, np.std]).applymap('{:.3f}'.format))

FileNotFoundError: [Errno 2] No such file or directory: '/home/geir/git/master_analysis/experiments/.json'