**Results - Regression of simulated events**

This notebook is the primary source of plots and tables for the regression part of the thesis, 
with the goal of keeping every table and figure as standardized as possible. (And who has the time to update
90 tables one by one anyway).

**Questions:**
* Descriptive statistics
    - Should descriptive statistics of the simulated data be included?\
    If so, how much? And should it be included for each fold in the k-fold cross-validation?
* Classification results
    - Breakdown of results based on event type? Single, double, close double?
    Reasonable to include in order to confirm the assumption that close doubles are the
    most difficult event type to classify correctly in simulated data
    Random state is included, so should be simple to reproduce the indices


**TODO**
* Implement reproducing the validation indices for each fold based on the random seed from config

**Handy links**
* [matplotlib-plots to latex](https://timodenk.com/blog/exporting-matplotlib-plots-to-latex/)
* [Robert's thesis df output](https://github.com/ATTPC/VAE-event-classification/blob/master/src/make_classification_table.py)

In [149]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
from master_scripts.data_functions import get_git_root, normalize_image_data, event_indices, normalize_position_data
from master_scripts.analysis_functions import load_experiment, experiment_metrics_to_df
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
import tensorflow as tf
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

THESIS_PATH = "../../../master_thesis/"

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [150]:
# Load test set and normalize
repo_root = get_git_root()
test_images = np.load(repo_root + "data/simulated/test/" + "images_test.npy")
test_images = normalize_image_data(test_images)
test_positions = np.load(repo_root + "data/simulated/test/" + "positions_test.npy") 
test_energies = np.load(repo_root + "data/simulated/test/" + "energies_test.npy") 
test_labels = np.load(repo_root + "data/simulated/test/" + "labels_test.npy") 

# Set up indices for position and energy data
# s = single, d = double, c = close double
s_idx, d_idx, c_idx = event_indices(test_positions)

In [151]:
def regression_metrics(model, x_val, y_val, name):
    """ Calculates regression metrics on the validation data.
    
    :param x_val: normalized detector images
    :param y_val: target values
    """

    y_pred = model.predict(x_val)

    metrics = {}
    metrics['r2_score'] = r2_score(y_val, y_pred)
    metrics['mse'] = mean_squared_error(y_val, y_pred)
    metrics['rmse'] = mean_squared_error(y_val, y_pred, squared=False)
    metrics['mae'] = mean_absolute_error(y_val, y_pred)
    
    df = pd.DataFrame.from_dict(data={name: metrics}, orient='index')
    return df

# Pre-processed simulated data - no additional modifications
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.

## Single events

### Positions

#### Linear Regression

In [152]:
# Load linear regression experiment
lin_ex_id = "225ca879103d"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [153]:
# Load logistic regression experiment
dense_ex_id = "a3716bc3648a"
dense_ex = load_experiment(dense_ex_id)

# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [154]:
# Load logistic regression experiment
cnn_ex_id = "1cac590bf1fe"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [155]:
# Load pretrained regression experiment
pretrained_ex_id = "d53a2353251f"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [156]:
# Load custom regression experiment
custom_ex_id = "f29da7bbd96f"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [157]:
all_means_single_pos = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_single_pos = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_pos)
display(all_std_single_pos)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.799903,0.014037,0.118478,0.088706
Dense,0.990907,0.000638,0.025254,0.016035
CNN,0.997171,0.000198,0.014088,0.008098
Pretrained,0.884081,0.008133,0.090186,0.056804
Custom,0.999312,4.8e-05,0.006948,0.003588


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.003664,0.000246,0.00104,0.001673
Dense,0.000819,5.7e-05,0.001069,0.000869
CNN,0.00024,1.7e-05,0.00058,0.000573
Pretrained,0.006864,0.000484,0.002813,0.002656
Custom,0.000234,1.6e-05,0.001034,0.000251


In [158]:
rows = all_test_single_pos.index
r2_str_array_single_pos = np.zeros((1, all_test_single_pos.shape[0]), dtype=object)
for i in range(all_test_single_pos.shape[0]):
    r2_str_array_single_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos["r2_score"][i], all_test_single_pos["r2_score"][i])
        
r2_df_single_pos = pd.DataFrame(r2_str_array_single_pos, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_position_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on single events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-position-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_pos.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [159]:
# Load linear regression experiment
lin_ex_id = "87e8f4558d97"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)
print(lin_ex['experiment_name'])

generate_results_energies_single_linreg


#### Small dense network

In [160]:
# Load dense regression experiment
dense_ex_id = "4cab676db128"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [161]:
# Load cnn regression experiment
cnn_ex_id = "3a91fd0e74b5"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [162]:
# Load logistic regression experiment
pretrained_ex_id = "ea8d88850f6e"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [163]:
# Load custom regression experiment
custom_ex_id = "3d45e6694b1d"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [164]:
all_means_single_energy = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_single_energy = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_energy)
display(all_std_single_energy)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.935729,0.005379,0.073342,0.055065
Dense,0.937884,0.005199,0.072101,0.053853
CNN,0.937058,0.005268,0.072579,0.054378
Pretrained,0.892654,0.008984,0.094784,0.076291
Custom,0.943565,0.004723,0.068725,0.050626


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.036908,0.00307,0.019433,0.020329
Dense,0.033643,0.002798,0.018293,0.019395
CNN,0.032822,0.002731,0.018183,0.019009
Pretrained,0.019447,0.001616,0.010731,0.011578
Custom,0.03103,0.002588,0.017358,0.018655


In [165]:
rows = all_test_single_energy.index
r2_str_array_single_energy = np.zeros((1, all_test_single_energy.shape[0]), dtype=object)
for i in range(all_test_single_energy.shape[0]):
    r2_str_array_single_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy["r2_score"][i], all_test_single_energy["r2_score"][i])
        
r2_df_single_energy = pd.DataFrame(r2_str_array_single_energy, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_energy_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on single events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-energy-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_energy.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [166]:
# Load linear regression experiment
lin_ex_id = "7b74b3cfc586"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [167]:
# Load logistic regression experiment
dense_ex_id = "ef55911e49d1"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [168]:
# Load logistic regression experiment
cnn_ex_id = "cc2654aea019"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [169]:
# Load logistic regression experiment
pretrained_ex_id = "3c0d1b7bd0ac"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [170]:
# Load custom regression experiment
custom_ex_id = "468fefa67787"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [171]:
all_means_double_pos = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_double_pos = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_pos)
display(all_std_double_pos)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.364356,0.044569,0.211113,0.168144
Dense,0.47076,0.037108,0.192635,0.156702
CNN,0.472672,0.036974,0.192286,0.156766
Pretrained,0.36981,0.044186,0.210205,0.166236
Custom,0.489352,0.035804,0.18922,0.153881


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.005796,0.000432,0.001021,0.001438
Dense,0.001809,0.000106,0.000276,0.000276
CNN,0.002394,0.00016,0.000415,0.00053
Pretrained,0.010787,0.000732,0.001727,0.001599
Custom,0.000681,5.9e-05,0.000157,0.000204


In [172]:
rows = all_test_double_pos.index
r2_str_array_double_pos = np.zeros((1, all_test_double_pos.shape[0]), dtype=object)
for i in range(all_test_double_pos.shape[0]):
    r2_str_array_double_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos["r2_score"][i], all_test_double_pos["r2_score"][i])
        
r2_df_double_pos = pd.DataFrame(r2_str_array_double_pos, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [173]:
# Load linear regression experiment
lin_ex_id = "6e600e08e8af"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [174]:
# Load dense regression experiment
dense_ex_id = "96cd3707d131"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [175]:
# Load cnn regression experiment
cnn_ex_id = "f41605cb58b4"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [176]:
# Load logistic regression experiment
pretrained_ex_id = "9f33b3fc7fff"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [177]:
# Load custom regression experiment
custom_ex_id = "6bab88fbd66f"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [178]:
all_means_double_energy = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_double_energy = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_energy)
display(all_std_double_energy)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.428787,0.047638,0.218261,0.177981
Dense,0.430478,0.047498,0.217941,0.177844
CNN,0.433277,0.047265,0.217405,0.177395
Pretrained,0.425453,0.047919,0.218903,0.176519
Custom,0.491116,0.042445,0.206021,0.167904


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.065006,0.005467,0.012013,0.009753
Dense,0.066321,0.005576,0.012252,0.009961
CNN,0.050267,0.004236,0.009484,0.007782
Pretrained,0.053075,0.004462,0.00974,0.006903
Custom,0.032255,0.002703,0.006222,0.005192


In [179]:
rows = all_test_double_energy.index
r2_str_array_double_energy = np.zeros((1, all_test_double_energy.shape[0]), dtype=object)
for i in range(all_test_double_energy.shape[0]):
    r2_str_array_double_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy["r2_score"][i], all_test_double_energy["r2_score"][i])
        
r2_df_double_energy = pd.DataFrame(r2_str_array_double_energy, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on double events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Pre-processed simulated data - Pixel modified
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.
Additionally, the data has had the top and bottom lines of pixels set to 0, plus
one pixel inside the detector permanently 0 (which idx again?).

## Single events

### Positions

#### Linear Regression

In [180]:
# Load linear regression experiment
lin_ex_id = "d65ec088580a"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [181]:
# Load logistic regression experiment
dense_ex_id = "2218dcb0de80"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [182]:
# Load logistic regression experiment
cnn_ex_id = "3a70de184f3c"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [183]:
# Load logistic regression experiment
pretrained_ex_id = "b5223ba6beaa"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [184]:
# Load custom regression experiment
custom_ex_id = "379bca43b134"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [185]:
all_means_single_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_single_pos_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_pos_pmod)
display(all_std_single_pos_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.776153,0.015702,0.125308,0.094354
Dense,0.987185,0.000899,0.02998,0.019358
CNN,0.987682,0.000863,0.029382,0.018213
Pretrained,0.87259,0.00894,0.094553,0.061185
Custom,0.997204,0.000196,0.014002,0.006549


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.002737,0.0002,0.000845,0.001267
Dense,0.00068,4.7e-05,0.000827,0.000669
CNN,0.001015,7.1e-05,0.001954,0.001231
Pretrained,0.017231,0.001193,0.006636,0.007076
Custom,0.000211,1.5e-05,0.00093,0.000557


In [186]:
rows = all_test_single_pos_pmod.index
r2_str_array_single_pos_pmod = np.zeros((1, all_test_single_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_single_pos_pmod.shape[0]):
    r2_str_array_single_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_pmod["r2_score"][i], all_test_single_pos_pmod["r2_score"][i])
        
r2_df_single_pos_pmod = pd.DataFrame(r2_str_array_single_pos_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_pos_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [187]:
# Load linear regression experiment
lin_ex_id = "7dfe302a7c09"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [188]:
# Load dense regression experiment
dense_ex_id = "2dbd6c697bc5"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN
This is really sensitive to pixel modifications.
Performs similarly to the other models if you pixelmod the test data.

In [189]:
# Load cnn regression experiment
cnn_ex_id = "fb0685871cf3"
#cnn_ex_id = "fb0685871cf3"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
#tmp_images = test_images.copy()
#tmp_images[:, 3, 13] = 0
#tmp_images[:, 0, :] = 0
#tmp_images[:, 15, :] = 0
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)
#display(cnn_test)

#### Pretrained - VGG16 

In [190]:
# Load logistic regression experiment
pretrained_ex_id = "8aa9f731b693"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [191]:
# Load custom regression experiment
custom_ex_id = "02c59a04c095"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [192]:
all_means_single_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)

all_test_single_energy_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_energy_pmod)
display(all_std_single_energy_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.738587,0.021878,0.147912,0.122523
Dense,0.754008,0.020587,0.143483,0.120516
CNN,-0.127648,0.094375,0.307205,0.205946
Pretrained,0.72836,0.022734,0.150778,0.125227
Custom,0.733101,0.022337,0.149457,0.123951


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.024595,0.002042,0.016237,0.01877
Dense,0.022607,0.001877,0.016135,0.019852
CNN,0.024045,0.001997,0.017148,0.021458
Pretrained,0.014177,0.001176,0.008603,0.0113
Custom,0.028649,0.002379,0.0195,0.023297


In [193]:
rows = all_test_single_energy_pmod.index
r2_str_array_single_energy_pmod = np.zeros((1, all_test_single_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_single_energy_pmod.shape[0]):
    r2_str_array_single_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_pmod["r2_score"][i], all_test_single_energy_pmod["r2_score"][i])
        
r2_df_single_energy_pmod = pd.DataFrame(r2_str_array_single_energy_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_energy_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [194]:
# Load linear regression experiment
lin_ex_id = "2c62e711e234"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [195]:
# Load logistic regression experiment
dense_ex_id = "4cea43be5aa4"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [196]:
# Load logistic regression experiment
cnn_ex_id = "7960fa803199"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [197]:
# Load logistic regression experiment
pretrained_ex_id = "4f70fd9e6d8a"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [198]:
# Load custom regression experiment
custom_ex_id = "98ea91d193ba"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [199]:
all_means_double_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_double_pos_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_pos_pmod)
display(all_std_double_pos_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.365183,0.044511,0.210976,0.169567
Dense,0.465533,0.037475,0.193583,0.157434
CNN,0.363243,0.044646,0.211296,0.167319
Pretrained,0.342786,0.046082,0.214667,0.170133
Custom,0.488469,0.035866,0.189384,0.154176


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.00068,6.1e-05,0.000145,0.000445
Dense,0.000866,8.7e-05,0.000225,0.000225
CNN,0.001841,0.000137,0.000357,0.000637
Pretrained,0.013968,0.000985,0.002309,0.002105
Custom,0.000269,3.1e-05,8.2e-05,0.000142


In [200]:
rows = all_test_double_pos_pmod.index
r2_str_array_double_pos_pmod = np.zeros((1, all_test_double_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_double_pos_pmod.shape[0]):
    r2_str_array_double_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_pmod["r2_score"][i], all_test_double_pos_pmod["r2_score"][i])
        
r2_df_double_pos_pmod = pd.DataFrame(r2_str_array_double_pos_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [201]:
# Load linear regression experiment
lin_ex_id = "fcc62faf0d97"
lin_ex = load_experiment(lin_ex_id)
# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [202]:
# Load dense regression experiment
dense_ex_id = "0c1eb0cbcceb"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [203]:
# Load cnn regression experiment
cnn_ex_id = "85a088b1c550"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [204]:
# Load logistic regression experiment
pretrained_ex_id = "e9484282c396"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [205]:
# Load custom regression experiment
custom_ex_id = "a7714c38fd74"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [206]:
all_means_double_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_double_energy_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_energy_pmod)
display(all_std_double_energy_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.486663,0.042817,0.206922,0.168505
Dense,0.489836,0.042551,0.20628,0.168059
CNN,0.282451,0.059847,0.244637,0.192272
Pretrained,0.455371,0.045429,0.213141,0.173201
Custom,0.46642,0.044504,0.210959,0.171638


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.003155,0.00026,0.000627,0.000566
Dense,0.002571,0.000218,0.000528,0.000481
CNN,0.00312,0.000271,0.000654,0.00062
Pretrained,0.010319,0.000869,0.002023,0.001074
Custom,0.002746,0.000244,0.000591,0.000525


In [207]:
rows = all_test_double_energy_pmod.index
r2_str_array_double_energy_pmod = np.zeros((1, all_test_double_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_double_energy_pmod.shape[0]):
    r2_str_array_double_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_pmod["r2_score"][i], all_test_double_energy_pmod["r2_score"][i])
        
r2_df_double_energy_pmod = pd.DataFrame(r2_str_array_double_energy_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on double events in simulated data with specific pixels
set to zero, using multiple models. Error estimates are the standard deviation in results from k-fold 
cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Pre-processed simulated data - Pixel modified and imbalanced
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.
Additionally, the data has had the top and bottom lines of pixels set to 0, plus
one pixel inside the detector permanently 0 (which idx again?).

This dataset has also been purposefully imbalanced to mimic the properties of experimental data
where doubles in space are expected to be rare.

## Single events

### Positions

#### Linear Regression

In [208]:
# Load linear regression experiment
lin_ex_id = "78f01912d908"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [209]:
# Load logistic regression experiment
dense_ex_id = "af61fe608db1"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [210]:
# Load logistic regression experiment
cnn_ex_id = "e2f24a47f2f3"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [211]:
# Load logistic regression experiment
pretrained_ex_id = "a7340b9e74ad"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [212]:
# Load custom regression experiment
custom_ex_id = "33fa607a199b"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [213]:
all_means_single_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_single_pos_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_pos_imbalanced)
display(all_std_single_pos_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.776153,0.015702,0.125308,0.094354
Dense,0.987195,0.000898,0.029968,0.01935
CNN,0.987634,0.000867,0.029439,0.018582
Pretrained,0.87259,0.00894,0.094553,0.061185
Custom,0.998461,0.000108,0.010385,0.00673


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.002737,0.0002,0.000845,0.001267
Dense,0.000677,4.7e-05,0.000841,0.000677
CNN,0.001016,7.1e-05,0.001987,0.001181
Pretrained,0.017231,0.001193,0.006636,0.007076
Custom,0.000479,3.3e-05,0.002032,0.00112


In [214]:
rows = all_test_single_pos_imbalanced.index
r2_str_array_single_pos_imbalanced = np.zeros((1, all_test_single_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_pos_imbalanced.shape[0]):
    r2_str_array_single_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_imbalanced["r2_score"][i], all_test_single_pos_imbalanced["r2_score"][i])
        
r2_df_single_pos_imbalanced = pd.DataFrame(r2_str_array_single_pos_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_pos_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [215]:
# Load linear regression experiment
lin_ex_id = "9f256a4990c0"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [216]:
# Load dense regression experiment
dense_ex_id = "29b1f98a4879"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [217]:
# Load cnn regression experiment
cnn_ex_id = "8422f85d6ff6"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [218]:
# Load logistic regression experiment
pretrained_ex_id = "73de75db91e4"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [219]:
# Load custom regression experiment
custom_ex_id = "0071c04bef42"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [220]:
all_means_single_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_test_single_energy_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_single_energy_imbalanced)
display(all_std_single_energy_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.738587,0.021878,0.147912,0.122523
Dense,0.752892,0.020681,0.143808,0.120834
CNN,-0.146074,0.095917,0.309705,0.207102
Pretrained,0.728359,0.022734,0.150778,0.125228
Custom,0.723089,0.023175,0.152234,0.126858


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.024595,0.002042,0.016237,0.01877
Dense,0.022443,0.001864,0.016051,0.019754
CNN,0.025265,0.002098,0.017808,0.022031
Pretrained,0.014176,0.001176,0.008603,0.0113
Custom,0.02866,0.00238,0.019523,0.023197


In [221]:
rows = all_test_single_energy_imbalanced.index
r2_str_array_single_energy_imbalanced = np.zeros((1, all_test_single_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_energy_imbalanced.shape[0]):
    r2_str_array_single_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_imbalanced["r2_score"][i], all_test_single_energy_imbalanced["r2_score"][i])
        
r2_df_single_energy_imbalanced = pd.DataFrame(r2_str_array_single_energy_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_energy_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [222]:
# Load linear regression experiment
lin_ex_id = "e3f840121ced"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [223]:
# Load logistic regression experiment
dense_ex_id = "44de4c962f6c"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [224]:
# Load logistic regression experiment
cnn_ex_id = "7cb4c91d34d3"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [225]:
# Load logistic regression experiment
pretrained_ex_id = "5230ffcd7119"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [226]:
# Load custom regression experiment
custom_ex_id = "1a1fd5dff9ae"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [227]:
all_means_double_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)

all_test_double_pos_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_pos_imbalanced)
display(all_std_double_pos_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.357213,0.04507,0.212296,0.17303
Dense,0.451351,0.038469,0.196135,0.159546
CNN,0.439809,0.039278,0.198187,0.16007
Pretrained,0.333471,0.046735,0.216182,0.172373
Custom,0.224464,0.054377,0.233189,0.171166


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.007864,0.000744,0.001757,0.001093
Dense,0.011331,0.000915,0.002326,0.001181
CNN,0.007664,0.000695,0.001787,0.001045
Pretrained,0.01506,0.001092,0.002497,0.002845
Custom,0.178066,0.012485,0.039951,0.031918


In [228]:
rows = all_test_double_pos_imbalanced.index
r2_str_array_double_pos_imbalanced = np.zeros((1, all_test_double_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_pos_imbalanced.shape[0]):
    r2_str_array_double_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_imbalanced["r2_score"][i], all_test_double_pos_imbalanced["r2_score"][i])
        
r2_df_double_pos_imbalanced = pd.DataFrame(r2_str_array_double_pos_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [229]:
# Load linear regression experiment
lin_ex_id = "fa1bac5bbad7"
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [230]:
# Load dense regression experiment
dense_ex_id = "a603f7b7d717"
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [231]:
# Load cnn regression experiment
cnn_ex_id = "aae44d283ef0"
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [232]:
# Load logistic regression experiment
pretrained_ex_id = "4f5d0b4bd0ef"
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [233]:
# Load custom regression experiment
custom_ex_id = "c227bd3fd86a"
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [234]:
all_means_double_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)

all_test_double_energy_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
display(all_test_double_energy_imbalanced)
display(all_std_double_energy_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.410545,0.049174,0.221752,0.180758
Dense,0.431934,0.047386,0.217682,0.177563
CNN,0.118513,0.073545,0.271192,0.215984
Pretrained,0.397889,0.050229,0.224118,0.183895
Custom,0.257733,0.061915,0.248827,0.194733


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.046081,0.003812,0.008771,0.007047
Dense,0.045895,0.003796,0.008731,0.007063
CNN,0.046143,0.003815,0.008755,0.007099
Pretrained,0.030523,0.002576,0.005792,0.003576
Custom,0.059499,0.004956,0.01156,0.01128


In [235]:
rows = all_test_double_energy_imbalanced.index
r2_str_array_double_energy_imbalanced = np.zeros((1, all_test_double_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_energy.shape[0]):
    r2_str_array_double_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_imbalanced["r2_score"][i], all_test_double_energy_imbalanced["r2_score"][i])
        
r2_df_double_energy_imbalanced = pd.DataFrame(r2_str_array_double_energy_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on double events in simulated data with specific pixels
set to zero, using multiple models. Error estimates are the standard deviation in results from k-fold 
cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Combined tables

In [239]:
df_pos = pd.concat(
    [
        r2_df_single_pos.rename({0:"Single (a)"}),
        r2_df_single_pos_pmod.rename({0:"Single (b)"}),
        r2_df_single_pos_imbalanced.rename({0:"Single (c)"}),
        r2_df_double_pos.rename({0:"Double (a)"}),
        r2_df_double_pos_pmod.rename({0:"Double (b)"}),
        r2_df_double_pos_imbalanced.rename({0:"Double (c)"}),
    ],
)
#display(df_pos)

df_energy = pd.concat(
    [
        r2_df_single_energy.rename({0:"Single (a)"}),
        r2_df_single_energy_pmod.rename({0:"Single (b)"}),
        r2_df_single_energy_imbalanced.rename({0:"Single (c)"}),
        r2_df_double_energy.rename({0:"Double (a)"}),
        r2_df_double_energy_pmod.rename({0:"Double (b)"}),
        r2_df_double_energy_imbalanced.rename({0:"Double (c)"}),
    ],
)
#display(df_energy)

In [240]:
# Output position df
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_positions_r2.tex"
caption = """
Test set R2-scores for regresson of positions of origin on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-positions-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    df_pos.to_latex(fp, escape=False, caption=caption, label=label, index=True)

In [241]:
# Output energy df
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_energies_r2.tex"
caption = """
Test set R2-scores for regresson of energies on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-energies-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    df_energy.to_latex(fp, escape=False, caption=caption, label=label, index=True)