**Results - Regression of simulated events**

This notebook is the primary source of plots and tables for the regression part of the thesis, 
with the goal of keeping every table and figure as standardized as possible. (And who has the time to update
90 tables one by one anyway).

**Questions:**
* Descriptive statistics
    - Should descriptive statistics of the simulated data be included?\
    If so, how much? And should it be included for each fold in the k-fold cross-validation?
* Classification results
    - Breakdown of results based on event type? Single, double, close double?
    Reasonable to include in order to confirm the assumption that close doubles are the
    most difficult event type to classify correctly in simulated data
    Random state is included, so should be simple to reproduce the indices


**TODO**
* Implement reproducing the validation indices for each fold based on the random seed from config

**Handy links**
* [matplotlib-plots to latex](https://timodenk.com/blog/exporting-matplotlib-plots-to-latex/)
* [Robert's thesis df output](https://github.com/ATTPC/VAE-event-classification/blob/master/src/make_classification_table.py)

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
from master_scripts.data_functions import get_git_root, normalize_image_data, event_indices, normalize_position_data
from master_scripts.analysis_functions import load_experiment, experiment_metrics_to_df
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
import tensorflow as tf
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

THESIS_PATH = "../../../master_thesis/"

In [2]:
# Load test set and normalize
repo_root = get_git_root()
test_images = np.load(repo_root + "data/simulated/test/" + "images_test.npy")
test_images = normalize_image_data(test_images)
test_positions = np.load(repo_root + "data/simulated/test/" + "positions_test.npy") 
test_energies = np.load(repo_root + "data/simulated/test/" + "energies_test.npy") 
test_labels = np.load(repo_root + "data/simulated/test/" + "labels_test.npy") 

# Set up indices for position and energy data
# s = single, d = double, c = close double
s_idx, d_idx, c_idx = event_indices(test_positions)

In [3]:
def regression_metrics(model, x_val, y_val, name):
    """ Calculates regression metrics on the validation data.
    
    :param x_val: normalized detector images
    :param y_val: target values
    """

    y_pred = model.predict(x_val)
    
    metrics = {}
    metrics['r2_score'] = r2_score(y_val, y_pred)
    metrics['mse'] = mean_squared_error(y_val, y_pred)
    metrics['rmse'] = mean_squared_error(y_val, y_pred, squared=False)
    metrics['mae'] = mean_absolute_error(y_val, y_pred)
    
    df = pd.DataFrame.from_dict(data={name: metrics}, orient='index')
    return df

In [4]:
# Experiment id's
experiments_nomod = {
    'linreg_pos_single': "6f482ad9fe9c",
    'dense_pos_single': "cfbd0c21511a",
    'cnn_pos_single': "4b6620824337",
    'pretrained_pos_single': "9d2a595aaf29",
    'custom_pos_single': "5f3792c8f1a0",
    'linreg_pos_double': "6aa4a4c86271",
    'dense_pos_double': "4e1e812b5ecc",
    'cnn_pos_double': "897838d20a59",
    'pretrained_pos_double': "f0c8443c4f1f",
    'custom_pos_double': "9e0e52034147",
    'linreg_energy_single': "08ae31d8e295",
    'dense_energy_single': "a1796d7f5a79",
    'cnn_energy_single': "266870a6918c",
    'pretrained_energy_single': "db783df32018",
    'custom_energy_single': "1df05215dd0f",
    'linreg_energy_double': "316536f29c50",
    'dense_energy_double': "fddd96776642",
    'cnn_energy_double': "783c1c5d243a",
    'pretrained_energy_double': "a342ac515982",
    'custom_energy_double': "fe9206498a0c",
}
experiments_pixelmod = {
    'linreg_pos_single': "68bf60283e44",
    'dense_pos_single': "755662cc4968",
    'cnn_pos_single': "f049229882a6",
    'pretrained_pos_single': "67884378eed5",
    'custom_pos_single': "d22d8cf06af5",
    'linreg_pos_double': "acfb12d8e5c9",
    'dense_pos_double': "7ec5ca894adc",
    'cnn_pos_double': "336783ab5d3a",
    'pretrained_pos_double': "a6adc3dda8ea",
    'custom_pos_double': "57a4d0c4e961",
    'linreg_energy_single': "110752ceb2dc",
    'dense_energy_single': "1ad5251dad3a",
    'cnn_energy_single': "0620841ee4f8",
    'pretrained_energy_single': "57482c0ae0df",
    'custom_energy_single': "685eb9c5ce0d",
    'linreg_energy_double': "dd9ebe869015",
    'dense_energy_double': "db3151b952e9",
    'cnn_energy_double': "037a360e93b3",
    'pretrained_energy_double': "505fe4eb821b",
    'custom_energy_double': "f5645df3dc7e",
}

experiments_imbalanced = {
    'linreg_pos_single': "2253faab0aad",
    'dense_pos_single': "9d6687876aed",
    'cnn_pos_single': "75a5b5dcb248",
    'pretrained_pos_single': "62e99a102779",
    'custom_pos_single': "ebb4e144b648",
    'linreg_pos_double': "5397fe597104",
    'dense_pos_double': "70321a55db34",
    'cnn_pos_double': "5ee6765687b2",
    'pretrained_pos_double': "7cc270cd9f65",
    'custom_pos_double': "0455e97a0fa9",
    'linreg_energy_single': "2b5f8f9116e6",
    'dense_energy_single': "4c8bff08c890",
    'cnn_energy_single': "f2578e244e47",
    'pretrained_energy_single': "6d74c09833d0",
    'custom_energy_single': "d535f64740b8",
    'linreg_energy_double': "368a0a3775d1",
    'dense_energy_double': "74d82fe449a6",
    'cnn_energy_double': "5c11ddfe6ca9",
    'pretrained_energy_double': "7a80d3b25ea8",
    'custom_energy_double': "0a93befd27dc",
}

# Pre-processed simulated data - no additional modifications
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.

## Single events

### Positions

#### Linear Regression

In [5]:
# Load linear regression experiment
lin_ex_id = experiments_nomod['linreg_pos_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [6]:
# Load logistic regression experiment
dense_ex_id = experiments_nomod['dense_pos_single']
dense_ex = load_experiment(dense_ex_id)

# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [7]:
# Load logistic regression experiment
cnn_ex_id = experiments_nomod['cnn_pos_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [8]:
# Load pretrained regression experiment
pretrained_ex_id = experiments_nomod['pretrained_pos_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [9]:
# Load custom regression experiment
custom_ex_id = experiments_nomod['custom_pos_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

In [10]:
display(custom_test)

Unnamed: 0,r2_score,mse,rmse,mae
custom_test,0.999115,6.2e-05,0.00788,0.00391


#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [11]:
all_means_single_pos = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_pos['mse'] = np.sqrt(all_std_single_pos['mse'])*16*3
all_test_single_pos = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_pos['mse'] = np.sqrt(all_test_single_pos['mse'])*16*3
display(all_test_single_pos)
display(all_std_single_pos)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.80027,5.681745,0.11837,0.088629
Dense,0.9885,1.363318,0.028402,0.017665
CNN,0.996974,0.699328,0.014569,0.008565
Pretrained,0.997177,0.675548,0.014074,0.007462
Custom,0.999115,0.378219,0.00788,0.00391


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.002889,0.642736,0.000758,0.001557
Dense,0.000701,0.33821,0.000874,0.000513
CNN,0.000921,0.384314,0.001936,0.001699
Pretrained,0.222923,5.990298,0.077535,0.064882
Custom,0.000137,0.147711,0.000628,0.000347


In [12]:
rows = all_test_single_pos.index
r2_str_array_single_pos = np.zeros((1, all_test_single_pos.shape[0]), dtype=object)
for i in range(all_test_single_pos.shape[0]):
    r2_str_array_single_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos["mse"][i], all_test_single_pos["mse"][i])

mse_str_array_single_pos = np.zeros((1, all_test_single_pos.shape[0]), dtype=object)
for i in range(all_test_single_pos.shape[0]):
    mse_str_array_single_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos["mse"][i], all_test_single_pos["mse"][i])
mse_df_single_pos = pd.DataFrame(mse_str_array_single_pos, columns=rows)

r2_df_single_pos = pd.DataFrame(r2_str_array_single_pos, columns=rows)


#section_path = "chapters/results/figures/"
#fname = THESIS_PATH + section_path + "regression_simulated_single_position_r2.tex"
#caption = """
#Mean R2-scores for regresson of positions of origin, on single events in simulated data, using multiple models. 
#Error estimates are the standard deviation in results from k-fold cross-validation 
#with $K=5$ folds.
#"""
#label = "tab:regression-simulated-single-position-r2"
#with open(fname, "w") as fp:
#    pd.set_option('display.max_colwidth', -1)
#    r2_df_single_pos.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [13]:
# Load linear regression experiment
lin_ex_id = experiments_nomod['linreg_energy_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)
print(lin_ex['experiment_name'])

results_energies_single_linreg


#### Small dense network

In [14]:
# Load dense regression experiment
dense_ex_id = experiments_nomod['dense_energy_single']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [15]:
# Load cnn regression experiment 
cnn_ex_id = experiments_nomod['cnn_energy_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [16]:
# Load logistic regression experiment 
pretrained_ex_id = experiments_nomod['pretrained_energy_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [17]:
# Load custom regression experiment 
custom_ex_id = experiments_nomod['custom_energy_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [18]:
all_means_single_energy = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_energy['mse'] = np.sqrt(all_std_single_energy['mse'])
all_test_single_energy = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_energy['mse'] = np.sqrt(all_test_single_energy['mse'])
display(all_test_single_energy)
display(all_std_single_energy)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.931644,0.075636,0.075636,0.057601
Dense,0.934257,0.074177,0.074177,0.056113
CNN,0.937209,0.072492,0.072492,0.054455
Pretrained,0.925566,0.078927,0.078927,0.061256
Custom,0.944194,0.068341,0.068341,0.050701


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.03334,0.052758,0.018014,0.018714
Dense,0.036228,0.054989,0.019965,0.021548
CNN,0.040879,0.058412,0.021705,0.023078
Pretrained,0.03761,0.056027,0.020388,0.021444
Custom,0.029973,0.05001,0.017306,0.018958


In [19]:
rows = all_test_single_energy.index
r2_str_array_single_energy = np.zeros((1, all_test_single_energy.shape[0]), dtype=object)
for i in range(all_test_single_energy.shape[0]):
    r2_str_array_single_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy["r2_score"][i], all_test_single_energy["r2_score"][i])
    
mse_str_array_single_energy = np.zeros((1, all_test_single_energy.shape[0]), dtype=object)
for i in range(all_test_single_energy.shape[0]):
    mse_str_array_single_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy["mse"][i], all_test_single_energy["mse"][i])
mse_df_single_energy = pd.DataFrame(mse_str_array_single_energy, columns=rows)
        
r2_df_single_energy = pd.DataFrame(r2_str_array_single_energy, columns=rows)

#section_path = "chapters/results/figures/"
#fname = THESIS_PATH + section_path + "regression_simulated_single_energy_r2.tex"
#caption = """
#Mean R2-scores for regresson of energy values, on single events in simulated data, using multiple models. 
#Error estimates are the standard deviation in results from k-fold cross-validation 
#with $K=5$ folds.
#"""
#label = "tab:regression-simulated-single-energy-r2"
#with open(fname, "w") as fp:
#    pd.set_option('display.max_colwidth', -1)
#    r2_df_single_energy.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [20]:
# Load linear regression experiment
#lin_ex_id = "7b74b3cfc586"
lin_ex_id = experiments_nomod['linreg_pos_double']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [21]:
# Load logistic regression experiment
#dense_ex_id = "ef55911e49d1"
dense_ex_id = experiments_nomod['dense_pos_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [22]:
# Load logistic regression experiment
#cnn_ex_id = "cc2654aea019"
cnn_ex_id = experiments_nomod['cnn_pos_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [23]:
# Load logistic regression experiment
#pretrained_ex_id = "3c0d1b7bd0ac"
pretrained_ex_id = experiments_nomod['pretrained_pos_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [24]:
# Load custom regression experiment
#custom_ex_id = "468fefa67787"
custom_ex_id = experiments_nomod['custom_pos_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [25]:
all_means_double_pos = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_pos['mse'] = np.sqrt(all_std_double_pos['mse'])*16*3 # scale to mm
all_test_double_pos = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_pos['mse'] = np.sqrt(all_test_double_pos['mse'])*16*3 #scale to mm
display(all_test_double_pos)
display(all_std_double_pos)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.369641,10.091215,0.210234,0.169158
Dense,0.456381,9.371153,0.195232,0.158392
CNN,0.470586,9.247995,0.192667,0.157797
Pretrained,0.290192,10.706412,0.22305,0.184016
Custom,0.4929,9.050965,0.188562,0.153711


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.003766,0.774587,0.000617,0.001528
Dense,0.005601,0.950576,0.001014,0.00082
CNN,0.001603,0.506852,0.00029,0.000504
Pretrained,0.15522,5.001838,0.024097,0.02108
Custom,0.000347,0.334215,0.000129,7e-05


In [26]:
rows = all_test_double_pos.index
r2_str_array_double_pos = np.zeros((1, all_test_double_pos.shape[0]), dtype=object)
for i in range(all_test_double_pos.shape[0]):
    r2_str_array_double_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos["r2_score"][i], all_test_double_pos["r2_score"][i])
    
mse_str_array_double_pos = np.zeros((1, all_test_double_pos.shape[0]), dtype=object)
for i in range(all_test_double_pos.shape[0]):
    mse_str_array_double_pos[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos["mse"][i], all_test_double_pos["mse"][i])
mse_df_double_pos = pd.DataFrame(mse_str_array_double_pos, columns=rows)
        
r2_df_double_pos = pd.DataFrame(r2_str_array_double_pos, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [27]:
# Load linear regression experiment 
#lin_ex_id = "6e600e08e8af"
lin_ex_id = experiments_nomod['linreg_energy_double']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [28]:
# Load dense regression experiment 
#dense_ex_id = "96cd3707d131"
dense_ex_id = experiments_nomod['dense_energy_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [29]:
# Load cnn regression experiment 
cnn_ex_id = experiments_nomod['cnn_energy_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [30]:
# Load logistic regression experiment 
pretrained_ex_id = experiments_nomod['pretrained_energy_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [31]:
# Load custom regression experiment 
custom_ex_id = experiments_nomod['custom_energy_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [32]:
all_means_double_energy = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_energy['mse'] = np.sqrt(all_std_double_energy['mse'])
all_test_double_energy = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_energy['mse'] = np.sqrt(all_test_double_energy['mse'])
display(all_test_double_energy)
display(all_std_double_energy)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.489724,0.206303,0.206303,0.168101
Dense,0.490413,0.206164,0.206164,0.167996
CNN,0.487926,0.206667,0.206667,0.168482
Pretrained,0.488561,0.206536,0.206536,0.168432
Custom,0.491342,0.205975,0.205975,0.167865


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.033493,0.052863,0.006383,0.00514
Dense,0.030838,0.050742,0.005896,0.004747
CNN,0.0413,0.058619,0.007806,0.006288
Pretrained,0.031376,0.051164,0.005979,0.004867
Custom,0.036177,0.054964,0.006925,0.005727


In [33]:
rows = all_test_double_energy.index
r2_str_array_double_energy = np.zeros((1, all_test_double_energy.shape[0]), dtype=object)
for i in range(all_test_double_energy.shape[0]):
    r2_str_array_double_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy["r2_score"][i], all_test_double_energy["r2_score"][i])

mse_str_array_double_energy = np.zeros((1, all_test_double_energy.shape[0]), dtype=object)
for i in range(all_test_double_energy.shape[0]):
    mse_str_array_double_energy[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy["mse"][i], all_test_double_energy["mse"][i])
mse_df_double_energy = pd.DataFrame(mse_str_array_double_energy, columns=rows)
        
r2_df_double_energy = pd.DataFrame(r2_str_array_double_energy, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on double events in simulated data, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Pre-processed simulated data - Pixel modified
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.
Additionally, the data has had the top and bottom lines of pixels set to 0, plus
one pixel inside the detector permanently 0 (which idx again?).

## Single events

### Positions

#### Linear Regression

In [34]:
# Load linear regression experiment
lin_ex_id = experiments_pixelmod['linreg_pos_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [35]:
# Load logistic regression experiment
dense_ex_id = experiments_pixelmod['dense_pos_single']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [36]:
# Load logistic regression experiment
cnn_ex_id = experiments_pixelmod['cnn_pos_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [37]:
# Load logistic regression experiment
pretrained_ex_id = experiments_pixelmod['pretrained_pos_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [38]:
# Load custom regression experiment
custom_ex_id = experiments_pixelmod['custom_pos_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [39]:
all_means_single_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_pos_pmod['mse'] = np.sqrt(all_std_single_pos_pmod['mse'])*16*3 #scale to mm
all_test_single_pos_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_pos_pmod['mse'] = np.sqrt(all_test_single_pos_pmod['mse'])*16*3 #scale to mm
display(all_test_single_pos_pmod)
display(all_std_single_pos_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.780824,5.951921,0.123998,0.093689
Dense,0.982026,1.703934,0.035499,0.022587
CNN,0.979772,1.806899,0.037644,0.024653
Pretrained,0.996908,0.706858,0.014726,0.008297
Custom,0.994937,0.904439,0.018842,0.007491


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.002749,0.682407,0.000853,0.0013
Dense,0.001406,0.479092,0.001631,0.000838
CNN,0.00111,0.42214,0.002108,0.001108
Pretrained,0.000451,0.269395,0.001127,0.00136
Custom,0.00096,0.393849,0.002986,0.000715


In [40]:
rows = all_test_single_pos_pmod.index
r2_str_array_single_pos_pmod = np.zeros((1, all_test_single_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_single_pos_pmod.shape[0]):
    r2_str_array_single_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_pmod["r2_score"][i], all_test_single_pos_pmod["r2_score"][i])

mse_str_array_single_pos_pmod = np.zeros((1, all_test_single_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_single_pos_pmod.shape[0]):
    mse_str_array_single_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_pmod["mse"][i], all_test_single_pos_pmod["mse"][i])
mse_df_single_pos_pmod = pd.DataFrame(mse_str_array_single_pos_pmod, columns=rows)
        
r2_df_single_pos_pmod = pd.DataFrame(r2_str_array_single_pos_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_pos_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [41]:
# Load linear regression experiment
lin_ex_id = experiments_pixelmod['linreg_energy_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [42]:
# Load dense regression experiment
dense_ex_id = experiments_pixelmod['dense_energy_single']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN
This is really sensitive to pixel modifications.
Performs similarly to the other models if you pixelmod the test data.

In [43]:
# Load cnn regression experiment
cnn_ex_id = experiments_pixelmod['cnn_energy_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)
#display(cnn_test)

#### Pretrained - VGG16 

In [44]:
# Load logistic regression experiment
pretrained_ex_id = experiments_pixelmod['pretrained_energy_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [45]:
# Load custom regression experiment
custom_ex_id = experiments_pixelmod['custom_energy_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [46]:
all_means_single_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_energy_pmod['mse'] = np.sqrt(all_std_single_energy_pmod['mse'])
all_test_single_energy_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_energy_pmod['mse'] = np.sqrt(all_test_single_energy_pmod['mse'])
display(all_test_single_energy_pmod)
display(all_std_single_energy_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.767872,0.139381,0.139381,0.115187
Dense,0.744545,0.146217,0.146217,0.122245
CNN,0.479845,0.208645,0.208645,0.169052
Pretrained,0.78126,0.135302,0.135302,0.113526
Custom,0.751786,0.14413,0.14413,0.120671


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.024595,0.045187,0.016237,0.01877
Dense,0.022225,0.042958,0.015898,0.019227
CNN,0.02575,0.046243,0.017808,0.021295
Pretrained,0.019476,0.040212,0.014395,0.017851
Custom,0.031672,0.05129,0.021167,0.024938


In [47]:
rows = all_test_single_energy_pmod.index
r2_str_array_single_energy_pmod = np.zeros((1, all_test_single_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_single_energy_pmod.shape[0]):
    r2_str_array_single_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_pmod["r2_score"][i], all_test_single_energy_pmod["r2_score"][i])
mse_str_array_single_energy_pmod = np.zeros((1, all_test_single_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_single_energy_pmod.shape[0]):
    mse_str_array_single_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_pmod["mse"][i], all_test_single_energy_pmod["mse"][i])
mse_df_single_energy_pmod = pd.DataFrame(mse_str_array_single_energy_pmod, columns=rows)
        
r2_df_single_energy_pmod = pd.DataFrame(r2_str_array_single_energy_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_energy_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [48]:
# Load linear regression experiment
lin_ex_id = experiments_pixelmod['linreg_pos_double']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [49]:
# Load logistic regression experiment
dense_ex_id = experiments_pixelmod['dense_pos_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [50]:
# Load logistic regression experiment
cnn_ex_id = experiments_pixelmod['cnn_pos_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [51]:
# Load logistic regression experiment
pretrained_ex_id = experiments_pixelmod['pretrained_pos_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [52]:
# Load custom regression experiment
custom_ex_id = experiments_pixelmod['custom_pos_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [53]:
all_means_double_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_pos_pmod['mse'] = np.sqrt(all_std_double_pos_pmod['mse'])*16*3 #scale to mm
all_test_double_pos_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_pos_pmod['mse'] = np.sqrt(all_test_double_pos_pmod['mse'])*16*3 #scale to mm
display(all_test_double_pos_pmod)
display(all_std_double_pos_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.363524,10.140017,0.21125,0.169904
Dense,0.457978,9.357444,0.194947,0.158731
CNN,0.435052,9.55331,0.199027,0.160244
Pretrained,0.289023,10.71516,0.223232,0.183692
Custom,0.488727,9.088124,0.189336,0.154539


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.000681,0.374785,0.000145,0.000445
Dense,0.003431,0.734363,0.000606,0.000404
CNN,0.001835,0.501365,0.000284,0.000394
Pretrained,0.155021,4.998659,0.024073,0.02111
Custom,0.000287,0.249135,7.2e-05,6.2e-05


In [54]:
rows = all_test_double_pos_pmod.index
r2_str_array_double_pos_pmod = np.zeros((1, all_test_double_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_double_pos_pmod.shape[0]):
    r2_str_array_double_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_pmod["r2_score"][i], all_test_double_pos_pmod["r2_score"][i])

mse_str_array_double_pos_pmod = np.zeros((1, all_test_double_pos_pmod.shape[0]), dtype=object)
for i in range(all_test_double_pos_pmod.shape[0]):
    mse_str_array_double_pos_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_pmod["mse"][i], all_test_double_pos_pmod["mse"][i])
mse_df_double_pos_pmod = pd.DataFrame(mse_str_array_double_pos_pmod, columns=rows)
        
r2_df_double_pos_pmod = pd.DataFrame(r2_str_array_double_pos_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [55]:
# Load linear regression experiment
lin_ex_id = experiments_pixelmod['linreg_energy_double']
lin_ex = load_experiment(lin_ex_id)
# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [56]:
# Load dense regression experiment
dense_ex_id = experiments_pixelmod['dense_energy_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [57]:
# Load cnn regression experiment
cnn_ex_id = experiments_pixelmod['cnn_energy_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [58]:
# Load logistic regression experiment
pretrained_ex_id = experiments_pixelmod['pretrained_energy_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [59]:
# Load custom regression experiment
custom_ex_id = experiments_pixelmod['custom_energy_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [60]:
all_means_double_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy_pmod = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_energy_pmod['mse'] = np.sqrt(all_std_double_energy_pmod['mse'])
all_test_double_energy_pmod = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_energy_pmod['mse'] = np.sqrt(all_test_double_energy_pmod['mse'])
display(all_test_double_energy_pmod)
display(all_std_double_energy_pmod)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.484912,0.207275,0.207275,0.16882
Dense,0.486881,0.206877,0.206877,0.168517
CNN,0.478423,0.208573,0.208573,0.169819
Pretrained,0.488723,0.206504,0.206504,0.168374
Custom,0.463511,0.211532,0.211532,0.172174


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.003157,0.016133,0.000628,0.000566
Dense,0.002347,0.014207,0.000488,0.00044
CNN,0.007096,0.023925,0.001374,0.001231
Pretrained,0.004508,0.019918,0.000958,0.000847
Custom,0.003659,0.016841,0.000685,0.000656


In [61]:
rows = all_test_double_energy_pmod.index
r2_str_array_double_energy_pmod = np.zeros((1, all_test_double_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_double_energy_pmod.shape[0]):
    r2_str_array_double_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_pmod["r2_score"][i], all_test_double_energy_pmod["r2_score"][i])
    
mse_str_array_double_energy_pmod = np.zeros((1, all_test_double_energy_pmod.shape[0]), dtype=object)
for i in range(all_test_double_energy_pmod.shape[0]):
    mse_str_array_double_energy_pmod[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_pmod["mse"][i], all_test_double_energy_pmod["mse"][i])
mse_df_double_energy_pmod = pd.DataFrame(mse_str_array_double_energy_pmod, columns=rows)
        
r2_df_double_energy_pmod = pd.DataFrame(r2_str_array_double_energy_pmod, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on double events in simulated data with specific pixels
set to zero, using multiple models. Error estimates are the standard deviation in results from k-fold 
cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy_pmod.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Pre-processed simulated data - Pixel modified and imbalanced
This is the basic metrics for all the models trained on simulated data.
The basic pre-processing includes formatting and min-max normalization.
Additionally, the data has had the top and bottom lines of pixels set to 0, plus
one pixel inside the detector permanently 0 (which idx again?).

This dataset has also been purposefully imbalanced to mimic the properties of experimental data
where doubles in space are expected to be rare.

## Single events

### Positions

#### Linear Regression

In [62]:
# Load linear regression experiment 
lin_ex_id = experiments_imbalanced['linreg_pos_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [63]:
# Load logistic regression experiment 
#dense_ex_id = "af61fe608db1"
dense_ex_id = experiments_imbalanced['dense_pos_single']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [64]:
# Load logistic regression experiment
cnn_ex_id = experiments_imbalanced['cnn_pos_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [65]:
# Load logistic regression experiment 
#pretrained_ex_id = "a7340b9e74ad"
pretrained_ex_id = experiments_imbalanced['pretrained_pos_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [66]:
# Load custom regression experiment
#custom_ex_id = "33fa607a199b"
custom_ex_id = experiments_imbalanced['custom_pos_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], normalize_position_data(test_positions[s_idx])[:,:2], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [67]:
all_means_single_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_pos_imbalanced['mse'] = np.sqrt(all_std_single_pos_imbalanced['mse'])*16*3 #scale to mm
all_test_single_pos_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_pos_imbalanced['mse'] = np.sqrt(all_test_single_pos_imbalanced['mse'])*16*3 #scale to mm
display(all_test_single_pos_imbalanced)
display(all_std_single_pos_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.780824,5.951921,0.123998,0.093689
Dense,0.982026,1.70394,0.035499,0.022588
CNN,0.980206,1.787418,0.037238,0.024505
Pretrained,0.996768,0.722726,0.015057,0.008724
Custom,0.993477,1.02668,0.021389,0.007832


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.002749,0.682408,0.000853,0.0013
Dense,0.001424,0.482023,0.00165,0.000853
CNN,0.00108,0.416431,0.002056,0.00114
Pretrained,0.000493,0.281714,0.001222,0.001434
Custom,0.001639,0.514709,0.00452,0.00094


In [68]:
rows = all_test_single_pos_imbalanced.index
r2_str_array_single_pos_imbalanced = np.zeros((1, all_test_single_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_pos_imbalanced.shape[0]):
    r2_str_array_single_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_imbalanced["r2_score"][i], all_test_single_pos_imbalanced["r2_score"][i])
mse_str_array_single_pos_imbalanced = np.zeros((1, all_test_single_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_pos_imbalanced.shape[0]):
    mse_str_array_single_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_pos_imbalanced["mse"][i], all_test_single_pos_imbalanced["mse"][i])
mse_df_single_pos_imbalanced = pd.DataFrame(mse_str_array_single_pos_imbalanced, columns=rows)
        
r2_df_single_pos_imbalanced = pd.DataFrame(r2_str_array_single_pos_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_pos_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [69]:
# Load linear regression experiment 
lin_ex_id = experiments_imbalanced['linreg_energy_single']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [70]:
# Load dense regression experiment 
dense_ex_id = experiments_imbalanced['dense_energy_single']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[s_idx], test_energies[s_idx,0], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [71]:
# Load cnn regression experiment 
cnn_ex_id = experiments_imbalanced['cnn_energy_single']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[s_idx], test_energies[s_idx,0], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [72]:
# Load logistic regression experiment 
pretrained_ex_id = experiments_imbalanced['pretrained_energy_single']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[s_idx], test_energies[s_idx,0], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [73]:
# Load custom regression experiment 
custom_ex_id = experiments_imbalanced['custom_energy_single']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[s_idx], test_energies[s_idx,0], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [74]:
all_means_single_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_single_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_single_energy_imbalanced['mse'] = np.sqrt(all_std_single_energy_imbalanced['mse'])
all_test_single_energy_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_single_energy_imbalanced['mse'] = np.sqrt(all_test_single_energy_imbalanced['mse'])
display(all_test_single_energy_imbalanced)
display(all_std_single_energy_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.767872,0.139381,0.139381,0.115187
Dense,0.744545,0.146217,0.146217,0.122245
CNN,0.432472,0.217939,0.217939,0.183872
Pretrained,0.781421,0.135253,0.135253,0.113448
Custom,0.724011,0.15198,0.15198,0.126578


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.024595,0.045187,0.016237,0.01877
Dense,0.022227,0.04296,0.015907,0.019251
CNN,0.025216,0.045761,0.017508,0.021094
Pretrained,0.01955,0.040289,0.01441,0.017824
Custom,0.029563,0.049552,0.020111,0.023985


In [75]:
rows = all_test_single_energy_imbalanced.index
r2_str_array_single_energy_imbalanced = np.zeros((1, all_test_single_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_energy_imbalanced.shape[0]):
    r2_str_array_single_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_imbalanced["r2_score"][i], all_test_single_energy_imbalanced["r2_score"][i])
    
mse_str_array_single_energy_imbalanced = np.zeros((1, all_test_single_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_single_energy_imbalanced.shape[0]):
    mse_str_array_single_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_single_energy_imbalanced["mse"][i], all_test_single_energy_imbalanced["mse"][i])
mse_df_single_energy_imbalanced = pd.DataFrame(mse_str_array_single_energy_imbalanced, columns=rows)
        
r2_df_single_energy_imbalanced = pd.DataFrame(r2_str_array_single_energy_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_single_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of energy values, on single events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-single-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_single_energy_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


## Double events

### Positions

#### Linear Regression

In [76]:
# Load linear regression experiment 
lin_ex_id = experiments_imbalanced['linreg_pos_double']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [77]:
# Load logistic regression experiment 
dense_ex_id = experiments_imbalanced['dense_pos_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], normalize_position_data(test_positions[d_idx]), "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [78]:
# Load logistic regression experiment 

cnn_ex_id = experiments_imbalanced['cnn_pos_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#display(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG
As an additional baseline for performance, we include a pretrained SOTA network
where trained on the ImageNet database.

Due to the size of our detector images (16x16) compared with the size the VGG network is
designed for, we cannot use all layers in the VGG network. This stems from the use of max-pooling
which effectively reduces the image size to half (8x8) each time the input is passed through such a
layer. At some point our input is too small to pass through to the rest of the network.
We therefore cut the network at the point where this becomes an issue.
Alternatively, one could possibly keep the depth but remove max-pooling layers.

In [79]:
# Load logistic regression experiment 
pretrained_ex_id = experiments_imbalanced['pretrained_pos_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], normalize_position_data(test_positions[d_idx]), "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [80]:
# Load custom regression experiment 
custom_ex_id = experiments_imbalanced['custom_pos_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], normalize_position_data(test_positions[d_idx]), "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output
We use the standard deviation in the folds as an error measure, and report the mean classification f1_score.

In [81]:
all_means_double_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_pos_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_pos_imbalanced['mse'] = np.sqrt(all_std_double_pos_imbalanced['mse'])*16*3 #scale to mm
all_test_double_pos_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_pos_imbalanced['mse'] = np.sqrt(all_test_double_pos_imbalanced['mse'])*16*3 #scale to mm
display(all_test_double_pos_imbalanced)
display(all_std_double_pos_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.356528,10.195619,0.212409,0.173276
Dense,0.417451,9.701021,0.202105,0.163032
CNN,0.442153,9.493067,0.197772,0.159013
Pretrained,-0.924294,17.63447,0.367385,0.287218
Custom,0.478479,9.178777,0.191225,0.155669


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.007768,1.299084,0.001729,0.001072
Dense,0.009456,1.345486,0.001978,0.001481
CNN,0.002507,0.881085,0.000858,0.000923
Pretrained,0.845202,11.709055,0.073263,0.065802
Custom,0.004187,1.032605,0.001215,0.000882


In [82]:
rows = all_test_double_pos_imbalanced.index
r2_str_array_double_pos_imbalanced = np.zeros((1, all_test_double_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_pos_imbalanced.shape[0]):
    r2_str_array_double_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_imbalanced["r2_score"][i], all_test_double_pos_imbalanced["r2_score"][i])
    
mse_str_array_double_pos_imbalanced = np.zeros((1, all_test_double_pos_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_pos_imbalanced.shape[0]):
    mse_str_array_double_pos_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_pos_imbalanced["mse"][i], all_test_double_pos_imbalanced["mse"][i])
mse_df_double_pos_imbalanced = pd.DataFrame(mse_str_array_double_pos_imbalanced, columns=rows)
        
r2_df_double_pos_imbalanced = pd.DataFrame(r2_str_array_double_pos_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_position_pixelmod_r2.tex"
caption = """
Mean R2-scores for regresson of positions of origin, on double events in simulated data with specific pixels
set to zero, using multiple models. 
Error estimates are the standard deviation in results from k-fold cross-validation 
with $K=5$ folds.
"""
label = "tab:regression-simulated-double-position-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_pos_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


### Energy

#### Linear regression

In [83]:
# Load linear regression experiment 
lin_ex_id = experiments_imbalanced['linreg_energy_double']
lin_ex = load_experiment(lin_ex_id)

# Load model and predict
lin_model = tf.keras.models.load_model(repo_root + "models/" + lin_ex_id + ".h5", compile=False)
lin_test = regression_metrics(lin_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "lin_test")
del lin_model #No longer needed, clear memory just in case.

lin_metrics = experiment_metrics_to_df(lin_ex)
#display(lin_metrics)
lin_means = lin_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
lin_means = lin_means.rename(index={'mean': 'lin_mean', 'std': 'lin_std'})
#display(lin_means)

#### Small dense network

In [84]:
# Load dense regression experiment
dense_ex_id = experiments_imbalanced['dense_energy_double']
dense_ex = load_experiment(dense_ex_id)
# Load model and predict
dense_model = tf.keras.models.load_model(repo_root + "models/" + dense_ex_id + ".h5", compile=False)
dense_test = regression_metrics(dense_model, test_images.reshape(test_images.shape[0], 256)[d_idx], test_energies[d_idx], "dense_test")
del dense_model

dense_metrics = experiment_metrics_to_df(dense_ex)
#display(dense_metrics)
dense_means = dense_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
dense_means = dense_means.rename(index={'mean': 'dense_mean', 'std': 'dense_std'})
#display(dense_means)

#### Small CNN

In [85]:
# Load cnn regression experiment
cnn_ex_id = experiments_imbalanced['cnn_energy_double']
cnn_ex = load_experiment(cnn_ex_id)
# Load model and predict
cnn_model = tf.keras.models.load_model(repo_root + "models/" + cnn_ex_id + ".h5", compile=False)
cnn_test = regression_metrics(cnn_model, test_images[d_idx], test_energies[d_idx], "cnn_test")
del cnn_model

cnn_metrics = experiment_metrics_to_df(cnn_ex)
#(cnn_metrics)
cnn_means = cnn_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
cnn_means = cnn_means.rename(index={'mean': 'cnn_mean', 'std': 'cnn_std'})
#display(cnn_means)

#### Pretrained - VGG16 

In [86]:
# Load logistic regression experiment
pretrained_ex_id = experiments_imbalanced['pretrained_energy_double']
pretrained_ex = load_experiment(pretrained_ex_id)
# Load model and predict
pretrained_model = tf.keras.models.load_model(repo_root + "models/" + pretrained_ex_id + ".h5", compile=False)
pretrained_test = regression_metrics(pretrained_model, np.concatenate((test_images, test_images, test_images), axis=-1)[d_idx], test_energies[d_idx], "pretrained_test")
del pretrained_model

pretrained_metrics = experiment_metrics_to_df(pretrained_ex)
#display(pretrained_metrics)
pretrained_means = pretrained_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
pretrained_means = pretrained_means.rename(index={'mean': 'pretrained_mean', 'std': 'pretrained_std'})
#display(pretrained_means)

#### Custom model

In [87]:
# Load custom regression experiment 
custom_ex_id = experiments_imbalanced['custom_energy_double']
custom_ex = load_experiment(custom_ex_id)
# Load model and predict
custom_model = tf.keras.models.load_model(repo_root + "models/" + custom_ex_id + ".h5", compile=False)
custom_test = regression_metrics(custom_model, test_images[d_idx], test_energies[d_idx], "custom_test")
del custom_model

custom_metrics = experiment_metrics_to_df(custom_ex)
#display(custom_metrics)
custom_means = custom_metrics.agg([np.mean, np.std])#.applymap('{:.3f}'.format)
custom_means = custom_means.rename(index={'mean': 'custom_mean', 'std': 'custom_std'})
#display(custom_means)

#### Output

In [88]:
all_means_double_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_mean': 'Linear',
        'dense_mean': 'Dense',
        'cnn_mean': 'CNN',
        'pretrained_mean': 'Pretrained',
        'custom_mean': 'Custom',
    }
)

all_std_double_energy_imbalanced = pd.DataFrame(
    [
        lin_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        dense_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
        custom_means.iloc[1][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_std': 'Linear',
        'dense_std': 'Dense',
        'cnn_std': 'CNN',
        'pretrained_std': 'Pretrained',
        'custom_std': 'Custom',
    }
)
all_std_double_energy_imbalanced['mse'] = np.sqrt(all_std_double_energy_imbalanced['mse'])
all_test_double_energy_imbalanced = pd.DataFrame(
    [
        lin_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        dense_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        cnn_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        pretrained_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
        custom_test.iloc[0][['r2_score', 'mse', 'rmse', 'mae']],
    ]
).rename(
    index={
        'lin_test': 'Linear',
        'dense_test': 'Dense',
        'cnn_test': 'CNN',
        'pretrained_test': 'Pretrained',
        'custom_test': 'Custom',
    }
)
all_test_double_energy_imbalanced['mse'] = np.sqrt(all_test_double_energy_imbalanced['mse'])
display(all_test_double_energy_imbalanced)
display(all_std_double_energy_imbalanced)

Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.43421,0.217239,0.217239,0.17722
Dense,0.422217,0.219541,0.219541,0.178922
CNN,0.445965,0.21498,0.21498,0.175488
Pretrained,0.417041,0.220515,0.220515,0.182564
Custom,0.401321,0.223483,0.223483,0.182139


Unnamed: 0,r2_score,mse,rmse,mae
Linear,0.046108,0.061761,0.00877,0.007056
Dense,0.045831,0.061566,0.008696,0.006983
CNN,0.045541,0.061337,0.00863,0.006929
Pretrained,0.038678,0.056449,0.007198,0.003539
Custom,0.048022,0.063068,0.009137,0.007514


In [89]:
rows = all_test_double_energy_imbalanced.index
r2_str_array_double_energy_imbalanced = np.zeros((1, all_test_double_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_energy.shape[0]):
    r2_str_array_double_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_imbalanced["r2_score"][i], all_test_double_energy_imbalanced["r2_score"][i])
    
mse_str_array_double_energy_imbalanced = np.zeros((1, all_test_double_energy_imbalanced.shape[0]), dtype=object)
for i in range(all_test_double_energy_imbalanced.shape[0]):
    mse_str_array_double_energy_imbalanced[0, i] = r"$\underset{{\num{{+- {:.3e} }}  }}{{\num{{ {:.3g} }} }}$".format(
        all_std_double_energy_imbalanced["mse"][i], all_test_double_energy_imbalanced["mse"][i])
mse_df_double_energy_imbalanced = pd.DataFrame(mse_str_array_double_energy_imbalanced, columns=rows)
        
r2_df_double_energy_imbalanced = pd.DataFrame(r2_str_array_double_energy_imbalanced, columns=rows)

section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_double_energy_pixelmod_r2.tex"
caption = """
Mean R2-scores for regression of energy values, on double events in simulated data with specific pixels
set to zero, using multiple models. Error estimates are the standard deviation in results from k-fold 
cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-double-energy-pixelmod-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    r2_df_double_energy_imbalanced.to_latex(fp, escape=False, caption=caption, label=label, index=False)


# Combined tables

In [90]:
df_pos = pd.concat(
    [
        r2_df_single_pos.rename({0:"Single (a)"}),
        r2_df_single_pos_pmod.rename({0:"Single (b)"}),
        r2_df_single_pos_imbalanced.rename({0:"Single (c)"}),
        r2_df_double_pos.rename({0:"Double (a)"}),
        r2_df_double_pos_pmod.rename({0:"Double (b)"}),
        r2_df_double_pos_imbalanced.rename({0:"Double (c)"}),
    ],
)
#display(df_pos)

df_energy = pd.concat(
    [
        r2_df_single_energy.rename({0:"Single (a)"}),
        r2_df_single_energy_pmod.rename({0:"Single (b)"}),
        r2_df_single_energy_imbalanced.rename({0:"Single (c)"}),
        r2_df_double_energy.rename({0:"Double (a)"}),
        r2_df_double_energy_pmod.rename({0:"Double (b)"}),
        r2_df_double_energy_imbalanced.rename({0:"Double (c)"}),
    ],
)
#display(df_energy)

In [91]:
df_pos_mse = pd.concat(
    [
        mse_df_single_pos.rename({0:"Single (a) [mm]"}),
        mse_df_single_pos_pmod.rename({0:"Single (b) [mm]"}),
        mse_df_single_pos_imbalanced.rename({0:"Single (c) [mm]"}),
        mse_df_double_pos.rename({0:"Double (a) [mm]"}),
        mse_df_double_pos_pmod.rename({0:"Double (b) [mm]"}),
        mse_df_double_pos_imbalanced.rename({0:"Double (c) [mm]"}),
    ],
)

df_energy_mse = pd.concat(
    [
        mse_df_single_energy.rename({0:"Single (a) [MeV]"}),
        mse_df_single_energy_pmod.rename({0:"Single (b) [MeV]"}),
        mse_df_single_energy_imbalanced.rename({0:"Single (c) [MeV]"}),
        mse_df_double_energy.rename({0:"Double (a) [MeV]"}),
        mse_df_double_energy_pmod.rename({0:"Double (b) [MeV]"}),
        mse_df_double_energy_imbalanced.rename({0:"Double (c) [MeV]"}),
    ],
)

In [92]:
# Save the results so we don't need to re-do ever ever never ever again
all_test_single_pos.to_pickle("all_test_single_pos.pkl")
all_test_single_pos_pmod.to_pickle("all_test_single_pos_pmod.pkl")
all_test_single_pos_imbalanced.to_pickle("all_test_single_pos_imbalanced.pkl")
all_test_double_pos.to_pickle("all_test_double_pos.pkl")
all_test_double_pos_pmod.to_pickle("all_test_double_pos_pmod.pkl")
all_test_double_pos_imbalanced.to_pickle("all_test_double_pos_imbalanced.pkl")
all_test_single_energy.to_pickle("all_test_single_energy.pkl")
all_test_single_energy_pmod.to_pickle("all_test_single_energy_pmod.pkl")
all_test_single_energy_imbalanced.to_pickle("all_test_single_energy_imbalanced.pkl")
all_test_double_energy.to_pickle("all_test_single_energy.pkl")
all_test_double_energy_pmod.to_pickle("all_test_single_energy_pmo.pkl")
all_test_double_energy_imbalanced.to_pickle("all_test_single_energy_imbalanced.pkl")


In [93]:
display(df_pos_mse)
display(df_energy_mse)

Unnamed: 0,Linear,Dense,CNN,Pretrained,Custom
Single (a) [mm],$\underset{\num{+- 6.427e-01 } }{\num{ 5.68 } }$,$\underset{\num{+- 3.382e-01 } }{\num{ 1.36 } }$,$\underset{\num{+- 3.843e-01 } }{\num{ 0.699 } }$,$\underset{\num{+- 5.990e+00 } }{\num{ 0.676 } }$,$\underset{\num{+- 1.477e-01 } }{\num{ 0.378 } }$
Single (b) [mm],$\underset{\num{+- 6.824e-01 } }{\num{ 5.95 } }$,$\underset{\num{+- 4.791e-01 } }{\num{ 1.7 } }$,$\underset{\num{+- 4.221e-01 } }{\num{ 1.81 } }$,$\underset{\num{+- 2.694e-01 } }{\num{ 0.707 } }$,$\underset{\num{+- 3.938e-01 } }{\num{ 0.904 } }$
Single (c) [mm],$\underset{\num{+- 6.824e-01 } }{\num{ 5.95 } }$,$\underset{\num{+- 4.820e-01 } }{\num{ 1.7 } }$,$\underset{\num{+- 4.164e-01 } }{\num{ 1.79 } }$,$\underset{\num{+- 2.817e-01 } }{\num{ 0.723 } }$,$\underset{\num{+- 5.147e-01 } }{\num{ 1.03 } }$
Double (a) [mm],$\underset{\num{+- 7.746e-01 } }{\num{ 10.1 } }$,$\underset{\num{+- 9.506e-01 } }{\num{ 9.37 } }$,$\underset{\num{+- 5.069e-01 } }{\num{ 9.25 } }$,$\underset{\num{+- 5.002e+00 } }{\num{ 10.7 } }$,$\underset{\num{+- 3.342e-01 } }{\num{ 9.05 } }$
Double (b) [mm],$\underset{\num{+- 3.748e-01 } }{\num{ 10.1 } }$,$\underset{\num{+- 7.344e-01 } }{\num{ 9.36 } }$,$\underset{\num{+- 5.014e-01 } }{\num{ 9.55 } }$,$\underset{\num{+- 4.999e+00 } }{\num{ 10.7 } }$,$\underset{\num{+- 2.491e-01 } }{\num{ 9.09 } }$
Double (c) [mm],$\underset{\num{+- 1.299e+00 } }{\num{ 10.2 } }$,$\underset{\num{+- 1.345e+00 } }{\num{ 9.7 } }$,$\underset{\num{+- 8.811e-01 } }{\num{ 9.49 } }$,$\underset{\num{+- 1.171e+01 } }{\num{ 17.6 } }$,$\underset{\num{+- 1.033e+00 } }{\num{ 9.18 } }$


Unnamed: 0,Linear,Dense,CNN,Pretrained,Custom
Single (a) [MeV],$\underset{\num{+- 5.276e-02 } }{\num{ 0.0756 } }$,$\underset{\num{+- 5.499e-02 } }{\num{ 0.0742 } }$,$\underset{\num{+- 5.841e-02 } }{\num{ 0.0725 } }$,$\underset{\num{+- 5.603e-02 } }{\num{ 0.0789 } }$,$\underset{\num{+- 5.001e-02 } }{\num{ 0.0683 } }$
Single (b) [MeV],$\underset{\num{+- 4.519e-02 } }{\num{ 0.139 } }$,$\underset{\num{+- 4.296e-02 } }{\num{ 0.146 } }$,$\underset{\num{+- 4.624e-02 } }{\num{ 0.209 } }$,$\underset{\num{+- 4.021e-02 } }{\num{ 0.135 } }$,$\underset{\num{+- 5.129e-02 } }{\num{ 0.144 } }$
Single (c) [MeV],$\underset{\num{+- 4.519e-02 } }{\num{ 0.139 } }$,$\underset{\num{+- 4.296e-02 } }{\num{ 0.146 } }$,$\underset{\num{+- 4.576e-02 } }{\num{ 0.218 } }$,$\underset{\num{+- 4.029e-02 } }{\num{ 0.135 } }$,$\underset{\num{+- 4.955e-02 } }{\num{ 0.152 } }$
Double (a) [MeV],$\underset{\num{+- 5.286e-02 } }{\num{ 0.206 } }$,$\underset{\num{+- 5.074e-02 } }{\num{ 0.206 } }$,$\underset{\num{+- 5.862e-02 } }{\num{ 0.207 } }$,$\underset{\num{+- 5.116e-02 } }{\num{ 0.207 } }$,$\underset{\num{+- 5.496e-02 } }{\num{ 0.206 } }$
Double (b) [MeV],$\underset{\num{+- 1.613e-02 } }{\num{ 0.207 } }$,$\underset{\num{+- 1.421e-02 } }{\num{ 0.207 } }$,$\underset{\num{+- 2.393e-02 } }{\num{ 0.209 } }$,$\underset{\num{+- 1.992e-02 } }{\num{ 0.207 } }$,$\underset{\num{+- 1.684e-02 } }{\num{ 0.212 } }$
Double (c) [MeV],$\underset{\num{+- 6.176e-02 } }{\num{ 0.217 } }$,$\underset{\num{+- 6.157e-02 } }{\num{ 0.22 } }$,$\underset{\num{+- 6.134e-02 } }{\num{ 0.215 } }$,$\underset{\num{+- 5.645e-02 } }{\num{ 0.221 } }$,$\underset{\num{+- 6.307e-02 } }{\num{ 0.223 } }$


In [94]:
# Output position mse df
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_positions_mse.tex"
caption = """
Test set Mean Squared Error (MSE) for regression of positions of origin on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-positions-mse"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    df_pos_mse.to_latex(fp, escape=False, caption=caption, label=label, index=True)

In [98]:
# Output energy msedf
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_energies_mse.tex"
caption = """
Test set Mean Squared Error for regression of energies on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-energies-r2"
with open(fname, "w") as fp:
    pd.set_option('display.max_colwidth', -1)
    df_energy_mse.to_latex(fp, escape=False, caption=caption, label=label, index=True)

In [96]:
# Output position df
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_positions_r2.tex"
caption = """
Test set R2-scores for regression of positions of origin on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-positions-r2"
#with open(fname, "w") as fp:
    #pd.set_option('display.max_colwidth', -1)
    #df_pos.to_latex(fp, escape=False, caption=caption, label=label, index=True)

In [97]:
# Output energy df
section_path = "chapters/results/figures/"
fname = THESIS_PATH + section_path + "regression_simulated_all_energies_r2.tex"
caption = """
Test set R2-scores for regression of energies on simulated data, with models trained on data with: 
a) no modifications, b) specific pixels set to zero to mimic experimental data, and c) imbalanced dataset
in addition to modifications in b) to further mimic experimental data. Error estimates are the standard deviation 
in results from validation data in k-fold cross-validation with $K=5$ folds.
"""
label = "tab:regression-simulated-all-energies-r2"
#with open(fname, "w") as fp:
    #pd.set_option('display.max_colwidth', -1)
    #df_energy.to_latex(fp, escape=False, caption=caption, label=label, index=True)