## TRT Experiments:

For each experiment summarized below, a table accompanies the stated hypothesis which describes the dataset(s) used to test the hypothesis. In each column, 'All' indicates that the comparison is made along the specified variable, a '1' indicates a setting in which the test is performed multiple times using a single value of this variable for each test, and 'Ref' refers to using the single reference execution. A value of 'Indep.' refers to tests which are performed for each single value independently, and 'Indep. & Compare' refers to variables for which a two-sample test comparing each value of this variable will be performed following independent evaluation. To avoid false-positive inflation due to multiple comparisons, the mean and standard deviation statistic and p-value for each test will be reported.

In total, following this structure, 9 tests will be performed for each of two pipelines and two perturbation modes, resulting in 36 one-sample tests. For each of these 9 tests, 2 two-sample tests will be performed, comparing pipelines for each perturbation mode, resulting in an additional 18 tests, bringing the total number of tests to 54.

### Hypothesis 1: Individual Variation

$H_0$: Individuals are not more different from one another than they are from themselves<br>
$H_A$: Individuals are distinct across {sessions, directions, simulations}<br>
Expected result: Accept $H_A$

Collections:

|  #  | Participants | Sessions | Directions | Simulations | Pipelines        | Perturbation     |
|:---:|:-------------|:---------|:-----------|:------------|:-----------------|:-----------------|
| 1.1 | All          | All      | 1          | All         | Indep. & Compare | Indep.           |
| 1.2 | All          | All      | 1          | Ref         | Indep. & Compare | Indep.           |
| 1.3 | All          | 1        | All        | All         | Indep. & Compare | Indep.           |
| 1.4 | All          | 1        | All        | Ref         | Indep. & Compare | Indep.           |
| 1.5 | All          | 1        | 1          | All         | Indep. & Compare | Indep.           |


### Hypothesis 2: Session Variation

$H_0$: Sessions are not different from one another<br>
$H_A$: Sessions are distinct from one another across {directions, simulations}<br>
Expected result: Fail to reject $H_0$

Collections:

|  #  | Participants | Sessions | Directions | Simulations | Pipelines        | Perturbation     |
|:---:|:-------------|:---------|:-----------|:------------|:-----------------|:-----------------|
| 2.6 | 1            | All      | All        | All         | Indep. & Compare | Indep.           |
| 2.7 | 1            | All      | All        | Ref         | Indep. & Compare | Indep.           |
| 2.8 | 1            | All      | 1          | All         | Indep. & Compare | Indep.           |


### Hypothesis 3: Direction Variation

$H_0$: Odd- and Even-direction sets are not different from one another<br>
$H_A$: Direction subsets are distinct from one another<br>
Expected result: Fail to reject $H_0$

Collections:

|  #  | Participants | Sessions | Directions | Simulations | Pipelines        | Perturbation     |
|:---:|:-------------|:---------|:-----------|:------------|:-----------------|:-----------------|
| 3.9 | 1            | 1        | All        | All         | Indep. & Compare | Indep.           |


In [1]:
import pandas as pd
import numpy as np
import os.path as op
import os

# import plotly.offline as off
# import plotly.figure_factory as ff
# import plotly.graph_objects as go
# import plotly.express as px
# from plotly.subplots import make_subplots
# from plotly.offline import init_notebook_mode, iplot

# init_notebook_mode(connected=False)

In [2]:
bp = '/data/RocklandSample/derivatives/paper1/'
exp = 'figures'
bpp = op.join(bp, exp)
try:
    os.makedirs(bpp)
except FileExistsError:
    pass

df = pd.read_csv(bp + 'connectomes_mp_discrim.csv')

In [3]:
df_diff = df[df['pipeline'] == 'det-prob']
df_nodiff = df[df['pipeline'] != 'det-prob']
df = df_nodiff

In [5]:
mn_list = []
for inst in df['instrumentation'].unique():
    for test in df['test'].unique():
        for pipe in df['pipeline'].unique():
            tmpdf =  df.query('pipeline == "{0}" and '.format(pipe) +
                              'test == "{0}" and '.format(test) +
                              'instrumentation == "{0}"'.format(inst))
            hyp = tmpdf['hypothesis'].values[0]
            mn_list += [
                {"hypothesis": hyp,
                 "test": test,
                 "pipeline": pipe,
                 "instrumentation": inst,
                 "discrim_mean": np.mean(tmpdf['discrim'].values),
                 "discrim_stde": np.std(tmpdf['discrim'].values)/np.sqrt(len(tmpdf)),
                 "p_mean": np.mean(tmpdf['p-value'])}
            ]
            del tmpdf

df_mn = pd.DataFrame.from_dict(mn_list)
# df_mn

In [87]:
# fig2 = px.bar(df_mn, x='test', y='discrim_mean', error_y='discrim_stde', color='pipeline',
#               facet_col='instrumentation', facet_row='hypothesis', range_y=[0, 1.1],
#               barmode='group')
# 
# fig2.update_yaxes(range=[0, 1.1])
# fig2.update_xaxes(matches=None)

In [6]:
cols = [('', 'Participants'), ('', 'Sessions'), ('', 'Directions'), ('', 'Simulations'),
        ('Pipeline', 'Discrim. (Det)'), ('1', 'Discrim. (Prob)'),
        ('Inputs', 'Discrim. (Det)'), ('2', 'Discrim. (Prob)'),]
cols = pd.MultiIndex.from_tuples(cols, names=['', ''])

idx = [1.1, 1.2, 1.3, 1.4, 1.5, 2.6, 2.7, 2.8, 3.9]
idx = pd.Index(idx, name="Experiment")

vals = [['All']*5 + [1]*4,
        ['All']*2 + [1]*3 + ['All']*3 + [1],
        [1]*2 + ['All']*2 + [1] + ['All']*2 + [1, 'All'],
        ['All', 'Ref', 'All', 'Ref', 'All', 'All', 'Ref', 'All', 'All']] +\
        [[None]*9,
         [None]*9]*2

tmp = pd.DataFrame(list(map(list, zip(*vals))), columns=cols, index=idx)
# tmp

In [7]:
frmtr = "$ {0:.2f} \pm {1:.2f} $"
combos_i = [('Pipeline', 'det'), ('Pipeline', 'prob'), ('Inputs', 'det'), ('Inputs', 'prob')]
combos_o = [('Pipeline', 'Discrim. (Det)'), ('1', 'Discrim. (Prob)'), ('Inputs', 'Discrim. (Det)'), ('2', 'Discrim. (Prob)')]
for idx, ci in enumerate(combos_i):
    dft = df_mn.query('instrumentation == "{0}" and pipeline == "{1}"'.format(*ci))
    tmp[combos_o[idx]] = [frmtr.format(*t) for t in zip(dft['discrim_mean'], dft['discrim_stde'])]
tmp

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Pipeline,1,Inputs,2
Unnamed: 0_level_1,Participants,Sessions,Directions,Simulations,Discrim. (Det),Discrim. (Prob),Discrim. (Det),Discrim. (Prob)
Experiment,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
1.1,All,All,1,All,$ 0.82 \pm 0.00 $,$ 0.82 \pm 0.00 $,$ 0.77 \pm 0.00 $,$ 0.75 \pm 0.00 $
1.2,All,All,1,Ref,$ 0.64 \pm 0.00 $,$ 0.65 \pm 0.00 $,$ 0.64 \pm 0.00 $,$ 0.65 \pm 0.00 $
1.3,All,1,All,All,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.93 \pm 0.02 $,$ 0.90 \pm 0.01 $
1.4,All,1,All,Ref,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $
1.5,All,1,1,All,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.94 \pm 0.01 $,$ 0.90 \pm 0.01 $
2.6,1,All,All,All,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.88 \pm 0.02 $,$ 0.85 \pm 0.02 $
2.7,1,All,All,Ref,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.88 \pm 0.02 $,$ 0.85 \pm 0.02 $
2.8,1,All,1,All,$ 1.00 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.89 \pm 0.02 $,$ 0.84 \pm 0.02 $
3.9,1,1,All,All,$ 0.99 \pm 0.00 $,$ 1.00 \pm 0.00 $,$ 0.71 \pm 0.01 $,$ 0.61 \pm 0.01 $


In [8]:
tmp.to_latex()

'\\begin{tabular}{lllllllll}\n\\toprule\n{} &           Pipeline &                  1 &             Inputs &                  2 \\\\\n{} & Participants & Sessions & Directions & Simulations &     Discrim. (Det) &    Discrim. (Prob) &     Discrim. (Det) &    Discrim. (Prob) \\\\\nExperiment &              &          &            &             &                    &                    &                    &                    \\\\\n\\midrule\n1.1        &          All &      All &          1 &         All &  \\$ 0.82 \\textbackslash pm 0.00 \\$ &  \\$ 0.82 \\textbackslash pm 0.00 \\$ &  \\$ 0.77 \\textbackslash pm 0.00 \\$ &  \\$ 0.75 \\textbackslash pm 0.00 \\$ \\\\\n1.2        &          All &      All &          1 &         Ref &  \\$ 0.64 \\textbackslash pm 0.00 \\$ &  \\$ 0.65 \\textbackslash pm 0.00 \\$ &  \\$ 0.64 \\textbackslash pm 0.00 \\$ &  \\$ 0.65 \\textbackslash pm 0.00 \\$ \\\\\n1.3        &          All &        1 &        All &         All &  \\$ 1.00 \\textbackslash pm 