In [11]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import nibabel as nib
import glob
import os
from pyhere import here
import sys
import statsmodels.api as sm

from nilearn.glm.first_level import FirstLevelModel
from nilearn.plotting import plot_stat_map, view_img, plot_glass_brain
from nilearn.image import concat_imgs, smooth_img, mean_img, math_img, resample_to_img
from nilearn.glm.second_level import SecondLevelModel
from nilearn.glm import threshold_stats_img
from nilearn.glm.second_level import non_parametric_inference

In [8]:
sys.path.append(str(here('01_level1')))
from level1_utils import make_level1_design_matrix, make_contrasts
from utils import get_from_sidecar, get_model_regs

## How can l2 z-maps look very constrained but l3 randomise with massive swaths of activation?

- How l3 t values be larger than the largest t values in the individual l2 mean maps?
    - L2 mean maps are generated by `nilearn.glm.second_level.SecondLevelModel.compute_contrast` with argument `output_type="z_score"`
    - L3 t values are computed using FSL's `randomise`  

Previous hypotheses:

- Is it because `rabndomise` uses enhanced TFCE statistics instaed of raw T stats?  
    - No. When I tried voxelwise setting `tfce=False` and checked the min and max values in `randomise_results.outputs.tstat_files` they were the same as the outputs using tfce.  

- Is it because of smoothing?
    - No. Smoothing l2 images made values only smaller.

Current hypothesis:

- Statistic mismatch between levels of analysis.
    - FSL's `randomise` expects `cope` images. These should be beta weights. What `output_type` do these correspond to in `nilearn` terminology? --> `output_type = 'stat'`
    - What is the correct input type for group analyses in `nilearn`? --> Can be either contrast maps (i.e. `output_type = "stat"` or FirstLevelModel objects.


## `compute_contrast` output types

Based on the [`Contrast` class](https://github.com/nilearn/nilearn/blob/2fd6665664f27dfbf1f10aeeba660adc9b560361/nilearn/glm/contrasts.py#L143)
 in `nilearn.glm.contrasts`

- `effect_size`: parameter estimate from the GLM defined as [`effect = np.dot(matrix, self.theta)`](https://github.com/nilearn/nilearn/blob/2fd6665664f27dfbf1f10aeeba660adc9b560361/nilearn/glm/model.py#L213)
- `effect_variance`: variance of the parameter estimate from the GLM [`sd = np.sqrt(self.vcov(matrix=matrix, dispersion=dispersion))`](https://github.com/nilearn/nilearn/blob/2fd6665664f27dfbf1f10aeeba660adc9b560361/nilearn/glm/model.py#L213)
- `stat`: decision statistic (t-value?)[`stat = (self.effect - baseline) / np.sqrt(self.variance)`](https://github.com/nilearn/nilearn/blob/2fd6665664f27dfbf1f10aeeba660adc9b560361/nilearn/glm/contrasts.py#L223) comparing the parameter estimate to the null hypothesis (i.e. is it different than 0.)
- `p_value`: p value for the decision statistic [`p_values = scipy.stats.t.sf(self.stat_, self.dof`](https://github.com/nilearn/nilearn/blob/2fd6665664f27dfbf1f10aeeba660adc9b560361/nilearn/glm/contrasts.py#L275)
- `z_score`: z scores for the given p values. Very similary to `stat`