# Generate FWE Corrected R-Squared Map 
- Using maximum statistic correction
- Notes on controlling a regression:
    - Adding covariates to a regression will 'control' for them, but will almost always increase the R-squared. 
    - To 'remove' a covariate from the regression, you will want to regress a nuisance covariate OUT of the covariate of interest. 
        - This means your regressor will become the residuals from the regression of cov_1 ~ nuisance_cov1

Import Niftis
- These are EXPECTED to have subject IDs which are IDENTICAL to the subject IDs that go in the covarite DF column names below
- Column labels are subject IDs. 
- This is expected to ultimately have the form:

|        |  1 |  2 |  3 |  4 |  5 |  6 |  7 |  8 |  9 |  10 | ... |  40 |  41 |  42 |  43 |  45 |  46 |  47 |  48 |  49 |  50 |
|----------|------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|------------|-----|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Voxel 1     | 3          | 4         | 7         | 2         | 2         | 2         | 9         | 4         | 7         | 5          | ... | 5           | 2           | 7           | 7           | 3           | 8           | 8           | 1           | 1           | 3           |
| . . .      | ...         | ...        | ...         | ...         | ...         | ...         | ...         | ...         | ...         | ...          | ... | ...           | ...           | ...           | ...           | ...           | ...           | 7           | ...           | ...           | ...           |
| Voxel N     | 2          | 1         | 0         | 1         | 3         | 4         | 9         | 5         | 8         | 6          | ... | 6           | 3           | 8           | 8           | 4           | 9           | 9           | 2           | 2           | 4           |

In [5]:
import_path ='/Volumes/Expansion/datasets/adni/neuroimaging/all_patients_atrophy_seeds'
file_target = '*/ses-01/unthresholded_tissue_segment_z_scores/sub-*_cerebrospinal_fluid.nii*'

In [6]:
from calvin_utils.file_utils.import_functions import GiiNiiFileImport
giinii = GiiNiiFileImport(import_path=import_path, file_column=None, file_pattern=file_target)
nimg_df = giinii.run()
nimg_df

Attempting to import from: /Volumes/Expansion/datasets/adni/neuroimaging/all_patients_atrophy_seeds/*/ses-01/unthresholded_tissue_segment_z_scores/sub-*_cerebrospinal_fluid.nii*


Fix names

In [7]:
pre = 'sub-'
post = '_cerebrospinal'

In [8]:
nimg_df = GiiNiiFileImport.splice_colnames(nimg_df, pre, post)
nimg_df

Import Covariates

**The CSV is expected to be in this format**
- sub column contents MUST match the names of the neuroimaging files above. 
    - ID column 
```
+-----+----------------------------+--------------+--------------+--------------+
| sub | Nifti_File_Path            | Covariate_1  | Covariate_2  | Covariate_3  |
+-----+----------------------------+--------------+--------------+--------------+
| 1   | /path/to/file1.nii.gz      | 0.5          | 1.2          | 3.4          |
| 2   | /path/to/file2.nii.gz      | 0.7          | 1.4          | 3.1          |
| 3   | /path/to/file3.nii.gz      | 0.6          | 1.5          | 3.5          |
| 4   | /path/to/file4.nii.gz      | 0.9          | 1.1          | 3.2          |
| ... | ...                        | ...          | ...          | ...          |
+-----+----------------------------+--------------+--------------+--------------+
```

In [9]:
# Specify the path to your CSV file containing NIFTI paths
input_csv_path = '/Users/cu135/Partners HealthCare Dropbox/Calvin Howard/studies/atrophy_seeds_2023/metadata/updated_with_all_adni/master_dx_updated_fix_composite.csv'
sheet= None #'master_list_proper_subjects'

In [10]:
# Specify where you want to save your results to
out_dir = '/Users/cu135/Partners HealthCare Dropbox/Calvin Howard/studies/arc/analyses/age_corr_to_atrophy/spearman'

In [11]:
from calvin_utils.permutation_analysis_utils.statsmodels_palm import CalvinStatsmodelsPalm
# Instantiate the PalmPrepararation class
cal_palm = CalvinStatsmodelsPalm(input_csv_path=input_csv_path, output_dir=out_dir, sheet=sheet)
# Call the process_nifti_paths method
data_df = cal_palm.read_and_display_data()
data_df

Unnamed: 0,Unnamed__0,subid,Age,Male,Female,CSF_Cerebellum,CSF_Subcortex,CSF_MTL,CSF_Occipital,CSF_Frontal,...,DIAGNOSIS_M12,DIAGNOSIS_BL_Str,DIAGNOSIS_CURRENT_Str,DIAGNOSIS_M12_Str,CSFGM_Cerebellum,CSFGM_MTL,CSFGM_Occipital,CSFGM_Frontal,CSFGM_Parietal,CSFGM_Temporal
0,0,002_S_0295,84.898630,1.0,0.0,-26319.38152,373.297842,-746.688684,-24081.964640,-32607.592110,...,1.0,Normal,Normal,Normal,-42080.47385,-985.079574,-49948.08397,-112828.36320,-65321.06433,-81695.87251
1,1,002_S_0413,76.397260,0.0,1.0,-13670.88871,5009.504947,-449.838607,-8185.340726,-13903.977170,...,1.0,Normal,Normal,Normal,-31877.73761,-702.013806,-32065.48546,-86785.29415,-49161.42864,-58706.57030
2,2,002_S_0559,79.410959,1.0,0.0,-38702.13792,152.392985,-754.706246,-25738.506400,-34449.367040,...,1.0,Normal,Normal,Normal,-59414.86617,-904.053254,-53596.47719,-105570.10190,-81075.10079,-72750.12881
3,3,002_S_0619,77.512329,1.0,0.0,-35472.21632,-23876.106190,-1947.061673,-51088.102680,-81475.108430,...,3.0,Alzheimer,Alzheimer,Alzheimer,-53709.90879,-2423.897611,-73659.93022,-154724.05630,-108539.66600,-121871.63710
4,4,002_S_0685,89.698630,0.0,1.0,-31092.93296,-22643.911490,-1014.786011,-32806.816640,-18280.083970,...,1.0,Normal,Normal,Normal,-51590.80915,-1294.271075,-62190.93704,-99100.38722,-102677.31680,-86601.74343
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1383,1383,941_S_4365,80.410959,1.0,0.0,-38529.96216,-15404.929680,-662.636580,-45763.001790,-51966.135180,...,1.0,Normal,Alzheimer,Normal,-57889.69365,-835.677343,-72888.69964,-129545.70420,-96902.61918,-92846.76271
1384,1384,941_S_4376,76.594521,0.0,1.0,-17949.11453,-104.528538,-236.251734,-30861.230910,-12256.220060,...,1.0,Normal,Normal,Normal,-35263.89451,-633.850720,-61941.53743,-95745.37551,-78969.66905,-67336.23936
1385,1385,941_S_4377,69.421918,0.0,1.0,-15992.36193,8601.241440,-63.774856,-17403.780770,-5480.603967,...,2.0,MCI,MCI,MCI,-37613.22065,-382.094253,-48520.90317,-83404.44082,-60909.52233,-52907.83773
1386,1386,941_S_4420,81.493151,1.0,0.0,-16826.37025,-551.877244,-240.304833,-19856.342990,-43372.373360,...,2.0,MCI,MCI,MCI,-34028.22866,-420.608521,-45134.76002,-119449.11290,-72759.32267,-77463.29364


**Preprocess Your Data**

**Handle NANs**
- Set drop_nans=True is you would like to remove NaNs from data
- Provide a column name or a list of column names to remove NaNs from

In [12]:
data_df.columns

Index(['Unnamed__0', 'subid', 'Age', 'Male', 'Female', 'CSF_Cerebellum',
       'CSF_Subcortex', 'CSF_MTL', 'CSF_Occipital', 'CSF_Frontal',
       'CSF_Parietal', 'CSF_Temporal', 'GM_Cerebellum', 'GM_Subcortex',
       'GM_MTL', 'GM_Occipital', 'GM_Frontal', 'GM_Parietal', 'GM_Temporal',
       'WM_Cerebellum', 'WM_Subcortex', 'WM_MTL', 'WM_Occipital', 'WM_Frontal',
       'WM_Parietal', 'WM_Temporal', 'Visual_Frontal', 'Visual_Parietal',
       'Visual_Occipital', 'Visual_Temporal', 'Visual_Cerebellum',
       'Visual_Subcortex', 'Visual_MTL', 'Diagnosis', 'Sex', 'Cohort',
       'CTh_Cerebellum', 'CTh_MTL', 'CTh_Occipital', 'CTh_Frontal',
       'CTh_Parietal', 'CTh_Temporal', 'CTh_Subcortex', 'Q4', 'TOTAL11',
       'TOTALMOD', 'DX_BASELINE', 'DX_M12', 'DIAGNOSIS_BL',
       'DIAGNOSIS_CURRENT', 'DIAGNOSIS_M12', 'DIAGNOSIS_BL_Str',
       'DIAGNOSIS_CURRENT_Str', 'DIAGNOSIS_M12_Str', 'CSFGM_Cerebellum',
       'CSFGM_MTL', 'CSFGM_Occipital', 'CSFGM_Frontal', 'CSFGM_Parietal',
      

In [13]:
drop_list = ['Age']

In [14]:
data_df = cal_palm.drop_nans_from_columns(columns_to_drop_from=drop_list)
display(data_df)

Unnamed: 0,Unnamed__0,subid,Age,Male,Female,CSF_Cerebellum,CSF_Subcortex,CSF_MTL,CSF_Occipital,CSF_Frontal,...,DIAGNOSIS_M12,DIAGNOSIS_BL_Str,DIAGNOSIS_CURRENT_Str,DIAGNOSIS_M12_Str,CSFGM_Cerebellum,CSFGM_MTL,CSFGM_Occipital,CSFGM_Frontal,CSFGM_Parietal,CSFGM_Temporal
0,0,002_S_0295,84.898630,1.0,0.0,-26319.38152,373.297842,-746.688684,-24081.964640,-32607.592110,...,1.0,Normal,Normal,Normal,-42080.47385,-985.079574,-49948.08397,-112828.36320,-65321.06433,-81695.87251
1,1,002_S_0413,76.397260,0.0,1.0,-13670.88871,5009.504947,-449.838607,-8185.340726,-13903.977170,...,1.0,Normal,Normal,Normal,-31877.73761,-702.013806,-32065.48546,-86785.29415,-49161.42864,-58706.57030
2,2,002_S_0559,79.410959,1.0,0.0,-38702.13792,152.392985,-754.706246,-25738.506400,-34449.367040,...,1.0,Normal,Normal,Normal,-59414.86617,-904.053254,-53596.47719,-105570.10190,-81075.10079,-72750.12881
3,3,002_S_0619,77.512329,1.0,0.0,-35472.21632,-23876.106190,-1947.061673,-51088.102680,-81475.108430,...,3.0,Alzheimer,Alzheimer,Alzheimer,-53709.90879,-2423.897611,-73659.93022,-154724.05630,-108539.66600,-121871.63710
4,4,002_S_0685,89.698630,0.0,1.0,-31092.93296,-22643.911490,-1014.786011,-32806.816640,-18280.083970,...,1.0,Normal,Normal,Normal,-51590.80915,-1294.271075,-62190.93704,-99100.38722,-102677.31680,-86601.74343
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1383,1383,941_S_4365,80.410959,1.0,0.0,-38529.96216,-15404.929680,-662.636580,-45763.001790,-51966.135180,...,1.0,Normal,Alzheimer,Normal,-57889.69365,-835.677343,-72888.69964,-129545.70420,-96902.61918,-92846.76271
1384,1384,941_S_4376,76.594521,0.0,1.0,-17949.11453,-104.528538,-236.251734,-30861.230910,-12256.220060,...,1.0,Normal,Normal,Normal,-35263.89451,-633.850720,-61941.53743,-95745.37551,-78969.66905,-67336.23936
1385,1385,941_S_4377,69.421918,0.0,1.0,-15992.36193,8601.241440,-63.774856,-17403.780770,-5480.603967,...,2.0,MCI,MCI,MCI,-37613.22065,-382.094253,-48520.90317,-83404.44082,-60909.52233,-52907.83773
1386,1386,941_S_4420,81.493151,1.0,0.0,-16826.37025,-551.877244,-240.304833,-19856.342990,-43372.373360,...,2.0,MCI,MCI,MCI,-34028.22866,-420.608521,-45134.76002,-119449.11290,-72759.32267,-77463.29364


**Drop Row Based on Value of Column**

Define the column, condition, and value for dropping rows
- column = 'your_column_name'
- condition = 'above'  # Options: 'equal', 'above', 'below'

Set the parameters for dropping rows

In [15]:
# column = 'City'  # The column you'd like to evaluate
# condition = 'not'  # Thecondition to check ('equal', 'above', 'below', 'not')
# value = 'Toronto' # The value to compare against

In [16]:
# data_df, other_df = cal_palm.drop_rows_based_on_value(column, condition, value)
# data_df

In [17]:
# data_df['subject'] = data_df['subject'].str[4:]
# data_df

Regress out a Covariate

In [18]:
data_df.columns

Index(['Unnamed__0', 'subid', 'Age', 'Male', 'Female', 'CSF_Cerebellum',
       'CSF_Subcortex', 'CSF_MTL', 'CSF_Occipital', 'CSF_Frontal',
       'CSF_Parietal', 'CSF_Temporal', 'GM_Cerebellum', 'GM_Subcortex',
       'GM_MTL', 'GM_Occipital', 'GM_Frontal', 'GM_Parietal', 'GM_Temporal',
       'WM_Cerebellum', 'WM_Subcortex', 'WM_MTL', 'WM_Occipital', 'WM_Frontal',
       'WM_Parietal', 'WM_Temporal', 'Visual_Frontal', 'Visual_Parietal',
       'Visual_Occipital', 'Visual_Temporal', 'Visual_Cerebellum',
       'Visual_Subcortex', 'Visual_MTL', 'Diagnosis', 'Sex', 'Cohort',
       'CTh_Cerebellum', 'CTh_MTL', 'CTh_Occipital', 'CTh_Frontal',
       'CTh_Parietal', 'CTh_Temporal', 'CTh_Subcortex', 'Q4', 'TOTAL11',
       'TOTALMOD', 'DX_BASELINE', 'DX_M12', 'DIAGNOSIS_BL',
       'DIAGNOSIS_CURRENT', 'DIAGNOSIS_M12', 'DIAGNOSIS_BL_Str',
       'DIAGNOSIS_CURRENT_Str', 'DIAGNOSIS_M12_Str', 'CSFGM_Cerebellum',
       'CSFGM_MTL', 'CSFGM_Occipital', 'CSFGM_Frontal', 'CSFGM_Parietal',
      

In [19]:
nimg_df.columns

Index([], dtype='object')

Regress values out of a Clinical Variable

In [20]:
from calvin_utils.statistical_utils.regression_utils import RegressOutCovariates
# use this code block to regress out covariates. Generally better to just include as covariates in a model..
dependent_variable_list = ['Age']
regressors = ['DIAGNOSIS_CURRENT']

data_df, adjusted_dep_vars_list = RegressOutCovariates.run(df=data_df, dependent_variable_list=dependent_variable_list, covariates_list=regressors)
print(adjusted_dep_vars_list)


 Formula for Age: 
 Age ~ DIAGNOSIS_CURRENT
['Age_residual']


Regress Values of out the Neuroimaging Variable

In [21]:
# pending

**Standardize Data**
- Enter Columns you Don't want to standardize into a list

In [22]:
# # Remove anything you don't want to standardize
# cols_not_to_standardize = ['Age',  'Subiculum_Connectivity_T']

In [23]:
# data_df = cal_palm.standardize_columns(cols_not_to_standardize)
# data_df

Choose Rows to Keep

In [24]:
print(data_df.columns)

Index(['Unnamed__0', 'subid', 'Age', 'Male', 'Female', 'CSF_Cerebellum',
       'CSF_Subcortex', 'CSF_MTL', 'CSF_Occipital', 'CSF_Frontal',
       'CSF_Parietal', 'CSF_Temporal', 'GM_Cerebellum', 'GM_Subcortex',
       'GM_MTL', 'GM_Occipital', 'GM_Frontal', 'GM_Parietal', 'GM_Temporal',
       'WM_Cerebellum', 'WM_Subcortex', 'WM_MTL', 'WM_Occipital', 'WM_Frontal',
       'WM_Parietal', 'WM_Temporal', 'Visual_Frontal', 'Visual_Parietal',
       'Visual_Occipital', 'Visual_Temporal', 'Visual_Cerebellum',
       'Visual_Subcortex', 'Visual_MTL', 'Diagnosis', 'Sex', 'Cohort',
       'CTh_Cerebellum', 'CTh_MTL', 'CTh_Occipital', 'CTh_Frontal',
       'CTh_Parietal', 'CTh_Temporal', 'CTh_Subcortex', 'Q4', 'TOTAL11',
       'TOTALMOD', 'DX_BASELINE', 'DX_M12', 'DIAGNOSIS_BL',
       'DIAGNOSIS_CURRENT', 'DIAGNOSIS_M12', 'DIAGNOSIS_BL_Str',
       'DIAGNOSIS_CURRENT_Str', 'DIAGNOSIS_M12_Str', 'CSFGM_Cerebellum',
       'CSFGM_MTL', 'CSFGM_Occipital', 'CSFGM_Frontal', 'CSFGM_Parietal',
      

In [25]:
col_to_keep_list = ['Age', 'subid']

- The final DF is EXPECTED to have subject IDs which are IDENTICAL to the subject IDs that go in the neuroimaging DF column names above
- There should only be 1 variable  the row

|        |  1 |  2 |  3 |  4 |  5 |  6 |  7 |  8 |  9 |  10 | ... |  40 |  41 |  42 |  43 |  45 |  46 |  47 |  48 |  49 |  50 |
|----------|------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|------------|-----|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
| Indep. Var.    | 3          | 4         | 7         | 2         | 2         | 2         | 9         | 4         | 7         | 5          | ... | 5           | 2           | 7           | 7           | 3           | 8           | 8           | 1           | 1           | 3           |

In [26]:
data_df=data_df.loc[:, col_to_keep_list]
data_df = data_df.T
data_df.columns = data_df.loc['subid']
data_df = data_df.drop('subid')
data_df.dropna(inplace=True, axis=1)
data_df

subid,002_S_0295,002_S_0413,002_S_0559,002_S_0619,002_S_0685,002_S_0729,002_S_0782,002_S_0816,002_S_0938,002_S_0954,...,941_S_4066,941_S_4100,941_S_4187,941_S_4255,941_S_4292,941_S_4365,941_S_4376,941_S_4377,941_S_4420,941_S_4764
Age,84.89863,76.39726,79.410959,77.512329,89.69863,65.224658,81.70411,70.838356,82.276712,69.452055,...,78.775342,78.621918,62.079452,72.517808,70.99726,80.410959,76.594521,69.421918,81.493151,82.783562


Is there a particular mask you want to use?
- MUST match the resolution of voxelwise data being analyzed. 
- If you set None, the voxelwise data will be used for thresholding. 
    - Values below mask_threshold (float) will be set to 0. 
- Warning: bad masking may result in failed experiments. Erroneous voxels outside the brain will influence the correction. 

In [27]:
mask_path = '/Users/cu135/hires_backdrops/MNI/MNI152_T1_2mm_brain_mask.nii'
mask_threshold = 0

Correlation method
- spearman or pearson

In [28]:
method = 'spearman'

Choose Max Stat Correction Method
- None | pseudo_var_smooth | var_smooth

In [29]:
max_stat_method = 'pseudo_var_smooth'

In [30]:
from calvin_utils.permutation_analysis_utils.correlation_fwe import CalvinFWEMap
calvin_fwe = CalvinFWEMap(neuroimaging_dataframe=nimg_df, 
                          variable_dataframe=data_df, 
                          mask_threshold=mask_threshold, 
                          mask_path=mask_path, out_dir=out_dir, 
                          method=method, max_stat_method=max_stat_method, vectorize=True)

calvin_fwe.permutation_test_r_map(n_permutations=1, debug=False)

ValueError: Shape of passed values is (902629, 1), indices imply (0, 1)

Visualize the FWE Corrected Image

In [None]:
calvin_fwe.corrected_img

Visualize the P-Values (FWE Corrected) Used to Correct the Above

In [None]:
calvin_fwe.p_img

Visualize the Uncorrected Image

In [None]:
calvin_fwe.uncorrected_img