# Run FSL Palm

### Authors: Calvin Howard, Alexander Cohen, Christopher Lin, William Drew.

#### Last updated: July 6, 2023

Use this to run/test a statistical model (e.g., regression or T-tests) on lesion network maps (or lesions alone!) using PALM, potentially taking into account specific covariates of interest and/or nuisance regressors.

Notes:
- To best use this notebook, you should be familar with GLM design and Contrast Matrix design. See this webpage to get started:
[FSL's GLM page](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/GLM)
- This notebook is a combination of the old PALM Notebooks and Christopher's palm_analysis notebooks (does the same thing) and requires the NIMLAB Python 3 environment as a kernel. Directions are on the [NIMLAB software_env README.md](https://github.com/nimlab/software_env)
- You will need a csv file that provides the paths to your fcMaps, usually created from the [Preprocessing](https://github.com/nimlab/templates/blob/master/py3_notebooks/1_Preprocessing_LesionQA_fcSurfBcbLqtGen_nimtrack.ipynb) notebook.
- Christopher wrote up a really nice description of how to modify code to set up your design matrix within the notebook here: [PALM-Analysis](https://github.com/nimlab/documentation/wiki/PALM-experimental-designs)
- I would also recommend reviewing Fred's [slides](https://github.com/nimlab/documentation/blob/master/presentations/presentation_palm_multidataset_analyses_labmeeting_13.4.2020.pdf) from his May 2020 lab meeting For details regarding the difference between implementing a random effects and fixed effects analysis and use of exchangeability blocks

# 01 - Import CSV with All Data
**The CSV is expected to be in this format**
- ID and absolute paths to niftis are critical
```
+-----+----------------------------+--------------+--------------+--------------+
| ID  | Nifti_File_Path            | Covariate_1  | Covariate_2  | Covariate_3  |
+-----+----------------------------+--------------+--------------+--------------+
| 1   | /path/to/file1.nii.gz      | 0.5          | 1.2          | 3.4          |
| 2   | /path/to/file2.nii.gz      | 0.7          | 1.4          | 3.1          |
| 3   | /path/to/file3.nii.gz      | 0.6          | 1.5          | 3.5          |
| 4   | /path/to/file4.nii.gz      | 0.9          | 1.1          | 3.2          |
| ... | ...                        | ...          | ...          | ...          |
+-----+----------------------------+--------------+--------------+--------------+
```

In [19]:
# Specify the path to your CSV file containing NIFTI paths
input_csv_path = '/Users/cu135/Dropbox (Partners HealthCare)/studies/ccm_memory/metadata/file_paths_and_clinical_data.csv'

In [20]:
# Specify where you want to save your results to
out_dir = '/Users/cu135/Dropbox (Partners HealthCare)/studies/ccm_memory/neuroimaging/derivatives/palm'

In [21]:
from calvin_utils.permutation_analysis_utils.palm_utils import CalvinPalm

# Instantiate the PalmPrepararation class
cal_palm = CalvinPalm(input_csv_path=input_csv_path, output_dir=out_dir)

# Call the process_nifti_paths method
data_df = cal_palm.read_data()

# Display nifti_df
display(data_df)


Unnamed: 0,AvgR_Fz,Avg_RFz_Local,Intercept,dataset,Memory,Spatial_Memory,Verbal_Memory,Episodic_Memory,Semantic,Subjective_Memory,Objective_Memory,Control
0,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,1,1,1,0,1,0,0,1,0
1,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,1,1,0,1,1,0,0,1,0
2,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,2,1,0,1,1,0,0,1,0
3,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
4,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
5,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
6,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
7,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
8,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0
9,/PHShome/cu135/studies/ccm_memory/neuroimaging...,/Users/cu135/Dropbox (Partners HealthCare)/stu...,1,3,1,0,1,1,0,0,1,0


# 02 - Define Your Design Matrix

This is the explanatory variable half of your regression formula
_______________________________________________________
Create Design Matrix: Use the create_design_matrix method. You can provide a list of formula variables which correspond to column names in your dataframe.

- design_matrix = palm.create_design_matrix(formula_vars=["var1", "var2", "var1*var2"])
- To include interaction terms, use * between variables, like "var1*var2".
- By default, an intercept will be added unless you set intercept=False
- **don't explicitly add the 'intercept' column. I'll do it for you.**

In [22]:
data_df.columns

Index(['AvgR_Fz', 'Avg_RFz_Local', 'Intercept', 'dataset', 'Memory',
       'Spatial_Memory', 'Verbal_Memory', 'Episodic_Memory', 'Semantic',
       'Subjective_Memory', 'Objective_Memory', 'Control'],
      dtype='object')

Use the above printout to copy-paste relevant covariates into your explanatory_variable_list

In [23]:
explanatory_variable_list = ['Memory', 'dataset', 'dataset*dataset*Memory']

In [24]:
design_matrix = cal_palm.create_design_matrix(formula_vars=explanatory_variable_list, data_df=data_df, intercept=True)
design_matrix

Unnamed: 0_level_0,Intercept,Memory,dataset,dataset_Memory,dataset_dataset_Memory
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,1.0,1.0,1.0,1.0,1.0
1,1.0,1.0,1.0,1.0,1.0
2,1.0,1.0,2.0,2.0,4.0
3,1.0,1.0,3.0,3.0,9.0
4,1.0,1.0,3.0,3.0,9.0
5,1.0,1.0,3.0,3.0,9.0
6,1.0,1.0,3.0,3.0,9.0
7,1.0,1.0,3.0,3.0,9.0
8,1.0,1.0,3.0,3.0,9.0
9,1.0,1.0,3.0,3.0,9.0


# 03 - Define Your Dependent Variable

Typically, this is just your nifti files. However, I have added functionality that allows you to set other values to your dependent variable, or to create interactions. Simple define a formula following 

dependent_variable = var1

or

dependent_variable = var1*var2

and I will generate your dependent variable nifti files for PALM

In [25]:
data_df.columns

Index(['AvgR_Fz', 'Avg_RFz_Local', 'Intercept', 'dataset', 'Memory',
       'Spatial_Memory', 'Verbal_Memory', 'Episodic_Memory', 'Semantic',
       'Subjective_Memory', 'Objective_Memory', 'Control'],
      dtype='object')

Use the about printout to select the column which will be your dependent variable. 

Functionality is not yet implemented to allow interactions in creating your columns.

In [26]:
dependent_variable = 'Avg_RFz_Local'

In [27]:
dv = cal_palm.set_dependent_variable(dependent_variable)
dv

id
0     /Users/cu135/Dropbox (Partners HealthCare)/stu...
1     /Users/cu135/Dropbox (Partners HealthCare)/stu...
2     /Users/cu135/Dropbox (Partners HealthCare)/stu...
3     /Users/cu135/Dropbox (Partners HealthCare)/stu...
4     /Users/cu135/Dropbox (Partners HealthCare)/stu...
5     /Users/cu135/Dropbox (Partners HealthCare)/stu...
6     /Users/cu135/Dropbox (Partners HealthCare)/stu...
7     /Users/cu135/Dropbox (Partners HealthCare)/stu...
8     /Users/cu135/Dropbox (Partners HealthCare)/stu...
9     /Users/cu135/Dropbox (Partners HealthCare)/stu...
10    /Users/cu135/Dropbox (Partners HealthCare)/stu...
11    /Users/cu135/Dropbox (Partners HealthCare)/stu...
12    /Users/cu135/Dropbox (Partners HealthCare)/stu...
13    /Users/cu135/Dropbox (Partners HealthCare)/stu...
14    /Users/cu135/Dropbox (Partners HealthCare)/stu...
15    /Users/cu135/Dropbox (Partners HealthCare)/stu...
16    /Users/cu135/Dropbox (Partners HealthCare)/stu...
17    /Users/cu135/Dropbox (Partners HealthCa

# 04 - Convert Your DV To a 4D Nifti for PALM to Accept

Set absval=True if you would like to run the PALM value using absolute values. 

In [28]:
dv_nifti_file_path = cal_palm.generate_4d_dependent_variable_nifti(dv, absval=False)

# 05 - Generate Contrasts

Generate a Contrast Matrix
- This is different from the contrast matrices used in cell-means regressions such as in PALM, but it is much more powerful. 



For more information on contrast matrices, please refer to this: https://cran.r-project.org/web/packages/codingMatrices/vignettes/codingMatrices.pdf

Generally, these drastically effect the results of ANOVA. However, they are mereley a nuisance for a regression.
In essence, they assess the coefficients of a given

________________________________________________________________
A coding matrix (a contrast matrix if it sums to zero) is simply a way of defining what coefficients to evaluate and how to evaluate them. 
If a coefficient is set to 1 and everything else is set to zero, we are taking the mean of the coefficient's means and assessing if they significantly
deviate from zero--IE we are checking if it had a significant impact on the ability to predict the depdendent variable.
If a coefficient is set to 1, another is -1, and others are 0, we are assessing how the means of the two coefficients deviate from eachother. 
If several coefficients are 1 and several others are -1, we are assessing how the group-level means of the two coefficients deviate from eachother.
If a group of coefficients are 1, a group is -1, and a group is 0, we are only assessing how the groups +1 and -1 have differing means. 

1: This value indicates that the corresponding variable's coefficient in the model is included in the contrast. It means you are interested in estimating the effect of that variable.

0: This value indicates that the corresponding variable's coefficient in the model is not included in the contrast. It means you are not interested in estimating the effect of that variable.

-1: This value indicates that the corresponding variable's coefficient in the model is included in the contrast, but with an opposite sign. It means you are interested in estimating the negative effect of that variable.

----------------------------------------------------------------
The contrast matrix is typically a matrix with dimensions (number of contrasts) x (number of regression coefficients). Each row of the contrast matrix represents a contrast or comparison you want to test.

For example, let's say you have the following regression coefficients in your model:

Intercept, Age, connectivity, Age_interaction_connectivity
A contrast matric has dimensions of [n_predictors, n_experiments] where each experiment is a contrast

If you want to test the hypothesis that the effect of Age is significant, you can set up a contrast matrix with a row that specifies this contrast (actually an averaging vector):
```
[0,1,0,0]. This is an averaging vector because it sums to 1
```
This contrast will test the coefficient corresponding to the Age variable against zero.


If you want to test the hypothesis that the effect of Age is different from the effect of connectivity, you can set up a contrast matrix with two rows:
```
[0,1,−1,0]. This is a contrast because it sums to 0
```

Thus, if you want to see if any given effect is significant compared to the intercept (average), you can use the following contrast matrix:
```
[1,0,0,0]
[-1,1,0,0]
[-1,0,1,0]
[-1,0,0,1] actually a coding matrix of averaging vectors
```

The first row tests the coefficient for Age against zero, and the second row tests the coefficient for connectivity against zero. The difference between the two coefficients can then be assessed.
_____
You can define any number of contrasts in the contrast matrix to test different hypotheses or comparisons of interest in your regression analysis.

It's important to note that the specific contrasts you choose depend on your research questions and hypotheses. You should carefully consider the comparisons you want to make and design the contrast matrix accordingly.

- Examples:
    - [Two Sample T-Test](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/GLM#Two-Group_Difference_.28Two-Sample_Unpaired_T-Test.29)
    - [One Sample with Covariate](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/GLM#Single-Group_Average_with_Additional_Covariate)

In [35]:
# Generate the basic contrast matrix and display it
contrast_matrix, _= cal_palm.generate_basic_contrast_matrix(design_matrix)
print('\n Copy the below matrix for manipulation:')
print(contrast_matrix)
# If you want to modify the contrast_matrix, do it here
# contrast_matrix = 

This is a basic contrast matrix set up to evaluate the significance of each variable.
Copy it into a cell below and edit it for more control over your analysis.
   Intercept  Memory  dataset  dataset_Memory  dataset_dataset_Memory
0        1.0     0.0      0.0             0.0                     0.0
1       -1.0     1.0      0.0             0.0                     0.0
2       -1.0     0.0      1.0             0.0                     0.0
3       -1.0     0.0      0.0             1.0                     0.0
4       -1.0     0.0      0.0             0.0                     1.0

 Copy the below matrix for manipulation:
[[ 1.  0.  0.  0.  0.]
 [-1.  1.  0.  0.  0.]
 [-1.  0.  1.  0.  0.]
 [-1.  0.  0.  1.  0.]
 [-1.  0.  0.  0.  1.]]


In [36]:
contrast_matrix = cal_palm.finalize_contrast_matrix(design_matrix=design_matrix, 
                                                    contrast_matrix=contrast_matrix) 
contrast_matrix

# 06 - Define Exchangeability Blocks (Optional)

Optional - Exchangability Blocks
- This is optional and for when you are doing a 'meta-analysis' of multiple data types, e.g. strokes and DBS sites
- This is a column of integers that can usually be generated from the dataset names. Details on the [PALM website](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/PALM/ExchangeabilityBlocks)
- To use this, add the following to the `call_palm` command below `eb=eb_matrix`.

In [38]:
### This is just an example, you will have to edit to adapt to your data, 
### but it should be integers, starting with 1,2,3....

# coding_key = {"Prosopagnosia_w_Yeo1000": 1,
#              "Corbetta_Lesions": 1,
#              "DBS_dataset": 2
#              }

# eb_matrix = pd.DataFrame()
# eb_matrix = clean_df['dataset'].replace(coding_key)
# display(eb_matrix)

# 07 - Are You Doing a 1 Sample T-Test?
- set one_sample_ttest=True if you are
- This is generally a test you run when you have 1 dataset and you are trying to figure out if it significantly deviates from zero. 

In [39]:
one_sample_ttest=True

# 08 - Call PALM

In [40]:
number_of_permutations=10000
cluster_username = "cu135"
cluster_email = "choward12@bwh.harvard.edu"

Set additional arguments here

In [41]:
# Call the PALM method
from calvin_utils.permutation_analysis_utils.palm_utils import CalvinPalmSubmitter

cal_palm = CalvinPalmSubmitter()
cal_palm.calvins_call_palm(
             dv_nifti_file_path=dv_nifti_file_path,
             design_matrix=design_matrix,
             contrast_matrix=contrast_matrix,
             working_directory=None,
             output_directory=out_dir,
             iterations=number_of_permutations,
             accel="tail",
             voxelwise_evs=None,
             eb=None,
             mask="",
             save_1p=True,
             logp=False,
             tfce=False,
             ise_flag=one_sample_ttest,
             two_tailed_flag=True,
             corrcon_flag=False,
             fdr_flag=False,
             cluster_name="erisone",
             username=cluster_username,
             cluster_email=cluster_email,
             queue="normal",
             cores="1",
             memory="6000",
             dryrun=False,
             job_name="fsl_palm",
             job_time="",
             num_nodes="",
             num_tasks="",
             x11_forwarding="",
             service_class="",
             debug=False,
             extra=""
)

FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/fsl/bin/Text2Vest'

PALM is now running. Enjoy.

-- Calvin