# Application of Deep Knockoffs for Functional Magnetic Resonance Imaging to Generate Surrogate Data

This notebook describes the pipeline to follow to build Knockoffs from fMRI images. These knockoffs generate surrogate data that is used in non-parametric tests to obtain a Statistical Parametric Map (SPM) of the brain.

First of all, some things to consider:
* fMRI images have been obtained from the Brain Connectome Project and previously preprocessed. 
* 

## Structure
The process is divided into 3 main parts, which consist of:
1. Performing the **General Linear Model (GLM)** on the data: this is the classical method to obtain the SPM, which returns the fitted beta values for $$ y = X\beta$$ where y is the fMRI timecourse and X is the design matrix of the experiment.


2. **Generating Knockoffs**: given the data, the algorithm will build a machine to generate surrogate timecourses.
    There are three methods:
    * Gaussian Knockoffs
    * Low Rank Knockoffs
    * Deep Knockoffs
    
    
3. Performing **Non-Parametric Tests**: the GLM is applied to the generated surrogate data to get the beta values and these are used to threshold the true betas using Non-Parametric Tests, which can be:
    * Uncorrected Non-Parametric Test
    * Corrected Non-Parametric Test
    



## 1. GLM

In [7]:
from implementation import glm
glm.run()

 MOTOR 
Loading data for task MOTOR...
Loaded Data - Shape: (100, 379, 284)
Loaded Task Paradigms - Shape: (100, 284)
Computing GLM for task MOTOR...
Separating conditions...
Done!
Convolving...
Done!
Fitting GLM for 100 subjects and 379 regions...
Done!
Saving activations and beta values for task MOTOR...
 GAMBLING 
Loading data for task GAMBLING...
Loaded Data - Shape: (100, 379, 253)
Loaded Task Paradigms - Shape: (100, 253)
Computing GLM for task GAMBLING...
Separating conditions...
Done!
Convolving...
Done!
Fitting GLM for 100 subjects and 379 regions...
Done!
Saving activations and beta values for task GAMBLING...
 RELATIONAL 
Loading data for task RELATIONAL...
Loaded Data - Shape: (100, 379, 232)
Loaded Task Paradigms - Shape: (100, 232)
Computing GLM for task RELATIONAL...
Separating conditions...
Done!
Convolving...
Done!
Fitting GLM for 100 subjects and 379 regions...
Done!
Saving activations and beta values for task RELATIONAL...
 SOCIAL 
Loading data for task SOCIAL...
Loa

## 2. Building Knockoffs

In [8]:
task = 'MOTOR'
subject = 1

### a) Gaussian Knockoffs

In [None]:
gaussian = knockoff_class.GaussianKnockOff(task, subject)  

# Pre-processing the data: clustering to avoid correlations
gaussian.pre_process(max_corr=.3)


gaussian.fit() #trains a machine to build knockoffs
# d.load_machine()  # gets the machine that will build knockoffs (if I fitted before) (only for Deep)
# d.diagnostics()  
# data_deep = d.transform()  #creates knockoffs for a specific subject, will create 101xregions
# ko_betas = d.statistic(data_deep, save=True)  #runs glm
# uncorrected_betas, corrected_betas = d.threshold(ko_betas, save=True)  #threshold with nonparametrics