<a href="https://colab.research.google.com/github/tchaase/cVAE_autism/blob/main/code/cVAE_autism.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Contrastive Variational Autoencoder for the ABIDE Data Set

Author - Tobias Haase

## Imports

Firstly I am importaing the necessary modules here, that I will use within the following.



In [None]:
import torch  # The main PyTorch library for tensor computations and neural network operations

import torch.nn as nn  # Provides various neural network layers and functionalities
import torch.nn.functional as F  # Provides functional interfaces to common operations (e.g., activation functions)
import torch.optim as optim  # Contains various optimization algorithms (e.g., SGD, Adam)

import torchvision  # A PyTorch library for computer vision tasks
import torchvision.transforms as transforms  # Provides common image transformations (e.g., resizing, normalization)
from torchvision.transforms import ToTensor  # Transforms PIL images to tensors
from torch.utils.data import Dataset, DataLoader  # Provides tools for creating custom datasets and data loaders

import numpy as np  # NumPy library for numerical computations and array operations
import matplotlib  # Matplotlib library for data visualization
import matplotlib.pyplot as plt  # Matplotlib's pyplot module for creating plots
from tqdm import tqdm  # Progress bar library for tracking iterations


Next, let's load the data. I am loading the data using nilearn's `fetch_abide_pcp` function. This function allows me to load the data that was previously preprocessed via the [preprocessed connectom project](http://preprocessed-connectomes-project.org/index.html) (PCP). Within this project, the data was preprocessed with four different pipelines.
  

>Due to the controversies surrounding bandpass filtering and global signal regression, four different preprocessing strategies were performed with each pipeline: all combinations of with and without filtering and with and without global signal correction.

So, the first question to answer is which preprocessing pipeline I should take. Let's go over them step by step. I tried listing what data they focus on during the preprocessing. Then I want to briefly list key features that set them apart from other pipelines. Under dependencies I list mostly the dependencies they had during their usage, not what they require to load the data with!

1. [Connectome Computation System](http://preprocessed-connectomes-project.org/abide/ccs.html):
  * Preprocessing Steps: CCS involves the usual preprocessing steps, in which both the structural and functional data is preprocessed.
  * Key Features: Perhaps it is important to note that this pipeline integrates FSL and Freesurfer and is primarily implemented using bash but also using various other programming languages.
  * Dependencies: Therefore, this pipeline depends on FSL (skull stripping, normalization etc), freesurfer (e.g. anatomical segmentation, surface reconstruction) and AFNI (various preprocessing tools come from here)
2. [Configurable Pipeline for the Analysis of Connectomes](http://preprocessed-connectomes-project.org/abide/cpac.html):
    * Preprocessing Steps: CPAC incorporates a range of preprocessing steps for both structural and functional data. This includes motion correction, slice timing correction, spatial normalization, intensity normalization, nuisance signal regression, and band-pass filtering.
    * Key Features: Most importantly, CPAC offers a high level of configuration as the name suggests. This allows the choice of several processing options based on their study requirements. It provides various quality control measures and outputs, including preprocessed functional connectivity matrices!
    * Dependencies: CPAC is primarily implemented in Python and relies on various libraries and tools such as Nipype, FSL, ANTS, and AFNI.
3. [Data Processing Assistant for Resting-State fMRI](http://preprocessed-connectomes-project.org/abide/dparsf.html):
    * Preprocessing Steps: DPARSF focuses on resting-state functional MRI data and includes standard preprocessing steps such as slice timing correction, realignment (motion correction), spatial normalization, smoothing, and nuisance signal regression.
    * Key Features: DPARSF provides a graphical user interface. There is a certian level of configurability, as ouput options can be choosen.
    * Dependencies: DPARSF is implemented in MATLAB and requires SPM (Statistical Parametric Mapping) toolbox for some of the preprocessing steps.
4. [Neuroimaging Analysis Kit](http://preprocessed-connectomes-project.org/abide/niak.html)
    * Preprocessing Steps: NIAK allows customization of preprocessing steps, including motion correction, slice timing correction, spatial normalization, smoothing, and nuisance signal regression. It also offers quality control measures.
    * Key Features: NIAK provides a flexible and versatile pipeline for functional and structural MRI data. It offers a command-line interface and the ability to select specific processing options based on the research requirements.
    * Dependencies: NIAK is primarily implemented in MATLAB and relies on various external software packages such as FSL, ANTS, and AFNI for specific preprocessing steps.

My sources for this information are both the website and ChatGPT.

It seems to me that I can stick with the preset pipeline for now, which is **cpac**.

Importantly, quality control was already performed for this data, and I will only load the data that has gone through the quality control successfully.

For now, I am just loading one participant.

In [None]:
nilearn.datasets.fetch_abide_pcp(data_dir = "./data", n_subjects = 1)