# fMRIPrep pre-processing and post-processing - connectivity matrices extraction using a parcellation

This notebook is intented to give a brief overview of a pipeline to:

1) Download data from Beluga, an HPC  
2) Verify the data with bids-validator using a Docker container  
3) Run the pre-processing with fMRIPrep using a Docker container  
4) Create and run a post-processing pipeline in Nilearn to obtain connectivity matrices

I tend to be quite verbose in the descriptions below. Please skip to the section's tl;dr to simply get the code and a brief description

## Background

The data used in this tutorial is anonymized data from the Prevent-AD Cohort. Data from the Prevent-AD is open and available at this [address](https://portal.conp.ca/dataset?id=projects/preventad-open). Unfortunately, the images from the cohort on the CONP are only available in .mnc format. While .mnc images are tranformable to .nii format, they are not transformable in BIDS at this time.

As such, the tutorial used closed data from the Prevent-AD cohort which was readily available in BIDS format. As the data is not open, the participant ID was anonymized and the data is not available to be reproduced.

However, the tools used and the tutorial should be applicable to any bids-validated datasets, which is why we run the bids-validator first. 

## 1) Downloading data from Beluga

In this step, we simply download the data from Beluga. The exact code is not shared as it would leak the actual participant ID. However, I share below the command that would be used to download the data. 

```scp -r user@beluga.computecanada.ca:/path/to/directory/to/copy /directory/local/computer/to/copy```

In this ```scp``` command, the ```-r``` argument serves to copy whole directories to a local computer. It is important to put ```:``` after your user name for the HPC. The path after the ```:``` refers to the path on the supercomputer that needs to be copied to the local computer. After this path, a white space and a path refering to the folder on the local computer where the folder from the HPC should be copied.

Should you want to copy a folder from your computer to the HPC (e.g. after the pre-processing), you simply need to invert the order of the paths.

```scp -r /directory/local/computer/to/copy/to/HPC user@beluga.computecanada.ca:/path/to/remote/directory/```

## 2) Verify BIDS validity with bids-validator

Once the data is downloaded, and in our case, anonymized, we are ready to validate our bids! You can find everything there is to know about BIDS [here](https://bids-specification.readthedocs.io/en/stable/). Note that the actual bids-specification version might differ slightly from the version available within the Docker container.

While this might seem redundant when running the bids-validator on our local computer (since fMRIPrep also runs the bids-validator before launching the pre-processing), it becomes particularly useful when we need to launch jobs say on remote HPCs. We might not receive a notice right away that our pre-processing failed because our data is not in bids format for example. Further it is a good practice on how to use containers. You can note however that there is a bunch of different ways to download and use bids-validator as described [here](https://pypi.org/project/bids-validator/), including a [web browser](http://bids-standard.github.io/bids-validator/) where you can upload the dataset and verify it there.

For the Docker container, you first need to install [Docker](https://hub.docker.com/editions/community/docker-ce-desktop-mac/). The hyperlink shows instructions for Mac, but Docker is available for Windows and Linux. Once you have gone through the instructions, your Docker version should be ready to go! 

To use the bids-validator, we first need to ```pull``` the Docker image of the bids-validator. In short, by pulling, we are "installing" the software on our computer, without actually installing software on the hardware. This might not make a ton of sense, but for now, you can think of it as accessing another computer, that is not yours, which only contains the softwares necessary to run bids-validator.

The first time you run the command below, Docker will ```pull``` the Docker image for you and run the analyses right away. Note that I split the code below in 4 lines for readability using back slashes, but you can run this command in a single line too.

In [15]:
#docker run -ti --rm \
#-v /path/to/data:/data:ro \
#bids/validator \
#/data

Let's unpack the command. 

The first line: Calls Docker and tells it to run in interactive mode (```ti```) (i.e. once we run Docker, we will be "warped" inside the container where there will be an output displayed on the terminal as the software runs. We also use the ```--rm``` command to "clean" our environment before the container is called. This insures that no variables from our Unix/Mac environment "leak" in the container. It is basically just how you would clean a wet table before putting a cardboard on it so that the water wouldn't leak in the box.

The second line: This is called a "mount" and is called in Docker using the ```-v``` argument. Is it telling Docker where on our computer it can fetch the data we want the software inside the container to analyze. When using the ```-v``` argument, we need to tell it: 1) Where to find the data on our computer, 2) How to call this path in the Container and 3) Whether or not Docker has permissions to modify the files in this folder (in this case, ```ro``` stands for read-only. To summarize, this mount tells Docker that the data is on our computer at a certain path (i.e. ```path/to/data```). Then, it tells Docker that inside the container, we should refer to it as ```/data```. Finally, we tell Docker that it can't modify this data: it is read only.

The third line: This is straightforward-this is simply the program Docker needs to call from within the container.

The fourth line: This is an argument given to the program 'bids-validator'. In this case, the program looks for a BIDS dataset within the path you gave it.

Once run, the command will open in your terminal in 'interactive' mode. You will see 

In [16]:
#docker run -ti --rm \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/sourcedata:/data:ro \
#bids/validator \
#/data

Running this command gives us an output that looks a little bit like this:

IMAGE COMMAND LINE

In red, we have errors: These are things that will be problematic when trying to run the bids apps (in our case, fMRIPrep). Full disclosure, in Prevent-AD, it seems that the field 'TaskName' is missing from our .json files. As such, bids-validator will throw an error. However, fMRIPrep still ran in our case.

In blue: we have references that the bids-validator recommend to check to get more information on the error. Note that these links do not always work, as they are auto-generated, so a Google search is much better.

In greenish/yellow: we have warning. These warnings are 'recommendations' that the bids specification asks for. However, they are not essential for the code to run properly. 

Once we verified our BIDS dataset and corrected the errors, we can re-run it again to insure that it is completly bids-compliant. Then, we are ready for our pre-processing!

### tl;dr - bids-validator

In [17]:
#docker run -ti --rm \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/sourcedata:/data:ro \
#bids/validator \
#/data

#########
# Mount option to change. In your code.

## 3) Run fMRIPrep pre-processing

As we now know how to use Docker containers, this next part should not be too difficult. To run fMRIPrep, a single command line is necessary. However, I would recommend to edit this command in a text editor first so that you can make sure that all the arguments necessary are there. The goal here is not to describe what fMRIPrep **does** in terms of pre-processing. The [fMRIPrep documentation](https://fmriprep.readthedocs.io/en/stable/), though quite long, is quite thorough in its documentation. The goal is to describe the correct command(s) to go through the pre-processing. 

fMRIPrep gives a few options to download and use the software, but their recommended method is to use a Docker container and use their Python script wrapper to simplify the command complexity (i.e. you do not need to call Docker as the script will do it for you). However, running the Docker container directly will give you a lot more option to fine-tune your analyses. Here is an example of a command from the [fMRIPrep documentation](https://fmriprep.readthedocs.io/en/stable/docker.html):

In [18]:
#docker run -ti --rm \
#-v path/to/data:/data:ro \
#-v path/to/output:/out \
#poldracklab/fmriprep:<latest-version> \
#/data /out/out \
#participant

Let's unpack the command.

<ins>**The first line:**</ins> Simply calls Docker in interactive mode and cleans the environment, as we have seen with the bids-validator.

<ins>**The second line:**</ins> A mount telling fMRIPrep where to get the data on our computer and telling it that it can't modify these files in their original folders (with the ```:ro```) option. We will call it ```/data``` in the container.

<ins>**The third line:**</ins> A mount telling fMRIPrep where to send back the pre-processed data on our computer. We will call it ```/out/``` in the container.

<ins>**The fourth line:**</ins> Calls fMRIPrep. The next lines will be arguments that we give directly to fMRIPrep to specify what and how we want to process our data.

<ins>**The fifth line:**</ins> We tell fMRIPrep 2 things. 1) The data to analyze is in the ```/data``` folder, which we defined with a mount before and the output where the data is to be store is the ```/out``` folder which we also defined in a mount. The second ```out``` is basically to create a separate folder for fMRIPrep in the output folder.

<ins>**The sixth line:**</ins> We tell fMRIPrep which participants we want to pre-process.

------------

These are the basic arguments that fMRIPrep needs to pre-process the data. The full list of arguments that can be used can be found [here](https://fmriprep.readthedocs.io/en/stable/usage.html).

In the case of the pre-processing I did for the current tutorial, I ran the following command:

In [19]:
#docker run -it --rm \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/sourcedata:/data:ro \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/derivative:/out \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/fs_license/license.txt:/opt/freesurfer/license.txt \
#-v /Users/stong3/Desktop/test_fmriprep_PAD/work_dir:/work \
#poldracklab/fmriprep:latest \
#/data /out/fmriprep \
#participant \
#--participant-label sub-00001 \
#-w /work \
#--low-mem \
#--output-spaces T1w \
#--write-graph \
#--fs-license-file /opt/freesurfer/license.txt

So what changed?

<ins>**The first, second and third lines:**</ins> This is unchanged from the basic structure, i.e., we run Docker, and give an input and output mount. 

<ins>**The fourth line:**</ins> This line tells Docker where to find a Freesurfer license. As part of the fMRIPrep processing, FreeSurfer is run on all anatomical images available to render surfaces and in part for registration of the anatomical template to the functional template. To do this, fMRIPrep needs to access a license that authorizes users to use FreeSurfer. This is free and can be done [here](https://surfer.nmr.mgh.harvard.edu/registration.html). 

<ins>**The fifth line:**</ins> This mount is to make it a bit easier on your computer to process the data. It creates a work directory where the intermediate output of the pre-processing are stored during processing so that fMRIPrep doesn't store all of it the computer's memory. We will define the work directory a little bit later.

<ins>**The sixth and seventh lines:**</ins> These are the basic fMRIPrep arguments where we start the software, where we tell it where to find the data, where to output the pre-processed files.

<ins>**The eight and nine lines:**</ins> This is where we define the participants we want to process. We first tell it the ```participant``` argument, followed by the ```--participant-label```. We then feed it the labels of the participants we want to process. In our case, we simply want the subject 00001. 

<ins>**The tenth line:**</ins> This defines the work directory where to put the intermediate files. See the comment above the **fifth line** above. 

<ins>**The eleventh line:**</ins> This tells fMRIPrep that our computer does not have a lot of RAM, and to go a bit easier on it. This is particularly relevant when running fMRIPrep on a local computer.

<ins>**The twelveth line:**</ins> We now chose in what space do we want our processing to be outputed. This depends on the type of analyses we are interested in. Normally, the final images are spatially registered to the MNI atlas. In our case, since the analyses were to done with the mentality of using the output for fMRI fingerprinted, I chose to register in T1w space (i.e. in the space of their T1w scans). You can ask fMRIPrep to give

<ins>**The final line:**</ins> This tells fMRIPrep where to find the FreeSurfer license, as described above. It is worth noting that existing FreeSurfer outputs (from version 6.0.0 onwards) can be fed to fMRIPrep without running the whole recon-all pipeline. Since we mounted a directory with the Freesurfer license, we need to tell it where is the license. **Be careful:** Here, the path does not refer to the path on your computer, **but to the path inside the container that you mounted earlier in line 4.**

Do not forget:

1) Spaces and indents are very sensitive in bash, so make sure everything is written exactly  
2) Make sure that there is no typo. I would advice copy/pasting directly from your terminal the paths using ```pwd``` to make sure that you are in the correct place. 

-------
Should everything go right, your terminal should soon change and start processing the data. A lot of text should scroll regularly in your terminal. It is worth noting that in a couple of cases, depending on your computer, certain processing steps might appear "jammed" in the same command for a couple of hours (particularly ```resume recon-all```, the Freesurfer command. For information, here is the specs of the computer used to run the code above:

macOS Mojave
Version 10.14.6
iMac (Retina 5K, 27-inch, Late 2015)
3.2 GHz Intel Core i5
8 GB 1867 MHz DDR3
AMD Radeon R9 M380 2 GB

Running fMRIPrep on this computer, for a single subject, 2 visits, took 14h, allocating 5.67GB of RAM to the processing. 

**Of note:** Docker by default allows only 4GB of RAM to be allocated to programs running in containers. This can be changed in Docker > Preferences > Ressources > Memory. I allowed Docker to use up to 6GB on the computer.

---------

Ok! Now, we have run fMRIPrep and we have a message in the terminal indicating that fMRIPrep proceeded successfully. Great! As a reminder, here are the files we started with before the run:

And now, here's our output:

Yikes! That is SO many files. So what do we need to keep? That depends on what you want to do. 

In our case, we want to extract functional connectivity matrices from the bold images. We also want the pre-processed T1w image so that we can plot the brain activity on a nice anatomical image. Depending on the study/plotting decision, you can definately use the Freesurfer outputs. Please note that, by default, the different MRI **are merged together in the fMRIPrep pipeline to allow for co-registration of BOLD signal on an averaged structural image**. To my knowledge, and according to the app's developper, [there is not currently a way to signal fMRIPrep to output FreeSurfer outputs for each session](https://github.com/poldracklab/fmriprep/issues/993). Their goal was more to provide a strong registration for the BOLD signal, not to allow analyses using structural variables. As such, if the FreeSurfer outputs for each visits are of capital importance, then either:


1) Pre-processing with fMRIPrep should be run for 1 session at a time. In such case, the registration and Freesurfer output is given for a single visit. At this time however, I am not sure if this solution works and it would need to be tried.  
2) FreeSurfer can be run separately before fMRIPrep and fMRIPrep can reuse the outputs. In that case, FreeSurfer outputs 6.0.0 needs to be used to be recognized by fMRIPrep and the FreeSurfer outputs need to be in the same output folder as fMRIPrep output will be in.

So ultimately, we will use the files below in the pipeline/or these files are useful:

**```dataset_description.json```**: Gives the the version of the bids-validator used, the version of fMRIPrep used and how to acknowledge the pipeline in a paper.

**```CITATION.md```**: fMRIPrep generates an exact description of the steps included in the pre-processing and is intended to be copy-pasted in a scientific article using fMRIPrep. Note that depending on how you ran the pre-processing, this citation will change. 

**```anat```**: This folder contains the 'averaged' MRI files. In our case, we want the 'desc-preproc' files.

**```func```**: These folders are generated for each session used. They include many files, but of interest, we need the ```.tsv``` file containing the fMRI confounds for each subject at each session. We also need the fMRI file with 'desc-preproc'

## 4) Connectivity matrices extraction

As a disclaimer, the following Nilearn pipeline is not complete and does not yet scrub time series for volumes with high frame displacement. This will soon be added. The script used to extract matrices is detailed here, but is also available in GitHub as a standalone script.

The goal of this section is to take the pre-processed images, extract the time series information using a defined brain parcellation, and regress the confounds out of the connectivity matrices. The script then saves the connectivity matrices as .csv files that be used for later analyses and as .png images that can be used for summary QC/visualization.

This script is based on this [Nilearn Example](https://nilearn.github.io/auto_examples/03_connectivity/plot_signal_extraction.html#sphx-glr-auto-examples-03-connectivity-plot-signal-extraction-py)
Part of this script (and future updates) takes inspiration from: https://github.com/brain-modelling-group/fmripop. 

### 4.0 - Generating a virtual environment to run the analyses

First, I generated a Python virtual environment to ensure reproducibility. This [page](https://anbasile.github.io/posts/2017-06-25-jupyter-venv/) briefly explains how to create a Python virtual environment, and set it as an iPython kernel. This allows you to reuse the exact method in your own Jupyter notebook. This GitHub includes a ```requirements.txt``` file that can be used when setting your environment. Here are the basic commands to reproduce in your terminal:

In [20]:
#python -m venv projectname
#source projectname/bin/activate
#pip install -r requirements.txt #Including the ipykernel package
#ipython kernel install --user --name=projectname

#On Jupyter, you should now be able to change the Kernel to the specific environnment specified.

### 4.1 - Importing the modules necessary for analyses

As a personnal preference, I like to install all packages first at the top in order of first appearance in the code and separate by packages, then, if needed, refer later in the code with a comment if there is confusion. Here are the packages we need and their usage:

In [21]:
#Nilearn packages:
from nilearn import datasets #For atlases
from nilearn import plotting #To plot brain images
from nilearn.input_data import NiftiLabelsMasker #To mask the data
from nilearn.connectome import ConnectivityMeasure #To compute the connectivity matrices

#Various packages
import os #To create directories 
import pandas as pd #For dataframe manipulation (e.g. confound file)
import numpy as np #To conversion to numpy array of the matrix
from matplotlib import pyplot as plt #Used to bypass a bug where the figures wouldn't close in Nilearn in my script

### 4.2 - Importing the brain atlas

For this tutorial, I chose to use the [Schaefer (2018)](https://github.com/ThomasYeoLab/CBIG/blob/master/stable_projects/brain_parcellation/Schaefer2018_LocalGlobal/Parcellations/Updates/Update_20190916_README.md) atlas. Note that not all atlases are configured in the same manner, so refer to the specific atlas you intend to use in Nilearn's help page [here](https://nilearn.github.io/modules/reference.html#module-nilearn.datasets)

The code below imports the atlases. We first extract the 'maps' (i.e. the .nii image representing the atlas regions) and the we extract the labels (i.e. what each region is) from the downloaded dictionary.

Note that the original 400 labels used by default in the atlas and the 300 labels did not work in the script. The length of the labels and the lenght of the matrix did not match which did not allow the code to work. I haven't had the time to explore the issue further.

In [22]:
atlas_schaefer = datasets.fetch_atlas_schaefer_2018(n_rois=200)
atlas_filename_schaefer = atlas_schaefer.maps 
labels_schaefer = atlas_schaefer.labels 
print(f'The atlas is located at {atlas_filename_schaefer}')
#print(labels_schaefer) #Prints the array of the labels, if needed
#print(len(labels_schaefer)) #Prints the length of the labels, if needed
#plotting.plot_roi(atlas_filename_schaefer) #Plotting the regions included in the atlas, if needed.

The atlas is located at /Users/stong3/nilearn_data/schaefer_2018/Schaefer2018_200Parcels_7Networks_order_FSLMNI152_1mm.nii.gz


### 4.3 - Extraction preparation

Before we can procede to the extraction, we need to set a few things.

1) We need to set the location of our subjects (i.e. where Python can find our images and the confound files)
2) Set a location for Python to output our derivatives (i.e. connectivity matrices)
3) Set the variables for extraction (subjects, session, confounds, kind of matrix, atlas)
4) Start the loops for extraction

To note, in future versions, the lists of variables might be replaced by argument parsers.

In [23]:
# 1)
subjects_location = '/Users/stong3/Desktop/test_fmriprep_PAD/derivatives/fmriprep/fmriprep/'

# 2)
connectivity_matrices_dir = '/Users/stong3/Desktop/test_fmriprep_PAD/derivatives/connectivity_matrices/'

#This part creates the root directory inside of the derivatives folder, specified by the BIDS convention. The path above should reflect where you want the directory to go.
#Even if the intermediate path (derivatives) is not created yet, the function 'makedirs' creates it for you.

#The function below tests whether the path already exists. If not, it creates it for you.
if not os.path.exists(connectivity_matrices_dir):
    os.makedirs(connectivity_matrices_dir)
    print(f'Created directory:{connectivity_matrices_dir}')
else:
    print(f'Directory {connectivity_matrices_dir} already exists. No directory is created.')
    

#3)
subject_list = ['00001']
session_list = ['BL00A', 'FU12A']
list_confounds = ['csf', 'white_matter', 'global_signal', 'trans_x', 'trans_y', 'trans_z', 'rot_x', 'rot_y', 'rot_z', 'cosine00', 'cosine01', 'cosine02']
kind_connectivity = ['correlation', 'partial correlation']
atlas = 'schaefer'

print('------------------------------------')
print('Description of the post-processing: ')
print(f'    Subjects to process are with the following IDs : {subject_list}')
print(f'    Sessiong to process are the following : {session_list}')
print(f'    The confounds included to generate the matrices are : {list_confounds}')
print(f'    The kind of correlation matrices to be generated are: {kind_connectivity}')
print(f'    All procedures will be done with the {atlas} atlas/map.')
print('------------------------------------')

Directory /Users/stong3/Desktop/test_fmriprep_PAD/derivatives/connectivity_matrices/ already exists. No directory is created.
------------------------------------
Description of the post-processing: 
    Subjects to process are with the following IDs : ['00001']
    Sessiong to process are the following : ['BL00A', 'FU12A']
    The confounds included to generate the matrices are : ['csf', 'white_matter', 'global_signal', 'trans_x', 'trans_y', 'trans_z', 'rot_x', 'rot_y', 'rot_z', 'cosine00', 'cosine01', 'cosine02']
    The kind of correlation matrices to be generated are: ['correlation', 'partial correlation']
    All procedures will be done with the schaefer atlas/map.
------------------------------------


Let's start the loops! We will create 3 nested loops. 

The first will iterate over all subjects in the list we gave our script. 

The second will, for every subject, iterate over all the sessions that are available. In this section, we will set the location of our files. Then, we will use Pandas to reduce the raw confound regressor files to only the confounds we want. Finally, we will created our final time_series object, which will have our regressors modeled out.

The final loop will create connectivity matrices for all the kinds that are specified, for every session, within every subjects. In this last loop, we will save the connectivity matrices as .csv files and as images. Here we go!

In [24]:
for subject in subject_list:
    for session in session_list:
        print(f'Starting connectivity matrices extraction for subject {subject} for session {session}')
        print('------------------------------------')
        print(' ')
        
        #First, we need to import the paths where the fMRI pre-processed and confounds files are stored. They follow BIDS convention and depend on the path provided
        #for the subject location. After fetching the paths, we print them.
        print('Fetching the paths for the fMRI files and the confond files...')
        pre_processed_fmri_file = f'{subjects_location}sub-{subject}/ses-{session}/func/sub-{subject}_ses-{session}_task-rest_run-1_space-T1w_desc-preproc_bold.nii.gz'
        full_confound_file_fmriprep = f'{subjects_location}sub-{subject}/ses-{session}/func/sub-{subject}_ses-{session}_task-rest_run-1_desc-confounds_regressors.tsv'

        print(f'The path for the fmri file is: {pre_processed_fmri_file}')
        print(f'The path for the full confound file is: {full_confound_file_fmriprep}')
        print('------------------------------------')
        print(' ')

        #The confound file from fMRIPrep is huge and needs to be cleaned. We import it with Pandas to keep column names and select them more easily.
        print('We now import the confound file using Pandas:')
        confounds = pd.read_csv(full_confound_file_fmriprep, delimiter = '\t')
        print(f'Confounds selected for this extractions were: {list_confounds}')
        #print(confounds.head())
        final_confounds = confounds[list_confounds]
        #print(final_confounds.head())

        #We need to convert the dataframe type of Pandas to a Numpy array so it is readable by Nilearn.
        print('Conversion to Numpy array:')
        confounds_np = final_confounds.to_numpy()
        print('Done!')
        print('------------------------------------')
        

        #We are now ready to extract the time series using our atlas mask
        #from nilearn.input_data import NiftiLabelsMasker #Already imported in the top of the script
        print('Creating masker using ', atlas_filename_schaefer)

        masker = NiftiLabelsMasker(labels_img=atlas_filename_schaefer, standardize=True, verbose=5)
        time_series = masker.fit_transform(pre_processed_fmri_file, confounds=confounds_np)
        print('The shape of our time series is: ', np.shape(time_series))

        #We are now ready to extract the connectivity matrix using Nilearn functions.
        #This part of the script will:
        ## 1) Create a loop to extract matrices according to the kind of matrix wanted (e.g. correlation, partial correlation, etc.)
        ## 2) Check that the matrix shape matches the length of the atlas labels
        ## 3) Create a Pandas dataframe with the labels as columns and indices
        ## 4) Create a directory where we can save the matrix ('bids-like')
        ## 5) Export the dataframe to a .csv file 
        ## 6) Use Nilearn functions to plot the matrix with labels
        ## 7) Save the image in the same directory as the .csv file.

        ### 1)
        print('Extracting connectivity matrices for the following kinds: ', kind_connectivity)
        for kind in kind_connectivity:
            correlation_measure = ConnectivityMeasure(kind = kind)
            correlation_matrix = correlation_measure.fit_transform([time_series])[0]
            print('The shape of the correlation matrix is: ', np.shape(correlation_matrix))
        ### 2)
            print('Testing if the shape of the matrix and the lenght of the labels are matching:')
            try:
                len(labels_schaefer) in np.shape(correlation_matrix)
                if False:
                    raise ValueError('The length of the labels do not match the shape of the correlation_matrix')
            except ValueError:
                exit('The shape of the matrix and labels are not matching')
            print('The shape of the matrix and labels are matching.')
        ### 3)
            print('Creating a Pandas dataframe with labels as index')
            subject_connectivity_matrix = pd.DataFrame(data=correlation_matrix, index=labels_schaefer, columns=labels_schaefer)
            #print(subject_connectivity_matrix.head())
        ### 4)
            print('Creating a directory to save the computed correlation matrices')
            dir_matrices_derivatives = f'{connectivity_matrices_dir}sub-{subject}/ses-{session}/kind-{kind}/'
            if not os.path.exists(dir_matrices_derivatives):
                os.makedirs(dir_matrices_derivatives)
                print(f'Created directory:{dir_matrices_derivatives}')
            else:
                print(f'Directory {dir_matrices_derivatives} already exists. None is created.')

        ### 5)
            #Using an f string, we give the new directory
            print(f'Saving dataframe to a .csv file in : {dir_matrices_derivatives}')
            subject_connectivity_matrix.to_csv(f'{dir_matrices_derivatives}sub-{subject}_ses-{session}_atlas-{atlas}_kind-{kind}_connectivity_matrix.csv')

        ### 6)
            #from nilearn import plotting
            ## This one plots the matrix without reordering the clusters of fMRI activation and fitting labels
            display = plotting.plot_matrix(correlation_matrix)
            display.figure.savefig(f'{dir_matrices_derivatives}sub-{subject}_ses-{session}_atlas-{atlas}_kind-{kind}_connectivity_matrix_no_labels_not_ordered.png')  
            plt.close()

            ## This one plots the matrix, reorders the clusters and places the labels on the figure
            display1 = plotting.plot_matrix(correlation_matrix, figure=(30, 30), labels=labels_schaefer, vmax=0.8, vmin=-0.8, reorder=True)
            display1.figure.savefig(f'{dir_matrices_derivatives}sub-{subject}_ses-{session}_atlas-{atlas}_kind-{kind}_connectivity_matrix_labels_ordered.png')  
            plt.close()

Starting connectivity matrices extraction for subject 00001 for session BL00A
------------------------------------
 
Fetching the paths for the fMRI files and the confond files...
The path for the fmri file is: /Users/stong3/Desktop/test_fmriprep_PAD/derivatives/fmriprep/fmriprep/sub-00001/ses-BL00A/func/sub-00001_ses-BL00A_task-rest_run-1_space-T1w_desc-preproc_bold.nii.gz
The path for the full confound file is: /Users/stong3/Desktop/test_fmriprep_PAD/derivatives/fmriprep/fmriprep/sub-00001/ses-BL00A/func/sub-00001_ses-BL00A_task-rest_run-1_desc-confounds_regressors.tsv
------------------------------------
 
We now import the confound file using Pandas:
Confounds selected for this extractions were: ['csf', 'white_matter', 'global_signal', 'trans_x', 'trans_y', 'trans_z', 'rot_x', 'rot_y', 'rot_z', 'cosine00', 'cosine01', 'cosine02']
Conversion to Numpy array:
Done!
------------------------------------
Creating masker using  /Users/stong3/nilearn_data/schaefer_2018/Schaefer2018_200Parc

This code therefore creates 3 outputs, in 'bids-like' format, per session, per kind of matrix, for each subject. The resulting files should look something like this:

## Conclusion

In conclusion, this project aimed to start from un-processed data in BIDS format 