## Resting State Fingerprinting Analysis

In this project you will explore the concept of interindividual variance through the lens of identifiability; in other words are resting state network connectivity patterns individually specific? This sort of question can be answered using techniques that fall under the category of **Fingerprinting** which ask the following question:

$$\text{Given a connectivity pattern can we reliably link a pattern of connectivity to an individual?}$$


In this notebook we'll work through utilizing HCP data from the S900 release and perform a fingerprinting analysis on resting state connectivity networks. Specifically the methodology we will use will be the one used in the paper:


[Functional connectome fingerprinting: Identifying individuals based on patterns of brain connectivity. Nat Neurosci. 2015 Nov; 18(11): 1664–1671](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5008686/)

Furthermore we will assess the data using different parcellations schemes to demonstrate the effect of parcellation type on identifiability of individuals.

***

## A quick primer: Connectivity Matrix

Before we begin to talk about how fingerprinting can be done, let's quickly go over **connectivity matrices**. A connectivity matrix in fMRI analysis describes the **connectivity** (read correlation) of two nodes from some fMRI data. Specifically this fMRI data can be in the form of:

- Vertices on a surface
- A parcel from parcellated fMRI data (this contains the mean time series of that particular parcel)
    - Furthermore this parcel can be derived from either surface-based fMRI or volume-based
- Voxels from volume-based fMRI data

Under ideal circumstances you would use a surface-based parcellation of fMRI data where each parcel is defined as the average time series from a set of vertices "belonging together". 

***

## How to compute the identifiability of subject resting state networks (fingerprinting)?

Once you have full connectivity matrices of the data, you can now start computing fingerprinting/identifiability scores. The rationale behind this is that given the connectivity matrices of a group of particular individuals across two sessions of fMRI acquisition, individual connectivity matrices should be more ***similar* to themselves compared to others** across sessions. 

A natural question that arises from the above statement is: *how do we express similarity of functional connectivity matrices*? Here's the authors approach:

Take a full connectivity matrix:

<img src="https://nilearn.github.io/_images/sphx_glr_plot_signal_extraction_001.png" width=70%/img>


The method used in Finn et al. (2015) converts this into a *feature vector* (a list of features to describe a single participant) by taking the upper right triangle part of the connectivity matrix. This is because, as shown above, the matrix is symmetric (mirrored) along the diagonal. The data is then normalized using a Fisher $r$-to-$Z$ transformation to yield features with normal distributions across subjects. 

Finally we can compute similarities on these new *features* by calculating the correlation of subject $i$ and subject $j$. We can get a measure of identifiability for a subject $i$ through calculating the correlation of their features with every other subject's session 2 data (including themselves!). A successful identification occurs when the correlations are maximized when they are *within-subject rather than across subject*. The accuracy score of this *fingerprinting analysis* is yielded by examining the proportion of subjects that are successfully identifiable (within correlation $\gt$ across-correlation).


### NOTE

This is not the only way of calculating identifiability - different methods exist that yield different perspectives, and some do a better job than others.

Note that part of the fingerprinting process yields similarity scores between individuals which may yield useful insights to how individuals may relate to each other. Do individuals with "similar fingerprints" share similar cognitive/behavioural profiles? Do psychiatric conditions interact with fingerprinting reliability? 

***

## The Project Task
***

In this component of the project task we'll start off with the initial task of performing a fingerprinting analysis on the HCP S900 dataset. Your first goal is to reproduce the analysis made in the above linked paper. To help guide your progress through this project, we've broken down the key steps to performing this analysis. When working on this analysis problem we recommend working together in groups and trying to solve programmatic issues/analysis issues with each other before approaching BrainHack instructors. 

### Task Breakdown


#### Beginner Project: Volume-based fingerprinting analysis

1. Load in a single subject's volume parcellation (Shen parcellation atlas) meants csv
    - The meants data contains 1 time-series per row, where rows corresponds to parcels
2. Compute a connectivity matrix for this parcellation 
3. Pull upper-triangular component of connectivity matrix and vectorize into a 1D array
4. Perform a Fisher $r$-to-$Z$ transformation
5. Repeat for all other subjects so that you have N 1-D arrays, where N is the number of available subjects
6. Run the fingerprinting analysis and compute an accuracy score

*Bonus Tasks*

- Which edges (connections) contributed most to your fingerprinting accuracy score? Look to the paper for ideas on how to tackle this question
- Try other volume-based parcellation schemes. What effects do parcellations have on fingerprinting? Is there something you can say about parcellation structure that relates to fingerprinting accuracy? 
    - HINT: Read the papers that generated the parcellations, what sort of information went inside the parcellation scheme? 
    - HINT: The paper provides a possible explanation, does this align with your observations?
    
#### Extension: Surface-based Fingerprinting Analysis

We can also perform fingerprinting using surface-based data! Surface-based data is thought to better represent local BOLD activations especially in regions where 1 voxel may cross a sulcal gap. 

***
For more information check out some surface-based papers! Here's one that might be interesting:

[Ciftify: A framework for surface-based analysis of legacy MR acquisitions](https://www.sciencedirect.com/science/article/abs/pii/S1053811919303714)

***

Perform the same analysis as the volume-based fingerprinting analysis. This time look into the surface-based meants csvs. Can you draw the same conclusions as the volume-based analysis? 

#### Additional Advanced Tasks

Now that we've been able to develop a fingerprinting algorithm let's get back to thinking about our similarity scores and the *features* that we've been able to calculate for our data. Can we relate this features to behavioural scores or cognitive measures from the dataset? 

Several methods may be useful to look at:

- Regression-type analysis of FC vectors to cognitive/behavioural scores (see paper for inspiration)
- Replicate the leave-one-out cross-validation algorithm used in the paper
- Try extending this idea of fingerprinting to *Cortical Gradients* - check out the Starter Project Gradients Repo for a start!