# Impact of confound removal strategies on functional connectivity generated from fMRIprep preprocessed data 

Authors

## Introduction

- Existing literature on confounds has compared the denoising strategies extensively
- Popularity of minimally preprocessing pipeline:
    - Pros: Freedom to experiment with impact of various factors for method development
    - Cons: difficult for end users with less method-focused research to do the downstream analysis right
- fMRIprep provides minimal preprocessing pipeline with lots of confounds with the minimally processed data
- Itâ€™s difficult to navigate the confounds and implement the sensible subset of variables in downstream analysis
- We provide solution and benchmark using fMRIPrep outputs on various modern connectomes

## Methods

- Dataset: pixar movie watching developmental dataset
- fMRIprep version
- Nilearn code implementation
- Atlases used
- Metric: 
    - correlation of connectivity edges with mean FD

## Results

In [1]:
import tarfile
import io
from pathlib import Path
import pandas as pd

from metrics import qcfc, compute_pairwise_distance

In [9]:
# define path of input and output
OUTPUT = "inputs/interim"
INPUT = "inputs/dataset-ds000288.tar.gz"
CENTROIDS = "inputs/atlas/schaefer20187networks/Schaefer2018_400Parcels_7Networks_order_FSLMNI152_2mm.Centroid_RAS.csv"
output = Path.cwd().parents[0] / OUTPUT
input_connectomes = Path.cwd().parents[0] / INPUT
input_centroids = Path.cwd().parents[0] / CENTROIDS

In [14]:
with tarfile.open(input_connectomes, 'r:gz') as tar:
    movement = tar.extractfile("dataset-ds000288/dataset-ds000288_desc-movement_phenotype.tsv").read()
    movement = pd.read_csv(io.BytesIO(movement),
                            sep='\t', index_col=0, header=0, encoding='utf8')

In [15]:
with tarfile.open(input_connectomes, 'r:gz') as tar:
    # find the strategies we need to iterate through.
    benchmark_strategies = []
    for member in tar.getmembers():
        filename = member.name.split('/')[-1]
        if "data.tsv" in filename:
            strategy = filename.split("desc-")[-1].split("_data")[0]
            benchmark_strategies.append(strategy)

In [10]:
with tarfile.open(input_connectomes, 'r:gz') as tar:
    # find the strategies we need to iterate through.
    benchmark_strategies = []
    for member in tar.getmembers():
        filename = member.name.split('/')[-1]
        if "data.tsv" in filename:
            strategy = filename.split("desc-")[-1].split("_data")[0]
            benchmark_strategies.append(strategy)

    for strategy_name in benchmark_strategies:
        print(strategy_name)
        connectome = tar.extractfile(f"dataset-ds000288/atlas-schaefer7networks/dataset-ds000288_atlas-schaefer7networks_nroi-400_desc-{strategy_name}_data.tsv").read()
        dataset_connectomes = pd.read_csv(io.BytesIO(connectome), sep='\t', index_col=0, header=0)

scrubbing
compcor
simple
simple+gsr
raw
aroma+gsr
aroma
scrubbing+gsr


In [16]:
from nilearn.connectome import ConnectivityMeasure

In [22]:
cm = ConnectivityMeasure().fit()
cm.inverse_transform(connectivities=dataset_connectomes.values)

TypeError: fit() missing 1 required positional argument: 'X'

In [18]:
dataset_connectomes.values

array([[ 4.95035476e-01,  4.50003888e-01,  3.55539269e-01, ...,
         4.04366245e-01,  1.97208727e-01,  3.51785879e-01],
       [ 5.51107507e-01,  5.24035590e-01,  4.04788253e-01, ...,
         3.88681385e-01,  2.03812205e-01,  3.52432545e-01],
       [ 5.34794228e-01,  2.66094344e-01,  2.74350480e-01, ...,
         5.73319661e-01,  2.16616869e-01,  5.83886877e-01],
       ...,
       [            nan,             nan,             nan, ...,
                    nan,             nan,             nan],
       [-9.29242750e-17,  5.33294170e-01, -9.36010482e-17, ...,
         3.64463243e-17,  1.04829695e-32, -9.57597658e-33],
       [ 4.28087791e-01,  6.10174002e-01,  3.28547191e-01, ...,
         4.68958214e-01,  3.72808304e-01,  4.53677173e-01]])

## Conclusions

## References