Towards 3D Deep Learning for neuropsychiatry: predicting Autism diagnosis using an interpretable Deep Learning pipeline applied to minimally processed structural MRI data, Melanie Garcia, Clare Kelly. medRxiv 2022.10.18.22281196; doi: https://doi.org/10.1101/2022.10.18.22281196

Github: https://github.com/garciaml/Autism-3D-CNN-brain-sMRI?tab=readme-ov-file

In [1]:
# Activate Virtual Environment and Install Requirements
!python3 -m venv ../pretrainedresnet
!source ../pretrainedresnet/bin/activate
!python3 -m ipykernel install --user --name=pretrainedresnet --display-name "Python (pretrainedresnet)"

Installed kernelspec pretrainedresnet in /home/ejh2wy/.local/share/jupyter/kernels/pretrainedresnet


In [10]:
#Switch to notebook/virtual environment kernel
#!pip install -r requirements.txt # install requirements text in new environment
#!pip install torchio
#!pip install monai
#!pip install tensorboard
!pip install torchsummary

Defaulting to user installation because normal site-packages is not writeable
Collecting torchsummary
  Downloading torchsummary-1.5.1-py3-none-any.whl.metadata (296 bytes)
Downloading torchsummary-1.5.1-py3-none-any.whl (2.8 kB)
Installing collected packages: torchsummary
Successfully installed torchsummary-1.5.1


# Data Preprocessing

Stucture your data following the BIDS organization (https://bids.neuroimaging.io/). Your participants.tsv file should contain:

1. a column called participant_id corresponding to the subject id (the same as the folder names in the BIDS dataset: sub-<participant_id>).
2. a column called label corresponding to the binary target variable (here: 0=no diagnosis, 1=Autism)
3. a column called dataset corresponding to where (training, validation, testing set) each participant data will be used: the code supports the three modalities train, val, test.

In [None]:
# Create ABIDEII_BIDS folder in BIDS structure from ABIDEII_Raw

In [None]:
# clone github repo from above
#!git clone https://github.com/garciaml/Autism-3D-CNN-brain-sMRI.git
#!cd ../Autism-3D-CNN-brain-sMRI
#!pip install -r requirements.txt
#!python preprocessing_bids.py ABIDEII_BIDS ABIDEII_BIDS_PROCESSED

In [18]:
import pandas as pd

# Load metadata files
df1 = pd.read_csv('../ABIDEII_Raw/ABIDEII_Composite_Phenotypic.csv')
df2 = pd.read_csv('../ABIDEII_Raw/ABIDEII_Long_Composite_Phenotypic.csv')

# Subset columns needed
df1 = df1[["SUB_ID", "DX_GROUP", "SITE_ID"]]
df2 = df2[["SUB_ID", "DX_GROUP", "SITE_ID"]]

# Combine metadata files
participants = pd.concat([df1, df2], ignore_index=True)

# Change labels: 1 -> 0, 2 -> 1
participants["DX_GROUP"] = participants["DX_GROUP"].replace({1: 0, 2: 1})

# Remove ABIDEII-IU_1 sub-29547
participants = participants[
    ~((participants["SUB_ID"] == 29547) | (participants["SITE_ID"] == "ABIDEII-IU_1"))
]

# Train/test/val split using vectorized operations
participants["dataset"] = "train"
participants.loc[
    participants["SITE_ID"].isin(["ABIDEII-UPSM_Long", "ABIDEII-UCLA_Long", "ABIDEII-IP_1"]),
    "dataset",
] = "test"
participants.loc[
    participants["SITE_ID"].isin(["ABIDEII-BNI_1", "ABIDEII-USM_1"]),
    "dataset",
] = "val"

# Rename columns to match
participants.rename(columns={"SUB_ID": "participant_id", "DX_GROUP": "label"}, inplace=True)

# Remove site id column
participants.drop('SITE_ID', axis=1, inplace=True)

# Export participants.tsv file
participants.to_csv('../ABIDEII_reorganized/participants.tsv', sep='\t', index=False)


In [32]:
import os
import shutil
import glob

# Define the original root and new destination directory
root_dir = "../ABIDEII_Raw"
new_root_dir = "../ABIDEII_reorganized"  # New directory to store the copied files

# Find all *_T1w.nii.gz files in the original structure
t1w_files = glob.glob(os.path.join(root_dir, "ABIDEII-*/derivatives/fmriprep/sub-*/ses-1/anat/*_T1w.nii.gz"), recursive=True)
print(len(t1w_files))

960


In [33]:
from collections import Counter
import os

# Extract filenames (without paths)
filenames = [os.path.basename(f) for f in t1w_files]

# Count duplicates
duplicates = [item for item, count in Counter(filenames).items() if count > 1]

# Print the number of duplicates
print(f"Number of duplicate files: {len(duplicates)}")

Number of duplicate files: 0


In [None]:
# Reorganized File Systems

In [34]:
# Process each file
for file_path in t1w_files:
    # Extract subject ID from the path
    parts = file_path.split(os.sep)
    sub_index = parts.index("fmriprep") + 1  # Index of the subject folder
    subject_id = parts[sub_index]

    # Define new directory and filename
    new_subject_dir = os.path.join(new_root_dir, subject_id, "anat")
    new_filename = f"{subject_id}_T1w.nii.gz"
    new_file_path = os.path.join(new_subject_dir, new_filename)

    # Create directories if they don't exist
    os.makedirs(new_subject_dir, exist_ok=True)

    # Copy the file instead of moving
    shutil.copy2(file_path, new_file_path)
    #print(f"Copied: {file_path} → {new_file_path}")

print("Reorganization complete! Files copied to:", new_root_dir)

Copied: ../ABIDEII_Raw/ABIDEII-KKI_1/derivatives/fmriprep/sub-29333/ses-1/anat/sub-29333_ses-1_acq-rc8chan_run-1_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz → ../ABIDEII_reorganized/sub-29333/anat/sub-29333_T1w.nii.gz
Copied: ../ABIDEII_Raw/ABIDEII-KKI_1/derivatives/fmriprep/sub-29423/ses-1/anat/sub-29423_ses-1_acq-rc32chan_run-1_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz → ../ABIDEII_reorganized/sub-29423/anat/sub-29423_T1w.nii.gz
Copied: ../ABIDEII_Raw/ABIDEII-KKI_1/derivatives/fmriprep/sub-29362/ses-1/anat/sub-29362_ses-1_acq-rc8chan_run-1_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz → ../ABIDEII_reorganized/sub-29362/anat/sub-29362_T1w.nii.gz
Copied: ../ABIDEII_Raw/ABIDEII-KKI_1/derivatives/fmriprep/sub-29430/ses-1/anat/sub-29430_ses-1_acq-rc32chan_run-1_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz → ../ABIDEII_reorganized/sub-29430/anat/sub-29430_T1w.nii.gz
Copied: ../ABIDEII_Raw/ABIDEII-KKI_1/derivatives/fmriprep/sub-29308/ses-1/anat/sub-29308_ses-1_acq-rc8

In [20]:
# Preprocessing
!python ../Autism-3D-CNN-brain-sMRI/preprocessing_bids.py '../ABIDEII_reorganized' '../ABIDEII_preprocessed'

In [23]:
# Model Training
# ensure you put pre-trained weights in directory
python ../Autism-3D-CNN-brain-sMRI/train_medicalnet.py '../ABIDEII_reorganized' '../ABIDEII_preprocessed' 'outputs/Resnet50' '../Autism-3D-CNN-brain-sMRI/resnet_training'

SyntaxError: invalid decimal literal (2017150444.py, line 3)