# Neuroimaging Preprocessing Tutorial

This tutorial demonstrates neuroimaging preprocessing using **DeepPrep**, a computationally efficient, scalable, and robust preprocessing pipeline empowered by deep learning and workflow managers.

**Official Documentation**: https://deepprep.readthedocs.io/en/latest/installation.html

### Why DeepPrep?

- **Fast**: 11x faster than fMRIPrep
- **Deep Learning-Powered**: Uses state-of-the-art deep learning algorithms:
  - **FastSurfer**: Brain tissue segmentation
  - **FastCSR**: Cortical surface reconstruction
  - **SUGAR**: Cortical surface registration
  - **SynthMorph**: Volumetric spatial normalization
- **Highly Scalable**: Supports local workstations, HPC clusters, and cloud computing environments
- **Robust**: 100% success rate in processing clinical samples

### Supported Data Types

- Anatomical MRI (T1w, T2w, FLAIR)
- Functional MRI (BOLD)
- CIFTI format output

### Citation

If you use DeepPrep in your research, please cite:

Ren, J.*, An, N.*, Lin, C., et al. (2025). DeepPrep: an accelerated, scalable and robust pipeline for neuroimaging preprocessing empowered by deep learning. *Nature Methods*. https://doi.org/10.1038/s41592-025-02599-1

## 1: Installation

This tutorial requires two environments:

1. **Python Environment** (for this notebook)
   - Used for data format conversion (ADNI → BIDS)
   - Required packages: `pydicom`, `nibabel`, `numpy`

2. **DeepPrep Docker** (for brain imaging preprocessing)
   - The main preprocessing pipeline
   - Runs independently in a container
   
Let's set up both environments step by step.

### 1.1 System Requirements

Before installing DeepPrep, ensure your system meets the following requirements:

#### Hardware Requirements

| Component | Minimum | Recommended |
|-----------|---------|-------------|
| CPU | 4 cores | 8+ cores |
| RAM | 12GB + Swap space | 32GB+ |
| Disk Space | 20GB | 100GB+ |
| GPU (Optional) | 10GB+ VRAM | NVIDIA GPU with CUDA 11.8+ |

#### Software Requirements

- **Operating System**: Ubuntu 20.04 or newer (Linux-based systems)
- **Container Platform**: Docker or Singularity
- **GPU Drivers** (if using GPU): NVIDIA Driver 520.61.05+ and CUDA 11.8+

### 1.2 Check Your System

Let's verify your system meets the requirements:

In [None]:
# Check Docker installation
!docker --version

In [None]:
# Check available disk space
!df -h /var/lib/docker

In [None]:
# Check home directory space
!df -h ~

In [None]:
# Check CPU information
!lscpu | grep -E "^CPU\(s\)|Model name"

In [None]:
# Check available RAM
!free -h

In [None]:
# Check GPU (if available)
!nvidia-smi 2>/dev/null || echo "No NVIDIA GPU detected (CPU-only mode will be used)"

### 1.3 Install Docker (If Not Already Installed)

If Docker is not installed, you can install it using the following commands in your terminal:

```bash
# Update package index
sudo apt-get update

# Install required packages
sudo apt-get install -y \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

# Add Docker's official GPG key
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# Set up the repository
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker Engine
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Add your user to the docker group (to run Docker without sudo)
sudo usermod -aG docker $USER

# Apply the new group membership (or logout and login)
newgrp docker
```

**Note**: After adding yourself to the docker group, you may need to log out and log back in for the changes to take effect.

Let's check if Docker is already installed:

In [None]:
# Check if Docker is installed
!which docker && echo "✓ Docker is already installed" || echo "✗ Docker not found - please install using the commands above"

# Check Docker service status
!sudo systemctl status docker --no-pager 2>/dev/null | grep "Active:" || echo "Note: Docker service status check requires sudo privileges"

### 1.4 Pull DeepPrep Docker Image

DeepPrep is distributed as a Docker image. The latest version is **pbfslab/deepprep:25.1.0**.

**Important Notes**:
- The Docker image is approximately **10-15GB** in size
- Download time depends on your internet connection
- Make sure you have sufficient disk space

In [None]:
# Pull the DeepPrep Docker image
# If you're in the docker group, run:
!docker pull pbfslab/deepprep:25.1.0

# If you need sudo privileges, run this in terminal:
# sudo docker pull pbfslab/deepprep:25.1.0

### 1.5 Verify Installation

Let's verify that the DeepPrep image was successfully downloaded:

In [None]:
# List DeepPrep Docker images
!docker images pbfslab/deepprep

You should see output similar to:

```
REPOSITORY           TAG       IMAGE ID       CREATED        SIZE
pbfslab/deepprep     25.1.0    xxxxxxxxxxxx   x weeks ago    xx.xGB
```

### 1.6 Preparation

Before running DeepPrep, you need to prepare the following:

#### 1.6.1 FreeSurfer License

DeepPrep requires a **FreeSurfer license** to run. This is a free license that you can obtain from the FreeSurfer website.

**Steps to obtain FreeSurfer license:**

1. Visit the FreeSurfer registration page: https://surfer.nmr.mgh.harvard.edu/registration.html
2. Fill out the registration form with your information
3. You will receive an email with the `license.txt` file
4. Save the license file to a known location on your computer

**Recommended location**: `$HOME/freesurfer/license.txt`

In [None]:
import os

# Create directory for FreeSurfer license
freesurfer_dir = os.path.expanduser("~/freesurfer")
os.makedirs(freesurfer_dir, exist_ok=True)

print(f"FreeSurfer directory created at: {freesurfer_dir}")
print(f"Please save your license.txt file to: {freesurfer_dir}/license.txt")

In [None]:
# Check if FreeSurfer license exists
import os

license_path = os.path.expanduser("~/freesurfer/license.txt")

if os.path.exists(license_path):
    print(f"✓ FreeSurfer license found at: {license_path}")
    print("\nLicense content:")
    with open(license_path, 'r') as f:
        print(f.read())
else:
    print(f"✗ FreeSurfer license NOT found at: {license_path}")
    print("\nPlease:")
    print("1. Obtain a license from: https://surfer.nmr.mgh.harvard.edu/registration.html")
    print(f"2. Save it to: {license_path}")

#### 1.6.2 Convert ADNI Data to BIDS Format

DeepPrep requires input data in **BIDS (Brain Imaging Data Structure)** format. Our dataset is from ADNI and contains three modalities:
- **MRI** (T1-weighted structural images)
- **AV45 PET** (Amyloid PET imaging)
- **FDG PET** (Glucose metabolism PET imaging)

**Current ADNI data structure:**
```
data/002_S_0295/
├── MPR__GradWarp__B1_Correction__N3/              # MRI T1w
│   └── 2009-05-22_07_00_57.0/
│       └── I150176/*.nii
├── AV45_Coreg,_Avg,_Standardized_Image_and_Voxel_Size/  # AV45 PET
│   └── 2011-06-10_16_22_23.0/
│       └── I240520/*.dcm
└── Coreg,_Avg,_Standardized_Image_and_Voxel_Size/       # FDG PET
    └── 2011-06-09_08_23_48.0/
        └── I240517/*.dcm
```

**Target BIDS structure:**
```
bids_dataset/
├── dataset_description.json
├── participants.tsv
└── sub-0020295/
    ├── anat/
    │   └── sub-0020295_T1w.nii.gz
    └── pet/
        ├── sub-0020295_trc-AV45_pet.nii.gz
        ├── sub-0020295_trc-AV45_pet.json
        ├── sub-0020295_trc-FDG_pet.nii.gz
        └── sub-0020295_trc-FDG_pet.json
```

We'll write a Python script to perform this conversion automatically.

In [None]:
# Install required pixel data handlers for pydicom
# Run this cell if you encounter "ImportError: NumPy is required when converting pixel data to an ndarray"

import sys
!{sys.executable} -m pip install -q pillow pylibjpeg pylibjpeg-libjpeg pylibjpeg-openjpeg python-gdcm

# Test the installation
try:
    import pydicom
    import numpy as np
    print("✓ pydicom and numpy imported successfully")
    
    # Test pixel data handler
    import glob
    dcm_files = glob.glob('data/002_S_0295/*/2011*/I*/*.dcm')
    if dcm_files:
        ds = pydicom.dcmread(dcm_files[0])
        pixel_data = ds.pixel_array
        print(f"✓ Pixel data access working - shape: {pixel_data.shape}")
    else:
        print("⚠ No DICOM files found for testing")
        
except Exception as e:
    print(f"✗ Error: {e}")
    print("\nPlease restart the kernel after running this cell.")

In [None]:
import os
import shutil
import json
import glob

# Import required libraries with proper error handling
try:
    import pydicom
    from pydicom.pixel_data_handlers.util import apply_voi_lut
    import numpy as np
    import nibabel as nib
    print("✓ All required libraries imported successfully")
except ImportError as e:
    print(f"✗ Import error: {e}")
    print("Please install required packages:")
    print("  pip install pydicom numpy nibabel pillow")
    raise

def convert_adni_to_bids(adni_data_dir, bids_output_dir):
    """
    Convert ADNI data to BIDS format
    
    Parameters:
    - adni_data_dir: Path to ADNI data (e.g., 'data/002_S_0295')
    - bids_output_dir: Output BIDS directory (e.g., 'bids_dataset')
    """
    
    # Extract subject ID from directory name (002_S_0295 -> 0020295)
    subject_id = os.path.basename(adni_data_dir).replace('_', '').replace('S', '')
    bids_subject_id = f"sub-{subject_id}"
    
    # Create BIDS directory structure
    subject_dir = os.path.join(bids_output_dir, bids_subject_id)
    anat_dir = os.path.join(subject_dir, 'anat')
    pet_dir = os.path.join(subject_dir, 'pet')
    
    os.makedirs(anat_dir, exist_ok=True)
    os.makedirs(pet_dir, exist_ok=True)
    
    print(f"Converting subject: {bids_subject_id}")
    print(f"Output directory: {subject_dir}")
    
    # 1. Convert MRI (T1w)
    mri_pattern = os.path.join(adni_data_dir, 'MPR__GradWarp__B1_Correction__N3', '*', '*', '*.nii')
    mri_files = glob.glob(mri_pattern)
    
    if mri_files:
        mri_file = mri_files[0]  # Take the first match
        output_t1w = os.path.join(anat_dir, f'{bids_subject_id}_T1w.nii.gz')
        
        # Load and save as compressed NIfTI
        img = nib.load(mri_file)
        nib.save(img, output_t1w)
        print(f"✓ Converted MRI: {os.path.basename(mri_file)} -> {os.path.basename(output_t1w)}")
    else:
        print("✗ No MRI data found")
    
    # 2. Convert AV45 PET
    av45_pattern = os.path.join(adni_data_dir, 'AV45_Coreg,_Avg,_Standardized_Image_and_Voxel_Size', '*', '*', '*.dcm')
    av45_files = glob.glob(av45_pattern)
    
    if av45_files:
        output_av45 = os.path.join(pet_dir, f'{bids_subject_id}_trc-AV45_pet.nii.gz')
        output_av45_json = os.path.join(pet_dir, f'{bids_subject_id}_trc-AV45_pet.json')
        
        # Convert DICOM series to NIfTI
        convert_dicom_to_nifti(av45_files, output_av45)
        
        # Create JSON sidecar
        pet_metadata = {
            "Manufacturer": "Siemens",
            "TracerName": "AV45",
            "TracerRadionuclide": "F18",
            "InjectedRadioactivity": 370,
            "InjectedRadioactivityUnits": "MBq",
            "ModeOfAdministration": "bolus"
        }
        with open(output_av45_json, 'w') as f:
            json.dump(pet_metadata, f, indent=2)
        
        print(f"✓ Converted AV45 PET: {len(av45_files)} DICOM files -> {os.path.basename(output_av45)}")
    else:
        print("✗ No AV45 PET data found")
    
    # 3. Convert FDG PET
    fdg_pattern = os.path.join(adni_data_dir, 'Coreg,_Avg,_Standardized_Image_and_Voxel_Size', '*', '*', '*.dcm')
    fdg_files = glob.glob(fdg_pattern)
    
    if fdg_files:
        output_fdg = os.path.join(pet_dir, f'{bids_subject_id}_trc-FDG_pet.nii.gz')
        output_fdg_json = os.path.join(pet_dir, f'{bids_subject_id}_trc-FDG_pet.json')
        
        # Convert DICOM series to NIfTI
        convert_dicom_to_nifti(fdg_files, output_fdg)
        
        # Create JSON sidecar
        pet_metadata = {
            "Manufacturer": "Siemens",
            "TracerName": "FDG",
            "TracerRadionuclide": "F18",
            "InjectedRadioactivity": 370,
            "InjectedRadioactivityUnits": "MBq",
            "ModeOfAdministration": "bolus"
        }
        with open(output_fdg_json, 'w') as f:
            json.dump(pet_metadata, f, indent=2)
        
        print(f"✓ Converted FDG PET: {len(fdg_files)} DICOM files -> {os.path.basename(output_fdg)}")
    else:
        print("✗ No FDG PET data found")
    
    return subject_dir


def convert_dicom_to_nifti(dicom_files, output_path):
    """Convert DICOM series to NIfTI format"""
    
    # Import numpy here to ensure it's available
    import numpy as np
    
    # Sort DICOM files by instance number
    def get_instance_number(filepath):
        try:
            ds = pydicom.dcmread(filepath, stop_before_pixels=True)
            return int(ds.InstanceNumber) if hasattr(ds, 'InstanceNumber') else 0
        except:
            return 0
    
    dicom_files_sorted = sorted(dicom_files, key=get_instance_number)
    
    # Read all slices and extract pixel data
    pixel_arrays = []
    
    print(f"  Reading {len(dicom_files_sorted)} DICOM slices...")
    
    for idx, filepath in enumerate(dicom_files_sorted):
        try:
            # Read DICOM file
            ds = pydicom.dcmread(filepath, force=True)
            
            # Get pixel array - this is where the error occurs
            # We need to ensure numpy is available to pydicom
            if hasattr(ds, 'pixel_array'):
                pixel_data = ds.pixel_array
                # Explicitly convert to numpy array
                pixel_arrays.append(np.asarray(pixel_data, dtype=np.float32))
            else:
                print(f"    Warning: No pixel data in {os.path.basename(filepath)}")
                
        except Exception as e:
            print(f"    Error reading {os.path.basename(filepath)}: {e}")
            continue
    
    if not pixel_arrays:
        raise ValueError("No valid pixel data found in DICOM files")
    
    # Stack all slices into 3D volume
    img3d = np.stack(pixel_arrays, axis=-1)
    print(f"  Created 3D volume with shape: {img3d.shape}")
    
    # Create NIfTI image (note: proper affine matrix should be calculated from DICOM headers)
    # For simplicity, we use an identity matrix here
    affine = np.eye(4)
    nifti_img = nib.Nifti1Image(img3d, affine)
    
    # Save as compressed NIfTI
    nib.save(nifti_img, output_path)
    print(f"  Saved to {output_path}")


def create_bids_metadata(bids_dir):
    """Create BIDS dataset_description.json and participants.tsv"""
    
    # Create dataset_description.json
    dataset_desc = {
        "Name": "ADNI PET-MRI Dataset",
        "BIDSVersion": "1.8.0",
        "DatasetType": "raw",
        "Authors": ["ADNI"]
    }
    
    desc_file = os.path.join(bids_dir, 'dataset_description.json')
    with open(desc_file, 'w') as f:
        json.dump(dataset_desc, f, indent=2)
    
    print(f"✓ Created {desc_file}")
    
    # Create participants.tsv
    participants_file = os.path.join(bids_dir, 'participants.tsv')
    with open(participants_file, 'w') as f:
        f.write("participant_id\tage\tsex\n")
        # Add participant info here (can be extracted from ADNI metadata)
    
    print(f"✓ Created {participants_file}")


# Example usage
print("BIDS Conversion Script Ready!")
print("\nTo convert your data, run:")
print("  convert_adni_to_bids('data/002_S_0295', 'bids_dataset')")
print("  create_bids_metadata('bids_dataset')")

In [None]:
# Execute the conversion
adni_data_path = 'data/002_S_0295'
bids_output_path = 'bids_dataset'

# Convert the ADNI data to BIDS format
convert_adni_to_bids(adni_data_path, bids_output_path)

# Create BIDS metadata files
create_bids_metadata(bids_output_path)

print("\n" + "="*50)
print("BIDS Conversion Complete!")
print("="*50)

## 2: Preprocessing

DeepPrep performs the following anatomical preprocessing steps:

| Step | Process | What It Does |
|------|---------|--------------|
| **1** | **Motion Correction** | Align and average multiple T1w scans |
| **2** | **Segmentation** (FastSurfer) | Divide brain into 95 regions |
| **3** | **Skull Stripping** | Remove skull and non-brain tissue |
| **4** | **Bias Correction** | Fix brightness inconsistencies |
| **5** | **Surface Reconstruction** (FastCSR) | Build 3D cortical surface models |
| **6** | **Spherical Projection** | Inflate surfaces to spheres |
| **7** | **Surface Registration** (SUGAR) | Align to standard template |
| **8** | **Parcellation** | Label brain regions by atlas |
| **9** | **Volume Mapping** | Project surface labels to volume |
| **10** | **Morphometry** | Extract cortical thickness, area, volume |

### Which Steps Do You Need?

Different research goals require different preprocessing steps:

**Basic volumetric analysis** (brain volume, voxel-based morphometry):
- Steps 1-4

**ROI analysis** (hippocampus volume, subcortical structures):
- Steps 1-4 (segmentation included)

**Surface analysis** (cortical thickness, surface area):
- Steps 1-7

**Group comparison** (patients vs. controls):
- Steps 1-8

**Full analysis** (multimodal neuroimaging):
- All steps 1-10


In [None]:
# Run DeepPrep Preprocessing
#
# OPTION 1: Use the provided shell script (Recommended)
# In your terminal, run:
#   ./run_deepprep.sh
#
# OPTION 2: Copy and paste the command below into your terminal

import os
bids_dir = os.path.abspath('bids_dataset')
output_dir = os.path.abspath('deepprep_output')
work_dir = os.path.abspath('deepprep_work')
license_file = os.path.expanduser('~/freesurfer/license.txt')

print("="*70)
print("OPTION 1 (Recommended): Run the shell script")
print("="*70)
print("./run_deepprep.sh")
print()
print("="*70)
print("OPTION 2: Copy and run this command in your terminal:")
print("="*70)
print()
print(f"sudo docker run --gpus all --rm \\")
print(f"    -v {bids_dir}:/bids:ro \\")
print(f"    -v {output_dir}:/output \\")
print(f"    -v {work_dir}:/work \\")
print(f"    -v {license_file}:/opt/freesurfer/license.txt:ro \\")
print(f"    pbfslab/deepprep:25.1.0 \\")
print(f"    /bids /output participant \\")
print(f"    --participant_label sub-0020295 \\")
print(f"    --anat_only \\")
print(f"    --skip_bids_validation \\")
print(f"    --device auto")

In [None]:
# Visualize preprocessing results
import nibabel as nib
import matplotlib.pyplot as plt

# Load images
original = nib.load('bids_dataset/sub-0020295/anat/sub-0020295_T1w.nii.gz').get_fdata()
skull_stripped = nib.load('deepprep_output/Recon/sub-0020295/mri/brainmask.mgz').get_fdata()
bias_corrected = nib.load('deepprep_output/Recon/sub-0020295/mri/norm.mgz').get_fdata()

# Show middle slice comparison
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

axes[0].imshow(original[:, :, original.shape[2]//2].T, cmap='gray', origin='lower')
axes[0].set_title('Original T1w', fontsize=14)
axes[0].axis('off')

axes[1].imshow(skull_stripped[:, :, skull_stripped.shape[2]//2].T, cmap='gray', origin='lower')
axes[1].set_title('Skull Stripped', fontsize=14)
axes[1].axis('off')

axes[2].imshow(bias_corrected[:, :, bias_corrected.shape[2]//2].T, cmap='gray', origin='lower')
axes[2].set_title('Bias Corrected', fontsize=14)
axes[2].axis('off')

plt.tight_layout()
plt.show()