# QSIParc Basics: Parcellation and Regional Analysis

This notebook demonstrates how to use the QSIParc runner for diffusion-weighted parcellation.

## Overview

QSIParc performs:
- Parcellation using the `parcellate` tool
- Regional quantification of diffusion metrics
- Atlas registration and labeling
- Generation of region-wise statistics

## Prerequisites

- Docker installed and running
- QSIRecon reconstructed data
- FreeSurfer license file

## Setup

In [2]:
from pathlib import Path
from voxelops import (
    run_qsiparc,
    QSIParcInputs,
    QSIParcDefaults,
)
import json
import pandas as pd

## Define Paths

In [3]:
# Input paths
qsirecon_dir = Path("/media/storage/yalab-dev/qsiprep_test/qsirecon_output/")
participant = "01"
fs_license = Path("/home/galkepler/misc/freesurfer/license.txt")

# Output paths (optional)
output_dir = Path("/media/storage/yalab-dev/qsiprep_tes/qsiparc")
work_dir = Path("/media/storage/yalab-dev/qsiprep_test/work/qsiparc/")

## Basic Usage

### Option 1: Use Default Configuration

In [5]:
# Create inputs
inputs = QSIParcInputs(
    qsirecon_dir=qsirecon_dir,
    participant=participant,
    output_dir=output_dir,
    # work_dir=work_dir,
)

# Run with defaults
result = run_qsiparc(
    inputs,
    fs_license=fs_license,
)

print(f"Success: {result['success']}")
print(f"Duration: {result['duration_human']}")


Running qsiparc for participant 01
Command: docker run --rm -v /media/storage/yalab-dev/qsiprep_test/qsirecon_output:/input:ro -v /media/storage/yalab-dev/qsiprep_tes/qsiparc:/output pennlinc/qsiparc:latest /input /output --participant-label=01 --atlas schaefer100 --atlas schaefer200

Execution log saved: /media/storage/yalab-dev/qsiprep_tes/logs/qsiparc_01_20260201_134616.json


ProcedureExecutionError: qsiparc failed: qsiparc failed with exit code 125

Stderr (last 1000 chars):
Unable to find image 'pennlinc/qsiparc:latest' locally
docker: Error response from daemon: pull access denied for pennlinc/qsiparc, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.


### Option 2: Override Resources

In [None]:
# Run with specific resources
result = run_qsiparc(
    inputs,
    fs_license=fs_license,
    nprocs=16,
    mem_gb=32,
)

print(f"Success: {result['success']}")

### Option 3: Custom Configuration

In [None]:
# Create custom configuration
config = QSIParcDefaults(
    nprocs=12,
    mem_gb=24,
    skip_bids_validation=False,
    fs_license=fs_license,
    docker_image="pennlinc/qsiparc:latest",
)

result = run_qsiparc(inputs, config)

print(f"Success: {result['success']}")

## Inspect Execution Record

In [None]:
print("Execution Details:")
print(f"  Tool: {result['tool']}")
print(f"  Participant: {result['participant']}")
print(f"  Duration: {result['duration_human']}")
print(f"  Success: {result['success']}")

print("\nConfiguration Used:")
config_used = result["config"]
print(f"  Cores: {config_used.nprocs}")
print(f"  Memory: {config_used.mem_gb}GB")
print(f"  Docker image: {config_used.docker_image}")

## Check Expected Outputs

In [None]:
outputs = result["expected_outputs"]

print("Expected Output Locations:")
print(f"  QSIParc directory: {outputs.qsiparc_dir}")
print(f"  Participant directory: {outputs.participant_dir}")
print(f"  Work directory: {outputs.work_dir}")

# Verify outputs
print("\nOutput Validation:")
print(f"  Participant dir exists: {outputs.participant_dir.exists()}")

## Explore Parcellation Outputs

In [None]:
if outputs.participant_dir.exists():
    print(f"Parcellation outputs for {participant}:\n")

    # List all files
    parcellation_maps = []
    statistics = []
    other = []

    for f in outputs.participant_dir.rglob("*"):
        if f.is_file():
            if f.suffix == ".nii.gz":
                parcellation_maps.append(f)
            elif f.suffix in [".csv", ".tsv"]:
                statistics.append(f)
            else:
                other.append(f)

    print(f"Parcellation Maps ({len(parcellation_maps)}):")
    for f in sorted(parcellation_maps):
        print(f"  {f.name}")

    print(f"\nStatistics Files ({len(statistics)}):")
    for f in sorted(statistics):
        print(f"  {f.name}")

    if other:
        print(f"\nOther Files ({len(other)}):")
        for f in sorted(other)[:5]:  # Show first 5
            print(f"  {f.name}")
else:
    print("Participant directory not found")

## Load and Analyze Regional Statistics

In [None]:
# Find statistics files
stats_files = list(outputs.participant_dir.rglob("*.csv"))

if stats_files:
    # Load first statistics file
    stats_file = stats_files[0]
    print(f"Loading: {stats_file.name}\n")

    # Load data
    stats_df = pd.read_csv(stats_file)

    print(f"Shape: {stats_df.shape}")
    print(f"Columns: {list(stats_df.columns)}")
    print(f"\nFirst few rows:")
    display(stats_df.head())

    # Summary statistics
    print(f"\nSummary Statistics:")
    display(stats_df.describe())
else:
    print("No statistics files found")

## Visualize Regional Metrics

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

if stats_files and "stats_df" in locals():
    # Assume there's a 'region' and some metric columns
    # Adjust based on actual column names

    # Example: Plot distribution of FA values across regions
    if "FA" in stats_df.columns:
        plt.figure(figsize=(12, 6))

        # Histogram
        plt.subplot(1, 2, 1)
        plt.hist(stats_df["FA"].dropna(), bins=30, edgecolor="black")
        plt.xlabel("Fractional Anisotropy (FA)")
        plt.ylabel("Number of Regions")
        plt.title("Distribution of FA Across Regions")
        plt.grid(alpha=0.3)

        # Box plot
        plt.subplot(1, 2, 2)
        plt.boxplot(stats_df["FA"].dropna())
        plt.ylabel("Fractional Anisotropy (FA)")
        plt.title("FA Distribution Summary")
        plt.grid(alpha=0.3)

        plt.tight_layout()
        plt.show()

        print(f"\nFA Statistics:")
        print(f"  Mean: {stats_df['FA'].mean():.3f}")
        print(f"  Std: {stats_df['FA'].std():.3f}")
        print(f"  Min: {stats_df['FA'].min():.3f}")
        print(f"  Max: {stats_df['FA'].max():.3f}")

    # If there are multiple metrics, create correlation matrix
    numeric_cols = stats_df.select_dtypes(include="number").columns
    if len(numeric_cols) > 1:
        plt.figure(figsize=(10, 8))
        correlation = stats_df[numeric_cols].corr()
        sns.heatmap(correlation, annot=True, fmt=".2f", cmap="coolwarm", center=0)
        plt.title("Correlation Matrix of Diffusion Metrics")
        plt.tight_layout()
        plt.show()
else:
    print("No data available for visualization")

## Batch Processing

In [None]:
# Get list of participants from QSIRecon directory
participant_dirs = sorted(qsirecon_dir.glob("sub-*"))
participants = [d.name.replace("sub-", "") for d in participant_dirs if d.is_dir()]

print(f"Found {len(participants)} participants: {participants}\n")

config = QSIParcDefaults(
    nprocs=12,
    mem_gb=24,
    fs_license=fs_license,
)

results = []

for participant in participants:
    print(f"Processing participant {participant}...")

    inputs = QSIParcInputs(
        qsirecon_dir=qsirecon_dir,
        participant=participant,
    )

    try:
        result = run_qsiparc(inputs, config)
        results.append(result)
        print(f"  ✓ Success in {result['duration_human']}\n")
    except Exception as e:
        print(f"  ✗ Failed: {e}\n")
        results.append(
            {
                "participant": participant,
                "success": False,
                "error": str(e),
            }
        )

# Summary
successful = sum(1 for r in results if r.get("success"))
print(f"\n{'='*60}")
print(f"Processed {len(results)} participants:")
print(f"  ✓ Successful: {successful}")
print(f"  ✗ Failed: {len(results) - successful}")

## Combine Statistics Across Participants

In [None]:
# Collect statistics from all successful runs
all_stats = []

for result in results:
    if result.get("success"):
        participant = result["participant"]
        outputs = result["expected_outputs"]

        # Find statistics files
        stats_files = list(outputs.participant_dir.rglob("*.csv"))

        for stats_file in stats_files:
            df = pd.read_csv(stats_file)
            df["participant"] = participant
            all_stats.append(df)

if all_stats:
    # Combine all dataframes
    combined_stats = pd.concat(all_stats, ignore_index=True)

    print(f"Combined statistics: {combined_stats.shape}")
    print(f"\nFirst few rows:")
    display(combined_stats.head())

    # Save to file
    output_file = output_dir / "combined_regional_statistics.csv"
    output_file.parent.mkdir(parents=True, exist_ok=True)
    combined_stats.to_csv(output_file, index=False)
    print(f"\nSaved combined statistics to: {output_file}")
else:
    print("No statistics to combine")

## Error Handling

In [None]:
from voxelops.exceptions import (
    ProcedureExecutionError,
    InputValidationError,
)

try:
    result = run_qsiparc(
        inputs,
        fs_license=fs_license,
    )
    print(f"Success: {result['success']}")

except InputValidationError as e:
    print(f"Input validation failed: {e}")
    print("Common issues:")
    print("  - QSIRecon directory doesn't exist")
    print("  - Participant not found in QSIRecon output")

except ProcedureExecutionError as e:
    print(f"Execution failed: {e}")
    print(f"Check logs: {result.get('log_file')}")

except Exception as e:
    print(f"Unexpected error: {e}")

## Next Steps

After parcellation:

1. Analyze regional statistics across subjects
2. Compare metrics between groups (e.g., patients vs controls)
3. Correlate with behavioral or clinical measures
4. Visualize parcellation overlays on structural images
5. Export to statistical analysis software (R, SPSS, etc.)

## Tips

- **Atlas choice**: Different atlases provide different regional granularity
- **Quality control**: Check parcellation overlays visually
- **Statistics**: Export to CSV for easy analysis in R, Python, or SPSS
- **Batch processing**: Process all subjects with the same configuration for consistency
- **Version control**: Pin Docker image versions for reproducibility
- **Data organization**: Maintain a consistent directory structure across subjects