# QSIPrep Validation - Comprehensive Guide

This notebook demonstrates validation for QSIPrep diffusion MRI preprocessing.

## What QSIPrepValidator Checks

### Pre-validation (Before Running)
1. **BIDS directory exists** - `/data/bids` must exist
2. **Participant exists** - `/data/bids/sub-{participant}` must exist
3. **DWI files exist** - Pattern: `dwi/*_dwi.nii.gz` (minimum 1)
4. **B-value files exist** - Pattern: `dwi/*_dwi.bval` (minimum 1)
5. **B-vector files exist** - Pattern: `dwi/*_dwi.bvec` (minimum 1)
6. **T1w anatomical exists** - Pattern: `anat/*_T1w.nii.gz` (minimum 1)

### Post-validation (After Running)
1. **QSIPrep output directory created** - `/derivatives/qsiprep` exists
2. **Participant output directory created** - `/derivatives/qsiprep/sub-{participant}` exists

In [None]:
from pathlib import Path
from voxelops import run_procedure, QSIPrepInputs, QSIPrepDefaults

# Your data paths
bids_dir = Path("/data/bids")
output_dir = Path("/data/derivatives")
log_dir = Path("/data/logs")

## Example 1: Basic QSIPrep Run with Validation

In [None]:
# Define inputs
inputs = QSIPrepInputs(
    bids_dir=bids_dir,
    participant="001",
    session="01",  # Optional
)

# Run with validation
result = run_procedure("qsiprep", inputs, log_dir=log_dir)

# Check result
if result.success:
    print(f"✓ QSIPrep completed successfully!")
    print(f"  Duration: {result.duration_seconds:.1f}s")
    print(f"  Output: {result.execution['expected_outputs'].qsiprep_dir}")
else:
    print(f"✗ QSIPrep failed: {result.get_failure_reason()}")
    print(f"  Status: {result.status}")

## Example 2: Detailed Pre-validation Inspection

See exactly what checks are performed before running:

In [None]:
# Get pre-validation report
pre_report = result.pre_validation

print(f"Pre-validation Summary:")
print(f"  Status: {'PASSED' if pre_report.passed else 'FAILED'}")
print(f"  Total checks: {len(pre_report.results)}")
print(f"  Errors: {len(pre_report.errors)}")
print(f"  Warnings: {len(pre_report.warnings)}")
print(f"  Passed: {len(pre_report.passed_checks)}")

print("\n" + "="*50)
print("Detailed Results:\n")

# Show each check
for i, result in enumerate(pre_report.results, 1):
    status = "✓" if result.passed else "✗"
    print(f"{i}. {status} {result.rule_description}")
    print(f"   Message: {result.message}")
    
    # Show details for failed checks
    if not result.passed and result.details:
        print(f"   Details:")
        for key, value in result.details.items():
            print(f"     {key}: {value}")
    print()

## Example 3: Handling Common Validation Failures

### Scenario A: Missing BIDS Directory

In [None]:
# This will fail pre-validation
bad_inputs = QSIPrepInputs(
    bids_dir=Path("/data/nonexistent"),
    participant="001",
)

result = run_procedure("qsiprep", bad_inputs, log_dir=log_dir)

if result.status == "pre_validation_failed":
    print("Pre-validation failed as expected!\n")
    
    # Find the specific error
    for error in result.pre_validation.errors:
        if "BIDS directory" in error.message:
            print(f"Error: {error.message}")
            print(f"Path checked: {error.details.get('path')}")
            print(f"\nAction: Create the BIDS directory or fix the path")

### Scenario B: Missing DWI Files

In [None]:
# Participant exists but has no DWI data
inputs = QSIPrepInputs(
    bids_dir=bids_dir,
    participant="999",  # Participant with no DWI
)

result = run_procedure("qsiprep", inputs, log_dir=log_dir)

if result.status == "pre_validation_failed":
    # Check which files are missing
    for error in result.pre_validation.errors:
        if "DWI files" in error.message or "b-value" in error.message or "b-vector" in error.message:
            print(f"Missing: {error.rule_description}")
            print(f"  Pattern searched: {error.details.get('pattern')}")
            print(f"  In directory: {error.details.get('search_dir')}")
            print(f"  Files found: {error.details.get('found_count', 0)}")
            print()

## Example 4: Custom Configuration

Use custom defaults and overrides:

In [None]:
# Define custom configuration
config = QSIPrepDefaults(
    nprocs=16,
    mem_mb=32000,
    output_resolution=1.5,
    denoise_method="dwidenoise",
)

inputs = QSIPrepInputs(
    bids_dir=bids_dir,
    participant="001",
)

# Run with custom config
result = run_procedure(
    "qsiprep",
    inputs,
    config=config,
    log_dir=log_dir,
)

print(f"Status: {result.status}")

## Example 5: Batch Processing with Validation

In [None]:
# Process multiple participants
participants = ["001", "002", "003", "004", "005"]
results = []

for participant in participants:
    print(f"\nProcessing sub-{participant}...")
    
    inputs = QSIPrepInputs(
        bids_dir=bids_dir,
        participant=participant,
    )
    
    result = run_procedure("qsiprep", inputs, log_dir=log_dir)
    results.append(result)
    
    if result.success:
        print(f"  ✓ Success ({result.duration_seconds:.1f}s)")
    elif result.status == "pre_validation_failed":
        print(f"  ✗ Skipped: {result.get_failure_reason()}")
    else:
        print(f"  ✗ Failed: {result.get_failure_reason()}")

# Summary
successful = sum(1 for r in results if r.success)
pre_val_failed = sum(1 for r in results if r.status == "pre_validation_failed")
exec_failed = sum(1 for r in results if r.status == "execution_failed")
post_val_failed = sum(1 for r in results if r.status == "post_validation_failed")

print("\n" + "="*50)
print("Batch Summary:")
print(f"  Total: {len(results)}")
print(f"  Successful: {successful}")
print(f"  Pre-validation failures: {pre_val_failed}")
print(f"  Execution failures: {exec_failed}")
print(f"  Post-validation failures: {post_val_failed}")

## Example 6: Saving Results to Database

In [None]:
import json
from datetime import datetime

def save_result_to_db(result, db_connection=None):
    """Example: Save result to database."""
    # Convert to dict
    data = result.to_dict()
    
    # Example: MongoDB
    if db_connection:
        db_connection.qsiprep_runs.insert_one(data)
    
    # Example: JSON file
    filename = f"qsiprep_sub-{result.participant}_{result.run_id}.json"
    with open(filename, 'w') as f:
        json.dump(data, f, indent=2)
    
    print(f"Saved: {filename}")
    return data

# Save a result
if result:
    saved_data = save_result_to_db(result)
    
    # Show what's saved
    print("\nSaved keys:", list(saved_data.keys()))
    print(f"\nValidation data included: {bool(saved_data['pre_validation'])}")
    print(f"Audit log referenced: {saved_data['audit_log_file']}")

## Example 7: Querying Failed Runs

Use validation data to understand failures:

In [None]:
# Collect all results from batch
failed_results = [r for r in results if not r.success]

# Categorize failures
failure_categories = {}

for result in failed_results:
    reason = result.get_failure_reason()
    if reason not in failure_categories:
        failure_categories[reason] = []
    failure_categories[reason].append(result.participant)

# Show summary
print("Failure Analysis:\n")
for reason, participants in failure_categories.items():
    print(f"{len(participants)} participant(s): {reason}")
    print(f"  Participants: {', '.join(participants)}")
    print()

## Example 8: Using Validators Directly (Advanced)

You can use validators independently for custom workflows:

In [None]:
from voxelops import QSIPrepValidator, ValidationContext

# Create validator
validator = QSIPrepValidator()

# Create context
inputs = QSIPrepInputs(
    bids_dir=bids_dir,
    participant="001",
)

context = ValidationContext(
    procedure_name="qsiprep",
    participant="001",
    inputs=inputs,
)

# Run just pre-validation
pre_report = validator.validate_pre(context)

# Check specific rules
print(f"Pre-validation rules: {len(validator.pre_rules)}")
for i, rule in enumerate(validator.pre_rules, 1):
    print(f"{i}. {rule.description}")

print(f"\nPost-validation rules: {len(validator.post_rules)}")
for i, rule in enumerate(validator.post_rules, 1):
    print(f"{i}. {rule.description}")

## Troubleshooting Guide

### Common Pre-validation Failures

| Error | Cause | Solution |
|-------|-------|----------|
| BIDS directory not found | Path incorrect | Check `bids_dir` path |
| Participant not found | Missing sub-directory | Verify participant ID, check BIDS structure |
| No DWI files found | Missing data | Check `dwi/` subdirectory, verify BIDS naming |
| Missing bval/bvec | Incomplete data | Ensure .bval and .bvec files exist alongside .nii.gz |
| No T1w found | Missing anatomical | Check `anat/` subdirectory |

### Common Post-validation Failures

| Error | Cause | Solution |
|-------|-------|----------|
| Output directory not created | Execution failed | Check execution logs |
| Participant directory missing | Partial failure | QSIPrep may have crashed |

### Performance Tips

1. **Pre-validation catches issues early** - saves hours of compute time
2. **Use batch processing** - process multiple participants efficiently
3. **Save results to database** - enables queries and analysis
4. **Check audit logs** - detailed timeline of what happened
5. **Don't skip validation** - unless you have a specific reason