# Demo: Multi-echo preprocessing in Neurodesk (fMRIPrep + tedana)

This notebook is a **short, guided demo** that mirrors the style of our lab handouts. It shows how to:

1. Install an OpenNeuro dataset via **DataLad**
2. Download **only one participant**: `sub-10317`
3. Run **fMRIPrep** (multi-echo) in **Neurodesk**
4. Run **tedana** for multi-echo denoising on **one run**
5. Do a quick QC pass and locate the key outputs

---

## Learning objectives

By the end, you should be able to:

- Install a BIDS dataset with **DataLad**, then download only a target subject
- Run **fMRIPrep** for a single participant with multi-echo settings
- Identify the echo-wise preprocessed outputs produced by fMRIPrep
- Install and run **tedana** and interpret what it produces at a high level
- Perform basic QC using the fMRIPrep HTML report and quick file checks

> Vocabulary reminder: in fMRI, **TR**, **volume**, **time point**, and **measurement** are often used interchangeably.

---

## Data and scope

- Dataset: OpenNeuro `ds005123`  
- Subject: **sub-10317 only**  
- Focus: **preprocessing + multi-echo denoising**, not statistical modeling  
- Environment: **Neurodesk / Neurodesktop**

**Important:** Do not write outputs into the dataset directory (`~/ds005123/`). Save everything into a separate working folder.

---


## Before you begin

### Paths (inputs + outputs)

We will use these defaults. You can change them, but keep them consistent throughout the notebook.

- **Dataset (DataLad will create this):** `~/ds005123/`
- **All demo outputs live here:** `~/Lab_ME_fmriprep_tedana/`
  - fMRIPrep outputs: `~/Lab_ME_fmriprep_tedana/derivatives/fmriprep/`
  - fMRIPrep working directory: `~/Lab_ME_fmriprep_tedana/work/`
  - tedana outputs: `~/Lab_ME_fmriprep_tedana/derivatives/tedana/`

> **At this point, you should begin working on your own.**  
> TAs and the instructor are available for help as needed.

---


In [None]:
import os
from pathlib import Path

# --------
# Settings
# --------
SUB = "10317"  # participant label (without "sub-")

HOME = Path.home()
BIDS = HOME / "ds005123"
DEMO = HOME / "Lab_ME_fmriprep_tedana"

FMRIPREP_OUT = DEMO / "derivatives" / "fmriprep"
FMRIPREP_WORK = DEMO / "work"
TEDANA_OUT = DEMO / "derivatives" / "tedana"

# Make these available to shell commands in this notebook
os.environ["SUB"] = SUB
os.environ["BIDS"] = str(BIDS)
os.environ["DEMO"] = str(DEMO)
os.environ["FMRIPREP_OUT"] = str(FMRIPREP_OUT)
os.environ["FMRIPREP_WORK"] = str(FMRIPREP_WORK)
os.environ["TEDANA_OUT"] = str(TEDANA_OUT)

print("SUB =", SUB)
print("BIDS =", BIDS)
print("DEMO =", DEMO)


In [None]:
# Create output folders
!mkdir -p "$FMRIPREP_OUT" "$FMRIPREP_WORK" "$TEDANA_OUT"
!ls -ld "$DEMO" "$FMRIPREP_OUT" "$FMRIPREP_WORK" "$TEDANA_OUT"


# 1) Install the dataset with DataLad and download only sub-10317

We will install the dataset repository, then request only the files for `sub-10317`.

> Checkpoint: after `datalad get sub-10317`, you should see a `sub-10317/func/` folder with multi-echo BOLD files.

---


In [None]:
# Install the dataset (this creates ~/ds005123 if it doesn't exist)
!cd ~ && datalad install https://github.com/OpenNeuroDatasets/ds005123.git

# Download only this participant
!cd "$BIDS" && datalad get "sub-$SUB"

# Quick sanity check
!ls -lh "$BIDS/sub-$SUB" | head -n 50
!ls -lh "$BIDS/sub-$SUB/func" | head -n 50


# 2) Pick one multi-echo run for the tedana demo

For this demo, we will run tedana on **one run** (one series), to keep runtime reasonable.

We will automatically select the first echo-1 file we find. You can override it if you prefer a specific run.

---


In [None]:
import glob

echo1_candidates = sorted(glob.glob(str(BIDS / f"sub-{SUB}/func/*echo-1*_bold.nii.gz")))
if not echo1_candidates:
    raise RuntimeError("No echo-1 BOLD files found. Check that the dataset downloaded correctly.")

RAW_ECHO1 = echo1_candidates[0]  # default choice
print("Default RAW_ECHO1:")
print(" ", RAW_ECHO1)

# BASE is everything up to "_echo-1_bold.nii.gz"
BASE = Path(RAW_ECHO1).name.replace("_echo-1_bold.nii.gz", "")
print("\nBASE:")
print(" ", BASE)

# List all raw echoes for this run
raw_echos = sorted(glob.glob(str(BIDS / f"sub-{SUB}/func/{BASE}_echo-*_bold.nii.gz")))
print("\nRaw echo files:")
for p in raw_echos:
    print(" ", p)

os.environ["BASE"] = BASE


## Extract echo times (TEs) from JSON sidecars

tedana needs the echo times in **seconds**, in the same order as the echo images.

> Checkpoint: you should see one EchoTime value per echo.

---


In [None]:
import json

json_paths = sorted(glob.glob(str(BIDS / f"sub-{SUB}/func/{BASE}_echo-*_bold.json")))
if not json_paths:
    raise RuntimeError("No JSON sidecars found for the selected run.")

tes = []
print("EchoTime values:")
for jp in json_paths:
    with open(jp, "r") as f:
        meta = json.load(f)
    te = meta.get("EchoTime", None)
    tes.append(te)
    print(f"  {Path(jp).name}: {te}")

# Store TEs for later use
if any(te is None for te in tes):
    raise RuntimeError("At least one EchoTime is missing. Check the JSON files.")

os.environ["TE_LIST"] = " ".join(str(te) for te in tes)
print("\nTE_LIST =", os.environ["TE_LIST"])


# 3) Run fMRIPrep (multi-echo) for sub-10317

### Key choices for this demo

- We run a single participant: `--participant-label 10317`
- We **skip FreeSurfer recon-all** to avoid licensing and reduce runtime: `--fs-no-reconall`
- We request echo-wise outputs: `--me-output-echos`
- We write outputs into our demo folder (not into the dataset itself)

> If `fmriprep` is not found in this notebook environment, run the same command in Neurodesk via:  
> **Neurodesk menu → fMRIPrep → terminal**, then paste the command.

---


In [None]:
# Check whether fmriprep is available on PATH from this notebook
!which fmriprep || echo "fmriprep not on PATH in this notebook. Use the Neurodesk fMRIPrep terminal."


In [None]:
# IMPORTANT:
# - This command can take a long time.
# - Put the work directory somewhere with plenty of space.
# - Adjust --nprocs/--mem for your machine.

!fmriprep "$BIDS" "$FMRIPREP_OUT" participant   --participant-label "$SUB"   --fs-no-reconall   --me-output-echos   --use-syn-sdc   --output-spaces T1w   --nprocs 6 --mem 12000   --skip_bids_validation   -w "$FMRIPREP_WORK"   -v


## fMRIPrep outputs: where to look

After fMRIPrep completes, you should have:

- A subject-level HTML report: `derivatives/fmriprep/sub-10317.html`
- Preprocessed files under: `derivatives/fmriprep/sub-10317/`

> Checkpoint: confirm the report exists and the `func/` directory contains echo-wise `desc-preproc_bold.nii.gz` files.

---


In [None]:
!ls -lh "$FMRIPREP_OUT" | head -n 200
!ls -lh "$FMRIPREP_OUT/sub-$SUB.html" || true
!ls -lh "$FMRIPREP_OUT/sub-$SUB/func" | head -n 50


In [None]:
from pathlib import Path
from IPython.display import HTML

report = Path(os.environ["FMRIPREP_OUT"]) / f"sub-{SUB}.html"
if report.exists():
    HTML(f"<b>fMRIPrep report:</b> <a href='{report.as_posix()}' target='_blank'>{report.name}</a>")
else:
    print("Report not found yet (fMRIPrep may still be running, or the run failed).")


# 4) Locate echo-wise preprocessed outputs (inputs for tedana)

We will find the echo-wise preprocessed images produced by fMRIPrep for the run you selected earlier (`BASE`).

> Checkpoint: you should see a list of `echo-1`, `echo-2`, … preprocessed NIfTIs for the same run.

---


In [None]:
import glob

prep_glob = str(FMRIPREP_OUT / f"sub-{SUB}/func/{BASE}*echo-*desc-preproc_bold.nii.gz")
prep_echos = sorted(glob.glob(prep_glob))
print("Looking for:", prep_glob)
print("\nPreprocessed echo files:")
for p in prep_echos:
    print(" ", p)

if not prep_echos:
    raise RuntimeError("No preprocessed echo files found. Confirm fMRIPrep completed and used --me-output-echos.")


# 5) Install tedana (if needed)

First, check whether `tedana` is already available. If not, install it.

**Recommendation:** use `%pip install` so the package is installed into the current Jupyter environment.

> If installation fails due to permissions, use a user install: `python -m pip install --user tedana`.

---


In [None]:
# Check whether tedana exists
!which tedana && tedana --version || echo "tedana not found yet."

# Install (uncomment one of the options below if you need it)

# Option A (preferred in notebooks):
# %pip install tedana

# Option B (user install):
# !python3 -m pip install --user tedana
# import os, pathlib
# os.environ["PATH"] = str(pathlib.Path.home() / ".local" / "bin") + ":" + os.environ.get("PATH","")


# 6) Run tedana on one run

We will run tedana on the **preprocessed echo-wise** time series from fMRIPrep.

Inputs:
- `-d`: echo-wise preprocessed images (one per echo)
- `-e`: echo times in seconds (same order as the echo images)

Outputs:
- an optimally-combined series (often called *optcom*)
- component metrics / classifications (TSV)
- a denoised time series (name varies by version)
- logs and, in some versions, a report directory

---


In [None]:
import subprocess, shlex

# Collect preprocessed echoes again (sorted by echo number)
prep_echos = sorted(glob.glob(str(FMRIPREP_OUT / f"sub-{SUB}/func/{BASE}*echo-*desc-preproc_bold.nii.gz")))

# Echo times extracted earlier (same order as JSONs)
te_values = [float(x) for x in os.environ["TE_LIST"].split()]

print("N echoes:", len(prep_echos))
print("TEs:", te_values)

# Build output directory for this run
run_out = TEDANA_OUT / f"sub-{SUB}" / "func" / BASE
run_out.mkdir(parents=True, exist_ok=True)
print("tedana out:", run_out)

# Compose tedana command
cmd = ["tedana", "-d", *prep_echos, "-e", *[str(te) for te in te_values],
       "--convention", "bids",
       "--out-dir", str(run_out),
       "--prefix", BASE,
       "--fittype", "curvefit",
       "--overwrite"]

print("\nCommand (copy/pasteable):")
print(" ".join(shlex.quote(c) for c in cmd))

# Run it
subprocess.run(cmd, check=True)


## tedana outputs: quick check

> Checkpoint: list the output folder and identify the main denoised / optcom files and the component tables.

---


In [None]:
!ls -lh "$TEDANA_OUT/sub-$SUB/func/$BASE" | head -n 200


# 7) Quick QC (what to look at)

### fMRIPrep QC
Open the fMRIPrep HTML report and focus on:
- BOLD↔T1w alignment
- motion patterns and outliers
- (if present) susceptibility distortion correction outcomes

### tedana QC (high level)
For this demo, focus on:
- whether tedana ran without errors
- whether the component metrics/classification files exist
- whether a denoised time series was produced

> You are not expected to become a tedana expert here. The goal is to build a *reproducible workflow* and to know where outputs live.

---


# Questions / short write-up (for students)

1. In 2–3 sentences: why can multi-echo acquisitions help separate BOLD-like signal from artifacts?
2. Which part of the fMRIPrep report most increased your confidence that preprocessing worked? Be specific.
3. What do the TE values represent, and why do they matter for tedana?
4. Where are the echo-wise preprocessed images located in the fMRIPrep derivatives?
5. If you had to archive only a minimal set of outputs to reproduce this demo later, what would you keep?

---


# Appendix: example scripts (for reference)

These are the example scripts you attached. They are **not** run directly by this notebook, because they include environment-specific paths (e.g., `/ZPOOL/...`).  
Still, they provide a useful template for how you might script this workflow on an HPC system.

## Example: fMRIPrep script

```bash
#!/bin/bash


sub=$1

# ensure paths are correct irrespective from where user runs the script
scriptdir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
maindir="$(dirname "$scriptdir")"

# make derivatives folder if it doesn't exist.
# let's keep this out of bids for now
if [ ! -d $maindir/derivatives ]; then
	mkdir -p $maindir/derivatives
fi

scratchdir=/ZPOOL/data/scratch/`whoami`
if [ ! -d $scratchdir ]; then
	mkdir -p $scratchdir
fi


TEMPLATEFLOW_DIR=/ZPOOL/data/tools/templateflow
export SINGULARITYENV_TEMPLATEFLOW_HOME=/opt/templateflow

if [ $sub -ge 300 ] ; then
	singularity run --cleanenv \
	-B ${TEMPLATEFLOW_DIR}:/opt/templateflow \
	-B $maindir:/base \
	-B /ZPOOL/data/projects/rf1-sra-data/bids:/input \
	-B /ZPOOL/data/tools/licenses:/opts \
	-B $scratchdir:/scratch \
	/ZPOOL/data/tools/fmriprep-23.2.1.simg \
	/input /base/derivatives/fmriprep \
	participant --participant_label $sub \
	-t sharedreward \
	--stop-on-first-crash \
	--me-output-echos \
	--use-syn-sdc \
	--output-spaces MNI152NLin6Asym:res-2 \
	--bids-filter-file /base/code/fmriprep_config.json \
	--fs-no-reconall --fs-license-file /opts/fs_license.txt -w /scratch
else
	singularity run --cleanenv \
	-B ${TEMPLATEFLOW_DIR}:/opt/templateflow \
	-B $maindir:/base \
	-B /ZPOOL/data/tools/licenses:/opts \
	-B $scratchdir:/scratch \
	/ZPOOL/data/tools/fmriprep-23.2.1.simg \
	/base/ds003745 /base/derivatives/fmriprep \
	participant --participant_label $sub \
	-t sharedreward \
	--stop-on-first-crash \
	--use-syn-sdc \
	--output-spaces MNI152NLin6Asym:res-2 \
	--fs-no-reconall --fs-license-file /opts/fs_license.txt -w /scratch
fi

#	--bids-filter-file /base/code/fmriprep_config.json \
```

## Example: tedana script

```bash
#!/bin/bash

# ensure paths are correct irrespective from where user runs the script
scriptdir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
maindir="$(dirname "$scriptdir")"

sub=$1
task=sharedreward
run=$2

# prepare inputs and outputs; don't run if data is missing, but log missingness
prepdir=${maindir}/derivatives/fmriprep/sub-${sub}/func
echo1=${prepdir}/sub-${sub}_task-${task}_run-${run}_echo-1_part-mag_desc-preproc_bold.nii.gz
echo2=${prepdir}/sub-${sub}_task-${task}_run-${run}_echo-2_part-mag_desc-preproc_bold.nii.gz
echo3=${prepdir}/sub-${sub}_task-${task}_run-${run}_echo-3_part-mag_desc-preproc_bold.nii.gz
echo4=${prepdir}/sub-${sub}_task-${task}_run-${run}_echo-4_part-mag_desc-preproc_bold.nii.gz
outdir=${maindir}/derivatives/tedana/sub-${sub}
if [ ! -e $echo1 ]; then
	echo "missing ${echo1}"
	echo "missing ${echo1}" >> $scriptdir/missing-tedanaInput.log
	exit
fi
mkdir -p $outdir

# run tedana
tedana -d $echo1 $echo2 $echo3 $echo4 \
-e 0.0138 0.03154 0.04928 0.06702 \
--out-dir $outdir \
--prefix sub-${sub}_task-${task}_run-${run} \
--convention bids \
--fittype curvefit \
--overwrite

# clean up and save space
rm -rf ${outdir}/sub-${sub}_task-${task}_run-${run}_*.nii.gz
```

---
