# Demo: Multi-echo preprocessing in Neurodesk (fMRIPrep → tedana)

This notebook is a **functional, step-by-step demo** that shows how to:

1. Install an OpenNeuro dataset via **DataLad**
2. Download **only one participant**: `sub-10317`
3. Run **fMRIPrep** (multi-echo) in **Neurodesk**
4. Run **tedana** on one run, using **fMRIPrep outputs**
5. Locate the key outputs for QC and downstream analyses

**Dataset:** `ds005123` (OpenNeuro)  
**Participant:** `sub-10317` only

---

## Before you begin

- Start **Neurodesktop** (Neurodesk GUI).
- Use a working folder with plenty of space.
- **Do not** write outputs into the dataset folder (`~/ds005123/`).

> **Checkpoint mindset:** after each step, confirm you see the expected files before moving on.


In [None]:
import os, re, glob, json
from pathlib import Path

# ----------------
# User-set options
# ----------------
SUB = "10317"  # participant label (without "sub-")

HOME = Path.home()

# Input BIDS dataset (DataLad will create this folder)
BIDS = HOME / "ds005123"

# All demo outputs live here (keep outputs out of the BIDS dataset folder)
DEMO = HOME / "Lab_ME_fmriprep_tedana"

FMRIPREP_OUT  = DEMO / "derivatives" / "fmriprep"
FMRIPREP_WORK = DEMO / "work"
TEDANA_OUT    = DEMO / "derivatives" / "tedana"

# Export to environment for shell cells
os.environ.update({
    "SUB": SUB,
    "BIDS": str(BIDS),
    "DEMO": str(DEMO),
    "FMRIPREP_OUT": str(FMRIPREP_OUT),
    "FMRIPREP_WORK": str(FMRIPREP_WORK),
    "TEDANA_OUT": str(TEDANA_OUT),
})

print("SUB =", SUB)
print("BIDS =", BIDS)
print("DEMO =", DEMO)


In [None]:
# Create output folders
!mkdir -p "$FMRIPREP_OUT" "$FMRIPREP_WORK" "$TEDANA_OUT"
!ls -ld "$DEMO" "$FMRIPREP_OUT" "$FMRIPREP_WORK" "$TEDANA_OUT"


# 1) Install the dataset (DataLad) and download only `sub-10317`

We will install the dataset repository, then download only the participant folder for `sub-10317`.

To keep the notebook readable, we will **log DataLad output to files** and print only a short tail.

> **Checkpoint:** you should see `sub-10317/func/` with multi-echo BOLD files.


In [None]:
# Install dataset (quiet-ish): write logs, then show the last few lines
!cd ~ && (datalad -l error install https://github.com/OpenNeuroDatasets/ds005123.git > "$DEMO/datalad_install.log" 2>&1 || true)
!tail -n 20 "$DEMO/datalad_install.log" || true

# Download only this participant (quiet-ish)
!cd "$BIDS" && (datalad -l error get "sub-$SUB" > "$DEMO/datalad_get_sub-${SUB}.log" 2>&1)
!tail -n 20 "$DEMO/datalad_get_sub-${SUB}.log"

# Quick sanity check
!ls -lh "$BIDS/sub-$SUB" | head -n 50
!ls -lh "$BIDS/sub-$SUB/func" | head -n 50


# 2) Run fMRIPrep (multi-echo) for `sub-10317`

### What we are doing here
- Single participant: `--participant-label 10317`
- Multi-echo echo-wise outputs: `--me-output-echos`
- Skip FreeSurfer recon-all for speed and to avoid licensing: `--fs-no-reconall`
- Keep outputs in `~/Lab_ME_fmriprep_tedana/derivatives/fmriprep/`

### Important
In Neurodesk/HPC environments, fMRIPrep is often provided via a module.

In this notebook, we try:
- `ml fmriprep` (or `module load fmriprep`)
- then check `which fmriprep`

If fMRIPrep still isn't found, run the same fMRIPrep command from:
**Neurodesk menu → fMRIPrep → terminal**, then return here for the tedana step.

> **Checkpoint:** you should have an HTML report at:  
> `~/Lab_ME_fmriprep_tedana/derivatives/fmriprep/sub-10317.html`


In [None]:
# Try to load fmriprep via environment modules (HPC-style). Either command may work depending on the system.
!bash -lc 'ml fmriprep 2>/dev/null || module load fmriprep 2>/dev/null || true; which fmriprep || true'


In [None]:
# Run fMRIPrep (this can take a while). Adjust --nprocs/--mem for your machine.
# If "fmriprep: command not found", run this command in the Neurodesk fMRIPrep terminal instead.

!bash -lc 'ml fmriprep 2>/dev/null || module load fmriprep 2>/dev/null || true;   fmriprep "$BIDS" "$FMRIPREP_OUT" participant     --participant-label "$SUB"     --fs-no-reconall     --me-output-echos     --use-syn-sdc     --output-spaces T1w     --nprocs 6 --mem 12000     --skip_bids_validation     -w "$FMRIPREP_WORK"     -v'


In [None]:
# Check fMRIPrep outputs
!ls -lh "$FMRIPREP_OUT" | head -n 200
!ls -lh "$FMRIPREP_OUT/sub-$SUB.html" || true
!ls -lh "$FMRIPREP_OUT/sub-$SUB/func" | head -n 50


# 3) Choose one run and extract echo times (TEs)

For the tedana demo, we will process **one run**.

We will:
1. Pick the first `echo-1` file found for this subject.
2. Parse the filename into a **run identifier** (prefix + suffix) so we can reliably match all echoes.
3. Extract `EchoTime` values from the JSON sidecars (in seconds).

> **Checkpoint:** you should see a list of raw echo files and one TE per echo.


In [None]:
from pathlib import Path
import re, glob

# Find candidate echo-1 images
echo1_candidates = sorted(glob.glob(str(BIDS / f"sub-{SUB}/func/*echo-1*_bold.nii.gz")))
if not echo1_candidates:
    raise RuntimeError("No echo-1 BOLD files found. Confirm you ran `datalad get sub-10317`.")

RAW_ECHO1 = echo1_candidates[0]
fname = Path(RAW_ECHO1).name
print("Selected RAW_ECHO1:")
print(" ", RAW_ECHO1)

# Robust parse: split into RUN_PREFIX and RUN_SUFFIX
m = re.match(r"^(.*)_echo-\d+(.*)_bold\.nii\.gz$", fname)
if not m:
    raise RuntimeError(f"Could not parse echo/run structure from: {fname}")

RUN_PREFIX, RUN_SUFFIX = m.group(1), m.group(2)
print("\nRUN_PREFIX:", RUN_PREFIX)
print("RUN_SUFFIX:", RUN_SUFFIX)

# List all raw echoes for this run
raw_echo_glob = str(BIDS / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}_bold.nii.gz")
raw_echos = sorted(glob.glob(raw_echo_glob))
print("\nRaw echo files for this run:")
for p in raw_echos:
    print(" ", p)

if len(raw_echos) < 2:
    print("\nNOTE: Found <2 echoes. If this is unexpected, choose a different run file.")


In [None]:
# Extract EchoTime values (seconds) from JSON sidecars for this run
import json, glob
from pathlib import Path

json_glob = str(BIDS / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}_bold.json")
json_paths = sorted(glob.glob(json_glob))

if not json_paths:
    raise RuntimeError("No JSON sidecars found for this run.")

tes = []
print("EchoTime values:")
for jp in json_paths:
    with open(jp, "r") as f:
        meta = json.load(f)
    te = meta.get("EchoTime", None)
    tes.append(te)
    print(f"  {Path(jp).name}: {te}")

if any(te is None for te in tes):
    raise RuntimeError("At least one EchoTime is missing. Check the JSON files.")

print("\nTE list (seconds), in file order:")
print(tes)

# Save for later cells
os.environ["RUN_PREFIX"] = RUN_PREFIX
os.environ["RUN_SUFFIX"] = RUN_SUFFIX
os.environ["TE_LIST"] = " ".join(str(te) for te in tes)


# 4) Locate echo-wise **preprocessed** outputs from fMRIPrep (inputs for tedana)

We will now find the echo-wise fMRIPrep outputs for the run we selected above.

We look for files like:
- `.../sub-10317/func/..._echo-1..._desc-preproc_bold.nii.gz`
- `.../sub-10317/func/..._echo-2..._desc-preproc_bold.nii.gz`
- etc.

> **Checkpoint:** you should see one `desc-preproc_bold.nii.gz` per echo for the selected run.


In [None]:
import glob

prep_glob = str(FMRIPREP_OUT / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}*desc-preproc_bold.nii.gz")
prep_echos = sorted(glob.glob(prep_glob))

print("Looking for:")
print(" ", prep_glob)
print("\nPreprocessed echo files:")
for p in prep_echos:
    print(" ", p)

if not prep_echos:
    raise RuntimeError("No preprocessed echo files found. Confirm fMRIPrep finished and used --me-output-echos.")


# 5) Ensure `tedana` is available

If `tedana` is not found, install it into the current notebook environment.

> If install fails due to permissions, use a user install (`--user`) and then re-check `which tedana`.


In [None]:
# Check whether tedana exists
!which tedana && tedana --version || echo "tedana not found yet."

# If needed, uncomment ONE option:

# Option A (recommended in notebooks):
# %pip install tedana

# Option B (user install):
# !python3 -m pip install --user tedana
# import os, pathlib
# os.environ["PATH"] = str(pathlib.Path.home() / ".local" / "bin") + ":" + os.environ.get("PATH","")


# 6) Run tedana (one run) using fMRIPrep outputs

Inputs:
- `-d`: echo-wise **fMRIPrep preprocessed** images (one per echo)
- `-e`: echo times in seconds (same order as images)

Outputs go here:
`~/Lab_ME_fmriprep_tedana/derivatives/tedana/sub-10317/func/<run-id>/`

> **Checkpoint:** the output folder should include NIfTI files and TSV files describing components/metrics.


In [None]:
import subprocess, shlex, glob

prep_echos = sorted(glob.glob(str(FMRIPREP_OUT / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}*desc-preproc_bold.nii.gz")))
te_values = [float(x) for x in os.environ["TE_LIST"].split()]

print("Number of echoes:", len(prep_echos))
print("TEs:", te_values)

run_id = f"{RUN_PREFIX}{RUN_SUFFIX}"
run_out = TEDANA_OUT / f"sub-{SUB}" / "func" / run_id
run_out.mkdir(parents=True, exist_ok=True)

cmd = [
    "tedana",
    "-d", *prep_echos,
    "-e", *[str(te) for te in te_values],
    "--convention", "bids",
    "--out-dir", str(run_out),
    "--prefix", run_id,
    "--fittype", "curvefit",
    "--overwrite",
]

print("\nCommand (copy/pasteable):")
print(" ".join(shlex.quote(c) for c in cmd))

subprocess.run(cmd, check=True)
print("\nDone.")


In [None]:
# List tedana outputs
run_id = f"{os.environ['RUN_PREFIX']}{os.environ['RUN_SUFFIX']}"
!ls -lh "$TEDANA_OUT/sub-$SUB/func/$run_id" | head -n 200


# Done: What you should have now

- **fMRIPrep report:** `~/Lab_ME_fmriprep_tedana/derivatives/fmriprep/sub-10317.html`
- **Echo-wise preprocessed BOLD files:** `~/Lab_ME_fmriprep_tedana/derivatives/fmriprep/sub-10317/func/*echo-*_desc-preproc_bold.nii.gz`
- **tedana outputs for one run:** `~/Lab_ME_fmriprep_tedana/derivatives/tedana/sub-10317/func/<run-id>/`
