# Demo: Multi-echo preprocessing in Neurodesk (fMRIPrep â†’ tedana)

This notebook is a **functional, step-by-step demo**:

1. Install the OpenNeuro dataset with **DataLad**
2. Download **only one participant**: `sub-10317`
3. Run **fMRIPrep** (multi-echo; echo-wise outputs)
4. Run **tedana** on **one run**, using **fMRIPrep outputs**
5. Verify where key outputs live

**Dataset:** `ds005123` (OpenNeuro)  
**Participant:** `sub-10317` only

---

## Notes before you start

- Keep outputs in a **separate working folder** (not inside the dataset folder itself).
- fMRIPrep requires a **FreeSurfer license** (we copy it into a local `licenses/` folder for container use).
- This demo runs fMRIPrep first, then tedana.


In [None]:
import os, glob, json, re
from pathlib import Path

# ----------------
# User-set options
# ----------------
SUB = "10317"  # participant label (without "sub-")

HOME = Path.home()
DEMO = HOME / "Lab_ME_fmriprep_tedana"

# Install the DataLad dataset here (matches the container bind scheme below)
BIDS = DEMO / "bids"

# fMRIPrep outputs (explicit version label)
FMRIPREP_DERIV = DEMO / "derivatives" / "fmriprep-24"
FMRIPREP_WORK  = DEMO / "scratch"

# tedana outputs
TEDANA_DERIV = DEMO / "derivatives" / "tedana"

# Support folders
TEMPLATEFLOW_DIR = DEMO / "templateflow"
MPLCONFIGDIR_DIR = DEMO / "mplconfigdir"
LICENSE_DIR      = DEMO / "licenses"
LOG_DIR          = DEMO / "logs"

# Optional: fMRIPrep container image path (edit if needed)
FMRIPREP_IMG = Path("~/work/tools/hpctools/fmriprep-24.1.1.simg").expanduser()

# Export for shell cells
os.environ.update({
    "SUB": SUB,
    "DEMO": str(DEMO),
    "BIDS": str(BIDS),
    "FMRIPREP_DERIV": str(FMRIPREP_DERIV),
    "FMRIPREP_WORK": str(FMRIPREP_WORK),
    "TEDANA_DERIV": str(TEDANA_DERIV),
    "TEMPLATEFLOW_DIR": str(TEMPLATEFLOW_DIR),
    "MPLCONFIGDIR_DIR": str(MPLCONFIGDIR_DIR),
    "LICENSE_DIR": str(LICENSE_DIR),
    "LOG_DIR": str(LOG_DIR),
    "FMRIPREP_IMG": str(FMRIPREP_IMG),
})

print("SUB =", SUB)
print("DEMO =", DEMO)
print("BIDS =", BIDS)
print("FMRIPREP_DERIV =", FMRIPREP_DERIV)
print("TEDANA_DERIV =", TEDANA_DERIV)
print("FMRIPREP_IMG =", FMRIPREP_IMG)


In [None]:
# Create folders (idempotent)
!mkdir -p "$DEMO" "$DEMO/derivatives" "$FMRIPREP_DERIV" "$FMRIPREP_WORK" "$TEDANA_DERIV" "$TEMPLATEFLOW_DIR" "$MPLCONFIGDIR_DIR" "$LICENSE_DIR" "$LOG_DIR"
!ls -ld "$DEMO" "$DEMO/derivatives" "$FMRIPREP_DERIV" "$FMRIPREP_WORK" "$TEDANA_DERIV" "$TEMPLATEFLOW_DIR" "$MPLCONFIGDIR_DIR" "$LICENSE_DIR" "$LOG_DIR"


# 1) Install `ds005123` with DataLad and download only `sub-10317`

To keep terminal output readable, we log DataLad output to files and show only the last lines.

> **Checkpoint:** you should see `bids/sub-10317/func/` populated with multi-echo BOLD files.


In [None]:
# Install dataset into DEMO/bids (quiet-ish: log to file)
!cd "$DEMO" && (datalad -l error install https://github.com/OpenNeuroDatasets/ds005123.git bids > "$LOG_DIR/datalad_install.log" 2>&1 || true)
!tail -n 20 "$LOG_DIR/datalad_install.log" || true

# Download only this participant (quiet-ish: log to file)
!cd "$BIDS" && (datalad -l error get "sub-$SUB" > "$LOG_DIR/datalad_get_sub-${SUB}.log" 2>&1)
!tail -n 20 "$LOG_DIR/datalad_get_sub-${SUB}.log"

# Quick sanity check
!ls -lh "$BIDS/sub-$SUB" | head -n 50
!ls -lh "$BIDS/sub-$SUB/func" | head -n 50


# 2) FreeSurfer license (required for fMRIPrep)

This demo expects your FreeSurfer license at `~/.license`.  
We copy it into the demo folder as `licenses/fs_license.txt` so the container can mount it consistently.

> **Checkpoint:** you should see `~/Lab_ME_fmriprep_tedana/licenses/fs_license.txt`.


In [None]:
# Copy FreeSurfer license into the demo (do not commit it anywhere)
!test -r "$HOME/.license" || (echo "ERROR: FreeSurfer license not found at ~/.license" && exit 1)
!cp -f "$HOME/.license" "$LICENSE_DIR/fs_license.txt"
!ls -l "$LICENSE_DIR/fs_license.txt"


# 3) Create a BIDS filter file (optional but recommended)

Your preferred fMRIPrep command uses `--bids-filter-file`.  
We generate a simple filter that keeps:
- T1w anatomical
- all functional BOLD + SBRef
- fieldmaps


In [None]:
# Write a simple bids-filter-file JSON
from pathlib import Path
import json

cfg_path = Path(os.environ["DEMO"]) / "cfg_fmriprep.json"
cfg = {
    "t1w":   {"datatype": "anat", "suffix": "T1w"},
    "bold":  {"datatype": "func", "suffix": "bold"},
    "sbref": {"datatype": "func", "suffix": "sbref"},
    "fmap":  {"datatype": "fmap"}
}
cfg_path.write_text(json.dumps(cfg, indent=2) + "\n")
print("Wrote:", cfg_path)
print(cfg_path.read_text())


# 4) Run fMRIPrep (preferred command pattern)

We run fMRIPrep **first**, then use its echo-wise outputs as inputs to tedana.

This step uses the **container-style command** you provided (`singularity run --cleanenv` with binds).

## Fix for the error you pasted
Your crash shows a FreeSurfer `subjects_dir` path that **did not exist**.  
Before running, we create a subjects directory under the fMRIPrep derivatives folder and set `SUBJECTS_DIR` inside the run.

> **Checkpoint:** after fMRIPrep completes, confirm:
- report: `derivatives/fmriprep-24/sub-10317.html`
- echo-wise outputs: `derivatives/fmriprep-24/sub-10317/func/*echo-*_desc-preproc_bold.nii.gz`


In [None]:
# Confirm singularity exists and the image path is correct
!bash -lc 'command -v singularity >/dev/null 2>&1 && echo "singularity: OK" || echo "singularity: NOT FOUND"'
!bash -lc 'test -r "$FMRIPREP_IMG" && echo "fMRIPrep image: OK" || (echo "fMRIPrep image not found at: $FMRIPREP_IMG" ; true)'


In [None]:
# Run fMRIPrep (container)
# - Logs go into $LOG_DIR
# - Adjust --nthreads if needed

!bash -lc 'set -euo pipefail

sub="$SUB"
maindir="$DEMO"
scratchdir="$FMRIPREP_WORK"
logdir="$LOG_DIR"
cfg="$DEMO/cfg_fmriprep.json"

mkdir -p "$scratchdir" "$logdir" "$TEMPLATEFLOW_DIR" "$MPLCONFIGDIR_DIR" "$LICENSE_DIR"

# Ensure a valid FreeSurfer SUBJECTS_DIR exists (fixes the TraitError you pasted)
FS_SUBJECTS_DIR="$FMRIPREP_DERIV/sourcedata/freesurfer"
mkdir -p "$FS_SUBJECTS_DIR"

cmd="singularity run --cleanenv   -B ${TEMPLATEFLOW_DIR}:/opt/templateflow   -B ${MPLCONFIGDIR_DIR}:/opt/mplconfigdir   -B ${LICENSE_DIR}:/opts   -B $maindir:/base   -B $scratchdir:/scratch   ${FMRIPREP_IMG}   /base/bids /base/derivatives/fmriprep-24   participant --participant_label $sub   --stop-on-first-crash   --skip-bids-validation   --nthreads 14   --me-output-echos   --output-spaces T1w MNI152NLin6Asym   --bids-filter-file $cfg   --fs-no-reconall --fs-license-file /opts/fs_license.txt   -w /scratch/sub-${sub} "

echo "$cmd" | tee "$logdir/cmd_fmriprep.txt"

export SUBJECTS_DIR="$FS_SUBJECTS_DIR"
eval "$cmd" 2>&1 | tee "$logdir/fmriprep_runtime.log"
'


In [None]:
# Check fMRIPrep outputs
!ls -lh "$FMRIPREP_DERIV" | head -n 200
!ls -lh "$FMRIPREP_DERIV/sub-$SUB.html" || true
!ls -lh "$FMRIPREP_DERIV/sub-$SUB/func" | head -n 80


# 5) Choose one run for tedana and extract echo times (TEs)

Now that fMRIPrep has run, we pick one run for the tedana demo.

> **Checkpoint:** you should see all raw echoes for the selected run and one TE per echo.


In [None]:
import glob, json, re
from pathlib import Path

echo1_candidates = sorted(glob.glob(str(Path(os.environ["BIDS"]) / f"sub-{SUB}/func/*echo-1*_bold.nii.gz")))
if not echo1_candidates:
    raise RuntimeError("No echo-1 BOLD files found in BIDS. Confirm DataLad downloaded sub-10317.")

RAW_ECHO1 = echo1_candidates[0]
fname = Path(RAW_ECHO1).name
print("Selected RAW_ECHO1:")
print(" ", RAW_ECHO1)

m = re.match(r"^(.*)_echo-\d+(.*)_bold\.nii\.gz$", fname)
if not m:
    raise RuntimeError(f"Could not parse echo/run structure from: {fname}")

RUN_PREFIX, RUN_SUFFIX = m.group(1), m.group(2)
print("\nRUN_PREFIX:", RUN_PREFIX)
print("RUN_SUFFIX:", RUN_SUFFIX)

bids_path = Path(os.environ["BIDS"])

raw_echo_glob = str(bids_path / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}_bold.nii.gz")
raw_echos = sorted(glob.glob(raw_echo_glob))
print("\nRaw echo files for this run:")
for p in raw_echos:
    print(" ", p)

json_glob = str(bids_path / f"sub-{SUB}/func/{RUN_PREFIX}_echo-*{RUN_SUFFIX}_bold.json")
json_paths = sorted(glob.glob(json_glob))
if not json_paths:
    raise RuntimeError("No JSON sidecars found for this run.")

tes = []
print("\nEchoTime values:")
for jp in json_paths:
    with open(jp, "r") as f:
        meta = json.load(f)
    te = meta.get("EchoTime", None)
    tes.append(te)
    print(f"  {Path(jp).name}: {te}")

if any(te is None for te in tes):
    raise RuntimeError("At least one EchoTime is missing in the JSON sidecars.")

print("\nTE list (seconds), in file order:")
print(tes)

os.environ["RUN_PREFIX"] = RUN_PREFIX
os.environ["RUN_SUFFIX"] = RUN_SUFFIX
os.environ["TE_LIST"] = " ".join(str(te) for te in tes)


# 6) Locate echo-wise fMRIPrep outputs for the selected run

We use the echo-wise **preprocessed** files from fMRIPrep as inputs to tedana.

> **Checkpoint:** you should see one `desc-preproc_bold.nii.gz` file per echo.


In [None]:
import glob
from pathlib import Path

fprep = Path(os.environ["FMRIPREP_DERIV"])
prep_glob = str(fprep / f"sub-{SUB}/func/{os.environ['RUN_PREFIX']}_echo-*{os.environ['RUN_SUFFIX']}*desc-preproc_bold.nii.gz")
prep_echos = sorted(glob.glob(prep_glob))

print("Looking for:")
print(" ", prep_glob)
print("\nPreprocessed echo files:")
for p in prep_echos:
    print(" ", p)

if not prep_echos:
    raise RuntimeError("No preprocessed echo files found. Confirm fMRIPrep finished and used --me-output-echos.")


# 7) Run tedana on one run (using fMRIPrep outputs)

1. Confirm `tedana` is available (install if needed).
2. Run tedana with:
   - `-d`: echo-wise **fMRIPrep preprocessed** images
   - `-e`: echo times (seconds)

> **Checkpoint:** you should see NIfTI + TSV outputs in:  
`derivatives/tedana/sub-10317/func/<run-id>/`


In [None]:
# Check whether tedana exists
!which tedana && tedana --version || echo "tedana not found yet."

# If needed, uncomment ONE option:

# Option A (recommended in notebooks):
# %pip install tedana

# Option B (user install):
# !python3 -m pip install --user tedana
# import pathlib, os
# os.environ["PATH"] = str(pathlib.Path.home() / ".local" / "bin") + ":" + os.environ.get("PATH","")


In [None]:
import subprocess, shlex, glob
from pathlib import Path

fprep = Path(os.environ["FMRIPREP_DERIV"])
prep_echos = sorted(glob.glob(str(fprep / f"sub-{SUB}/func/{os.environ['RUN_PREFIX']}_echo-*{os.environ['RUN_SUFFIX']}*desc-preproc_bold.nii.gz")))
te_values = [float(x) for x in os.environ["TE_LIST"].split()]

run_id = f"{os.environ['RUN_PREFIX']}{os.environ['RUN_SUFFIX']}"
run_out = Path(os.environ["TEDANA_DERIV"]) / f"sub-{SUB}" / "func" / run_id
run_out.mkdir(parents=True, exist_ok=True)

cmd = [
    "tedana",
    "-d", *prep_echos,
    "-e", *[str(te) for te in te_values],
    "--convention", "bids",
    "--out-dir", str(run_out),
    "--prefix", run_id,
    "--fittype", "curvefit",
    "--overwrite",
]

print("Command (copy/pasteable):")
print(" ".join(shlex.quote(c) for c in cmd))

subprocess.run(cmd, check=True)
print("Done.")


In [None]:
# List tedana outputs
run_id = f"{os.environ['RUN_PREFIX']}{os.environ['RUN_SUFFIX']}"
!ls -lh "$TEDANA_DERIV/sub-$SUB/func/$run_id" | head -n 200


# Finished

You should now have:

- **fMRIPrep report:** `derivatives/fmriprep-24/sub-10317.html`
- **Echo-wise preprocessed BOLD:** `derivatives/fmriprep-24/sub-10317/func/*echo-*_desc-preproc_bold.nii.gz`
- **tedana outputs (one run):** `derivatives/tedana/sub-10317/func/<run-id>/`
- **Logs:** `logs/cmd_fmriprep.txt` and `logs/fmriprep_runtime.log`
