# Session 1: Basics of fMRI Preprocessing

In this session, we will learn the fundamental concepts and practical steps of **fMRI data preprocessing** — the essential process that prepares raw functional images for further analysis.



## Tools We’ll Use

### **FSL**
We will begin with **[FSL](https://fsl.fmrib.ox.ac.uk/fsl/docs/)** — a comprehensive neuroimaging software library developed at the **University of Oxford**.  
FSL provides command-line and GUI tools that let you perform each preprocessing step-by-step. You will gain a concrete understanding of what each stage in preprocessing actually does.

### **fMRIPrep**

While FSL remains widely used, most modern studies now rely on **[fMRIPrep](https://fmriprep.org/en/stable/)** — an automated, standardized preprocessing pipeline that integrates best-in-class tools from multiple neuroimaging packages, including FSL.

### **Why fMRIPrep?**
1. **Automation:** With just **one line of code**, you can execute the entire preprocessing workflow.
2. **Reproducibility:** Using a standardized pipeline minimizes variability caused by different preprocessing choices across labs and studies.

In short, fMRIPrep provides a *robust, reproducible, and community-endorsed* preprocessing workflow — but it’s valuable to first understand what happens under the hood, which is exactly what FSL will help us explore.

Please run the cell below for the runtime of each notebook. It will import necessary packages that you require to go through this tutorial.

In [None]:
# --- Basic Setup (always run this first) ---

# Install dependencies 
%pip install -q gdown
%pip install -q git+https://github.com/Yuan-fang/fMRI-tutorial.git

# Import essential packages
import warnings
from pathlib import Path
import shutil
import module
from nilearn import image, plotting
from nilearn.image import index_img
from bids import BIDSLayout
import numpy as np
import os
import matplotlib.pyplot as plt

from tutorial.utils.paths import PathManager
from tutorial.utils.fetch import fetch_dataset

warnings.filterwarnings("ignore")

# --- Set up data directories ---

DATASET = "Haxby2001" # name of the dataset
BASE_DIR = Path.home() / "fmri_tutorial" # base directory for the tutorial
DATA_DIR = BASE_DIR / "data" / DATASET # data directory for the dataset
DERIV_DIR = BASE_DIR / "data" / "derivatives" # derivatives directory for processed data

for p in (DATA_DIR, DERIV_DIR):
    p.mkdir(parents=True, exist_ok=True)

# --- Download dataset if not already present ---

# Google Drive link to the dataset
download_url = "https://drive.google.com/uc?id=1fPjbWhY6ZDOGSm59duKmOcCpgp5Zf5tX"
fetch_dataset(download_url, DATA_DIR)

To begin with, We need to specify our raw BIDS data project path, and create a BIDSLayout object to interface with the original BIDS dataset.

We will also create a derivative folder where the output files are to be saved.

For your own project, you may need to adjust your own paths accordingly.

In [None]:
# Create a BIDSLayout object to interface with the BIDS dataset
layout = BIDSLayout(DATA_DIR, validate=False)  

print("Data directory:       ", DATA_DIR.resolve())
print("Derivatives directory:", DERIV_DIR.resolve())

Let's first get the functional image path of the 1st run of one subject ("sub-1").

In addition, we also get this subeject's T1w image path from this dataset.

In [None]:
# Get the file path of T1w image for a specific subject ("sub-1") 
anat_path = layout.get(subject="1", suffix="T1w", extension=".nii.gz", return_type="file")[0] 
print("Anatomical files for subject 1:", anat_path)

# Get the file path of run 1's functional image for a specific subject ("sub-1")
func_path = layout.get(subject="1", suffix="bold", extension=".nii.gz", run="1", return_type="file")[0]
print("Functional files for subject 1, run 1:", func_path)

### Organizing Output Data in fMRI Analysis

When analyzing fMRI data, it’s important to predefine a clean output structure for your results.
Otherwise, each preprocessing or modeling step can generate dozens of files across runs, sessions, and subjects, which quickly turn your derivatives folder into chaos.

To help you manage this, here we use a custom tool called `PathManager`, which automatically creates and retrieves output paths that follow the BIDS convention.

Below is a minimal example showing how to use it.

---
```python
# --- 1. Initialize PathManager ---

# BIDSlayout: the BIDSLayout object for your raw BIDS dataset
# DERIV_ROOT: the root folder for derivative (output) data
# pipeline: your processing pipeline name (e.g. "fMRIPrep", "my_pipeline", etc.)
path_manager = PathManager(BIDSlayout=layout,
                           deriv_base=DERIV_DIR,
                           pipeline="your_fancy_pipeline")


# --- 2. Create a new derivative file path ---

# src_file  : the source file (e.g., subject 1's run 1 bold image)
# proc      : label for this processing step (e.g., "smoothed")
# suffix    : data type (e.g., "bold")
# extension : file format (e.g., ".nii.gz")
output_path = path_manager.create_path(src_file=func_path,
                                       proc="smoothed",
                                       suffix="bold",
                                       ext=".nii.gz")


# --- 3. Find an existing derivative file ---

# Search for a derivative that matches certain entities
existing_path = path_manager.find_path(subject="1",
                                       run="1",
                                       proc="smoothed",
                                       suffix="bold",
                                       extension=".nii.gz")
```
---
Now let's create a PathManager object for our preprocessings with.

In [None]:
# Create a PathManager object 'fsl_manager', which is specifically for managing file paths related to FSL processing.
fsl_manager = PathManager(
    BIDSlayout=layout,
    deriv_base=DERIV_DIR,
    pipeline="fsl_preproc"
    )

# Then use the 'manager' object to create the output file path for brain-extracted anatomical image
# Here, we specify src (source) as the input anatomical image, 
# the processing step as "brain" (brain extraction) and the suffix as "T1w" (T1-weighted image)
brain_path = fsl_manager.create_path(src=anat_path, proc="brain", suffix="T1w")
print("Brain-extracted anatomical image path:", brain_path)

### Brain Extraction from the T1w Image

Now let's apply our first FSL function: **BET**, to our T1w image to extract brain image.

> 💡 **Tip:** FSL commands by default are executed in a Shell environment. When running Shell commands (such as FSL commands) inside a Python notebook, prefix the command with an **exclamation mark (`!`)** so that it tells the notebook to execute the command in a **shell environment**, not in Python.
> 
> For example, to check how to use `bet`:
> ```bash
> !bet -help

In [None]:
# load FSL (version 6.0.7.8)
# Note other versions may not work with this tutorial
await module.load('fsl/6.0.7.8')
await module.list()

In [None]:
# Skull-strip the anatomical image using FSL's BET
# Note "{}" is used to format the string with the variable values in Python so that the anat_path and brain_path are correctly inserted into the BET command.
!bet "{anat_path}" "{brain_path}" -R -f 0.5 -g 0
print(f"Brain-extracted anatomical image created: {brain_path}")

In [None]:
# Visualize the brain extraction image
plotting.plot_anat(brain_path, title="Brain-extracted anatomical image")

# Visualize the brain extraction result by overlaying the skull-striped brain on the original anatomical image
plotting.plot_roi(brain_path, bg_img=anat_path, alpha=0.3, title="BET mask overlay")

### Slice timing correction

Slice timing correction adjusts for the fact that, during fMRI acquisition, **different slices of the brain are sampled at slightly different times within each TR**.
This correction is more important when precise timing of neural responses matters.

>When slice timing is *not necessary*
>
> For this dataset (a block design), slice timing correction is not needed.  
> In block designs, each condition lasts several seconds, and the hemodynamic responses are sustained and overlapping—so small slice-to-slice time offsets have negligible effect on the model fit.  
> 
> Even in event-related designs, slice timing becomes less critical when the TR is short (e.g., ≤ 1 s).  
> With such rapid sampling, the maximum inter-slice delay (a few tens of milliseconds) is small relative to the ~5 s width of the HRF.
> 
>---
>When slice timing is *important*
>
>For event-related designs with longer TRs (e.g., 2–3 s), slice timing correction helps align the BOLD signal with stimulus onsets more accurately before modeling.

In FSL, slice timing correction is performed using the command-line tool `slicetimer`.
You can view its options by running:
```bash
!slicetimer -help
```
Note that slice timing correction requires knowing the exact acquisition order of slices.
In a BIDS dataset, this information is stored in the JSON sidecar file associated with each functional run (under the "SliceTiming" field).
>⚠️ **Note:** In this tutorial dataset, JSON files are not available, so the slice timing information cannot be recovered from the data or the original publication.

### Motion correction

Motion correction is the process of **realigning all functional volumes to a reference image (usually the first or middle volume of the run)**.
This ensures that each voxel’s time series corresponds to a consistent brain location across the entire scan.

Motion correction is typically performed **before** spatial smoothing and temporal filtering.
If you applied smoothing first, the signal from moving voxels would be mixed with neighboring tissue, making alignment less accurate and motion artefacts harder to remove.

In FSL, motion correction is carried out using the tool `mcflirt` (Motion Correction using FMRIB’s Linear Image Registration Tool).
You can view its usage by typing:
````bash 
!mcflirt -help
````
mcflirt performs rigid-body registration (6 degrees of freedom) to estimate and correct head motion.
It can also output transformation matrices and motion parameter files (e.g., *.par), which are often used later as nuisance regressors.

In [None]:
# Use the PathManager object to create the output file path for motion-corrected functional image
# Here, we specify the src (source) as the input functional image, 
# the processing step as "mc" (motion correction) and the suffix as "bold" (BOLD image)
mc_path = fsl_manager.create_path(src=func_path, proc="mc", suffix="bold")

# Perform motion correction using FSL's MCFLIRT
!mcflirt -in "{func_path}" -out "{mc_path}" -refvol 0 -plots -report

Click the derivatives folder, you will see two files were automatically created under 'fsl_preproc/sub-1/func/'. One is the motion parameter file and the other is the motion corrected functional image.
The motion corrected functional image is the file we just created and saved at 'mc_path'.
The motion parameter file is generated by MCFLIRT during the motion correction process, with the extension '.nii.gz.par'.

We can get the file path of the motion parameter file and plot the motion parameters for each volume.

In [None]:
# Use the PathManager to find the motion parameter file generated by MCFLIRT.
# The motion parameter file has the extension '.nii.gz.par'.
# Here, we specify the subject as '1', processing step as 'mc' (motion correction), run as '1', and extension as '.nii.gz.par'.
# The find_path method returns a list of matching file paths, so we take the first element [0].
motion_par = fsl_manager.find_path(subject='1', proc='mc', run='1', extension='.nii.gz.par')[0]
print(f"The motion parameter file path: {motion_par}")

# Load motion parameters
motion = np.loadtxt(motion_par)

# Plot the motion parameters
# Create a figure with two subplots: one for translations and one for rotations
# Share the x-axis (time/volume)
fig, axes = plt.subplots(2, 1, figsize=(10, 6), sharex=True)

# Translations (mm)
axes[0].plot(motion[:, 3:]) # Note that the last three columns are translations
axes[0].legend(["x", "y", "z"], loc="upper right")
axes[0].set_ylabel("Translation (mm)")
axes[0].set_title("Head translation over time")

# Rotations (radians)
axes[1].plot(motion[:, :3]) # Note that the first three columns are rotations
axes[1].legend(["pitch", "roll", "yaw"], loc="upper right")
axes[1].set_xlabel("Volume (TR)")
axes[1].set_ylabel("Rotation (radians)")
axes[1].set_title("Head rotation over time")

plt.tight_layout()
plt.show()

The motion parameters estimated by mcflirt (three translations and three rotations) can later be included as nuisance regressors in the general linear model (GLM).
Including them helps remove residual signal fluctuations associated with head motion that are not fully corrected by realignment.

In practice, researchers often exclude or flag runs where head motion is excessively large, since strong movements can cause signal dropout and spin-history artefacts that cannot be fully corrected.


It is generally considered safe to assume that motion is minimal when both:
>
>Translation (x, y, z displacement) is less than 1 mm, and
>
>Rotation (pitch, roll, yaw) is less than 1 degree.

Runs exceeding these thresholds may still be usable with careful scrubbing or motion modeling, but should be evaluated critically.

>⚠️ **Note:** In FSL’s motion parameter files (*.par), rotations are expressed in radians, not degrees.
To convert radians to degrees, multiply each rotation value by $$180/\pi \approx 57.3$$

### Spatial Smoothing

In [None]:
# Create smoothed functional image path
# processing step is "smoothed_6mm" indicating 6mm smoothing
smoothed_6mm_path = fsl_manager.create_path(src=func_path, proc="smoothed_6mm", suffix="bold", extension=".nii.gz")

# Perform spatial smoothing using nilearn's smooth_img function
# mc_path is the motion-corrected functional image
smoothed_6mm_img = image.smooth_img(mc_path, fwhm=6)  # Apply 6mm FWHM Gaussian smoothing

# Save the smoothed image
smoothed_6mm_img.to_filename(smoothed_6mm_path)
print(f"Smoothed image saved at: {smoothed_6mm_path}")

In [None]:
# Visualize the smoothed functional image of one volume (e.g., the 10th volume)
plotting.plot_img(index_img(smoothed_6mm_path, 10), title="Smoothed functional image (6mm FWHM)")

#### 🤔 Do it yourself: 
Please smooth the data with the 4mm FWHM and visualize the results

_Type your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
# Create smoothed functional image path
# processing step is "smoothed_4mm" indicating 4mm smoothing
smoothed_4mm_path = fsl_manager.create_path(src=func_path, proc="smoothed_4mm", suffix="bold", extension=".nii.gz")

# Perform spatial smoothing using nilearn's smooth_img function
# mc_path is the motion-corrected functional image
smoothed_4mm_img = image.smooth_img(mc_path, fwhm=4)  

# Save the smoothed image
smoothed_4mm_img.to_filename(smoothed_4mm_path)
print(f"Smoothed image saved at: {smoothed_4mm_path}")

# Visualize the smoothed functional image of one volume (e.g., the 10th volume)
plotting.plot_img(index_img(smoothed_4mm_path, 10), title="Smoothed functional image (4mm FWHM)")

````
</details>


In [None]:
# Write and execute your code below to smooth the functional image with 4mm FWHM
# --- YOUR CODE HERE ---



### Temporal filtering

In [None]:
# calculate the sigma values for high-pass and low-pass filtering
# 100 seconds/cycle (0.01 Hz) cutoff for high-pass filter
hp_sigma = 100 / (2 * np.sqrt(2 * np.log(2)))  # convert 100 seconds to sigma in volumes
# we can disable low-pass filter by setting lp_sigma to a negative value
# because we only want high-pass filtering here, as the sampling rate is very low
lp_sigma = -1 # set <0 to disable low-pass filter

# create the output file path for the temporally filtered image
filtered_path = fsl_manager.create_path(src=func_path, proc="filtered-100s", suffix="bold", extension=".nii.gz")

# perform temporal filtering using FSL's fslmaths
# smoothed_6mm_path is the input smoothed functional image at 6mm FWHM
!fslmaths "{smoothed_6mm_path}" -bptf "{hp_sigma}" "{lp_sigma}"  "{filtered_path}" -odt float
print(f"Temporally filtered image saved at: {filtered_path}")

In [None]:
# Load the before-filtered functional image for comparison
before_filtered_func_img = image.load_img(smoothed_6mm_path)

# Load the before-filtered image data as a numpy array
before_filtered_func_data = before_filtered_func_img.get_fdata()

# invert the affine
inv_affine = np.linalg.inv(before_filtered_func_img.affine)

# apply the inverse transform: world -> voxel
i, j, k = image.coord_transform(10, 30, 3, inv_affine)

# convert to integer voxel indices for numpy indexing
voxel_idx = np.round([i, j, k]).astype(int)

# extract the time series at the specified voxel
time_series_before_filtered = before_filtered_func_data[tuple(voxel_idx)]

# Plot the time series
# We use matplotlib for plotting.
# see https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html for more options
plt.plot(time_series_before_filtered, label="Before Filtering")
plt.title("Voxel time course at (10, 30, 3) mm")
plt.xlabel("Time (TRs)")
plt.ylabel("Signal intensity")
plt.legend()
plt.show()

In [None]:
# Load the filtered functional image
filtered_func_img = image.load_img(filtered_path)

# Load the filtered image data as a numpy array from the Nifti1Image object
filtered_func_data = filtered_func_img.get_fdata()

# extract the time series at the specified voxel
time_series_filtered = filtered_func_data[tuple(voxel_idx)]

# Plot the time series
# We use matplotlib for plotting.
# see https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html for more options
plt.plot(time_series_filtered, label="After Filtering")
plt.title("Voxel time course at (10, 30, 3) mm")
plt.xlabel("Time (TRs)")
plt.ylabel("Signal intensity")
plt.legend()
plt.show()

### Registration and Normalization

As the final step in preprocessings, we need to bring each run's data in the subject-specific native functional space to a common template space (e.g., MNI). To do this, we ususally do two-step normalization:

1. Registration: sub-A's func image **is aligned with** sub-A's T1w image (func -> T1w)
2. Normalization: sub-A's T1w image **is normalized to** a common template image (T1w -> MNI) 

Both steps involve estimating transformations and applying them to the data. 

Step 1 produces a transformation matrix $ M_{\text{reg}} $, which aligns the
functional image (`func`) to the participant’s anatomical image (`T1w`) by

$$
\text{func} \xrightarrow{\,*M_{\text{reg}}\,} \text{T1w}.
$$

Step 2 yields another matrix $ M_{\text{norm}} $, which brings the anatomical
(`T1w`) image into MNI template space:

$$
\text{T1w} \xrightarrow{\,*M_{\text{norm}}\,} \text{MNI}.
$$

Combining the two gives the total transformation that maps the functional
image directly into MNI space:

$$
\text{func} \xrightarrow{\,*M_{\text{total}}\,} \text{MNI},
\ where\
M_{\text{total}} = M_{\text{reg}}\,M_{\text{norm}}.
$$

Hence, applying $ M_{\text{total}} $ to the functional data
corresponds to first aligning it to `T1w` and then normalizing to MNI:

$$
\text{func} * M_{\text{reg}} * M_{\text{norm}} = \text{func} * M_{\text{total}}.
$$





#### Register functional to anatomical (func to T1w)
To register functional to T1w image, we need to pick one representative frame (volume) from the functional run as the to-be-registered functional frame (or target image). This is often the middle volume or the averaged volume across time. The reference is the same subject's T1w image. 

The registration produces two important files - one is the registered functional volume, which allow us to evaluate whether the registration is accurate. The other is the transformation matrix, which will be used for final transformation.

In FSL, registration can be achieved by `flirt` or `epi_reg` . Here we use `flirt` for practical reasons of saving processing time. `epi_reg` is more accurate but also more computational expensive, besides its performance is best with **field mapping scans**, which we don't have in this dataset.

For usages of both functions, type `!flirt --help` and `!epi_reg --help`.

In [None]:
# Create the output file path for the mean functional image
fmean_path = fsl_manager.create_path(src=func_path, proc="mean", suffix="bold", extension=".nii.gz")

# Create the file (mean across time) by applying FSL's fslmaths with -Tmean option
# Note we use the motion-corrected functional image (mc_path) as input here
# Because smoothed and filtered images are not sharp enough for registration (registration is done on un-smoothed images as it needs detailed features)
!fslmaths {mc_path} -Tmean {fmean_path}
print(f"Mean motion-corrected functional image created: {fmean_path}")

# Register functional to anatomical using epi_reg
# Create the output file path for the registered functional image
reg_func_path = fsl_manager.create_path(src=func_path, proc="reg2highres", suffix="bold", extension=".nii.gz")

# Create the output file path for the transformation matrix
trans_mat_path = fsl_manager.create_path(src=func_path, proc="reg2highres", suffix="bold", extension=".mat")

# Perform registration using FSL's flirt
# dof 12 indicates 12 degrees of freedom (affine transformation including translation, rotation, scaling, and shearing)
!flirt -in "{fmean_path}" -ref "{brain_path}" -out "{reg_func_path}" -omat "{trans_mat_path}" -dof 12
print(f"Registered functional image created: {reg_func_path}")
print(f"Transformation matrix created: {trans_mat_path}")

Let's now overlap the registered mean functional image and T1w image to see how well the registration did.

In [None]:
# Mean functional image with T1 edges
display = plotting.plot_epi(reg_func_path, title="Mean EPI + T1 edges")
display.add_edges(brain_path)   # white edges by default
plotting.show()

Look at the edge of T1w image, it corresponds with the corregistered mean functional image very well overally.

We can also look the overlap interactive.

In [None]:
# Interactive 3D view of the coregistered mean functional image overlaid on the brain-extracted anatomical image
view = plotting.view_img(
    reg_func_path,      # coregistered mean functional image
    bg_img=brain_path,     # T1 as background
    opacity=0.5,           # transparency of EPI overlay
    cmap="cold_hot",       # nice diverging colormap for EPI
)

view  

#### Normalizing T1w to MNI template
Normalization in essense serves the same purpose as the previous regisration step. The reason why it's termed as normalization is that it's no longer alignment between different imaging modalities *within a subject*, but the alignment between an idiosyncratic individual's T1w image to a common template. For that reason, it often involves nonlinear wrapping to bring subject's T1w image as close to the template as possible.

To do normalization, of course we will need a template image (e.g., MNI152) at a specific resolution (e.g., 2mm) as the reference image. For better accuracy, people often first do a linear pre-alignment between target and reference images to minimize the gross spatial difference, which is then followed by a nonlinear warpping to fine tune regional difference.

In FSL, the linear pre-alignment is done with `Flirt`, the nonlinear warp is done with `fnirt` (FMRIB’s Non-linear Image Registration Tool). Check usage of `fnirt` with `!fnirt --help`

In [None]:
# first, we need to fetch an MNI template image from FSL
template_path = !echo $FSLDIR/data/standard/MNI152_T1_2mm_brain.nii.gz
template_path = template_path[0]

# Check the template image info
# fslinfo provides information about the NIfTI image, including dimensions, voxel size, data type, etc.
!fslinfo "{template_path}"

In [None]:
# Now, we will perform linear pre-alignment of the subject's T1w image to the MNI template

# Create the pre-alignment transformation matrix path
prealign_mat_path = fsl_manager.create_path(src=anat_path, proc="highres2MNI", suffix="T1w", extension=".mat")

# Perform linear pre-alignment using FSL's flirt
# Note by default, flirt uses 12 degrees of freedom for inter-modal registration
# We want the pre-alignment matrix only here, so we don't output the aligned image
!flirt -in "{brain_path}" -ref "{template_path}" -omat "{prealign_mat_path}"
print(f"Pre-alignment matrix created: {prealign_mat_path}")

In [None]:
# Now, we will perform nonlinear normalization of the subject's T1w image to the MNI template

# Create the output file path for the normalized T1w image
norm_anat_path = fsl_manager.create_path(src=anat_path, proc="highres2MNI_nonlinear", suffix="T1w", extension=".nii.gz")

# Create the output file path for the warp coefficient image
# Nonlinear warping produces warp coefficient files that describe the deformation field
# It's functionally similar to the transformation matrix in linear registration
warp_coef_path = fsl_manager.create_path(src=anat_path, proc="highres2MNI_nonlinear", suffix="T1w_warpcoef", extension=".nii.gz")

# Perform nonlinear normalization using FSL's fnirt
# Note that we use the pre-alignment matrix as an initial affine transformation
!fnirt --in="{brain_path}" --aff="{prealign_mat_path}" --ref="{template_path}" --cout="{warp_coef_path}" --iout="{norm_anat_path}"
print(f"Normalized anatomical image created: {norm_anat_path}")
print(f"Warp coefficient image created: {warp_coef_path}")

In [None]:
# Visualize the normalized anatomical image with MNI template edges to confirm how well the normalization did
display = plotting.plot_anat(norm_anat_path, title="Normalized T1w + MNI edges")
display.add_edges(template_path)   # white edges by default
plotting.show()

In [None]:
# Now, we will combine the transformations to bring the functional image to MNI space

# Create the output file path for the normalized functional image
norm_func_path = fsl_manager.create_path(src=func_path, proc="clean", suffix="bold_MNI", extension=".nii.gz")

# Perform the transformation using FSL's applywarp
# We combine the functional-to-T1w linear transformation matrix and T1w-to-MNI nonlinear warp coefficient
# Note we use the temporally filtered functional image (filtered_path) as input here as we just applied the already computed transformations to it
!applywarp --in="{filtered_path}" --ref="{template_path}" --out="{norm_func_path}" --warp="{warp_coef_path}" --premat="{trans_mat_path}"
print(f"Normalized functional image created: {norm_func_path}")

# To this point, we have completed the main preprocessing steps:
# 1) Brain extraction
# 2) Motion correction
# 3) Spatial smoothing
# 4) Temporal filtering
# 5) Registration of functional to anatomical
# 6) Normalization of anatomical to MNI space
# 7) Applying combined transformations to bring functional to MNI space
# The final preprocessed functional image is saved at 'norm_func_path', which is in MNI space and ready for further analysis.

### fMRIPrep

While we have learned about the basic preprocessing steps in preprocessing fMRI data, you may have realized that the steps are tedious and succeptible to data analysts' arbitary choices. We could also use fMRIPrep, a ready-to-use automatic optimizied preprocessing pipeline to do preprocessings for most fMRI data. 

To use fMRIprep, the data must be in BIDS format. It's also important to bear in mind that: an automatic and common pipeline also have its cons. It could produce suboptimal preprocessing results in unknown scenerios, such as for some data with special scanning sequences. So it's always recommended to know beforehands if your data is suitable for fMRIPrep and always inspect the preprocessing results from it, especially the registration and normalization.

Now let's see how preprocessing with fMRIPrep is implemented.

In [None]:
# load fmriprep
await module.load('fmriprep')      
await module.list()  

You can check the usage of fMRIprep by typing:
`!fmriprep --help`

In [None]:
# Create a fmriprep derivatives directory specified for fMRIPrep outputs
DERIV_DIR_fmriprep = BASE_DIR / "data" / "derivatives_fmriprep"
DERIV_DIR_fmriprep.mkdir(parents=True, exist_ok=True)

# ---- The following license, environment variables and paths need to be set for fMRIPrep ----

# specify the freesurfer license file path
license_path = DATA_DIR / "license.txt"
os.environ['FS_LICENSE'] = str(license_path)
os.environ['APPTAINERENV_FS_LICENSE'] = license_path  # Pass to container

# Set up environment variables before running fMRIPrep
# These environment variables help fMRIPrep locate necessary resources and optimize performance
os.environ['SUBJECTS_DIR'] = f'{DERIV_DIR_fmriprep}/freesurfer' # specify FreeSurfer subjects directory within fmriprep derivatives
os.environ['APPTAINERENV_SUBJECTS_DIR'] = os.environ['SUBJECTS_DIR'] # Pass to container
os.environ['ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS'] = '6' # limit ITK threads to 6
os.environ['MPLCONFIGDIR'] = os.path.expanduser('~/matplotlib-mpldir') # set matplotlib config directory to avoid permission issues

# Create the freesurfer directory
!mkdir -p {DERIV_DIR_fmriprep}/freesurfer

# ---- Run fMRIPrep ----

# DATA_DIR: input BIDS dataset directory
# DERIV_DIR: output derivatives directory
# participant: run for participant level (can also run for group level, if given "group" instead. Group level requires all subjects to be preprocessed first and it will give a summary report)
# --participant-label 1: specify subject 1 (BIDS ID: "sub-1", if two subjects, use "--participant-label 1 2" to specify both)
# --nprocs 6 --mem 10000: allocate 6 processors and 10GB memory (adjust based on your system resources)
# --output-spaces MNI152NLin2009cAsym: specify output spaces (we just want MNI space here)
# --fs-no-reconall: skip FreeSurfer's recon-all step (this requires a lot of time and resources)
# --skip_bids_validation: skip BIDS validation step
# -v: verbose output
!fmriprep \
  "{DATA_DIR}" \
  "{DERIV_DIR_fmriprep}" \
  participant \
  --participant-label 1 \
  --nprocs 6 --mem 10000 \
  --output-spaces MNI152NLin2009cAsym \
  --fs-no-reconall \
  --skip_bids_validation \
  -v

print("fMRIPrep completed successfully.")