# Session 0: Start with fMRI data

In this session, we will learn some basics about fMRI data, including fMRI data visualization & access.

## Tools We’ll Use

### **Nilearn**
In this tutorial, most of the analyses will be carried out with **[Nilearn](https://nilearn.github.io/stable/index.html)**.  Nilearn is a Python library that simplifies the use of machine learning and statistical tools for fMRI data. It provides convenient functions to load, manipulate, and visualize brain images, as well as to build predictive models and perform decoding analyses.

We will use Nilearn to perform both:
- **First-level analyses**, which model individual subjects’ brain responses to experimental events and produce subject-specific activation maps.
- **Second-level analyses**, which combine these individual results across participants to identify reliable group-level activation patterns.

Beyond GLM-based activation analyses, Nilearn also supports:
- Functional connectivity and network analyses  
- Decoding and multivariate pattern analysis (MVPA)  
- High-quality visualization in both 2D and 3D brain spaces  

## Running Environment: Neurodesktop

One big hassel for fMRI data analysis is that you need to install and set many softwares locally or in a server. While you can install them by yourself after class at your wish, for this tutorial, we'd like to save your time and efforts from installing various neuroimaging softwares (there are tons of softwares if you count them!). 

For this reason, we will run notebooks on **[Neurodesktop](https://neurodesk.org/)** — a software container with major neuroimaging tools, including FSL, AFNI, ANTs, and fMRIPrep preinstalled.

In [None]:
# --- Basic Setup (always run this first) ---

# Install dependencies 
%pip install -q gdown
%pip install -q git+https://github.com/Yuan-fang/fMRI-tutorial.git

# Import essential packages
import warnings
from pathlib import Path
import shutil
import module
from nilearn import image, plotting
from nilearn.image import index_img
from bids import BIDSLayout
import numpy as np
import os
import matplotlib.pyplot as plt

from tutorial.utils.paths import PathManager
from tutorial.utils.fetch import fetch_dataset

warnings.filterwarnings("ignore")

We also need to set up data directories. You can change to other directories and names according to your preference. But for this tutorial, let's stick with the same data directories.

In [None]:
# --- Set up data directories ---

DATASET = "Haxby2001" # name of the dataset
BASE_DIR = Path.home() / "fmri_tutorial" # base directory for the tutorial
DATA_DIR = BASE_DIR / "data" / DATASET # data directory for the dataset
DERIV_DIR = BASE_DIR / "data" / "derivatives" # derivatives directory for processed data

for p in (DATA_DIR, DERIV_DIR):
    p.mkdir(parents=True, exist_ok=True)

In [None]:
# --- Download dataset if not already present ---

# Google Drive link to the dataset
download_url = "https://drive.google.com/uc?id=1fPjbWhY6ZDOGSm59duKmOcCpgp5Zf5tX"
fetch_dataset(download_url, DATA_DIR)

### fMRI data folder: BIDS

Let's first familiarize yourself with the structure of fMI data. As you can see from the "./data" folder, the data is structured in BIDS format.

BIDS (Brain Imaging Data Structure) is a data structure standard many researchers follow. As different people follow the same structure and naming convention to organize their data, it will allow the same set of pipelines to process different people's data with convenience (see more info about BIDS: https://bids.neuroimaging.io/index.html).

Let's quickly browse our data. One easy-to-think way is to kick and click the specific data folder to see what's inside. Here we use the package pyBIDS (https://bids-standard.github.io/pybids/user_guide.html) to help us *inferacte* the data with commmandlines.

In [None]:
# create a BIDSLayout object using BIDSLayout (https://bids-standard.github.io/pybids/generated/bids.layout.BIDSLayout.html)
# BIDSLayout is a tool from the package pybids that will help us index our dataset. 
# By creating a BIDSLayout object, we can easily query the dataset to get file paths of specific files.
layout = BIDSLayout(DATA_DIR, validate=False) # we set validate=False to skip BIDS validation for speed. Other parameters can be found in the documentation.

Let's explore what's inside this dataset.

In [None]:
# List all subjects in the dataset
subjects = layout.get_subjects()
print("Subjects:", subjects)

# List all data types available in the dataset for a specific subject.
# Data types can be anat (anatomical), func (functional), dwi (diffusion), etc.
# You can change the subject ID and session
datatypes = layout.get_datatypes(subject="1")
print("Data types:", datatypes)

# For functional data, we often have multiple runs. 
# Let's list all functional runs for a specific subject
runs = layout.get_runs(subject="1")
print("Runs for subject 1:", runs)

With the BIDSLayout object, we can not only browse the data at ease, but also get access to the path of specific data conveniently.

In [None]:
# Get the file path of anatomical image for a specific subject
# .nii.gz is the common file extension for NIfTI files, which is a standard format for storing neuroimaging data
anat_filepath = layout.get(subject="1", suffix="T1w", extension=".nii.gz", return_type="file")[0] # note that we use [0] to get the first item in the list
print("Anatomical files for subject 1:", anat_filepath)

# Get the file path of functional image in run 1 for a specific subject
func_filepath = layout.get(subject="1", suffix="bold", extension=".nii.gz", run="1", return_type="file")[0]
print("Functional files for subject 1, run 1:", func_filepath)

### Visualize the anatomical image

There are many other aspects about the BIDS structure you can explore with the BIDSLayout object you've just created. I will leave them to you to explore after the class. 

Here, let's first focus on the anatomical image, i.e., T1w (T1-weighted) image. We will visualize the T1w image.

For visualization and many further fMRI analyses, we will rely on the package Nilearn (https://nilearn.github.io/stable/index.html)

In [None]:
# Plot this subject's anatomical image at the default coordinates
view = plotting.plot_anat(anat_filepath) # see https://nilearn.github.io/stable/modules/generated/nilearn.plotting.plot_anat.html for more options
plotting.show()

You can see from the above that the default display show three views (rememeber the name of each view?) at one specific location (7, -3, 7). The coordinates are in millimeter unit. The coordinates share similiar meaning with MNI coordinates. However, as the T1w image is not yet normalized to a template space (such as MNI space), we just call them "coordinates in the local/native space".

Now let's look at another location in the local space, e.g., (-30, 10, 40) 

In [None]:
# change the cut coordinates to (-30, 10, 40) 
view = plotting.plot_anat(anat_filepath, cut_coords=(-30, 10, 40))
plotting.show()

You must have noticed taht the above plotting method only allows you to visualize images at a specific location. 

To explore the T1w image more interactively, you can use another more general-purpose tool in Nilearn.

In [None]:
# To explore the image interactively, you could use another methods.
plotting.view_img(anat_filepath, cmap="gray") # see https://nilearn.github.io/stable/modules/generated/nilearn.plotting.view_img.html for more options

### Visualize the functional image

After visualizing the T1w image. Now let's visualize one run from the func images.

Different from the T1w image in 3D, for func images, they are 4D. For Nilearn, plotting 4D data is difficult (actually Nilearn's interactive data viewing functions overally is far from comparable to other softwares, such as FSL's fslview). Therefore, we need to pick one volume from the concatenated 4D volumes to plot it.

In [None]:
# Set the index of the first volume from the functional image (you can change this index to plot other volumes)
volume_index = 0 # note that the index starts from 0
# Get the volume with the specified index from the functional image
# see https://nilearn.github.io/stable/modules/generated/nilearn.image.index_img.html for more options
vol = image.index_img(func_filepath, volume_index)
# Plot the selected volume with plot_epi.
# plot_epi is a counterpart to plot_anat. see https://nilearn.github.io/stable/modules/generated/nilearn.plotting.plot_epi.html for more options
plotting.plot_epi(vol, title="BOLD volume t=0")
plotting.show()

Did you notice that the functional images come at much lower resolution than the T1w (T1-weighted) structural image?

In the next section, we will see how to get their seperated resolution information from the images themselves. 

#### 🧠 Do it yourself: 
Can you display the above volume at another location, say, (-30, 20, 11), with *plot_epi*?

_Tpye your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
view = plotting.plot_epi(vol, title="BOLD volume t=0", cut_coords=(-30, 20, 11))
plotting.show()
````
</details>

In [None]:
# Write and execute your code below to display the above volume at different coordinates (-30, 20, 11)
# --- YOUR CODE HERE ---



#### 🧠 Do it yourself: 
Similiar to T1w image, we can also display the volume interactively. Remeber the more general-purpose tool *view_img*?

_Tpye your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
plotting.view_img(vol, cmap="gray") 
````
</details>

In [None]:
# Write and execute your code below to display the above volume interactively
# --- YOUR CODE HERE ---



In Nilearn, although the ways to display 4D data is very limited, we do have one way to visualize 4D func images. But maybe different from what you expect, this method plot the image of voxel intensities across time. So the 3D data in each volume is flattened to 1D.

This method is very useful for a quick glimpse of the headmovement noise in a func run. Run the code cell below. Do you notice several bands (abrupt change in overall grayscale in nearby time)? They are often related to head movement induced global (whole brain) signal fluctuation.

In [None]:
# Visualize the functional (bold) images
# see https://nilearn.github.io/stable/modules/generated/nilearn.plotting.plot_carpet.html for more options
plotting.plot_carpet(func_filepath)

### Access the fMRI data

We just displayed the T1w and func images. Very often, we want to directly access the fMRI data, so that we can do some specific computations on them. To do this, we need to load the data as some 3D or 4D arrays.

In the next, we first load the T1w image as a Nifti1Image object. Nifti1Image is just a Python object which contains as many information as you need about the T1w image.

In [None]:
# Load the T1w image
# It reads the image from the file and stores it as a Nifti1Image object (https://nilearn.github.io/stable/modules/generated/nilearn.image.load_img.html)
t1_img = image.load_img(anat_filepath)

# Print the Nifti1Image object to see its information
print(t1_img)

You can see the Nifti1Image object holding all the information about the T1w image. We do not need to print all those information (It's too cluttered). Here we just want to know the image's shape and voxel size. So we can do something below:

In [None]:
# Print the shape of the T1w image (dimensions in x, y, z))
print("Shape (x, y, z):", t1_img.shape)

# Print the voxel size (in mm) in each dimension (x, y, z)
print("Voxel size (mm):", t1_img.header.get_zooms())

We can get access to the actual value (MRI signal, or grayscale intensity value) in a specific coordinate. You can really see the T1w data is essentially a 3D array.

In [None]:
# Load the image data as a numpy array from the Nifti1Image object
t1_data = t1_img.get_fdata()

# Print the shape and data type of the image data
print("Data type:", t1_data.dtype)
print("Data shape:", t1_data.shape)

# Access the value at a specific voxel coordinate (x=90, y=120, z=130) in voxel space. 
# You can change the coordinate to any value within the image dimensions.
# Note that the coordinate starts from (0, 0, 0)
print("Value at a specific voxel coordinate:", t1_data[90, 120, 130])

Note in the above example, the coordinates (90, 120, 130) are voxel space coordinates. Or put it another way, they are just *index* for specific locations in the 3D data matrix (or array). So they starts from (0, 0, 0). The unit of the coordinate is *voxel*. They are different from the coordinates in the native space you saw earlier, such as (-30, 10, 40), the unit of which is *millimeter (mm)*. 

But what if we want to know the voxel space coordinates (90, 120, 130) correspond to what coordinaes in the native space?

In [None]:
# Get the affine matrix of the T1w image
affine = t1_img.affine
print("Affine matrix:\n", affine)

# Get the coordinates in native space (in mm) corresponding to the voxel space coordinate (90, 120, 130)
voxel_coord = [90, 120, 130, 1]  # add a 1 for homogeneous coordinates
native_coord = affine.dot(voxel_coord)  # matrix multiplication
print("Coordinates in native space (mm):", native_coord[:3])  # exclude the last element

# Another more convenient way to convert voxel space coordinates to native space coordinates is to use the function `nilearn.image.coord_transform`
# see https://nilearn.github.io/stable/modules/generated/nilearn.image.coord_transform
native_coord2 = image.coord_transform(90, 120, 130, affine)
print("Coordinates in native space (mm) using coord_transform:", native_coord2)

As you can see above the above example, to convert voxel space coordinates to native space coordinates, or put it into another way, to find the correspondence between the voxel locations in an array to real-word coordinates, one piece of information is very important, they are `affine` matrix.

`affine` is a **4×4 matrix** that maps from voxel indices `(i, j, k)`  to scanner-space (or native-space) coordinates `(x, y, z)` in **millimeters**.

Mathematically, the mapping is:

$$[x, y, z, 1]^T = \text{affine} \cdot [i, j, k, 1]^T$$



At its core, fMRI data are remarkably simple. They consist of two key components:

1. **A data matrix** — a 3D (or 4D) array of numbers. Conceptually, this is no different from any other dataset you might encounter — whether in agriculture, transportation, or finance. Each element in the matrix represents a data point (e.g., voxel intensity) at a specific index.

2. **A linear transformation matrix (`affine`)** — this defines how the voxel indices in the data matrix correspond to real-world spatial locations in millimeters. In other words, it tells you *where* each data point lies in the brain or scanner space.

Together, these two components allow us to move seamlessly between array indices and meaningful anatomical coordinates.


Next, let's see what an func image (T2* image) looks like.


In [None]:
# Load the functional image
func_img = image.load_img(func_filepath)

# Print the Nifti1Image object to see its information
print(func_img)

#### 🧠 Do it yourself: 
Similiar to T1w image, please print the shape of this func image and it's voxel size.

_Type your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
# Print the shape of the func image (dimensions in x, y, z, t))
print("Shape (x, y, z, t):", func_img.shape)

# Print the voxel size (in mm) in each dimension (x, y, z)
print("Voxel size (mm):", func_img.header.get_zooms())


````
</details>

In [None]:
# Write and execute your code below to display the shape of this func image and it's voxel size.
# --- YOUR CODE HERE ---



Did you notice that for func images, different from T1w image, the outputs of `func_img.shape` and `func_img.header.get_zooms()` has the 4th dimension? It is the temporal dimension, as we are dealing with the functional image.

1. The 4th dimension in the output of `func_img.shape` tell you **how many volumes(TRs) in total for this func image** 
2. The 4th dimension in the output of `func_img.header.get_zooms()` tell you **the resolution in the temporal dimension (i.e, TR)** 

Now, let's extract one voxel's time course from the `func_img`. Let's say, the voxel is located in coordinates (0. -10, 30) (mm unit). 

#### 🧠 Do it yourself: 
Please first extract the actual data from `func_img`. Remember to print the data type and shape.

_Type your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
# Load the image data as a numpy array from the Nifti1Image object
func_data = func_img.get_fdata()

# Print the shape and data type of the image data
print("Data type:", func_data.dtype)
print("Data shape:", func_data.shape)


````
</details>

In [None]:
# Write and execute your code below to extract the 4D data from `func_img`. Remember to print the data type and shape.
# --- YOUR CODE HERE ---



After we got the 4D data matrix. We need to identify which element in the 4D array corresponds to local space coordinates (0. -10, 30).

Remember the `affine` matrix in the `func_img` object and how we transformed an element index to the local space coordinates for the T1w image? We can just do it reversely: from local space coordinates (i.e., 0. -10, 3) to the element index in the matrix.

#### 🧠 Do it yourself: 
Please identify the element index corresponding to local space coordinates (i.e., 0. -10, 3).

_Hint: you could use `image.coord_transform` to do this. But remember to pass the **inverse affine matrix** to the command._

```python

# invert the affine to get the inverse affine matrix
inv_affine = np.linalg.inv(func_img.affine)

```


_Type your answer in the cell below. then check the answer._

<details>
<summary>💡 Show the correct answer</summary>

````python
# invert the affine
inv_affine = np.linalg.inv(func_img.affine)

# apply the inverse transform: world -> voxel
i, j, k = image.coord_transform(0, -10, 3, inv_affine)

# as the output i, j, k are not integer
# convert them to integer so that they can be used as voxel indices for indexing
voxel_idx = np.round([i, j, k]).astype(int)
print("voxel (i, j, k):", tuple(voxel_idx))

````
</details>

In [None]:
# Write and execute your code below to convert the local space coordinates (0, -10, 3) to the corresponding voxel indices (i, j, k) in the 4D data matrix.
# --- YOUR CODE HERE ---




Now with the data matrix and the element index for local space coordinates (0, -10, 3), the rest is straitfoward. You just need to extract the values from the index and plot it. It's purely Python plotting.

Below is the complete code of doing this from 4D data extraction, converting local space coordinates to element index, and then extract the data and plot it.
You can come up with your version, but make sure you understand each step of the codes below.

In [None]:
# Load the image data as a numpy array from the Nifti1Image object
func_data = func_img.get_fdata()

# invert the affine
inv_affine = np.linalg.inv(func_img.affine)

# apply the inverse transform: world -> voxel
i, j, k = image.coord_transform(0, -10, 3, inv_affine)

# convert to integer voxel indices for numpy indexing
voxel_idx = np.round([i, j, k]).astype(int)

# extract the time series at the specified voxel
time_series = func_data[tuple(voxel_idx)]

# Plot the time series
# We use matplotlib for plotting.
# see https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html for more options
plt.plot(time_series)
plt.title(f"Voxel time course at (0, -10, 3) mm (voxel {tuple(voxel_idx)})")
plt.xlabel("Time (TRs)")
plt.ylabel("Signal intensity")
plt.show()