Importing Libraries:  
Python imports and modules that are required that are imported at the start:  
-	os, sys, time, numpy (np alias), pandas (pd alias), matplotlib.pyplot (plt alias) ipywidgets, tqdm.notebook, nibabel, glmsingle, bids, noise_ceiling and tc2see.

In [37]:
%load_ext autoreload
%autoreload 2

import os
import sys
import time
from pprint import pprint
from pathlib import Path
from random import randint

import numpy as np
import pandas as pd
# import matplotlib.pyplot as plt
from ipywidgets import interact
from tqdm.notebook import tqdm
import nibabel as nib
import nilearn
from nilearn import image
# import glmsingle
# from glmsingle.glmsingle import GLM_single
import bids
from bids import BIDSLayout
from scipy.ndimage import zoom, binary_dilation
import h5py
import nibabel as nib
from einops import rearrange

dir2 = os.path.abspath('..')
dir1 = os.path.dirname(dir2)
if not dir1 in sys.path: 
    sys.path.append(dir1)
    
from noise_ceiling import (
    compute_ncsnr,
    compute_nc,
)

from tc2see import load_data

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Defining Dataset Paths and Variables:  
These sections focuses on establishing the paths to project directories and initializing key variables. It defines the paths to the dataset, derivatives, and preprocessed fMRI data, setting up the ways to access project data. Additionally, variables related to the dataset version, the number of runs, and the task specifications are set within this section.  
(Adjust this path and any other path as needed)

In [38]:
dataset_root = Path('E:\\fmri_processing\\results')

In [39]:
tc2see_version = 3 # [1, 2]
dataset_path = dataset_root
derivatives_path = dataset_path / 'derivatives_TC2See'
data_path = derivatives_path / 'fmriprep'
num_runs = 6 if tc2see_version in (1, 3) else 8

task = "bird"
space = 'T1w' # ['T1w', 'MNI152NLin2009cAsym']

subject_no = '31'

# Initialize BIDSLayouts for querying files.
dataset_layout = BIDSLayout(dataset_path / 'TC2See')
derivatives_layout = BIDSLayout(derivatives_path / 'fmriprep', derivatives=True, validate = False)
spm = False

Example contents of 'dataset_description.json':
{"Name": "Example dataset", "BIDSVersion": "1.0.2", "GeneratedBy": [{"Name": "Example pipeline"}]}


Processing fMRI Data for Subjects:  
This code segment focuses on configuring and preparing the environment for the analysis of the preprocessed fMRI data. First it initializes configuration variables such as subject IDs, TR duration, brain mask dilation parameters, and the number of stimuli. Then, it loads stimulus images and creates a mapping of stimulus names to unique identifiers. The code then creates an HDF5 file for storing the preprocessed fMRI data, with the filename derived from the specified version. For each subject, it initializes a group within the HDF5 file and manages the loading of the brain mask, potentially applying binary dilation if required. Various datasets within the subject's group are created to store bold data, statistics, trends, and stimulus related information. This segment ensures that the preprocessed fMRI data is well organized and structured for further analysis and interpretation.

In [40]:
#subjects = ['01', '02']
# subjects = [str(sub) if sub >= 10 else '0'+str(sub) for sub in range(5,28)] # Subject ID to process data for
subjects = [subject_no]

num_trs = 236 #231 #229 #236 - 28,29  # Total number of TRs in the fMRI data

tr = 2. # 1.97  # TR duration (in seconds)
mask_dilations = 3  # Number of dilation iterations for the brain mask
num_stimuli = 75 # 112  # Total number of different stimuli

# Load stimulus images and create a mapping of stimulus names to unique identifiers
stimulus_images = h5py.File(derivatives_path / 'stimulus-images.hdf5', 'r')
stimulus_id_map = {name: i for i, name in enumerate(stimulus_images.attrs['stimulus_names'])}

new_or_append = 'w' # Use 'a' for append/overwrite, 'w' for new hdf5 file
           
# Create or append to an HDF5 file to store preprocessed fMRI data
with h5py.File(data_path / f'tc2see-v{tc2see_version}-bold-test-31.hdf5', new_or_append) as f:
    for subject in tqdm(subjects):
        if f'sub-{subject}' not in list(f.keys()):
            try:
                print(f"Processing subject {subject}...")

                group = f.require_group(f'sub-{subject}')
            
                # Load the brain mask and perform binary dilation to include neighboring voxels
                if spm:
                    mask_image = nib.load(derivatives_path / f'spm/sub-{subject}/func/arsub-{subject}_task-bird_run-1_bold.nii')
                    fmri_mask = mask_image.get_fdata()[..., 0].astype(bool)
                    fmri_mask[:] = True
                else:
                    mask_image = derivatives_layout.get(
                        subject=subject,
                        run=1,
                        task=task,
                        space=space, 
                        desc='brain',
                        extension='nii.gz',
                    )[0].get_image()

                    fmri_mask = mask_image.get_fdata().astype(bool)
                    fmri_mask = binary_dilation(fmri_mask, iterations=mask_dilations)

                num_voxels = fmri_mask.sum()

                # If necessary attributes and datasets don't exist in the group, create them
                if 'affine' not in group:
                    group['affine'] = mask_image.affine
                
                #H, W, D = fmri_mask.shape
                if 'fmri_mask' not in group:
                    group['fmri_mask'] = fmri_mask
                    
                group.require_dataset('bold', shape=(num_runs, num_trs, num_voxels), dtype='f4')
                group.require_dataset('bold_mean', shape=(num_runs, num_voxels), dtype='f4')
                group.require_dataset('bold_std', shape=(num_runs, num_voxels), dtype='f4')
                group.require_dataset('bold_trend', shape=(num_runs, 2, num_voxels), dtype='f4')
                group.require_dataset('bold_trend_std', shape=(num_runs, num_voxels), dtype='f4')
                group.require_dataset('stimulus_trs', shape=(num_runs, num_stimuli), dtype='f4')
                group.require_dataset('stimulus_ids', shape=(num_runs, num_stimuli), dtype='i4')
                
                for run_id in tqdm(range(num_runs)):
                    if spm:
                        bold = nib.load(derivatives_path / f'spm/sub-{subject}/func/arsub-{subject}_task-bird_run-{run_id+1}_bold.nii').get_fdata()
                    else:
                        # Load the preprocessed fMRI data for the current subject and run
                        bids_image = derivatives_layout.get(
                            subject=subject,
                            run=run_id + 1,
                            space=space, 
                            task=task,
                            desc='preproc', 
                            extension='nii.gz',
                        )[0]
                        
                        bold = bids_image.get_image().get_fdata()

                    bold = bold[fmri_mask].T  # Extract the relevant voxels
                    print(f'{bold.shape=}')

                    num_trs_run = bold.shape[0]
                    trend_coeffs = np.stack([np.arange(num_trs_run), np.ones(shape=num_trs_run)], axis=1)
                    
                    # Perform linear detrending on the bold data
                    bold_trend = np.linalg.lstsq(trend_coeffs, bold, rcond=None)[0]
                    bold_predicted = trend_coeffs @ bold_trend
                    bold_detrend = bold - bold_predicted

                    # Load events data for the current subject and run
                    events_file = dataset_layout.get(
                        subject=subject,
                        run=run_id + 1,
                        task=task,
                        extension='tsv'
                    )[0]
                    
                    events_df = pd.read_csv(events_file.path, sep='\t')
                    events_df = events_df[events_df['stimulus'] != '+']
                    stimulus_names = [Path(stimulus_path).stem for stimulus_path in events_df['stimulus']]
                    stimulus_names = [
                        name[:name.find('hash')-1] if "hash" in name else name
                        for name in stimulus_names
                    ]
                    stimulus_ids = [stimulus_id_map[name] for name in stimulus_names]                    
                    stimulus_trs = np.array(events_df['tr']).astype(np.float32)
                    
                    # Store various datasets in the HDF5 file
                    group['bold'][run_id, :num_trs_run] = bold
                    group['bold_mean'][run_id] = bold.mean(axis=0)
                    group['bold_std'][run_id] = bold.std(axis=0)
                    group['bold_trend'][run_id] = bold_trend
                    group['bold_trend_std'][run_id] = bold_detrend.std(axis=0)
                    group['stimulus_trs'][run_id] = stimulus_trs
                    group['stimulus_ids'][run_id] = stimulus_ids
                
            except Exception as e:
                print(f"Error processing {subject}: {e}")
                del f[f'sub-{subject}']
                continue
        else:
            print(f"Subject {subject} already exists")
            print(f[f'sub-{subject}']['bold'].shape)

  0%|          | 0/1 [00:00<?, ?it/s]

Processing subject 31...


  0%|          | 0/6 [00:00<?, ?it/s]

bold.shape=(236, 166634)
bold.shape=(236, 166634)
bold.shape=(236, 166634)
bold.shape=(236, 166634)
bold.shape=(236, 166634)
bold.shape=(236, 166634)


### Noise Ceiling File Creation

Loading and Processing Events Data:  
This part of the code focuses on loading and processing events data related to a specific subject, run, and task. It involves the extraction of information about the stimuli presented during fMRI scans and the mapping of these stimuli to unique identifiers. It prints the length of the stimulus_ids list.

In [41]:
end

NameError: name 'end' is not defined

In [None]:
events_file = dataset_layout.get(
    subject=subject_no,
    run=1,
    task=task,
    extension='tsv'
)[0]
events_df = pd.read_csv(events_file.path, sep='\t')
events_df = events_df[events_df['stimulus'] != '+']
stimulus_names = [Path(stimulus_path).stem for stimulus_path in events_df['stimulus']]
stimulus_names = [
    name[:name.find('hash')-1] if "hash" in name else name
    for name in stimulus_names
]
stimulus_ids = [stimulus_id_map[name] for name in stimulus_names]
print(len(stimulus_ids))

stimulus_trs = np.array(events_df['tr']).astype(np.float32)

75


Loading and Processing fMRI Data:  
In this section, the emphasis is on loading and processing fMRI data. The code sets various data processing parameters, such as TR offset and run normalization, to ensure data quality. The loaded bold data and stimulus IDs' shapes are inspected to verify their suitability for analysis.

In [None]:
bold, stimulus_ids, mask, affine = load_data(
    data_path / f'tc2see-v{tc2see_version}-bold-2.hdf5', 
    f'sub-{subject_no}', 
    tr_offset=4,
    run_normalize='linear_trend',
    interpolation=False,
)
print(bold.shape)
print(mask.shape)
print("num_voxels: ", mask.sum())
print(stimulus_ids.shape)

  run_bold = (run_bold - predicted_bold) / group['bold_trend_std'][i]


(450, 10828)
(63, 75, 67)
num_voxels:  10828
(450,)


Calculating Noise Ceiling:  
This section focuses on computing noise ceiling metrics, including noise ceiling signal-to-noise ratio (ncsnr) and noise ceiling (nc). The generated metrics can help assess the quality and reliability of the fMRI data.

In [None]:
ncsnr = compute_ncsnr(bold, stimulus_ids) # Compute noise ceiling noise ratio
nc = compute_nc(ncsnr, num_averages=1)
nc_volume = np.zeros_like(mask, dtype=float)
nc_volume[mask] = nc

Visualizing Noise Ceiling Maps:  
In this section, the code defines a function that allows interactive visualization of noise ceiling maps. The @interact creates a graphical interface where users can adjust the value “d” to navigate through different slices along the z-axis of the noise ceiling volume. It displays the selected slice of the noise ceiling map, with a color map that highlights variations in noise ceiling values, with red indicating a high noise ceiling. This visualization aids in the examination of noise ceiling patterns within the brain's spatial dimensions.

In [None]:
import matplotlib.pyplot as plt

D = nc_volume.shape[2]
@interact(d=(0, D-1), original=True)
def show(d):
    plt.figure(figsize=(12, 12))
    plt.imshow(nc_volume[:, :, d], cmap='jet', vmin=0., vmax=25,)

# Visualize noise ceiling (red is high noise ceiling)
# Left is back of the brain (visual cortex area)

interactive(children=(IntSlider(value=33, description='d', max=66), Output()), _dom_classes=('widget-interact'…

Generating Noise Ceiling Maps for Specific Subjects and TR Offsets:  
This section extends the analysis to specific subjects and TR offsets. The code processes subjects individually, producing noise ceiling maps for each subject at distinct TR offsets. This approach allows for a detailed exploration of variations in noise ceilings at the subject level. These noise ceiling maps are saved as NIfTI files with names like “sub-12__noise-ceiling.nii.gz” in the directory specified by “subject_path,” which is determined by “data_path / 'noise_ceiling' / subject.”

In [None]:
subjects = [f'sub-{subject_no}']# 'sub-04',] # ['sub-05','sub-06', 'sub-07']
tr_window = [0, 2, 4, 6, 8] # 4 (6 seconds) most white visible

results = {}
for subject in subjects:
    subject_path = data_path / 'noise_ceiling' / subject
    subject_path.mkdir(exist_ok=True, parents=True)
    out_file_name = f'{subject}__noise-ceiling.nii.gz'

    nc_series = []
    for t_offset in tr_window:
        bold, stimulus_ids, mask, affine = load_data(
            derivatives_path / 'fmriprep' / f'tc2see-v{tc2see_version}-bold-2.hdf5', 
            subject, 
            tr_offset=t_offset / tr,
            run_normalize='linear_trend',
            interpolation=False,
        )
        print(bold.shape)
        print(mask.shape)
        ncsnr = compute_ncsnr(bold, stimulus_ids)
        nc = compute_nc(ncsnr, num_averages=1)
        nc_volume = np.zeros_like(mask, dtype=float)
        nc_volume[mask] = nc
        nc_series.append(nc_volume)
    nc_series = np.stack(nc_series, axis=-1)

    results[out_file_name] = nc_series
    image = nib.Nifti1Image(nc_series, affine)
    nib.save(image, subject_path / out_file_name)

(450, 10828)
(63, 75, 67)
(450, 10828)
(63, 75, 67)
(450, 10828)
(63, 75, 67)
(450, 10828)
(63, 75, 67)
(450, 10828)
(63, 75, 67)


Generating Noise Ceiling Maps for Subject Groups and Run Combinations:  
This part focuses on generating noise ceiling maps for subject groups, but it additionally considers specific run combinations. The code allows for examining how different combinations of runs and TR offsets impact the noise ceiling for subject groups, offering a complete assessment of data quality for these group configurations. The generated noise ceiling maps for subject groups and run combinations are saved as NIfTI files. These files are named with a format like “sub-12__run_ids_0-3__noise-ceiling.nii.gz” and are saved in the directory specified by “subject_path,” which is determined by “data_path / 'noise_ceiling' / subject.”

In [None]:
subjects = [f'sub-{subject_no}',]# 'sub-04',]
tr_window = [0, 2, 4, 6, 8]
run_id_groups = [[0, 3], [1, 4], [2, 5], [2], [5]] # 6 runs total
tr = 2

results = {}
for subject in subjects:
    subject_path = data_path / 'noise_ceiling' / subject
    subject_path.mkdir(exist_ok=True, parents=True)
    for run_ids in run_id_groups:
        out_file_name = f'{subject}__run_ids_{"-".join([str(i) for i in run_ids])}__noise-ceiling.nii.gz'

        nc_series = []
        for t_offset in tr_window:
            bold, stimulus_ids, mask, affine = load_data(
                data_path / f'tc2see-v{tc2see_version}-bold-2.hdf5', 
                subject, 
                tr_offset=t_offset / tr,
                run_normalize='linear_trend',
                interpolation=False,
                run_ids=run_ids,
            )
            ncsnr = compute_ncsnr(bold, stimulus_ids)
            nc = compute_nc(ncsnr, num_averages=1)
            nc_volume = np.zeros_like(mask, dtype=float)
            nc_volume[mask] = nc
            nc_series.append(nc_volume)
        nc_series = np.stack(nc_series, axis=-1)

        results[out_file_name] = nc_series
        image = nib.Nifti1Image(nc_series, affine)
        nib.save(image, subject_path / out_file_name)