# Betaseries extraction

This script combines calculated whole brain trial beta-maps with brain parcellation and extracts trial beta-series for predefined set of brain regions. This analysis step has to be conducted separately for each parcellation. `NiftiSpheresMasker` is used for signal extraction with parameters:
- `allow_overlap=False`: ensures that parcellation has no everlapping spheres
- `standardize=True`: z-scores signal along trials dimension
- `detrend=False`: disable detrending since we are no longer in a time domain
- `high_pass=None`: disable high-pass filtering
- `low_pass=None`: disable low-pass filtering

Use of brain mask can be enabled / disabled by setting `use_mask` flag. Note that due to signal dropout in orbitofrontal regions, that will likely lead to an error raised by masker (some ROIs will fall out of the mask). Therefore by default `use_mask=False`. Spheres outside of the brain mask – i.e. these without signal, will be removed in the next analysis step during network construction. 

Additional within-condition normalization step can be performed by setting `within_condition_normalization` flag to `True`. This step is described in ([Conrad et al., 2020](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7462424/); see section "Within-format Normalization"). This procedure is adapted from the MVPA literature in which normalization of beta values is employed prior to classifier training.

After extraction, data from all subject and both task conditions are aggregated into single `.npy` file for convenience. Output file is stored as:

> `betaseries/<atlas_name>/betaseries_aggregated_<suffix>.npy`

Output array has shape `n_subjects` x `n_conditions` x `n_trials` x `n_rois`. Metadata corresponding to first thee dimensions can be found in `behavioral_data_clean_all.json` file. Suffix can be `norm` if within-condition normalization is performed or missing otherwise.

In [None]:
import json
from os.path import join
from pathlib import Path

import nibabel as nib
import numpy as np
import pandas as pd
from dn_utils.behavioral_models import load_behavioral_data
from dn_utils.misc import normalize_4d_nifti
from dn_utils.path import path
from nibabel.funcs import concat_images, four_to_three
from nilearn.input_data import NiftiSpheresMasker
from tqdm.notebook import tqdm

### Select brain parcellation & masker options

In [None]:
atlas = "combined_roi"
roi_table_fname = "combined_roi_table.csv"

# Arguments for NiftiSpheresMasker
masker_kwargs = {
    "allow_overlap": False, 
    "standardize": True, 
    "detrend": False, 
    "high_pass": None,
    "low_pass": None
}

# Whether to use individual brain mask during signal extraction
use_mask = False

# Within-conditon normaliation
within_condition_normalization = False

In [None]:
# Create paths
path_betamaps = join(path["bsc"], "betamaps")
path_betaseries = join(path["bsc"], "betaseries")
Path(path_betaseries).mkdir(exist_ok=True)

# Load ROI data
df_roi = pd.read_csv(join(path["parcellations"], atlas, roi_table_fname))

# Load behavioral data
beh, meta = load_behavioral_data(path["behavioral"], verbose=False)
n_subjects = beh.shape[0]
n_conditions = beh.shape[1]
n_trials = beh.shape[2]

# Load masks
with open(join(path["data_paths"], "mask_filenames.json"), "r") as f:
    mask_files = json.loads(f.read())

# Load betamaps
imgs = {"prlrew": [], "prlpun": []}
for con_idx, con in enumerate(meta["dim2"]):
    for sub_idx, sub in enumerate(meta["dim1"]):
        img_fname = f"sub-{sub}_task-prl{con}_betamaps.nii.gz" 
        imgs[f"prl{con}"].append(nib.load(join(path_betamaps, img_fname)))
        
df_roi.head()

### Within-condition normalization

Optional step of within-condition normalization of betamaps. In order to ensure that differences in activity or variance between conditons do not affect connectivity estimates, within-condition normalization procedure is implemented. Betamaps are first separated by condition and concatenated, then each voxel-wise beta series is normalized (mean subtraction and standard deviation division).

> This step will be performed only if `within_condition_normalization` flag is set to `True`

In [None]:
if within_condition_normalization:
    print("Normalizing images...")
    
    imgs_norm = {
        "prlpun": [None for _ in range(n_subjects)], 
        "prlrew": [None for _ in range(n_subjects)]
    }

    for sub_idx, sub in enumerate(tqdm(meta["dim1"])):
        for con_idx, con in enumerate(meta["dim2"]):

            won_bool_idx = meta["dim4"].index("won_bool")
            won_bool = beh[sub_idx, con_idx, :, won_bool_idx].astype(bool)
            won_indices = list(np.where(won_bool)[0])
            los_indices = list(np.where(~won_bool)[0])        

            # Grab 4d image
            con_key = f"prl{con}"
            img = imgs[con_key][sub_idx]

            # Split according to condition
            img_list = four_to_three(img)
            won_img_list = [img_3d for i, img_3d 
                            in enumerate(img_list) if i in won_indices]
            los_img_list = [img_3d for i, img_3d in 
                            enumerate(img_list) if i in los_indices]

            # Within-condition normalizitation
            img_list_norm = [None for _ in range(n_trials)]
            won_img_list_norm = normalize_4d_nifti(concat_images(won_img_list))
            los_img_list_norm = normalize_4d_nifti(concat_images(los_img_list))

            # Put normalized images back in place & concatenate
            for i, img_3d in zip(won_indices, won_img_list_norm):
                img_list_norm[i] = img_3d
            for i, img_3d in zip(los_indices, los_img_list_norm):
                img_list_norm[i] = img_3d
            img_norm = concat_images(img_list_norm)

            imgs_norm[con_key][sub_idx] = img_norm
else: 
    print("Skipping normalization...")

### Create maskers

Here separate maskers are created for ROIs with different sphere diameter. It is required for parcellations with uneven ROIs, since nilearn doesn't support variable ROI size. In standard case of uniform ROI size, this step is unnecessary. 

In [None]:
n_rois = len(df_roi)
seeds = [tuple(coords[1]) for coords in df_roi[["x", "y", "z"]].iterrows()]

maskers = {}

for radius in df_roi["radius(mm)"].unique():
    # Select ROIs with given radius
    roi_indices = np.flatnonzero(df_roi["radius(mm)"] == radius)
    
    # Create masker for single radius value
    masker = NiftiSpheresMasker(
        [seeds[idx] for idx in roi_indices], 
        radius=radius,                
        mask_img=None,
        **masker_kwargs
    )
    
    maskers[radius] = {"masker": masker, "indices": roi_indices}

In [None]:
betaseries_aggregated = np.zeros((n_subjects, n_conditions, n_trials, n_rois))

for sub_idx, sub in enumerate(tqdm(meta["dim1"])):
    for con_idx, con in enumerate(meta["dim2"]):

        mask_img = nib.load(mask_files[f"prl{con}"][sub_idx])
        
        if within_condition_normalization:
            beta_img = imgs_norm[f"prl{con}"][sub_idx]
        else:
            beta_img = imgs[f"prl{con}"][sub_idx]

        betaseries = np.zeros((n_trials, n_rois))

        for radius in maskers:
            # Get indices to insert computed timeseries into right positions
            roi_indices = maskers[radius]["indices"]

            # Get right spheres masker and add right whole-brain mask 
            masker = maskers[radius]["masker"]
            
            # Apply mask
            if use_mask:
                masker.mask_img = mask_img

            # Extract timeseries          
            betaseries[:, roi_indices] = masker.fit_transform(beta_img)
                
        betaseries_aggregated[sub_idx, con_idx] = betaseries
        
# Store betaseries
suffix = "_norm" if within_condition_normalization else ""
Path(join(path_betaseries, atlas)).mkdir(exist_ok=True, parents=True)
np.save(join(path_betaseries, atlas, f"betaseries_aggregated{suffix}.npy"), 
        betaseries_aggregated)