# THINGS-fMRI usage notes (Modified Version)

### THINGS-fMRI1 b-value extraction notebook

Part of codes are grabbing from the THINGS-data repository: [link](https://github.com/ViCCo-Group/THINGS-data/blob/main/MRI/notebooks/fmri_usage.ipynb)

For a detailed description of the data and the procedures that generated it, see [the THINGS-data preprint](https://doi.org/10.1101/2022.07.22.501123).

In [2]:
from os.path import join as pjoin
import glob
import numpy as np
import pandas as pd
# from nilearn.masking import apply_mask, unmasks
# from nilearn.plotting import plot_epi, plot_stat_map
# from nilearn.image import load_img, index_img, iter_img
# import matplotlib.pyplot as plt
# import cortex
import os
import shutil

In [3]:
# Assumes you've downloaded the THINGS-fMRI data to this directory
basedir = '/mnt/c/Users/Wayne/Desktop/'

## Single trial responses

The single trial responses are arguably the easiest way to analyze the THINGS-fMRI data. They contains the magnitude of the fMRI response to each stimulus in each voxel with a single number. The single trial responses are provided in two formats: a) In table format, b) in volumetric format.

### Table format

Besides the fMRI response data, the table format contains metadata about each voxel (such as noise ceilings, pRF parameters, regions of interest) and about the stimulus (such as image file name, trial type, run and session). 

In [4]:
# Assumes you downloaded the single trial responses in table format to this directory 
betas_csv_dir = pjoin(basedir, 'betas_csv')

# and that you're interested in the data for the first subject
sub = '01'

The `sub-{subject}_ResponseData.h5` files contain the actual single trial responses. Rows are voxels, columns are trials.

In [5]:
data_file = pjoin(betas_csv_dir, f'sub-{sub}_ResponseData.h5')
responses = pd.read_hdf(data_file)  # this may take a minute
print('Single trial response data')
responses.head()

  responses = pd.read_hdf(data_file)  # this may take a minute


Single trial response data


Unnamed: 0,voxel_id,0,1,2,3,4,5,6,7,8,...,9830,9831,9832,9833,9834,9835,9836,9837,9838,9839
0,0,-0.089022,0.041923,0.16913,-0.075151,0.015963,-0.010098,-0.012468,0.084902,0.091878,...,0.024178,-0.029775,0.099759,0.006796,-0.044169,0.018878,-0.057514,0.055157,-0.036201,-0.068705
1,1,-0.062508,-0.037973,-0.009769,0.082478,0.056631,-0.015929,-0.027017,0.054912,0.017544,...,-0.071025,0.035286,0.066276,-0.09253,-0.074624,0.006528,0.005477,0.002302,0.074685,-0.096532
2,2,-0.070807,-0.019326,-0.019546,-0.060038,-0.024878,0.05275,0.163108,0.037861,-0.073247,...,0.045679,0.075059,-0.020386,-0.034966,-0.027783,0.011068,-0.025219,0.01235,0.029529,0.006157
3,3,0.006218,0.016355,-0.075845,-0.109495,-0.007062,0.144785,0.086463,-0.047257,0.011348,...,-0.050225,0.016627,0.083943,-0.038645,-0.014257,0.050435,0.032841,-0.036794,-0.000256,0.033482
4,4,-0.014344,-0.029792,0.136358,-0.118176,0.007145,0.036102,0.036816,0.015313,0.035015,...,-0.104036,-0.020143,0.063932,-0.0809,0.010575,-0.015148,-0.085487,0.11867,0.073392,-0.014972


The `sub-{subject}_VoxelMetadata.csv` files contain additional information about each voxel, such as membership to ROIs, reliability measures, and noise ceilings.

In [6]:
vox_f = pjoin(betas_csv_dir, f'sub-{sub}_VoxelMetadata.csv')
voxdata = pd.read_csv(vox_f)
voxdata.head()

Unnamed: 0,voxel_id,subject_id,voxel_x,voxel_y,voxel_z,nc_singletrial,nc_testset,splithalf_uncorrected,splithalf_corrected,prf-eccentricity,...,lSTS,rSTS,lPPA,rPPA,lRSC,rRSC,lTOS,rTOS,lLOC,rLOC
0,0,1,1,39,33,0.0,0.0,-0.03053,-0.062982,0.0,...,0,0,0,0,0,0,0,0,0,0
1,1,1,1,39,34,1.102513,11.799193,0.070108,0.131029,4.941197,...,0,0,0,0,0,0,0,0,0,0
2,2,1,1,39,35,2.134454,20.743164,0.121099,0.216036,11.064742,...,0,0,0,0,0,0,0,0,0,0
3,3,1,1,39,36,0.0,0.0,-0.040901,-0.08529,0.0,...,0,0,0,0,0,0,0,0,0,0
4,4,1,1,40,33,0.446367,5.105714,0.034924,0.067491,0.0,...,0,0,0,0,0,0,0,0,0,0


### Get Voxel IDs from the table

In [7]:
# get voxel_id of V1
def get_ROI_voxel_id(voxdata, region):
    return np.array(voxdata[voxdata[region] == 1]["voxel_id"].values)
    
v1_voxel_id = get_ROI_voxel_id(voxdata, "V1")
v2_voxel_id = get_ROI_voxel_id(voxdata, "V2")
hv4_voxel_id = get_ROI_voxel_id(voxdata, "hV4")
it_l_voxel_id = get_ROI_voxel_id(voxdata, "lLOC")
it_r_voxel_id = get_ROI_voxel_id(voxdata, "rLOC")

it_voxel_id = np.concatenate((it_l_voxel_id, it_r_voxel_id))

### Get Voxel Betas values from the h5 file

In [8]:
responses_v1 = responses[responses["voxel_id"].isin(v1_voxel_id)]
responses_v2 = responses[responses["voxel_id"].isin(v2_voxel_id)]
responses_hv4 = responses[responses["voxel_id"].isin(hv4_voxel_id)]
responses_it = responses[responses["voxel_id"].isin(it_voxel_id)]



In [9]:
print('available voxel metadata:\n', voxdata.columns.to_list())

available voxel metadata:
 ['voxel_id', 'subject_id', 'voxel_x', 'voxel_y', 'voxel_z', 'nc_singletrial', 'nc_testset', 'splithalf_uncorrected', 'splithalf_corrected', 'prf-eccentricity', 'prf-polarangle', 'prf-rsquared', 'prf-size', 'V1', 'V2', 'V3', 'hV4', 'VO1', 'VO2', 'LO1 (prf)', 'LO2 (prf)', 'TO1', 'TO2', 'V3b', 'V3a', 'lEBA', 'rEBA', 'lFFA', 'rFFA', 'lOFA', 'rOFA', 'lSTS', 'rSTS', 'lPPA', 'rPPA', 'lRSC', 'rRSC', 'lTOS', 'rTOS', 'lLOC', 'rLOC']


The voxel indices can be used to reconstruct a volume, e.g. for visualizing results obtained from the single trial responses. Alternatively, the brain mask can be used for that purpose (see below). Membership of each voxel to the available ROIs is dummy coded, e.g. in `voxdata["V1"]` or `voxdata["rFFA"]`. The population receptive field parameters are encoded in the following columns: `prf-eccentricity`, `prf-polarangle`, `prf-size`, and `prf-rsquared`. Finally, different reliability estimates are available in the columns: `nc_testset`, `nc_singletrial`, `splithalf_uncorrected`, and `splithalf_corrected`.

The `sub-{subject}_StimulusMetadata.csv` files contain information about the file name of the image shown in each trial, which run and session a given trial occured in, and the trial_type. 

In [10]:
# Stimulus metadata
stim_f = pjoin(betas_csv_dir, f'sub-{sub}_StimulusMetadata.csv')
stimdata = pd.read_csv(stim_f)
stimdata.head()

Unnamed: 0,trial_type,session,run,subject_id,trial_id,stimulus
0,train,1,1,1,0,dog_12s.jpg
1,train,1,1,1,1,mango_12s.jpg
2,train,1,1,1,2,spatula_12s.jpg
3,test,1,1,1,3,candelabra_14s.jpg
4,train,1,1,1,4,panda_12s.jpg


> 🚨 **Trial types**
>
> The THINGS-fMRI experiment presented participants with three different trial types:
> - `train`: Participants passively viewed an object image.
> - `test`: Same as train, but these trials belonged to a set of 200 images which were presented in each session. It's main purpose is to allow for estimating the reliability of the single trial responses in a given voxel.
> - `catch`: Participants saw a non-object image and responded with a button press. This was included to ensure participants were engaged throughout the experiment.
>
> Note: Catch trials are excluded from the single trial responses in table format as they are likely not of interest for most applications. However, catch trials are included in the volumetric format in order to make it possible to account for them in analyses.

### Get picture trials

In [11]:
# get all characters before last 8 characters
def get_stimulus_name(stimdata):
    return np.array([stim[:-8] for stim in stimdata["stimulus"].values])
categories = get_stimulus_name(stimdata)
# concatenate the categories into stimdata as a new column
stimdata["categories"] = categories
print(stimdata)
# get unique categories from stimdata["categories"]
categories = stimdata["categories"].unique()
# get all the trail_id that has the same category
def get_stimulus_id(stimdata, category):
    return np.array(stimdata[stimdata["categories"] == category]["trial_id"].values)

# get all the trial_id that has the same category and put them into a dictionary
stimulus_id_dict = {}
for category in categories:
    stimulus_id_dict[category] = get_stimulus_id(stimdata, category)


     trial_type  session  run  subject_id  trial_id            stimulus  \
0         train        1    1           1         0         dog_12s.jpg   
1         train        1    1           1         1       mango_12s.jpg   
2         train        1    1           1         2     spatula_12s.jpg   
3          test        1    1           1         3  candelabra_14s.jpg   
4         train        1    1           1         4       panda_12s.jpg   
...         ...      ...  ...         ...       ...                 ...   
9835      train       12   10           1      9835       sword_09s.jpg   
9836      train       12   10           1      9836      toilet_09s.jpg   
9837       test       12   10           1      9837     earring_16n.jpg   
9838       test       12   10           1      9838       brace_14s.jpg   
9839      train       12   10           1      9839   cockroach_09s.jpg   

      categories  
0            dog  
1          mango  
2        spatula  
3     candelabra  
4   

### Do trials and voxels average

In the original H5 data file, the row is voxel and the column is trials.

In [12]:

ROI = ["V1", "V2", "hV4", "IT"]

### use responses data to get the mean response for each category
def get_mean_betas(responses, stimulus_id_dict, categories):
    # average the voxels
    responses_average = np.mean(responses, axis=0)
    # create a np array that has same shape as how many unique categories we have
    betas_categories = np.zeros((len(categories),1))
    for i in range(len(categories)):
        # get the trial_id for each category
        stimulus_id = stimulus_id_dict[categories[i]]
        betas_categories[i] = responses_average[stimulus_id].mean()
    return betas_categories

v1_mean_betas = get_mean_betas(responses_v1, stimulus_id_dict, categories)
v2_mean_betas = get_mean_betas(responses_v2, stimulus_id_dict, categories)
hv4_mean_betas = get_mean_betas(responses_hv4, stimulus_id_dict, categories)
it_mean_betas = get_mean_betas(responses_it, stimulus_id_dict, categories)

# concatenate the mean betas for each ROI into a pandas dataframe, row is ROI, column is category
final_mean_betas = pd.DataFrame(np.concatenate((v1_mean_betas, v2_mean_betas, hv4_mean_betas, it_mean_betas), axis=1).T, index=ROI, columns=categories)
print(final_mean_betas)

          dog     mango   spatula  candelabra     panda  pacifier  crayfish  \
V1   0.008226  0.017480 -0.000349    0.016366  0.003010  0.011333  0.002690   
V2   0.012045  0.020776  0.009508    0.020023  0.006212  0.017008  0.005667   
hV4  0.022013  0.025496  0.015969    0.027363  0.013007  0.031143  0.020203   
IT   0.013272  0.016870  0.013224    0.015172  0.009315  0.026026  0.008108   

        shirt  chicken_wire       cow  ...   lettuce    diaper   granite  \
V1   0.004290      0.016990  0.008359  ...  0.021588  0.015891  0.022580   
V2   0.008726      0.023003  0.011581  ...  0.022974  0.018392  0.021601   
hV4  0.015112      0.024035  0.022274  ...  0.027887  0.030999  0.019290   
IT   0.004882      0.010760  0.009132  ...  0.006865  0.026962  0.004102   

      pom-pom  windowsill   tadpole      leek     sheep    coffin  \
V1   0.014479    0.009649  0.008141  0.007890  0.007142  0.005645   
V2   0.023019    0.015574  0.013099  0.018520  0.007567  0.011964   
hV4  0.027569   

### Locate the images

In [14]:
pic_path = "/mnt/c/Users/Wayne/Desktop/THINGS_pics/THINGS/Images/all"

# if the folder name is matched with categories, copy the folder to new folder "FCAnet" at Desktop
for category in categories:
    if os.path.exists(os.path.join(pic_path, category)):
        shutil.copytree(os.path.join(pic_path, category), os.path.join(f"/mnt/c/Users/Wayne/Desktop/FCAnet_stimulus/sub-{sub}", category))
    else:
        print(category)