[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hendersonneurolab/CogAI_Fall2025/blob/master/Lab08_Model_Comparisons.ipynb)

## Week 8: Compare different fMRI encoding models.

This week we will continue working with fMRI encoding models, building on what we covered in Lab 7. We will use data from the Natural Scenes Dataset (NSD), and features from the AlexNet DNN model at various layers. We will explore how to compare performance of different models using variance partitioning, and how to plot data on a cortical surface map using the software library PyCortex.

**Learning Objectives**
- Understand the basic procedure involved in variance partitioning, and how to compute the unique variance and shared variance between models.
- Know how to interpret results when plotted on a flattened cortical surface.

NOTE: before you start, make sure your runtime is set to T4: GPU. (Use menu in the top right, "change runtime type").

###Step 1: Installation and importing

First, we need to install PyCortex and some dependencies.
These are not included with Colab by default. They will be needed in order to make brain map plots. This part should take ~2 minutes, let the instructor know if you encounter any issues.

In [None]:
import time
st = time.time()
# First, install some required dependencies
# !pip install -U setuptools wheel numpy cython
# Install the latest release of pycortex from pip
!pip install -U pycortex
print('elapsed = %.2f sec'%(time.time() - st))

In [None]:
# This installs inkscape, it will be needed for the flatmap plots.
st = time.time()
# Install Inkscape
!apt-get update
!apt-get install -y inkscape
# Verify installation
!inkscape --version
print('elapsed = %.2f sec'%(time.time() - st))

Now proceed with imports of other packages.

In [None]:
import numpy as np
import urllib.request
from io import BytesIO
import matplotlib.pyplot as plt
import pandas as pd
import scipy
import os, sys
import h5py
import time
import torch
import zipfile
import copy
import warnings
import shutil
warnings.filterwarnings('ignore')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

###Step 2: Download and load the data.

Like last week, we're using NSD data and features that are computed from AlexNet.


First, mount the Google Drive storage.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
# Navigate to the Colab Notebooks folder
colab_notebooks_path = '/content/drive/MyDrive/Colab Notebooks/'
os.chdir(colab_notebooks_path)
os.makedirs('CogAI', exist_ok=True)
os.makedirs('CogAI/data', exist_ok=True)
data_folder = os.path.join(colab_notebooks_path, 'CogAI', 'data')
print(data_folder)

Download the "PyCortex database" files (pycortex_db). These are files that PyCortex needs in order to make plots of data on an individual subject's cortical surface. They include information about the subject's cortical surface anatomy, and how to map from volume space to surface space.

You can learn more about PyCortex here: https://gallantlab.org/pycortex/install.html

In [None]:
# Info about pycortex database (need this to make the brain map plots)
dbox_link = 'https://www.dropbox.com/scl/fi/9737r06i53b0x672qnns9/pycortex_subj01.zip?rlkey=61vjlcv2p5dp0o05ff1bd0lnj&st=knf4uh33&dl=1'
filename = os.path.join(data_folder, 'S1_pycortex.zip')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

# unzip this file
st = time.time()
cortex_data_path = os.path.join(data_folder, 'pycortex_db')
folder_unzipped = cortex_data_path
with zipfile.ZipFile(filename, 'r') as zip_ref:
    zip_ref.extractall(folder_unzipped)
print('elapsed = %.2f seconds'%(time.time() - st))

In [None]:
# download a pycortex configuration file
dbox_link = 'https://www.dropbox.com/scl/fi/3ircd1y0s5jznpiyk2i1d/pycortex_config.cfg?rlkey=mz0gqvwlxoekdz3nxp1xowmmo&st=hv9fdozs&dl=1'
filename = os.path.join(data_folder, 'pycortex_config.cfg')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

# We're replacing the default pycortex config file with this version, which sets our subject database folder.
# When pycortex is imported, it should use this file.
config_file = os.path.join(data_folder, 'pycortex_config.cfg')
default_config_path = '/root/.config/pycortex/options.cfg'
os.makedirs(os.path.dirname(default_config_path), exist_ok=True)
shutil.copyfile(config_file, default_config_path)

Now import PyCortex:

In [None]:
import cortex
cortex.db.filestore, cortex.db.subjects

Download data files for this exercise. Note that several of these files were used in last week's lab too (Lab07), so you might already have them in your Drive storage.

In [None]:
# Info about ROIs
dbox_link = 'https://www.dropbox.com/scl/fi/kilrzj841mrpm17aj9gid/S1_voxel_roi_info.npy?rlkey=jgt1zje70ta8qpmaib8kmjskp&st=n9b3bxft&dl=1'
filename = os.path.join(data_folder, 'S1_voxel_roi_info.npy')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

In [None]:
# Info about images
dbox_link = 'https://www.dropbox.com/scl/fi/caabn3on6l9q8w32uxq8z/S1_image_info.csv?rlkey=cwb4mfruzcyrozrmdrfgerpp2&st=fg9i8ml3&dl=1'
filename = os.path.join(data_folder, 'S1_image_info.csv')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

fMRI data: this one can take a while to download, but shouldn't take more than 2 minutes. If it's taking too long, check with the professor.

In [None]:
# fMRI data
dbox_link = 'https://www.dropbox.com/scl/fi/e040hit5avrptp4hbnijy/S1_betas_avg_bigmask.hdf5?rlkey=s8bpoara1fln1dhf1ydjpuldn&st=323ct1jm&dl=1'
filename = os.path.join(data_folder, 'S1_betas_avg_bigmask.hdf5')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

Download DNN features:

In [None]:
dbox_links = ['https://www.dropbox.com/scl/fi/ck4al0qbe0tuljrky48uh/NSD_S1_ims224pix_Conv1_ReLU_pca100.hdf5?rlkey=6ppjrfkaii7z7k5hiz3zutzqf&st=fobms0tb&dl=1', \
              'https://www.dropbox.com/scl/fi/ilr537zl5wzkv9hacostg/NSD_S1_ims224pix_Conv2_ReLU_pca100.hdf5?rlkey=hsgr41ppossjsl1cqxp97eee3&st=mym3o32m&dl=1', \
              'https://www.dropbox.com/scl/fi/s6sdl8gsx1nzkykjdl6a8/NSD_S1_ims224pix_Conv3_ReLU_pca100.hdf5?rlkey=3xpos62w09s3wrkc87w18yadr&st=eu01sdzv&dl=1', \
              'https://www.dropbox.com/scl/fi/rwvz3m7hij6dptmw58nfg/NSD_S1_ims224pix_Conv4_ReLU_pca100.hdf5?rlkey=rie88fpywjn2ncpw9sridme19&st=60jf3zep&dl=1', \
              'https://www.dropbox.com/scl/fi/a8ksxnq380ycop4cc0cmh/NSD_S1_ims224pix_Conv5_ReLU_pca100.hdf5?rlkey=pc4k8mfaafh7t9dnwygpsnt3n&st=wgju9hox&dl=1']
layers = ['Conv1', 'Conv2', 'Conv3', 'Conv4', 'Conv5']

for link, layer in zip(dbox_links, layers):

  filename = os.path.join(data_folder, 'S1_%s_pca100.hdf5'%layer)
  # filename = os.path.join(data_folder, 'S1_%s.hdf5'%layer)
  if not os.path.exists(filename):
    st = time.time()
    print('downloading to %s...'%filename)
    urllib.request.urlretrieve(link, filename)
    print('elapsed = %.2f seconds'%(time.time() - st))
  else:
    print('We already have: %s'%filename)

Now that the files are downloaded, we need to load them into Python.


First, load the **image information**. This is a .csv file that contains info about all the images shown.

In [None]:
info_fn = os.path.join(data_folder, 'S1_image_info.csv')
print(info_fn)
info = pd.read_csv(info_fn)
# this df has [10,000] elements. Each element is 1 unique image.
# it contains info about the images and where they came from (within the MS COCO dataset).
# the rows of this correspond exactly to the features files (which will be loaded below).

image_order = np.array(info['unique_ims'])
n_reps = np.array(info['n_reps'])


Next, load the **fMRI data**.

The data (betas_avg_bigmask) is organized as: [images x voxels]

Each image was shown multiple times, and these values capture average response to each image.

Keep in mind this is already several steps of preprocessing removed from the raw data. Steps like motion correction have already been performed to improve signal quality.

Single-trial beta weights were already extracted using a GLM analysis (similar to what we did last week).

Data have also been z-scored, within each session, and the beta weights for repetitions of each image have been averaged.

Note also that the voxels in this matrix are not the entire brain. They represent a wide portion of visual cortex, including all voxels with reliable signal.
   


In [None]:
data_filename = os.path.join(data_folder, 'S1_betas_avg_bigmask.hdf5')
print(data_filename)

t = time.time()
with h5py.File(data_filename, 'r') as data_set:
    values = np.copy(data_set['/betas'])
    data_set.close()
elapsed = time.time() - t
print('Took %.5f seconds to load file'%elapsed)

# Some of these values may be nans, only for some subjects
# this is for subjects who didn't complete all 40 sessions of NSD experiment.
# make sure we remove the nans now.
# for subject 1: we should have all the data, no nans.
good_values = ~np.isnan(values[:,0])
print(values.shape)
print(np.sum(~good_values))

voxel_data = values[good_values,:]
print(voxel_data.shape)

# check that nans are exactly where we expect
# nans happen when n_reps=0
assert(np.all(good_values[n_reps>0]))
assert(np.all(~good_values[n_reps==0]))

Load info about the **ROIs (regions of interest)** in this dataset. Conveniently, the labels for these regions are already provided for us.

In [None]:
# ROI = region of interest. These are visual areas we want to focus on for analysis.
fn = os.path.join(data_folder, 'S1_voxel_roi_info.npy')
print(fn)
rinfo = np.load(fn, allow_pickle=True).item()
# this is a dictionary that contains information about which voxels our data will include.
# voxel_mask is the whole set of voxels we're focusing on. basically all of visual cortex.
voxel_mask = rinfo['voxel_mask']

# noise ceiling: this is already computed, this tells us the maximum explainable variance in the data.
# like a "ceiling" for encoding model performance.
noise_ceiling = rinfo['noise_ceiling_avgreps'] / 100



Load the **DNN features (activations)**, which will be used to construct the encoding model.

These are features from AlexNet, a large CNN model pre-trained on ImageNet. We pre-computed these features ahead of time, from several different layers of the model.

These features are organized as [n_images x n_features], where the images are in the same order as the fMRI data. So, each row in the features matrices corresponds to one row in the fMRI data matrix.

In [None]:
fdata = dict([])
for layer in layers:
  filename = os.path.join(data_folder, 'S1_%s_pca100.hdf5'%layer)
  print(filename)
  with h5py.File(filename, 'r') as f:
      # Explore the file structure
      print("Keys in file:", list(f.keys()))

      # # Load your data (adjust based on your file structure)
      data = np.array(f['features'])
      fdata[layer] = data

[f.shape for f in fdata.values()]

###Step 3: Set up the functions needed for ridge regression.

This part is similar to what we did in the last lab.

During ridge regression, we typically split our data into 3 independent sets of images:

- **Training data:** used to fit the model weights
- **Holdout data** (nested validation): used to choose best pRF and ridge parameters
- **Validation data**: held out until the very end, used to compute the validation set $R^2$.


Split the data into training, holdout, and testing data.

In [None]:
# fixed random seed, to make sure shuffling is repeatable
rndseeds = [171301, 42102, 490304, 521005, 11407, 501610, 552211, 450013, 824387]
subject = 1
si = subject-1 # remember python is zero-indexed. but subjects are one-indexed.

# Always holding out 1000 "shared images", which were seen by all NSD participants, as
# the validation set.
val_inds = np.array(info['shared1000'])

# Then take a random 10% of the remaining data, as the nested "holdout" set.
# Holdout set is used to choose ridge parameters and pRF parameters.
# You could experiment with different % holdout, 10% usually works well.
pct_holdout = 0.10
n_images_total = info.shape[0]
n_images_notval = np.sum(~val_inds);
n_images_holdout = int(np.ceil(n_images_notval*pct_holdout))
n_images_trn = n_images_notval - n_images_holdout

inds_notval = np.where(~val_inds)[0]
np.random.seed(rndseeds[si])
np.random.shuffle(inds_notval) # this is the only random part

inds_trn = inds_notval[0:n_images_trn]
inds_holdout = inds_notval[n_images_trn:]
assert(len(inds_holdout)==n_images_holdout)

trn_inds = np.isin(np.arange(0, n_images_total), inds_trn)
holdout_inds = np.isin(np.arange(0, n_images_total), inds_holdout)

# remove nan rows here
trn_inds = trn_inds[good_values]
val_inds = val_inds[good_values]
holdout_inds = holdout_inds[good_values]

# apply these indices to split the voxel data and image labels.
voxel_data_trn = voxel_data[trn_inds, :]
voxel_data_val = voxel_data[val_inds, :]
voxel_data_holdout = voxel_data[holdout_inds, :]

n_voxels = voxel_data_trn.shape[1]
print(voxel_data_trn.shape, voxel_data_val.shape, voxel_data_holdout.shape)

image_order_use = image_order[good_values]

image_inds_trn = image_order_use[trn_inds]
image_inds_val = image_order_use[val_inds]
image_inds_holdout = image_order_use[holdout_inds]

Make a function that splits the features into training, holdout, and validation partitions, and z-scores the features.

In [None]:
def split_normalize_feats(f):

    f_trn = f[trn_inds,:]
    f_val = f[val_inds,:]
    f_out = f[holdout_inds,:]

    # Z-score the data - this is a step that helps with fit stability.
    # I'm computing the normalization parameters (mean and std) on my training data only
    # (plus the nested held-out partition), but not the val set.
    # this helps reduce leakage of data between train and val partitions.
    # then apply those same normalization parameters to the val set too.
    f_concat = np.concatenate([f_trn, f_out], axis=0)
    # f_concat = f_trn

    features_m = np.mean(f_concat, axis=0, keepdims=True) #[:trn_size]
    # print(features_m[0,0:10])
    features_s = np.std(f_concat, axis=0, keepdims=True) + 1e-6

    f_trn -= features_m
    f_trn /= features_s
    f_out -= features_m
    f_out /= features_s
    f_val -= features_m
    f_val /= features_s

    # add the intercept: a column of ones
    f_trn = np.concatenate([f_trn, np.ones(shape=(len(f_trn), 1), dtype=f_trn.dtype)], axis=1)
    f_out = np.concatenate([f_out, np.ones(shape=(len(f_out), 1), dtype=f_out.dtype)], axis=1)
    f_val = np.concatenate([f_val, np.ones(shape=(len(f_val), 1), dtype=f_val.dtype)], axis=1)

    #
    return f_trn, f_val, f_out

Define the candidate lambda values for ridge regression.

In [None]:
# lambda is the ridge penalty, bigger = more regularization
n_lambdas = 20
lambdas = np.logspace(np.log(0.0001),np.log(10**10+0.01),n_lambdas, dtype=np.float32, base=np.e) - 0.01

Write a function that returns the set of voxels in any ROI.

In [None]:

def get_roi_vox(roi_name = 'FFA-2'):

  if roi_name in rinfo['ret_prf_roi_names']:
    ind_num = rinfo['ret_prf_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_retino'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_face_roi_names']:
    ind_num = rinfo['floc_face_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_face'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_place_roi_names']:
    ind_num = rinfo['floc_place_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_place'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_body_roi_names']:
    ind_num = rinfo['floc_body_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_body'][voxel_mask]==ind_num


  elif roi_name=='all':
    # return all of vis cortex, very big
    roi_inds = np.ones((np.sum(rinfo['voxel_mask']),), dtype=bool)


  return roi_inds


List all of the possible ROIs in the dataset. Retinotopic ROIS (ret_prf) are early visual areas defined using retinotopic mapping. Face-selective, place-selective and body-selective ROIs are higher visual areas defined using category localizers.

In [None]:
# Retinotopic ROI definitions:
print(rinfo['ret_prf_roi_names'])
# Face-selective ROI definitions:
print(rinfo['floc_face_roi_names'])
# Place-selective ROI definitions:
print(rinfo['floc_place_roi_names'])
# Body-selective ROI definitions:
print(rinfo['floc_body_roi_names'])

Make a function to compute $R^2$.

In [None]:
def get_r2(actual,predicted):
    """
    This computes the coefficient of determination (R2).
    Always goes along first dimension (i.e. the trials/samples dimension)
    """
    ssres = np.sum(np.power((predicted - actual),2), axis=0);
    sstot = np.sum(np.power((actual - np.mean(actual, axis=0)),2), axis=0);
    r2 = 1-(ssres/sstot)

    return r2

Create a function to solve the ridge regression problem. You can look back to Lab 7 for a reminder on the details of this.

This function will take as inputs:
- **xtrn**: DNN features from a layer of interest. [n_images x n_features]
- **vtrn**: Neural data from a brain region of interest. [n_images x n_voxels]
- **lambdas**: The candidate values for $\lambda$, the ridge penalty parameter.

In [None]:
def solve_ridge_fast(xtrn, vtrn, lambdas):

  n_features = xtrn.shape[1]
  # xT * x
  mult = xtrn.T @ xtrn

  # make an identity matrix
  ridge_term = torch.eye(xtrn.size()[1], device=device, dtype=torch.float64)

  # make versions of this matrix that are adjusted by each possible lambda value
  # this is: (X^T @ X + lambda*I)^(-1)
  # first dim is the different lambda values.
  lambda_matrices = torch.stack([(mult+ridge_term*l).inverse() \
            for l in lambdas], axis=0)

  cofactor = torch.tensordot(lambda_matrices, xtrn, dims=[[2],[1]])

  # solve for weights
  weights = torch.tensordot(cofactor, vtrn, dims=[[2], [0]]) # [#lambdas, #feature, #voxel]

  # predict the response on held-out data, using features from held-out data (xout)
  pred = torch.tensordot(xout, weights, dims=[[1],[1]]) # [#samples, #lambdas, #voxels]

  # compute loss for held-out data
  # this will tell us the loss for each of the possible lambda values
  loss = torch.sum(torch.pow(vout[:,None,:] - pred, 2), dim=0) # [#lambdas, #voxels]
  loss = loss.cpu().numpy()

  weights_use = torch.zeros((n_features, vtrn.shape[1]),device=device, dtype=weights.dtype)

  # for each voxel, find its best weights
  for vi in range(vtrn.shape[1]):
    # choose the best lambda value, based on min loss
    best_lambda_ind = np.argmin(loss[:,vi])
    best_lambda = lambdas[best_lambda_ind]
    # print(vi, best_lambda_ind, best_lambda)
    weights_use[:, vi] = weights[best_lambda_ind,:,vi]

  return weights_use

###Step 4: Fit models for different brain areas and different layers.

First, let's run through the fitting for one region and one layer:

In [None]:
# pick an ROI
target_roi_name = 'V1v'
voxel_inds = get_roi_vox(roi_name = target_roi_name)
n_voxels = np.sum(voxel_inds)

# v is our fMRI data, these are the 3 different splits
vtrn = torch.Tensor(voxel_data_trn[:,voxel_inds]).to(device).to(torch.float64)
vout = torch.Tensor(voxel_data_holdout[:,voxel_inds]).to(device).to(torch.float64)
vval = torch.Tensor(voxel_data_val[:,voxel_inds]).to(device).to(torch.float64)

# pick a DNN layer
layer_name = 'Conv1'

# pull out the features from this DNN layer
f = fdata[layer_name]

f_trn, f_val, f_out = split_normalize_feats(f)

# x is the DNN features, split into partitions
xtrn = torch.Tensor(f_trn).to(device).to(torch.float64)
xout = torch.Tensor(f_out).to(device).to(torch.float64)
xval = torch.Tensor(f_val).to(device).to(torch.float64)

# Fit the encoding model weights
st = time.time()
weights_use = solve_ridge_fast(xtrn, vtrn, lambdas)
elapsed = time.time() - st
print('elapsed = %.5f s'%elapsed)

# predict voxel response in held-out validation data.
# yhat = X @ W
pred = xval @ weights_use

# remember to turn these back into numpy, from torch.
# sometimes tensors will give errors in your subsequent numpy code.
actual_array = vval.cpu().numpy()
pred_array = pred.cpu().numpy()

r2 = get_r2(actual_array, pred_array)

print(r2.shape)



---
***Question 1:***

Run the encoding model fitting for different AlexNet layers. Try: "Conv1","Conv2","Conv3","Conv4","Conv5". To do this, start by copying the code from above and modifying, you may want to use a loop.

Print out the median $R^2$ for each layer.

How do the results compare for different layers? Does this make sense?


In [None]:
# [answer here]



---



Now, let's try running this same code for a larger group of voxels. Instead of just V1v, we can gather all the voxels in visual cortex, and fit at the same time.

NOTE that this part can take a longer time than the above code. Make sure you're attached to the T4: GPU runtime type.

In [None]:
# pick an ROI
target_roi_name = 'all' # if you pass in all, we get all visual cortex voxels.
voxel_inds = get_roi_vox(roi_name = target_roi_name)
n_voxels = np.sum(voxel_inds)

# v is our fMRI data, these are the 3 different splits
vtrn = torch.Tensor(voxel_data_trn[:,voxel_inds]).to(device).to(torch.float64)
vout = torch.Tensor(voxel_data_holdout[:,voxel_inds]).to(device).to(torch.float64)
vval = torch.Tensor(voxel_data_val[:,voxel_inds]).to(device).to(torch.float64)

# looping over the different layers, saving R2 for each layer.
r2_each_layer = np.zeros((len(layers), n_voxels))

for li, layer_name in enumerate(layers):

  # pull out the features from this DNN layer
  f = fdata[layer_name]

  f_trn, f_val, f_out = split_normalize_feats(f)

  # x is the DNN features, split into partitions
  xtrn = torch.Tensor(f_trn).to(device).to(torch.float64)
  xout = torch.Tensor(f_out).to(device).to(torch.float64)
  xval = torch.Tensor(f_val).to(device).to(torch.float64)

  # Fit the encoding model weights
  st = time.time()
  weights_use = solve_ridge_fast(xtrn, vtrn, lambdas)
  elapsed = time.time() - st
  print('elapsed = %.5f s'%elapsed)

  # predict voxel response in held-out validation data.
  # yhat = X @ W
  pred = xval @ weights_use

  # remember to turn these back into numpy, from torch.
  # sometimes tensors will give errors in your subsequent numpy code.
  actual_array = vval.cpu().numpy()
  pred_array = pred.cpu().numpy()

  r2 = get_r2(actual_array, pred_array)

  r2_each_layer[li,:] = r2

  print('\nPerformance for %s, %s layer:'%(target_roi_name, layer_name))
  print('\nMin R2: %.3f'%np.min(r2))
  print('Max R2: %.3f'%np.max(r2))
  print('Median R2: %.3f'%np.median(r2))




---
***Question 2:***

For each voxel, calculate which of the AlexNet layers resulted in the best $R^2$. You can use the variable "r2_each_layer" for this.

For each layer, print out how many of the voxels have that layer as their best layer.

In [None]:
#[answer here]



---



Now let's visualize these results on the cortical surface. We will use PyCortex for this. In order to do this we need to know which voxels in the matrix we've been working with, correspond to which positions on the cortical surface. In PyCortex, there is a transformation file (.xfm) which provides that information. We need to re-organize the data slightly for this:

In [None]:
# this is the data we want to plot: best layer for each cortical voxel
# it's an integer 0-4
best_layer_each_vox = np.argmax(r2_each_layer, axis=0)

# need to put these values into a larger list
my_values = np.full(shape = voxel_inds.shape, fill_value = np.nan)
my_values[voxel_inds] = best_layer_each_vox

# more parameters needed for this
xfmname = 'func1pt8_to_anat0pt8_autoFSbbr'
# ^this is the pre-computed transform we will use - mapping from functional to anatomical surface space.
substr = 'subj%02d'%subject
voxel_mask = rinfo['voxel_mask'] # all vis cortex voxels to whole brain mask
vol_shape = rinfo['brain_nii_shape'] # size of the 3D volume
mask_3d = np.reshape(voxel_mask, vol_shape, order='C') # 3D mask, indicating which voxels are in our matrix.


Put the data into a 3D volume: has values for the voxels that we analyzed here, and NaNs for other voxels.

In [None]:
def get_full_volume(values, voxel_mask, shape):
    """
    For PyCortex: Put values for voxels that were analyzed back into their
    correct coordinates in full volume space matrix.
    """
    voxel_mask_3d = np.reshape(voxel_mask, shape)
    full_vals = copy.deepcopy(voxel_mask_3d).astype('float64')
    full_vals[voxel_mask_3d==0] = np.nan
    full_vals[voxel_mask_3d==1] = values

    full_vals = np.moveaxis(full_vals, [0,1,2], [2,1,0])

    return full_vals

v = get_full_volume(my_values, voxel_mask, vol_shape)

Now make the PyCortex flatmap plot.

In this plot, the colors correspond to different numbers, which indicate different AlexNet layers.

Blue = 0 = Conv1, and so on.

In [None]:
plt.rcParams.update({'font.size': 18})

vol = cortex.Volume(data = v, cmap='BPROG', subject=substr, \
                                           vmin=-0.5, vmax=4.5,\
                                           xfmname=xfmname, mask=mask_3d)

fig = cortex.quickflat.make_figure(vol, with_curvature=True, with_labels=True)



---

***Question 3:***

What patterns do you notice in this surface plot?

Which colors are highest in early visual versus higher visual areas? Why?

Are there any exceptions to this pattern?

[answer here]



---



###Step 5: Run variance partition analysis.

When comparing the performance of different encoding models, one thing we often notice is that multiple models can have high $R^2$ for a given voxel. This can create ambiguity in interpretation, because we don't know if they are both explaining the same component of the voxel's response, or independent components.


- **Variance partitioning** provides a tool for dissecting how much of the variance explained is shared versus unique between two (or more) models.


To do this, we start by constructing larger models that consist of multiple feature spaces. In this case they will be different layers of the same DNN, but the same principle can be applied to comparing different DNNs. The features from the layers will be concatenated together. Then, we construct sub-models by "leaving out" one set of features at a time. The theory is that if we combine two layers, $R^2$ is generally better than either layer alone, and if we leave out one of the layers, $R^2$ will drop.

- The drop in $R^2$ (i.e. variance explained) when leaving out a specific layer indicates the **unique variance** for that layer.

Mathematically:

**Given:**
- $R^2_A$ = $R^2$ from model with only predictor A
- $R^2_B$ = $R^2$ from model with only predictor B  
- $R^2_{AB}$ = $R^2$ from combined model with both predictors A and B


**Unique variance explained by A:**

$$R^2_{\text{unique A}} = R^2_{AB} - R^2_B$$

**Unique variance explained by B:**

$$R^2_{\text{unique B}} = R^2_{AB} - R^2_A$$

**Shared variance:**

$$R^2_{\text{shared}} = R^2_{AB} - R^2_{\text{unique A}} - R^2_{\text{unique B}}$$

Let's see how this works in code:

In [None]:
# pick an ROI
# target_roi_name = 'EBA'
target_roi_name = 'all'

voxel_inds = get_roi_vox(roi_name = target_roi_name)
n_voxels = np.sum(voxel_inds)

# v is our fMRI data, these are the 3 different splits
vtrn = torch.Tensor(voxel_data_trn[:,voxel_inds]).to(device).to(torch.float64)
vout = torch.Tensor(voxel_data_holdout[:,voxel_inds]).to(device).to(torch.float64)
vval = torch.Tensor(voxel_data_val[:,voxel_inds]).to(device).to(torch.float64)

Pick two DNN layers to compare:

In [None]:
layer_name_A = 'Conv1'
layer_name_B = 'Conv4'

# pull out the features from each DNN layer: layer A and layer B
f_A = fdata[layer_name_A]
f_B = fdata[layer_name_B]

# Make a set of "concatenated" features, combining across both sets.
f_concat = np.concatenate([f_A, f_B], axis=1)
print(f_A.shape, f_B.shape, f_concat.shape)


Now run the encoding model with each of these feature sets

In [None]:
f = f_A

f_trn, f_val, f_out = split_normalize_feats(f)

# x is the DNN features, split into partitions
xtrn = torch.Tensor(f_trn).to(device).to(torch.float64)
xout = torch.Tensor(f_out).to(device).to(torch.float64)
xval = torch.Tensor(f_val).to(device).to(torch.float64)

# Fit the encoding model weights
st = time.time()
weights_use = solve_ridge_fast(xtrn, vtrn, lambdas)
elapsed = time.time() - st
print('elapsed = %.5f s'%elapsed)

# predict voxel response in held-out validation data.
# yhat = X @ W
pred = xval @ weights_use
actual_array = vval.cpu().numpy()
pred_array = pred.cpu().numpy()

r2 = get_r2(actual_array, pred_array)

r2_A = r2

In [None]:
f = f_B

f_trn, f_val, f_out = split_normalize_feats(f)

# x is the DNN features, split into partitions
xtrn = torch.Tensor(f_trn).to(device).to(torch.float64)
xout = torch.Tensor(f_out).to(device).to(torch.float64)
xval = torch.Tensor(f_val).to(device).to(torch.float64)

# Fit the encoding model weights
st = time.time()
weights_use = solve_ridge_fast(xtrn, vtrn, lambdas)
elapsed = time.time() - st
print('elapsed = %.5f s'%elapsed)

# predict voxel response in held-out validation data.
# yhat = X @ W
pred = xval @ weights_use
actual_array = vval.cpu().numpy()
pred_array = pred.cpu().numpy()

r2 = get_r2(actual_array, pred_array)

r2_B = r2

In [None]:
f = f_concat

f_trn, f_val, f_out = split_normalize_feats(f)

# x is the DNN features, split into partitions
xtrn = torch.Tensor(f_trn).to(device).to(torch.float64)
xout = torch.Tensor(f_out).to(device).to(torch.float64)
xval = torch.Tensor(f_val).to(device).to(torch.float64)

# Fit the encoding model weights
st = time.time()
weights_use = solve_ridge_fast(xtrn, vtrn, lambdas)
elapsed = time.time() - st
print('elapsed = %.5f s'%elapsed)

# predict voxel response in held-out validation data.
# yhat = X @ W
pred = xval @ weights_use
actual_array = vval.cpu().numpy()
pred_array = pred.cpu().numpy()

r2 = get_r2(actual_array, pred_array)

r2_concat = r2



---
***Question 4:***

Based on the formulas and code above, compute the unique variance for feature set A, the unique variance for feature set B, and the shared variance between A and B.

For each voxel, take the sum of [unique A, unique B, shared variance]. What do these values add up to?

Print out the minimum, maximum, and median values of unique and shared variance, across voxels.



In [None]:
# [answer here]



---



Now let's visualize the variance partitioning on a brain map.

First plot the overall performance of the concatenated model (feature set A + feature set B).


In [None]:
plt.rcParams.update({'font.size': 18})

vals = r2_concat

# need to put these values into a larger list
my_values = np.full(shape = voxel_inds.shape, fill_value = np.nan)
my_values[voxel_inds] = vals

v = get_full_volume(my_values, voxel_mask, vol_shape)

vol = cortex.Volume(data = v, cmap='Blues', subject=substr, \
                                          vmin=0, vmax=0.4,\
                                          xfmname=xfmname, mask=mask_3d)

fig = cortex.quickflat.make_figure(vol, with_curvature=True, with_labels=True)
plt.title('Combined A+B model performance')

Plot the unique variance for each model:

In [None]:
plt.rcParams.update({'font.size': 18})

unique_A = r2_concat - r2_B
unique_B = r2_concat - r2_A

shared = r2_concat - unique_A - unique_B

vals = unique_A

# need to put these values into a larger list
my_values = np.full(shape = voxel_inds.shape, fill_value = np.nan)
my_values[voxel_inds] = vals

v = get_full_volume(my_values, voxel_mask, vol_shape)

vol = cortex.Volume(data = v, cmap='Blues', subject=substr, \
                                          vmin=0, vmax=0.4,\
                                          xfmname=xfmname, mask=mask_3d)

fig = cortex.quickflat.make_figure(vol, with_curvature=True, with_labels=True)
plt.title('Unique Variance: %s layer'%layer_name_A)

In [None]:
plt.rcParams.update({'font.size': 18})

unique_A = r2_concat - r2_B
unique_B = r2_concat - r2_A

shared = r2_concat - unique_A - unique_B

vals = unique_B

# need to put these values into a larger list
my_values = np.full(shape = voxel_inds.shape, fill_value = np.nan)
my_values[voxel_inds] = vals

v = get_full_volume(my_values, voxel_mask, vol_shape)

vol = cortex.Volume(data = v, cmap='Blues', subject=substr, \
                                          vmin=0, vmax=0.4,\
                                          xfmname=xfmname, mask=mask_3d)

fig = cortex.quickflat.make_figure(vol, with_curvature=True, with_labels=True)
plt.title('Unique Variance: %s layer'%layer_name_B)

In [None]:
plt.rcParams.update({'font.size': 18})

unique_A = r2_concat - r2_B
unique_B = r2_concat - r2_A

shared = r2_concat - unique_A - unique_B

vals = shared

# need to put these values into a larger list
my_values = np.full(shape = voxel_inds.shape, fill_value = np.nan)
my_values[voxel_inds] = vals

v = get_full_volume(my_values, voxel_mask, vol_shape)

vol = cortex.Volume(data = v, cmap='Blues', subject=substr, \
                                          vmin=0, vmax=0.4,\
                                          xfmname=xfmname, mask=mask_3d)

fig = cortex.quickflat.make_figure(vol, with_curvature=True, with_labels=True)
plt.title('Shared Variance')



---
***Question 5:***

What patterns do you notice in the unique variance and shared variance plots across the brain?

Where is the unique variance highest for each layer? Where is the shared variance highest?




[answer here]



---



***Question 6:***

Try re-running the variance partitioning for a different pair of layers. (Change "layer_name_A" and "layer_name_B" above).

What patterns do you notice for this new pair of layers? Are there any pairs of layers that have especially high shared variance with each other?

[answer here]


---

