[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hendersonneurolab/CogAI_Fall2025/blob/master/Lab09_Neural_Decoding.ipynb)

## Week 9: Decoding variables from fMRI data.

In this lab, we'll learn how to construct a decoder to read out information from population-level neural activation patterns. This is a form of multi-voxel pattern analysis (MVPA). We will use a linear decoder, a tool that is commonly used in neuroscience to measure the information about a variable that is represented by a neural population. As in the previous exercises, we'll use data from the Natural Scenes Dataset (NSD).

**Learning Objectives:**
- Understand the steps involved in constructing and evaluating a classifier using neural data.
- Know how to interpret the accuracy of neural decoders, and compare results between brain regions.


###Step 1: Importing libraries and downloading data.

In [None]:
import numpy as np
import urllib.request
from io import BytesIO
import matplotlib.pyplot as plt
import pandas as pd
import scipy
import os, sys
import h5py
import time
import torch
import zipfile
import copy
import warnings
import shutil
import sklearn
import sklearn.svm, sklearn.discriminant_analysis, sklearn.linear_model
warnings.filterwarnings('ignore')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)


First, mount the Google Drive storage.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
# Navigate to the Colab Notebooks folder
colab_notebooks_path = '/content/drive/MyDrive/Colab Notebooks/'
os.chdir(colab_notebooks_path)
os.makedirs('CogAI', exist_ok=True)
os.makedirs('CogAI/data', exist_ok=True)
data_folder = os.path.join(colab_notebooks_path, 'CogAI', 'data')
print(data_folder)

Download data files for this exercise. Note that several of these files were used in previous week's lab too (Lab07), so you might already have them in your Drive storage.

In [None]:
# Info about ROIs
dbox_link = 'https://www.dropbox.com/scl/fi/kilrzj841mrpm17aj9gid/S1_voxel_roi_info.npy?rlkey=jgt1zje70ta8qpmaib8kmjskp&st=n9b3bxft&dl=1'
filename = os.path.join(data_folder, 'S1_voxel_roi_info.npy')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

In [None]:
# Info about images
dbox_link = 'https://www.dropbox.com/scl/fi/caabn3on6l9q8w32uxq8z/S1_image_info.csv?rlkey=cwb4mfruzcyrozrmdrfgerpp2&st=fg9i8ml3&dl=1'
filename = os.path.join(data_folder, 'S1_image_info.csv')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

fMRI data: this one can take a while to download, but shouldn't take more than 2 minutes. If it's taking too long, check with the professor.

In [None]:
# fMRI data
dbox_link = 'https://www.dropbox.com/scl/fi/e040hit5avrptp4hbnijy/S1_betas_avg_bigmask.hdf5?rlkey=s8bpoara1fln1dhf1ydjpuldn&st=323ct1jm&dl=1'
filename = os.path.join(data_folder, 'S1_betas_avg_bigmask.hdf5')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

Download category labels for the images.

Recall from the NSD paper, that all the images in this dataset come from the MS COCO dataset, and include annotations in many different categories. Here, we'll download a simple version of these annotations.

More info about COCO can be found here: https://cocodataset.org/#explore

In [None]:
dbox_link = 'https://www.dropbox.com/scl/fi/81gok5e9abzeq45oz4wq6/S1_cocolabs_binary.csv?rlkey=fpxzptb96h6poznf2345lcnzz&st=g8dizxxm&dl=1'
filename = os.path.join(data_folder, 'S1_cocolabs_binary.csv')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)

Download the COCO images themselves, for this subject.

In [None]:
dbox_link = 'https://www.dropbox.com/scl/fi/vfoeto2lrpoz2ik2642nn/S1_stimuli_224.h5py?rlkey=v74bvk0he1qi4tx1mmjgjk70j&st=ohc37b2b&dl=1'
filename = os.path.join(data_folder, 'S1_stimuli_224.h5py')
if not os.path.exists(filename):
  st = time.time()
  print('downloading to %s...'%filename)
  urllib.request.urlretrieve(dbox_link, filename)
  print('elapsed = %.2f seconds'%(time.time() - st))
else:
  print('We already have: %s'%filename)


###Step 2: Load and inspect the data.

Now that the files are downloaded, we need to load them into Python.


First, load the **NSD images**. This is a big matrix [10,000 x 3 x 224 x 224]. It contains the pixel values for all 10,000 images in the experiment. The 10,000 images are in the same order as in the fMRI data matrix.

In [None]:
data_filename = os.path.join(data_folder, 'S1_stimuli_224.h5py')
print(data_filename)

t = time.time()
with h5py.File(data_filename, 'r') as data_set:
  print(data_set.keys())
  images = np.copy(data_set['/stimuli'])
  data_set.close()
elapsed = time.time() - t
print('Took %.5f seconds to load file'%elapsed)

Next load the **COCO category labels** for the images.

These are stored as a pandas dataframe, where each column is one category.

They are binary labels, which indicate whether the category is present or absent in each image. 1=present, 0=absent.

Note that images can have more than one category in them. For example, an image of a person on a boat would be labeled with "person" and "vehicle".

In [None]:
fn = os.path.join(data_folder, 'S1_cocolabs_binary.csv')
print(fn)
cocolabs = pd.read_csv(fn, index_col=0)
cocolabs = cocolabs.iloc[:,0:12]
# I'm pulling out the first 12 columns here, which correspond to the "supercategories" in COCO.
# The remaining columns label the more fine-grained categories.

cocolabs

To understand this, let's plot a random image and its associated COCO labels. You can run this cell a few times to get different images.

In [None]:
# ii = 2343
ii = np.random.choice(np.arange(10000),1)
img=np.moveaxis(images[ii[0],:,:,:], [0], [2])
plt.imshow(img)

# pull out corresponding row of the dataframe.
cocolabs.iloc[ii]


Next, load the **fMRI data**.

The data (betas_avg_bigmask) is organized as: [images x voxels]

Each image was shown multiple times, and these values capture average response to each image.

Keep in mind this is already several steps of preprocessing removed from the raw data. Steps like motion correction have already been performed to improve signal quality. Single-trial beta weights were already extracted using a GLM analysis. Data have also been z-scored, within each session, and the beta weights for repetitions of each image have been averaged.

Note also that the voxels in this matrix are not the entire brain. They represent a wide portion of visual cortex, including all voxels with reliable signal.
   


In [None]:
data_filename = os.path.join(data_folder, 'S1_betas_avg_bigmask.hdf5')
print(data_filename)

t = time.time()
with h5py.File(data_filename, 'r') as data_set:
    voxel_data = np.copy(data_set['/betas'])
    data_set.close()
elapsed = time.time() - t
print('Took %.5f seconds to load file'%elapsed)

Load info about the **ROIs (regions of interest)** in this dataset.

In [None]:
# ROI = region of interest. These are visual areas we want to focus on for analysis.
fn = os.path.join(data_folder, 'S1_voxel_roi_info.npy')
print(fn)
rinfo = np.load(fn, allow_pickle=True).item()
# this is a dictionary that contains information about which voxels our data will include.

Make a function to pull out voxels in a target ROI.

In [None]:
def get_roi_vox(roi_name = 'FFA-2'):

  voxel_mask = rinfo['voxel_mask']

  if roi_name in rinfo['ret_prf_roi_names']:
    ind_num = rinfo['ret_prf_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_retino'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_face_roi_names']:
    ind_num = rinfo['floc_face_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_face'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_place_roi_names']:
    ind_num = rinfo['floc_place_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_place'][voxel_mask]==ind_num

  elif roi_name in rinfo['floc_body_roi_names']:
    ind_num = rinfo['floc_body_roi_names'][roi_name]
    roi_inds = rinfo['roi_labels_body'][voxel_mask]==ind_num

  elif roi_name=='all':
    # return all of vis cortex, very big
    roi_inds = np.ones((np.sum(rinfo['voxel_mask']),), dtype=bool)

  return roi_inds

###Step 3: Classify animal vs. vehicle images.

We will start by constructing a binary classifier, which aims to use the fMRI activation patterns evoked by each image to decode the category of the viewed image.

Because the images in COCO are complex natural scenes, there are many possible ways we can classify them. They often have more than one category label.

As a starting point, we can define a binary distinction between images with animals and images with vehicles. Animals and vehicles are both visually and semantically distinct, so we should expect this classifier to do well. Other distinctions between more similar categories may be more difficult.

First let's define the animal and vehicle labels.

In [None]:
has_animal = np.array(cocolabs['animal']).astype(bool)
has_vehicle = np.array(cocolabs['vehicle']).astype(bool)

# Count the number that have one category, both, or neither.
print('Animal only: %d'%np.sum(has_animal))
print('Vehicle only: %d'%np.sum(has_vehicle))
print('Both: %d'%np.sum(has_animal & has_vehicle))
print('Neither: %d'%np.sum(~has_animal & ~has_vehicle))


We see that there are some images with both categories present, and some with neither category present. For the simplest implementation of a binary classifier, we want to ignore these "ambiguous" images, and only focus on the
 "unambiguous" images, which have only animal or vehicle.

 Let's pull out those unambiguous images only.

In [None]:
ambiguous = (has_animal & has_vehicle) | (~has_animal & ~has_vehicle)

print('%d ambiguous images'%np.sum(ambiguous))

labels = has_animal.astype(float)
# Setting the ambiguous ones to NaN here.
labels[ambiguous] = np.nan

print('%d unambiguous images'%np.sum(~np.isnan(labels)))

Decide how many cross-validation folds we want to use.

Recall some key terms here:

- **Cross-validation:** When we train a decoder (or, encoding model) on one set of data (*training set*) and test it on an independent set of data (*test set*). This is an essential step in machine learning, which ensures that we aren't just memorizing noise in the training data.
- **Folds:** During cross-validation, we split the data into train and test sets multiple times. Each time, we hold out some percent of the data, and we do this repeatedly until every trial has been in the test set exactly once. For example, in 4-fold cross validation, we would hold out 25% of the data on each fold, and would do that 4x, with a different 25% each time.

There is a trade-off here where more folds can lead to better performance, because it uses more data per fold. But fewer folds will be faster, so let's try just 4 here.

In [None]:
n_cv = 4

**Balancing the labels:** for the binary classifier to be completely unbiased, we ideally would like have exactly half the trials in each label group (i.e., 50% animal and 50% vehicle).

In reality, there are rarely exactly half the trials in each category, so the classes are "unbalanced". There are various ways to get around this problem of unbalanced classes. Here we'll address this by randomly sub-sampling the trials in the larger category, so that it has the same number of trials as the smaller category.

Write a function to do this:

In [None]:
def make_balanced_labels(labels, n_cv = 4):

  # let's figure out how to make a balanced set of images.
  un, n_per_class = np.unique(labels[~np.isnan(labels)], return_counts=True)
  n_classes = len(un)

  # how many are in the smaller class?
  small_class = np.min(n_per_class)
  # trim this many to make it an even multiple of n_cv, just makes it easier.
  remove_each = int(np.mod(small_class, n_cv))
  small_class = small_class - remove_each

  # I'm going to randomly sample "n" from the larger class, to balance them.
  # seed the random number generator, so you get the same result every time.
  np.random.seed(213321)
  inds_use = []
  # loop over the classes
  for uu in un:

    # for each class, randomly sample "n" samples
    inds_all = np.where(labels==uu)[0]
    inds_samp = np.random.choice(inds_all, size=small_class, replace=False)
    inds_use += [inds_samp]

  # This is the indices we will use to make the balanced set.
  inds_use = np.concatenate(inds_use)
  assert(len(np.unique(inds_use))==len(inds_use)) # make sure we didn't mess this up

  # Adjusted set of labels for the images
  labels_use = labels[inds_use]
  un, n_per_class = np.unique(labels_use, return_counts=True)
  assert(np.all(n_per_class==n_per_class[0]))

  return labels_use, inds_use

In [None]:
print('Original counts in each group:')
print('%d images in label 0'%np.sum(labels==0))
print('%d images in label 1'%np.sum(labels==1))

labels_use, inds_use = make_balanced_labels(labels)

print('\nAdjusted counts in each group:')
print('%d images in label 0'%np.sum(labels_use==0))
print('%d images in label 1'%np.sum(labels_use==1))


The variable "labels_use" now contains the labels that have been adjusted to have 50% per category.

The variable "inds_use" now provides an index that tells us how to get the balanced set of data out of our original data matrices (which have 10,000 elements).



As a sanity check - let's plot some example images in each of the categories.
- If you run this cell multiple times, you'll get a different set of random images. Note how much variability there is in these images.

In [None]:
names = ['vehicle','animal']
# np.random.seed(234235)

for li, ll in enumerate(names):

  inds = np.where(labels_use==li)[0]
  inds = np.random.choice(inds, size=4, replace=False)

  # "inds" only indexes into the subset of trials in "labels_use"
  # inds_for_full_array now provides index back into the original 10,000 images.
  inds_for_full_array = inds_use[inds]

  plt.figure(figsize=(8, 3))
  pi=0

  for ii in inds_for_full_array:

    image = np.moveaxis(images[ii], [0],[2])

    pi+=1
    plt.subplot(1,4,pi)
    plt.imshow(image)
    plt.axis('off')
    # plt.title('image %d'%ii)

  plt.suptitle(ll, y=0.85)
  plt.tight_layout()


Next, define the cross-validation (CV) indices. This will be an vector of values [n_trials] long, which tells us on which cross-validation fold the trials should serve as the testing set.

In [None]:
def make_crossval_inds(labels_use, n_cv = 4):

  # fixed random seed to make sure shuffling is repeatable
  rndseeds = 53455

  n_trials = len(labels_use)
  n_classes = len(np.unique(labels_use))

  # how many trials will participate in each CV fold
  n_per_cv = int(np.ceil(n_trials/n_cv))

  # initialize the cv_inds vector
  cv_inds = np.zeros_like(labels_use)

  # looping over label values (cv inds will be balanced within each label value)
  for li, ll in enumerate(np.unique(labels_use)):
    inds = labels_use==ll
    c = np.repeat(np.arange(n_cv), int(n_per_cv/n_classes))
    cv_inds[inds] = c[np.random.permutation(len(c))] # randomize within this group

  return cv_inds

In [None]:
cv_inds = make_crossval_inds(labels_use, n_cv)
# print unique values in cv_inds
np.unique(cv_inds, return_counts=True)

Now that the labels are handled, we're ready for the fMRI data. Choose one ROI to focus on here.

In [None]:
# roi_name = 'V1v'
roi_name = 'FFA-1'
roi_inds = np.copy(get_roi_vox(roi_name))

rdat = voxel_data[:,roi_inds]
rdat = rdat[inds_use,:]

rdat.shape

Before decoding, we generally **subtract the mean across voxels**, from each trial. This step ensures that the decoder is picking up on information represented across multivariate patterns, as opposed to the mean signal in the ROI.

For intuition on this - we know that some ROIs will have higher average activation for one category versus another (for example, face-selective areas like FFA have higher average response to faces than objects). If we don't subtract this "mean" response, than it can cause the decoding accuracy to be higher, and we don't know if decoding is caused by mean difference between categories or by pattern-level differences (some voxels go up, some go down). If we do subtract the mean, then we know that decoding is driven by a distributed pattern-level representation.

In [None]:
# subtract the mean here
rdat = rdat - np.tile(np.mean(rdat, axis=1, keepdims=True), [1, rdat.shape[1]])

rdat.shape

Now we're ready to run the classifier. Let's start with one *fold* at a time (eventually, we'll want to loop over many of these folds). Start by splitting the data into training and testing sets:

In [None]:
cv = 0

trninds = cv_inds!=cv
tstinds = cv_inds==cv

trndat = rdat[trninds,:]
tstdat = rdat[tstinds,:]

trnlabs = labels_use[trninds]
tstlabs = labels_use[tstinds]

# check for balanced labels
assert(not np.any(np.isnan(trnlabs)))
un, counts = np.unique(trnlabs, return_counts=True)
assert(counts[0]==counts[1])

print('Training data is shape:')
print(trndat.shape)

print('Testing data is shape:')
print(tstdat.shape)

The linear classifier we're using here is a ***ridge classifier***. This is similar to the ridge regression models we've used for encoding models in previous weeks. In this case, it is using ridge regression to learn a projection from voxel response patterns to category labels.

A benefit of the ridge classifier is that it uses regularization, ensuring that the decoder doesn't over-fit. This can be very important for fMRI decoding, because we have many voxels in our ROI, and we don't know which of these are most informative versus more noisy. A regularized classifier ensures that we focus on the most informative voxels.

Note that this is different from how ridge is used in encoding models. In encoding, we're mapping from a set of features (DNN features) to a voxel response. In decoding, we're mapping from a voxel response pattern (many voxels) to a category output. Hence why this is decoding versus encoding.

Note also - this is not the only kind of linear classifier that can be used here. In neuroscience, other linear methods like [support vector machines](https://scikit-learn.org/stable/modules/svm.html) and [linear discriminant classifiers](https://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html) are often used.

To fit the ridge classifier - we need to choose a regularization parameter (a or $\alpha$). To choose the best a, we will fit the model for multiple candidate a values, and choose the one that results in highest decoding accuracy.

In [None]:
# define set of candidate a values
a_values = np.logspace(-9, 9, 8)

st = time.time()

acc_each_a = np.zeros_like(a_values)

# loop over the possible regularization penalties
for ai, a in enumerate(a_values):

  model = sklearn.linear_model.RidgeClassifier(alpha = a)
  model.fit(trndat, trnlabs)
  pred = model.predict(tstdat)
  acc_each_a[ai] = np.mean(pred==tstlabs)

elapsed = time.time() - st
print('elapsed = %.2f seconds'%(elapsed))


---

***Question 1:***

Which a value should we use?

Print out the accuracy associated with the best a value.

In [None]:
# [answer here]



---



In practice, we will make this procedure slightly more complicated by using **nested cross-validation** to select the a value.

On each cross-validation fold, we'll hold out 25% of the data as our test set, and we'll choose the a parameter just from within the 75% training set. This will be done using nested cross-validation, where we hold out a sub-set of the training data, and use this to choose the best alpha. Then once we've chosen the best alpha, we'll still evaluate performance using the test set only.

This nested procedure provides an additional safeguard against over-fitting. We're choosing all the model parameters (including a), based only on the training data, and using the testing data only for final evaluation.

Let's write a function to do this, including a loop over all the cross-validation folds.

In [None]:
def decode_loop(rdat, labels_use, cv_inds):

  pred_labels = np.zeros_like(labels_use)

  for cv in np.unique(cv_inds):

    print('CV fold %d of %d'%(cv, len(np.unique(cv_inds))))

    trninds = cv_inds!=cv
    tstinds = cv_inds==cv

    trndat = rdat[trninds,:]
    tstdat = rdat[tstinds,:]

    trnlabs = labels_use[trninds]
    tstlabs = labels_use[tstinds]

    # check for balanced labels
    assert(not np.any(np.isnan(trnlabs)))
    un, counts = np.unique(trnlabs, return_counts=True)
    assert(counts[0]==counts[1])

    print(trndat.shape, tstdat.shape)

    # define model
    st = time.time()
    a_values = np.logspace(-9, 9, 8)
    model = sklearn.linear_model.RidgeClassifierCV(cv = 3, \
                                                    alphas = a_values, \
                                                    scoring='accuracy',
                                                    )
    model.fit(trndat, trnlabs)
    elapsed = time.time() - st

    # pull out the best regularization parameter
    best_alpha = model.alpha_

    # compute train and test accuracy
    train_acc = model.score(trndat, trnlabs)
    test_acc = model.score(tstdat, tstlabs)

    print('    cv fold %d (elapsed = %.6f s): best alpha = %.5f, train acc = %.2f, test acc = %.2f'%(
        cv, elapsed, best_alpha, train_acc, test_acc))

    pred = model.predict(tstdat)
    pred_labels[tstinds] = pred

  return pred_labels



Implement the full decoding procedure here:

In [None]:
roi_name = 'FFA-1'
roi_inds = get_roi_vox(roi_name)

rdat = voxel_data[:,roi_inds]
rdat = rdat[inds_use,:]

# subtract the mean here
rdat = rdat - np.tile(np.mean(rdat, axis=1, keepdims=True), [1, rdat.shape[1]])

pred_labels = decode_loop(rdat, labels_use, cv_inds)



---
***Question 2:***

Compute and print out the final cross-validated decoding accuracy of the model.

What is the "chance" value for this decoder? Is the obtained accuracy close to chance, or above chance?


In [None]:
# [answer here]



---



In [None]:
# Retinotopic ROI definitions:
print(rinfo['ret_prf_roi_names'])
# Face-selective ROI definitions:
print(rinfo['floc_face_roi_names'])
# Place-selective ROI definitions:
print(rinfo['floc_place_roi_names'])
# Body-selective ROI definitions:
print(rinfo['floc_body_roi_names'])



---
***Question 3:***

Run the decoder for a different ROI (see options above), by copying the code above and modifying it.

Print out the accuracy for your chosen area.

- Note that if you choose a larger area (more voxels = more columns), this will take a bit longer.

Does your area have higher or lower accuracy compared to FFA-1 (tested in Question 2)? How do you interpret this difference?

In [None]:
# [answer here]



---



###Step 4: Classify other categories.

Now let's see how well the classifier can handle other category distinctions.

This will enable us to ask how information about different categories is represented across different visual regions.

cocolabs.keys() has a list of all the categories we can use.

These correspond to the "supercategories" in COCO (i.e., superordinate categories).

In [None]:
list(cocolabs.keys())

Make a binary food vs. not-food classifier, which will distinguish images that contain food from those without food.

In [None]:
categ_name = 'food'

labels = np.array(cocolabs[categ_name])

labels_use, inds_use = make_balanced_labels(labels)
cv_inds = make_crossval_inds(labels_use)

roi_name = 'FFA-1'
roi_inds = get_roi_vox(roi_name)

rdat = voxel_data[:,roi_inds]
rdat = rdat[inds_use,:]

# subtract the mean here
rdat = rdat - np.tile(np.mean(rdat, axis=1, keepdims=True), [1, rdat.shape[1]])

print('Decoding %s vs. not-%s:\n'%(categ_name, categ_name))
pred_labels = decode_loop(rdat, labels_use, cv_inds)




---
***Question 4:***

Compute and print the final accuracy of this classifier.

What does this tell us about the neural representation of the food category? Is this surprising?

In [None]:
# [answer here]



---





Next, let's visualize some of the errors made by this decoder.

These plots will show examples of images in each category, which are either correctly classified or mis-classified by the decoder.


In [None]:
is_correct = (pred_labels==labels_use).astype(int)

categ_correct = (labels_use==1) & (is_correct==1)
categ_incorrect = (labels_use==1) & (is_correct==0)

notcateg_correct = (labels_use==0) & (is_correct==1)
notcateg_incorrect = (labels_use==0) & (is_correct==0)

# np.random.seed(234235)

for inds, name in zip([categ_correct, categ_incorrect, notcateg_correct, notcateg_incorrect], \
                       ['%s, correct'%categ_name, '%s, error'%categ_name, \
                       'not %s, correct'%categ_name, 'not %s, error'%categ_name]):

    inds = np.where(inds)[0]
    # print(len(inds))
    inds = np.random.choice(inds, size=4, replace=False)

    # "inds" only indexes into the subset of trials in "labels_use"
    # inds_for_full_array now provides index back into the original 10,000 images.
    inds_for_full_array = inds_use[inds]

    plt.figure(figsize=(8, 3))
    pi=0

    for ii in inds_for_full_array:

      # have to move the axis, want [H x W x 3]
      image = np.moveaxis(images[ii], [0],[2])

      pi+=1
      plt.subplot(1,4,pi)
      plt.imshow(image)
      plt.axis('off')
      # plt.title('image %d'%ii)

    plt.suptitle(name, y=0.85)
    plt.tight_layout()



---
***Question 5:***

Examine the images above, for the correct and error trials.
- You can try running the cell above again to get a different set of images, the code includes random sampling so it'll be different each time.

Are there any patterns that you notice about the error trials?

What kinds of images are hardest for this decoder?



[answer here]



---



***Question 6:***

By copying and pasting some of the code from above, construct a classifier for a different one of the COCO categories (e.g., animal, person, etc.)

Run this decoder for at least 2 different ROIs, and print out the results for each ROI.

How high is the accuracy for your chosen category?

How does accuracy differ between regions? What does this tell you about the difference in representations between these regions?


In [None]:
# [answer here]



---

