# Comparing computational models

This week's tutorial is about using RSA to assess the explanatory power of *different* computational models in explaining neural processing! These computational models will serve as hypotheses about the computations that are important for neural processing, just like we used the participants behaviour or the experimental conditions in the last tutorial. In particular, we'll be looking at a new fMRI dataset on natural image processing, called **BOLD5000**, we will evaluate two different computational models on one of the ROIs, the lateral occipital cortex (LOC).

**What you'll learn**: At the end of this tutorial, you ...

* are familiar with noise ceilings and you will know how to compute them.
* know how to use a deep convolutional neural network (DCNN) for object recognition
* are able to obtain features from a DCNN and convert them into an RDM
* can estimate semantic similarity using linguistic word embeddings
* can convert word embeddings into an RDM
* can compare different candidate RDMs for explaining neural RDMs

**Estimated time needed to complete**: 8-12 hours

In [None]:
# Some imports for the rest of the tutorial
import sys
import os
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import h5py
import joblib
from scipy.spatial.distance import squareform, pdist
from niedu.utils.nipa import show_rankRDM
sys.path.append('/home/Public')

## Getting and getting to know BOLD5000
In this notebook, we are going to work with the **BOLD5000 dataset**. 

### Downloading the datasets
As a first step, we need to download the images and labels, the participants saw during the experiment. We also need to dowwnload the fMRI data. Conviniently for us, the patterns from a set of ROIs is already available for download, so we'll start from a set of patterns from every hemisphere for 4 participants!

The download below will take a long time! So in the meantime, you have time familiarize yourself with the dataset by looking at the [website](https://bold5000.github.io/overview.html) and the [paper](https://www.nature.com/articles/s41597-019-0052-3). 

In [None]:
if not os.path.isdir(os.path.join(os.path.expanduser('~'), 'NI-edu-data')):
    os.mkdir(os.path.expanduser('~'), 'NI-edu-data')

data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')
if not os.path.isdir(data_dir):
    os.mkdir(data_dir)

img_dir = os.path.join(data_dir, 'BOLD5000_Stimuli')
if not os.path.isdir(img_dir):
    print("Downloading the dataset (+- ... MB) ...")
    !wget -P $data_dir https://www.dropbox.com/s/5ie18t4rjjvsl47/BOLD5000_Stimuli.zip
    filename = os.path.join(data_dir, 'BOLD5000_Stimuli.zip')
    !unzip $filename -d $data_dir
    #!rm $filename
    print("\nDone!")
else:
    print("Dataset already downloaded!")
    
if not os.path.isfile(os.path.join(data_dir, 'BOLD_5000_Stimuli', 'Image_Labels', 'coco-labels-paper.txt')):
    filename = os.path.join(data_dir, 'BOLD5000_Stimuli', 'Image_Labels', 'coco-labels-paper.txt')
    !wget -O $filename https://surfdrive.surf.nl/files/index.php/s/PcJgbhnso6XYcqz/download
                      
if not os.path.isfile(os.path.join(img_dir, 'Image_Labels', 'scene_final_labels_corrected.txt')):
    filename = os.path.join(img_dir, 'Image_Labels', 'scene_final_labels_corrected.txt')
    !wget -O $filename https://surfdrive.surf.nl/files/index.php/s/1Wvo7SHx3W22iox/download
            
roi_dir = os.path.join(data_dir, 'ROIs')
if not os.path.isdir(roi_dir):
    print("Downloading the neural data (+- ... MB) ...")
    filename = os.path.join(data_dir, 'ROIs.zip')
    !wget -O $filename https://ndownloader.figshare.com/files/12965447 
    !unzip $filename -d $data_dir
    #!rm $filename
    print("\nDone!")
else:
    print("Neural data already downloaded!")


So now we've downloaded all the data we will need for this notebook! Let's have a look at the neural data:

In [None]:
roi_dir = os.path.join(data_dir, 'ROIs')
print("We have the following folders:\n-", '\n- '.join(sorted(os.listdir(roi_dir))))

sub_dir = os.path.join(data_dir, 'ROIs', 'CSI1','h5')
print("We have the following folders:\n-", '\n- '.join(sorted(os.listdir(sub_dir))))

stim_dir = os.path.join(data_dir, 'ROIs', 'stim_lists')
print("We have the following folders:\n-", '\n- '.join(sorted(os.listdir(stim_dir))))


You can see that in every folder there are multiple patterns available averaged for the different TR's. In the next steps, we will be loading in these patterns!

<div class='alert alert-warning'>
    <b>ToDo</b> (1 point): Which TR dataset should we use for our analysis and why? (Hint: Check out the figures in the paper!). Place your TR reply in a variable called TR.
</div>

In [None]:
''' Implement your ToDo here. '''

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_TR

test_TR(TR)
    

### Organizing the datasets
As a first step, let's load in the pattern for our region of interest LOC for all participants.

In [None]:
# Get the data
data = {} # preallocate!
roi_key = 'LOC' # defines our ROI!
for sub in range(1,5): # to enumerate over subjects!
    print('Processing subject ' + str(sub) + ' ...')
    data['CSI' + str(sub)]={}
    filename = os.path.join(data_dir, 'ROIs', 'CSI' + str(sub),'h5',"CSI" + str(sub) + "_ROIs_TR34" + ".h5") 
    stimname = os.path.join(data_dir, 'ROIs','stim_lists','CSI0' + str(sub) + '_stim_lists.txt') 
    # reads in the neural patterns
    with h5py.File(filename, "r") as f:
        for k in f.keys():
            if k.endswith(roi_key):
                data['CSI' + str(sub)][k]= list(f[k])
    # reads in the stimulus information
    data['CSI' + str(sub)]['stim'] = open(stimname).read().splitlines()

print('Done!')

Nice, so we've just loaded in all the patterns that we need! As a next step, we need to organize them in the same order, so that we can turn them into an RDM. For convenience, we'll organize the images alphabetically for all subjects. This will create some structure for our RDMS since:
* Images from imageNet with n...followed by 8 digits, e.g. n01692333_12353.JPEG
* Images from coco contain the string COCO_, e.g. COCO_train2014_000000273147.jpg
* Images from the Scene database names such as dinosaur4.jpg

Subject 4 has less trials, with only around 3000 so let's focus on these images and discard all other trials.

In [None]:
trial_ids = pd.DataFrame(np.zeros((len(data['CSI4']['stim']),5)), 
                         columns = ['ID', 'CSI4_trials', 'CSI1_trials', 'CSI2_trials', 'CSI3_trials'])
trial_ids['ID'] = data['CSI4']['stim']
trial_ids['CSI4_trials'] = np.arange(len(data['CSI4']['stim']))

# Now let's find the corresponding trials for the remaining subjects
for i in range(len(trial_ids['ID'])):
    for sub in range(1,4):
        trial_ids.loc[i, 'CSI' + str(sub) + '_trials'] = int(data['CSI'+ str(sub)]['stim'].index(trial_ids.loc[i,'ID']))

# Finally, let's sort all of them alphabetically, this can also be anything else. 
trial_ids['Dataset'] = 'Scene'
#ind = trial_ids['ID'].str.startswith('COCO')
#print(ind)
trial_ids.loc[trial_ids['ID'].str.contains('COCO'), 'Dataset']  = 'COCO'
trial_ids.loc[trial_ids.ID.str.contains(r'n(0|1)[0-9][0-9][0-9][0-9]'), 'Dataset'] = 'ImageNet'

#trial_ids_sorted = trial_ids.sort_values(by=['Dataset', 'ID']).reset_index()
trial_ids_sorted = trial_ids.sort_values(by=['ID']).reset_index()
print(trial_ids_sorted)


### Constructing and visualizing neural RDMs
Now that we've organized the data, we can make both the individidual and the group RDMs. To do this, we need to use the trial indices we've just created and apply them to the patterns!

In [None]:
subs = 4
rois = data['CSI1'].keys()
rois = [roi for roi in rois if roi.endswith(roi_key)]
# preallocate for the rdms (subjects, regions, images, images)
rdms = np.zeros((subs, len(rois), len(trial_ids_sorted), len(trial_ids_sorted )))
for sub in range(1, subs + 1): # loop over subjects
    for n, roi in enumerate(rois): # and ROIs
        ids = list(trial_ids_sorted['CSI' + str(sub) + '_trials'].to_numpy(dtype='int'))
        
        sel_data = np.array(data['CSI' + str(sub)][roi])[ids]
        print(sel_data.shape)
        rdms[sub-1, n, :, : ] = 1 - np.round(np.corrcoef(sel_data),4) 


Before storing the final RDMs we can visualized them!

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import rankdata
fig, ax = plt.subplots(4,2, figsize=(4,10))

for sub in range(subs):
    for roi in range(len(rois)):
        show_rankRDM(rdms[sub,roi], ax = ax[sub, roi], label=None)
        if sub == 0:
            ax[sub, roi].set_title(rois[roi])
        if roi == 0:
            ax[sub, roi].set_ylabel('Subject ' + str(sub+1) )
            

<div class='alert alert-warning'>
    <b>ToDo</b> (1 point): Compute and plot the grand average RDM. Store the GA rdm in a variable called GA.
</div>

In [None]:
''' Implement your ToDo here. '''
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_GA

test_GA(rdms, GA)


In [None]:
# YOUR CODE HERE
raise NotImplementedError()

<div class='alert alert-info'>
    <b>ToThink</b> (1 point): So we see that there is a lot of structure in this brain area in response to these diverese set of images! Most prominently, there is a big cluster in the middle of the RDM. What does it mean? Use the `trial_ids_sorted` dataframe and find out what these images are and why they might cluster! Explain what happens in the text field below.
</div>

YOUR ANSWER HERE

In [None]:
# Finally let's save the neural RDMs!
if not os.path.isdir(os.path.join(data_dir, 'Results')):
    os.mkdir(os.path.join(data_dir, 'Results'))
np.save(os.path.join(data_dir, 'Results', 'RDM_LO.npy'), rdms)

## Noise ceiling

As you can see in the RDMs above, our neural data is not perfect and there are quite some differences between the participants. This is a common problem, since any type of neural data consists of both signal and noise. Oftentimes, our definitions of signal are quite simply derived from our experimental design in a homogeneous group of participants, so that a signal is something that is shared across all trials of the same type or participants in the group (the statistical reason behind averaging across trials and subjects). This view also entails that anything that is not reliably shared across subject has to be defined as noise (e.g. unrelated to the studied process), because it cannot be captured with our methods, it is thus limiting our explanatory power. 

In the later parts of the tutorial, we will ask how well these neural RDMs are captured by the features of different computational models. To understand, how well any model could possibly do, we need to get a grasp on the reliable signal in this ROI. With RSA, this is typically done by constructing the upper and lower bound of the noise ceiling.

**Upper noise ceiling bound**: This one is usually obtained by computing the correlation for a single participants RDM to the average RDM. This is a somewhat optimistic noise estimate since any single subject's RDM also contributed to the average RDM, for this reason it is the upper bound. No model can exceed this upper noise bound as it represents the best possible model.

**Lower noise ceiling bound**: This one is obtained by computing the correlation for a single participants RDM to the average RDM without this individual participant. This is a measure of consistency across subjects. If the lower noise ceiling bound is close to zero, this means that there is no variance to be explained.



<div class='alert alert-warning'>
    <b>ToDo</b> (2 points): Let's construct the noise ceiling for our data! To this end, first average across hemisphere to obtain a single RDM per participant. Compute the upper and lower noise ceiling per participant and the average across all participants. For estimating the correlation, use Spearman's rho. Store the result `noiseCeiling['UpperBound']` and `noiseCeiling['LowerBound']`, respectively.
</div>

In [None]:
''' Implement your ToDo here. '''
from scipy.stats import spearmanr
rdms_neural = np.load(os.path.join(data_dir, 'Results', 'RDM_LO.npy'))
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_NC

test_NC(rdms_neural, noiseCeiling)

<div class='alert alert-info'>
    <b>ToThink</b> (1 point): Why do you think the span between the lower and the upper bound is so large?
</div>

YOUR ANSWER HERE

In [None]:
# Save the trials and reset workspace
if not os.path.isdir(os.path.join(data_dir, 'Results')):
    os.mkdir(os.path.join(data_dir, 'Results'))
trial_ids_sorted.to_pickle(os.path.join(data_dir, 'Results', 'trial_ids.pkl'))
try:
    joblib.dump(noiseCeiling, os.path.join(data_dir, 'Results', 'NC_LO.pkl'))
except:
    # If you didn't succeed to compute the noise ceiling, you can download it here:
    if not os.path.isfile(os.path.join(data_dir, 'Results', 'NC_LO.pkl')):
        filename = os.path.join(data_dir, 'Results', 'NC_LO.pkl')
        !wget -O $filename https://surfdrive.surf.nl/files/index.php/s/Lyco1827lUTgAWO/download
print('Done saving!')

In [None]:
# Reset the workspace!
%reset -f

This was the first part of this tutorial! You have now organized the dataset across subjects into a single format and converted it into an RDM. In the next part, we will try to explain these neural RDMs by using different computational models. 

## Computational models as hypotheses on neural processing

In this part of the tutorial, we move our focus to computational models. These type of models can be loosely defined as an input-output mapping, that is, a certain operation is *computed* on the input and returns an output.

In some ways, these computational models can be viewed as hypotheses of neural information processing. The idea is that if a model can explain neural activity either with its output or its process (computations), then there may be some overlap in computations between such a model and the brain. One example for a computational vision model is the Gabor Filter bank, where the input image is convolved with different gaussians of different phases and frequencies, giving rise to a set of more useful features for further visual processing. This rather simple model has been quite successful in predicting neural responses in V1 even though it is not the state-of-the-art anymore ([Cadena et al., 2019](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006897)).

In recent years, more and more sophisticated computational models have entered neuroscience fueled by the recent progress in deep learning. In the next two parts of this tutorial, we will first look at a vision deep learning model and later at a natural language processing model. The overarching goal is to leverage these models to explain the neural activity in LOC. Like this, we can ask a couple of questions such as:

* How well can a vision model actually capture visual processing in LOC?
* Can a language model explain anything in this *visual* area?
* Do the vision and language model explain the same variances in the data or are they accounting for different things?

Please note that depending on our research question, we could have adopted many other computational models. 


### Deep convolutional neural networks
In this tutorial, we are going to focus on a single feedforward deep convolutional neural network architecture class, termed [VGG19](https://arxiv.org/abs/1409.1556). We chose this one for its simplicity in architecture, it is actually not the most successful one anymore (https://paperswithcode.com/sota/image-classification-on-imagenet). However, it is still the second to best models for explaining primate visual processing (http://www.brain-score.org/).

It is called VGG*19* because it consists of 16 convolutional layers and 3 dense layers.

Ok, as a first step let's load in our model. Deep learning libraries such as [tensorflow](https://www.tensorflow.org/) and [keras](https://keras.io/) make this very easy for us:

In [None]:
# Some imports for the rest of the tutorial
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.applications.vgg19 import preprocess_input, decode_predictions
import os
import sys
import joblib
import numpy as np
from glob import glob
from sklearn.decomposition import PCA
from scipy.stats import pearsonr
import pandas as pd
from scipy.spatial.distance import squareform
from scipy.stats import rankdata, spearmanr
from niedu.utils.nipa import show_rankRDM
sys.path.append('/home/Public')
# This reduces the number of cpu cores used.
jobs=7
config = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=jobs,
                         inter_op_parallelism_threads=jobs,
                         allow_soft_placement=True)
session = tf.compat.v1.Session(config=config)
from tensorflow.python.keras import backend as K
K.set_session(session)


In [None]:
# Load in the model
model = keras.applications.VGG19(weights='imagenet') # and specify that it should also load the pretrained weights!
# print an overview!
model.summary()

This output above tells us a number of important things about our model!

A feedforward DCNN is a series of operations applied to an input image. You can see from the first row in the table states that this image is required to have the dimensions of `224, 224, 3`, where the first two numbers refer to the height and width of the image and the last one is the color channel (RGB, so 3). The first dimenion `None` here refers to the batch size, that is, the amount of images presented to the network. So if we were to give 5 images to the network, this would have the dimension of `(5, 224, 224, 3)`

Further inspecting the table, you can see that Convolutions and MaxPooling operations are repeatedly applied to the image. 
This means that successively applying these operations to our input image gradually transforms it into a 1000-dimensional vector, the network's prediction!

Finally at the bottom of the table, you can see the amount of free parameters that network has at its disposal during training for finding an effective mapping between the input image and the target label. 

<div class='alert alert-info'>
    <b>ToThink</b> (1 point): The right handside of the table specifies the amount of parameters at every layer of the network. Why do you think the amount of parameters grow with the increasing number of layers? Why do dense layer have so many more parameters compared to convolutional layers?
</div>

YOUR ANSWER HERE

Alright, let's try out if this model actually works! To do this, we'll load in the stimuli used in the BOLD5000 experiment.

In [None]:
# Redefine the directories
data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')
img_dir = os.path.join(data_dir, 'BOLD5000_Stimuli', 'Scene_Stimuli', 'Presented_Stimuli')
print("We have the following folders:\n-", '\n- '.join(sorted(os.listdir(img_dir))))
# Pick one specific image)
img_path = os.path.join(img_dir, 'ImageNet','n01532829_11283.JPEG')

# Load this image
img = image.load_img(img_path, target_size=(224, 224))

# and visualize it!
sns.set_context('poster')
plt.imshow(img)
plt.axis('off')
plt.title('What kind of "object" is this?')


In the next step, we want to evaluate whether the model can recognize the object on this image. We can do it like this:

In [None]:
# We first have to convert it from an image format to an array
x = image.img_to_array(img)
# Then add the batch dimension, in this case just 1 image.
x = np.expand_dims(x, axis=0)

# and then we have to preprocess the image! (e.g. switch color channels, normalize etc.)
x = preprocess_input(x)

# finally, we can simply predict the image.
predictions = model.predict(x)
# This provides us with vector of 1000 entries, the softmax probability for every class.

# we can use the function below to translate this vector into more interpretable prediction:
print('Predicted:', decode_predictions(predictions , top=3)[0]) # Returns the top 3 predictions for the first image!

Of course, this was a clearly a house finch! The model's prediction also show that the other 999 catgories did not stand a chance. The second most highly predicted class was only at 0.00045%!

<div class='alert alert-warning'>
    <b>ToDo</b> (1 point): Load in a picture from the COCO database, display it and have the model predict this image. Does the model's response make sense? If so, why? If not, why not? Store the models response in a variable called `predictions` and use the image 'COCO_train2014_000000000036.jpg'.
</div>

In [None]:
''' Implement your ToDo here. '''
img_path = os.path.join(img_dir, 'COCO','COCO_train2014_000000000036.jpg')
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_prediction

test_prediction(predictions)

Ok, now that we have established that the model can actually do its job, we can start to harness its features for understanding the neural data.

### Feature extraction 

As outlined above, we want to use the VGG features to see whether they can account for our neural data from area LOC. To do this, we first have to translate the VGG model features into our common space, representational dissimilarity!
Doing this with a DCNN is actually pretty comparable to running a neuroimaging experiment: You present the model/subject with a set of experimental images, you record its activations while its processing the images and probe it for its responses!

This also means that we can easily construct RDMs for the VGG model if we can access its activations. This is what we are going to do next, **feature extraction**. We will first load in the experimental stimuli, then present these images to the model and record its activations in response to these images. Later, based on the activations, we will construct an RDM.

As a first step, we need to decide where to extract the activations in the model. Let's take a look at an middle layer first:

In [None]:
layer = 'block4_pool' # name of the layer of interest

# create a new model, that end with the layer of interest.
model_features = Model(inputs=model.input, outputs=model.get_layer(layer).output)
# use this shortened model to evaluate the image
features = model_features.predict(x)
# the output are the intermediate features of the model!
print(features.shape)

Doing this for our house finch image, gives us a activations in the shape of `(1, 14, 14, 512)`, just like we would expect based on the summary table above. For constructing an RDM, we will flatten this matrix such that we obtain `(images, features)`, which is the product of all our former dimensions (`14*14*512`), so 100,352 features for every image!

Most of the time, the actual information usually falls into much lower dimensional manifold. This means that that while the information is currently represented along so many dimensions, fewer dimensions could suffice to approximatethe same kind of informantion. Let's use PCA, a trick you learnt about in week 1, to reduce the amount of features. This will also help us to reduce the computational time for obtaining the pairwise distances for our model RDMs later on.

To do this, we need to load in many more images and obtain their activations, because the dimensionality reduction in PCA is limited by the amount of samples available, so if we only have 5 images, we can only project our 100,352-dimensional space into a 5-dimensional one, which will likely lead to a lot of lost variance. So let's get more images!


In [None]:
#Let's load in some more images and pick those, we used in the neural RDM.
trial_ids = pd.read_pickle(os.path.join(data_dir, 'Results', 'trial_ids.pkl'))

# Let's pick a random 1000 images, around a third of our data 
# to get a good estimate of the dimensionality of the variance.
n = 1000
i_rand = np.random.randint(0, len(trial_ids), n) # let's pick a random set of 1000 images

imgs = []
for i in i_rand:
    img_id = trial_ids.loc[i, 'ID']
    if img_id.startswith('rep_'): # There are trials call rep_, which simply means that the same image was shown twice.
        img_id = img_id[4:]
          
    imgs.append(image.img_to_array(image.load_img(os.path.join(img_dir, trial_ids.loc[i, 'Dataset'], img_id ), target_size=(224, 224))))
    
# and preprocess (same as we did for a single image!)
imgs = preprocess_input(np.array(imgs)) 


In [None]:
# Obtain the activations
features = model_features.predict(imgs, verbose=True) # 1000 images --> ca. 2 - 2.3 GB RAM
print(features.shape)

In [None]:
# And apply the PCA transform!
pca = PCA(n_components=500)
features = np.reshape(features,(features.shape[0], -1)) # flatten, only maintaining the samples
# Fit and transform the data 
features_pca = pca.fit_transform(features)

# Evaluate how the variance is distributed across the PCA components. 
plt.plot(np.arange(500),pca.explained_variance_ratio_ )
plt.xlabel('Component')
plt.ylabel('Variance explained')
plt.title('Total variance explained: ' + str(np.sum(pca.explained_variance_ratio_)))


Ok, so doing a PCA seems to be a reasonable way to reduce the dimensionality of the data. We traded around 100.000 for 500 features and maintained around 84% of the variance. 

<div class='alert alert-info'>
    <b>ToThink</b> (1 point): Why do you think we chose 1000 *random* images for fitting the PCA transform instead of just using images from ImageNet for instance? 
</div>

YOUR ANSWER HERE

### Constructing RDMs from DNN features
In our next step, we will apply this transformation to all our images and then we can construct an RDM with this data. This will take a little while!

In [None]:
# Store our first 1000 features in a large matrix (images x features)
features_pca_all = np.zeros((len(trial_ids), features_pca.shape[1]))
print(features_pca_all.shape) 

# transfer the features that we're already transformed
features_pca_all[i_rand] = features_pca

# Now load in the remaining pictures in batches 
batches = np.linspace(0, len(trial_ids), 4, endpoint=True, dtype=int)
for b in range(len(batches)-1):
    print('Processing batch ' + str(b + 1))
    imgs = []
    i_seq = []
    for i in range(batches[b], batches[b+1]):
        if i not in i_rand:
            i_seq.append(i)
            img_id = trial_ids.loc[i, 'ID']
            if img_id.startswith('rep_'):
                img_id = img_id[4:]

            imgs.append(image.img_to_array(image.load_img(os.path.join(img_dir, trial_ids.loc[i, 'Dataset'], img_id ), target_size=(224, 224))))

    # and preprocess!
    imgs = preprocess_input(np.array(imgs)) 
    
    # and predict!
    features = model_features.predict(imgs, verbose=True) 
    print(features.shape)

    # transform into lower dimensional space
    features_pca = pca.transform(np.reshape(features,(features.shape[0], -1)))
    
    # write to array!
    features_pca_all[i_seq] = features_pca
    
    del features, features_pca, imgs


Finally, with this we can now construct our RDM!

In [None]:
rdm = 1 - np.round(np.corrcoef(features_pca_all),4)  
np.save(os.path.join(data_dir, 'Results', 'RDM_' + layer + '.npy'), rdm)

In [None]:
fig, ax = plt.subplots()
show_rankRDM(rdm, label=layer,ax=ax)

So as you've just seen, the construction of an RDM is quite comparable in an DCNN! This model RDM even already looks a bit similar to our neural RDM. In our next step, we will compare it back to the RDM that we've computed for the region LOC and see how well this DCNN layer explains the neural RDMs of the four participants.

In [None]:
# Let's load in our earlier result and the noise ceiling!
layer = 'block4_pool'
data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')
# model RDM
rdm = np.load(os.path.join(data_dir, 'Results', 'RDM_' + layer + '.npy'))
# neural RDM
rdms_neural = np.load(os.path.join(data_dir, 'Results', 'RDM_LO.npy'))
# Noise Ceiling
noiseCeiling = joblib.load(os.path.join(data_dir, 'Results', 'NC_LO.pkl'))

Since this dataset contains only four subjects, we cannot model them as a random factor. Instead we use the mean RDM across participants, estimate variability by sampling across the stimuli and determine significance by creating a null-distribution.

In [None]:
# Compute relatedness
mean_rdm = np.mean(np.mean(rdms_neural, axis=0),axis=0) # average over hemispheres and participans
feature_rdm = squareform(rdm)
rdm_corr = spearmanr(squareform(mean_rdm), feature_rdm)[0] # and compute the spearman corr to the feature RDM
print(rdm_corr)

In [None]:
# Statistical significance
def corr_nullDist(mean_rdm, feature_rdm_vec, iterations=100):
    rdm_corr_null = []
    for i in range(iterations):
        if i%10 == 0:
            print('Iteration: ' + str(i) )
        # Create a random index that respects the structure of an rdm.
        shuffle = np.random.choice(mean_rdm.shape[0],mean_rdm.shape[0],replace=False)

        # shuffle RDM consistently for both dims
        mean_rdm_shuffle = mean_rdm[shuffle,:] # rows
        mean_rdm_shuffle = mean_rdm_shuffle[:,shuffle] # columns

        # correlating with neural similarty matrix
        rdm_corr_null.append(spearmanr(squareform(mean_rdm_shuffle), feature_rdm_vec)[0])
    return rdm_corr_null

In [None]:
rdm_corr_null = corr_nullDist(mean_rdm, feature_rdm)

In [None]:
# plot null distribution and true test statistic
fig = sns.displot(rdm_corr_null,height=6,aspect=1.5)
fig.set_axis_labels("Rhos")
plt.axvline(rdm_corr,color="red")

In [None]:
# compute p-value
p_val = np.mean(rdm_corr_null>rdm_corr) 
print(p_val)

<div class='alert alert-info'>
    <b>ToThink</b> (1 points): The p-value of our test-statistic is zero. Explain what that means with regard to the null distribution and how it can be zero. Would the p-value change if we ran 1000000 iterations instead of 100?
    
</div>

YOUR ANSWER HERE

Finally, we would like to assess how variable our measure of relatedness is. That is, just knowing how the mean subject RDM correlates with the feature RDM does not yet tell us, about the variability of this estimate. So let's estimate the confidence interval by using bootstrapping with replacement. Put simply, bootstrapping is a method in which you repeatedly sample from the same data. Importantly, you also put the data back after drawing and the same data points can be drawn multiple times within the same draw. Check out the example below:
![Example for bootstrapping with replacement](https://www.researchgate.net/profile/Paola-Galdi/publication/322179244/figure/fig2/AS:588191077777408@1517247089079/An-example-of-bootstrap-sampling-Since-objects-are-subsampled-with-replacement-some.png)

In [None]:
# Bootstrapping
def corr_variability(mean_rdm_vec, feature_rdm_vec, iterations=100):

    rdm_corr_boots = []
    for i in range(iterations):
        if i%10 == 0:
            print('Iteration: ' + str(i) )
        # Create a random index that respects the structure of an rdm.
        sample = np.random.choice(mean_rdm_vec.shape[0],mean_rdm_vec.shape[0],replace=True) # note that now replace is True

        # Subsample from both the reference and the feature RDM
        mean_rdm_sample = mean_rdm_vec[sample] 
        feature_rdm_sample = feature_rdm_vec[sample] 

        # correlating with neural similarty matrix
        rdm_corr_boots.append(spearmanr(mean_rdm_sample, feature_rdm_sample)[0])

    return rdm_corr_boots

In [None]:
mean_rdm_vec = squareform(mean_rdm)
rdm_corr_boots = corr_variability(mean_rdm_vec, feature_rdm)

In [None]:
# plot test statistics and bootstrapped distribution
fig = sns.displot(rdm_corr_boots,height=6,aspect=1.5)
fig.set_axis_labels("Rhos")
plt.axvline(rdm_corr,color="red")
# 95% confidence intervals:
plt.axvline(np.percentile(rdm_corr_boots, 2.5), color='gray', ls='--') # 2.5%
plt.axvline(np.percentile(rdm_corr_boots, 97.5), color='gray', ls='--') # 97.5% 

<div class='alert alert-info'>
    <b>ToThink</b> (1 points): Please explain what the confidence interval tells us about the relatedness between our neural RDM and the feature RDM? What would a small and large confidence interval mean?
    
</div>

YOUR ANSWER HERE

Finally, we can assess how well this feature RDM fare with regard to the noise ceiling:

In [None]:
np.mean(rdm_corr)/noiseCeiling['LowerBound']

Above you can see that the model features from block 4 are actually accounting a sizable amount of the lower noise ceiling.

<div class='alert alert-warning'>
    <b>ToDo</b> (3 points): Let's see how well a higher layer such as the first dense layer `fc1` can account for the data! Note that `fc1` is already 2 dimensional, so you can skip the PCA step! Don't forget to save the resulting RDM. Name this variable `rdm_fc1`. Start by loading in the images in batches, preprocess them, pass them to the feature model and obtain the features for all images. Once you've processed all images, compute the RDM and compare the feature RDM to the mean subject RDMs. 
</div>

In [None]:
# Before you start, let's clean up!
del features_pca_all, rdm, pca

In [None]:
''' Implement your ToDo here. '''

layer = 'fc1'
model_features = Model(inputs=model.input, outputs=model.get_layer(layer).output)

# Store all features in a large matrix (images x features)
features_all = np.zeros((len(trial_ids), model.get_layer(layer).output_shape[-1]))
print(features_all.shape) 

# Now load in the remaining pictures in batches 
batches = np.linspace(0, len(trial_ids), 4, endpoint=True, dtype=int)
for b in range(len(batches)-1):
    print('Processing batch ' + str(b + 1))
    imgs = []
    i_seq = []
    
    for i in range(batches[b], batches[b+1]): # Loop over images in the batch
        i_seq.append(i)
        
        # YOUR CODE HERE
        raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_fc1

test_fc1(rdm_fc1, rdm_corr_fc1)

In [None]:
# and save.
try: 
    np.save(os.path.join(data_dir, 'Results', 'RDM_' + layer + '.npy'), rdm_fc1)
except:
    if not os.path.isfile(os.path.join(data_dir, 'Results', 'RDM_' + layer + '.npy')):
        filename = os.path.join(data_dir, 'Results', 'RDM_' + layer + '.npy')
        !wget -O $filename https://surfdrive.surf.nl/files/index.php/s/dmEjKGwQvtNJ1MN/download

In [None]:
# Visualize it again.
fig, ax = plt.subplots()
show_rankRDM(rdm_fc1, label=layer, ax=ax)

Finally, let's take a moment to visualize our results! To do this, it's nice to also estimate the variability for the FC1 correlation first.

In [None]:
rdm_corr_boots_fc1 = corr_variability(mean_rdm_vec, squareform(rdm_fc1))
fig = sns.displot(rdm_corr_boots_fc1,height=6,aspect=1.5)
fig.set_axis_labels("Rhos")
plt.axvline(rdm_corr_fc1,color="red")
# Confidence inquantilervals:
plt.axvline(np.percentile(rdm_corr_boots_fc1, 2.5), color='gray', ls='--') # 2.5%
plt.axvline(np.percentile(rdm_corr_boots_fc1, 97.5), color='gray', ls='--') # 97.5% 

In [None]:
# Store all data in a single dataframe, useful for plotting.

rhos = pd.DataFrame({'Features': np.repeat(np.arange(2)[np.newaxis,:],100,axis=1).flatten(),
                          'Spearman rho': np.zeros(200)})
# Assign the results per layer
rhos.loc[rhos['Features'] == 0, 'Spearman rho'] = rdm_corr_boots 
rhos.loc[rhos['Features'] == 1, 'Spearman rho'] = rdm_corr_boots_fc1

rhos['Features'].replace({0: 'Block4', 1: 'FC1'}, inplace=True)

f, ax = plt.subplots(figsize=(4,5))
# Show each observation with a scatterplot
sns.stripplot(x="Features", y="Spearman rho",
              data=rhos, dodge=True, alpha=.5, zorder=1)

# Show the conditional means
sns.pointplot(x="Features", y="Spearman rho", 
              data=rhos, dodge=.532, join=False, palette="dark",
              markers="o", scale=.25, ci=None)
ax.set_ylim([0, noiseCeiling['LowerBound'] + 0.005])
ax.axhline(noiseCeiling['LowerBound'], color='gray', label='lower NC')
plt.tight_layout()
sns.despine()
ax.legend(loc=(1.04, 0), frameon=False)

In [None]:
rhos.to_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))

So, well done! In this part, you've learnt how to leverage the statistical knowledge captured in DCNNs to explain neural processing in the area LOC. In the next section, we are going to be looking at a completely different kind of model, semantic embedding spaces. 

In [None]:
# Reset the workspace!
%reset -f


### Semantic embedding spaces

Semantic word embeddings are based on the idea that words that co-occur in a text are likely to be semantically related. For example, consider two news paper articles, one describing a trial in court and another one describing the birth of a baby panda in the zoo. If you now imagine the types of words that will occur in each of these article and then I would ask you if the word 'zebra' is more likely in the first or second article? Most likely, you would answer the second article. A semantic embedding space trained on large corpuses (collection of texts) such as news items is capturing this intuition. 

The idea of co-occurrence as a proxy for semnatic relatedness has fuelled progress in Natural Language Processing (NLP) and has led to the development of embedding spaces. There are both different kinds of models as well as datasets used for creating these semantic embedding spaces.

In [None]:
import gensim.downloader as api
import seaborn as sns
import joblib
import os
import sys
import pandas as pd
import numpy as np
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
from scipy.spatial.distance import squareform
from scipy.stats import rankdata, spearmanr
from niedu.utils.nipa import corr_variability
sys.path.append('/home/Public')
sns.set_context('poster')
data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')

In [None]:
# This refers to that it was a glove model, trained on the wikipedia corpus with a 100 dimensional output space.
# For an overview of all available models, see https://github.com/RaRe-Technologies/gensim-data
embedding = "glove-wiki-gigaword-100" 
model = api.load(embedding)

We have now downloaded the pre-trained model, let's see what we can do with that!

In [None]:
# We can give it a word and ask which other words are the most similar
model.most_similar(positive = ['zebra'])

So as you would expect, we see other animals being very similar to zebra.

In [None]:
# You can even do some more intricate stuff such as
model.most_similar(positive=['woman', 'king'], negative=['man'])

This  means, return words that are most similar to woman and king, while being dissimilar from the word man.

In [None]:
# The most important feature for us is that we can ask how far apart in this 300-dimensional space two words are:
print('Zebra and court: ' + str(model.distance('zebra', 'court')))
print('Zebra and zoo: ' + str(model.distance('zebra', 'zoo')))

In [None]:
# You can also give multiple words and evaluate them with regard to their distance to a single word:
model.distances('zebra', ['dog', 'body', 'frog', 'neuron'])

<div class='alert alert-warning'>
    <b>ToDo</b> (1 point): Let's do some vector arithmetics! Come up with your own example of an analogy as we've seen above for queen example.
</div>

In [None]:
''' Implement your ToDo here. '''

# YOUR CODE HERE
raise NotImplementedError()

This feature allows us to construct an RDM from our image labels! 

In the next step, we will load in the labels of our images and compute the RDMs for this embedding space.

In [None]:
# We first need to load in all the label information.
trial_ids = pd.read_pickle(os.path.join(data_dir, 'Results', 'trial_ids.pkl'))
data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')
coco_ann = joblib.load(os.path.join(data_dir,'BOLD5000_Stimuli', 'Image_Labels', 'coco_final_annotations.pkl'))

coco_cat = open(os.path.join(data_dir,'BOLD5000_Stimuli', 'Image_Labels', 'coco-labels-paper.txt'), "r").read().split('\n')
imagenet_cat = open(os.path.join(data_dir,'BOLD5000_Stimuli', 'Image_Labels', 'imagenet_final_labels.txt'), "r").read().split('\n')
scene_cat = pd.read_csv(os.path.join(data_dir,'BOLD5000_Stimuli', 'Image_Labels', 'scene_final_labels_corrected.txt'), index_col=0)

print(trial_ids)

In [None]:
# Let's assign a label to every image!
for ID in range(len(trial_ids)):
    if trial_ids.loc[ID, 'ID'].startswith('rep_'):
        tag = trial_ids.loc[ID, 'ID'][4:]
    else:
        tag = trial_ids.loc[ID, 'ID']
    
    # There is a different procedure for obtaining the labels depending on the dataset.
    if trial_ids.loc[ID, 'Dataset'] == 'COCO':
        annot = coco_ann[int(tag[-15:-4])] # We first have to strip the ID from the start and the ending, so that only the image identifier is maintained.
        label = []
        for a in range(len(annot)): # Some images contain multiple objects
            label.append(coco_cat[annot[a]['category_id']-1]) # For every object, find the category id!
        trial_ids.loc[ID, 'Label'] = ', '.join(label) # and link it back to the actual label.
        
    elif trial_ids.loc[ID, 'Dataset'] == 'Scene': # For the scenes, the label is contained in the ID.
        label = ''.join(c for c in tag.split('.')[0] if not c.isdigit())
        trial_ids.loc[ID, 'Label'] = scene_cat.loc[scene_cat['original category'] == label, 'corrected category'].item() 
        
    elif trial_ids.loc[ID, 'Dataset'] == 'ImageNet': # For ImageNet, the label ID is contained in the ID.
        label = tag.split('.')[0].split('_')[0] #
        trial_ids.loc[ID, 'Label'] = [i for i in imagenet_cat if i.startswith(label)][0][10:]        
    else:
        print('Dataset not found.')
        
    print(trial_ids.loc[ID, 'Label'])



### Constructing RDMs from embedding spaces
Now that we've assigned a label, that is, some description about what's on the picture, we can ask, how closely related are these different images semantically!

We can construct an RDM by computing the pairwise distances between every set of labels:

In [None]:
rdm = np.ones((len(trial_ids),len(trial_ids))) # preallocate, note the ones since we are computing similarities now.

for i in range(len(trial_ids)): # Loop over all images
    if i % 100 == 0:
        print(i)
    label1 = trial_ids.loc[i,'Label'].lower().split(', ') # Prepare the label for the model.
    
    label1_clean = []
    for word in label1:
        if ' ' in word: # Some two word expressions, can be found as word1_word2 in the model corpus
            word = word.replace(' ', '_') 
                
        if word in model.key_to_index: # check if the label items are in the vocabulary of the model.
            label1_clean.append(word)
        else:
            for w in word.split('_'): # if the words are not in the vocabulary, try to split them up again, to get a match.
#                 if w in model.vocab:
                if w in model.key_to_index:
                    label1_clean.append(w) 
    
    for j in range(i, len(trial_ids)):  # we can do this, since it's symmetric!
        if len(label1_clean) !=0:
            label2 = trial_ids.loc[j,'Label'].lower().split(', ')

            label2_clean = []
            for word in label2:
                if ' ' in word:
                    word = word.replace(' ', '_')
#                 if word in model.vocab:
                if word in model.key_to_index:
                    label2_clean.append(word)
                else:
                    for w in word.split('_'):
#                         if w in model.vocab:
                        if w in model.key_to_index:
                            label2_clean.append(w) 

            if len(label2_clean) !=0:
            
                rdm[i, j] = model.n_similarity(label1_clean, label2_clean) # compute the similarity between the two label sets.
            else:
                rdm[i, j] = np.nan
        else:
            #print(label1)
            rdm[i, j] = np.nan
        
        rdm[j, i] = rdm[i, j]
    
    
np.save(os.path.join(data_dir, 'Results', 'RDM_' + embedding + '.npy'), rdm)

In [None]:
plt.imshow(1 - rdm)
plt.title(embedding)


In the output above, you can see that the model corpus didn't know all the labels we've asked it for (white stripes & nan in the rdm). In fact, there are also some labels, you've probably never heard about yourself. Crucially, this language model can only process words that were contained in the corpus (collection of texts) that is was trained on.

So now we have obtained the RDM based on the 100-dimensional embedding space! However, computing an RDM based on a larger corpus requires a lot of RAM. To this end, we've precomputed an RDM based on the GoogleNews corpus and the 300 dimensional word2vec embedding space for you. You can load in the RDM in the next step.

In [None]:
embedding = 'word2vec-google-news-300'
if not os.path.isfile(os.path.join(data_dir, 'Results', 'RDM_' + embedding + '.npy')):
    filename = os.path.join(data_dir, 'Results', 'RDM_' + embedding + '.npy')
    !wget -O $filename https://surfdrive.surf.nl/files/index.php/s/AVb4Z843OJRyOOZ/download
rdm_w2v = np.load(os.path.join(data_dir, 'Results', 'RDM_' + embedding + '.npy'))

plt.imshow(1 - rdm_w2v)
plt.title(embedding)


At first sight, these two RDMs seem quite comparable yet if we correlate them, it becomes clear that they are by no means identical!

In [None]:
# We have to exclude all RDM entries that were not contained in either corpus.
nan_filter = (np.isnan(rdm_w2v)[0,:] == False) & (np.isnan(rdm)[0,:] == False) 
rdm_filtered_w2v = squareform(1- np.round((rdm_w2v[nan_filter, :][:,nan_filter]),5))
rdm_filtered_glove = squareform(1 - np.round((rdm[nan_filter, :][:,nan_filter]),5))


In [None]:
f, ax = plt.subplots(figsize=(6.5, 6.5))
sns.scatterplot(x=rdm_filtered_w2v, y=rdm_filtered_glove, zorder=1, s=2)
ax.plot([0, 1.5], [0, 1.5], linewidth = 5, c='gray', ls='--',zorder=4 )
ax.set_xlabel('Word2vec')
ax.set_ylabel('GloVe')
ax.set_title(spearmanr(rdm_filtered_w2v, rdm_filtered_glove))
sns.despine()

<div class='alert alert-info'>
    <b>ToDo</b> (1 point): As you can see above, the word embeddings are similar but not the same. Please name one reason for why the embedding spaces are different and explain why that is.
</div>

YOUR ANSWER HERE

<div class='alert alert-warning'>
    <b>ToDo</b> (3 point): Take these language-based RDMs and link the, back to the neural RDM. Compute the correlation and estimate the variability! How many percent of the lower noise ceiling can be accounted for on average by the word2vec and the GloVe model? Store your answer in the variable `prop_explained_w2v` and `prop_explained_GloVe` for the correlation and report the 95% confidence interval in a vector `CI_prop_explained_w2v` and `CI_prop_explained_GloVe`.
</div>

In [None]:
''' Implement your ToDo here. '''
# Let's load in our earlier result and the noise ceiling!
np.random.seed(3)

rdms_neural = np.load(os.path.join(data_dir, 'Results', 'RDM_LO.npy'))
noiseCeiling = joblib.load(os.path.join(data_dir, 'Results', 'NC_LO.pkl'))

rdm_neural = np.mean(np.mean(rdms_neural, axis=0),axis=0)

rdm_corrs = {}
rdm_corrs_boots={}
for embedding in ['word2vec-google-news-300','glove-wiki-gigaword-100']:
    rdm_embedding = np.load(os.path.join(data_dir, 'Results', 'RDM_' + embedding + '.npy'))
    # Since there are a couple of nan's similarities in our labels we have to exclude these before proceeding:
    nan_filter = np.isnan(rdm_embedding)[0,:] == False
    rdm_filtered = 1 - np.round((rdm_embedding[nan_filter, :][:,nan_filter]),5)
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_embeddings

test_embeddings(prop_explained_w2v,prop_explained_GloVe, CI_prop_explained_w2v, CI_prop_explained_GloVe)

In [None]:
# Let's visualize this and compare it back to the other features
for embedding in ['word2vec-google-news-300','glove-wiki-gigaword-100']:
    rhos = pd.read_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))
    
    new_rhos = pd.DataFrame({'Features': np.zeros(100),
                          'Spearman rho': np.zeros(100)})

    new_rhos.loc[new_rhos['Features'] == 0, 'Spearman rho'] = rdm_corrs_boots[embedding]

    new_rhos['Features'].replace({0: embedding.split('-')[0]}, inplace=True)

    rhos_updated = pd.concat([rhos, new_rhos])
    rhos_updated.to_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))

In [None]:
rhos_updated = pd.read_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))
f, ax = plt.subplots(figsize=(7,5))

# Show each observation with a scatterplot
sns.stripplot(x="Features", y="Spearman rho",
              data=rhos_updated, dodge=True, alpha=.5, zorder=1)

# Show the conditional means
sns.pointplot(x="Features", y="Spearman rho", 
              data=rhos_updated, dodge=.532, join=False, palette="dark",
              markers="o", scale=.25, ci=None)

ax.set_ylim([0, noiseCeiling['LowerBound'] + 0.005])
ax.axhline(noiseCeiling['LowerBound'], color='gray', label='lower NC')
plt.tight_layout()
sns.despine()


So in sum, this tells us that even a language model, so a model that has never learnt from anything *visual* can account for some of the patterns in the area LOC. Our finding is also supported by more recent studies, if you'd like to read more about this:
* [Object representations in the human brain reflect the cooccurrence statistics of vision and language](https://www.biorxiv.org/content/10.1101/2020.03.09.984625v1.full.pdf)
* [Learning as unsupervised alignment of conceptual systems](http://arxiv.org/abs/1906.09012)

We also see that the word2vec model was much better than the GloVe model.

In [None]:
# Reset the workspace!
%reset -f

## Combining features spaces

You might have realized that the VGG19 RDMs and the RDM we've obtained from the word2vec model do resemble one another. In this final part, we are going to explore in how far these models explain the *same* variance by combining them into a single model.

In [None]:
import joblib
import os
import sys
import pandas as pd
import numpy as np
from scipy.spatial.distance import squareform
from scipy.stats import rankdata
import seaborn as sns
import matplotlib.pyplot as plt
from niedu.utils.nipa import show_rankRDM
from niedu.utils.nipa import corr_variability
sns.set_context('poster')
sys.path.append('/home/Public')

In [None]:
# Let's load in all our RDMs 
data_dir = os.path.join(os.path.expanduser('~'), 'NI-edu-data', 'BOLD5000')

rdms_neural = np.load(os.path.join(data_dir, 'Results', 'RDM_LO.npy'))
rdm_neural = np.mean(np.mean(rdms_neural, axis=0),axis=0)
noiseCeiling = joblib.load(os.path.join(data_dir, 'Results', 'NC_LO.pkl'))
rdm_w2v = np.load(os.path.join(data_dir, 'Results', 'RDM_' + 'word2vec-google-news-300' + '.npy'))
rdm_pool4 = np.load(os.path.join(data_dir, 'Results', 'RDM_block4_pool.npy'))
rdm_fc1 = np.load(os.path.join(data_dir, 'Results', 'RDM_fc1.npy'))

# Since there are a couple of nan's similarities in our labels we have to exclude these before proceeding:
nan_filter = np.isnan(rdm_w2v)[0,:] == False
rdm_w2v_filtered = 1 - np.round((rdm_w2v[nan_filter, :][:,nan_filter]),5)
rdm_pool4_filtered = np.round((rdm_pool4[nan_filter, :][:,nan_filter]),5)
rdm_fc1_filtered = np.round((rdm_fc1[nan_filter, :][:,nan_filter]),5)
rdm_neural_filtered = np.round((rdm_neural[nan_filter, :][:,nan_filter]),5)


In [None]:

fig, ax = plt.subplots(1, 4, figsize = (11,8))
ax = ax.flatten()
show_rankRDM(rdm_pool4_filtered, label='VGG19 - pool4', ax=ax[0])
show_rankRDM(rdm_fc1_filtered, label='VGG19 - FC1', ax=ax[1])
show_rankRDM(rdm_w2v_filtered, label='Word2Vec', ax=ax[2])
show_rankRDM(rdm_neural_filtered, label='Mean neural RDM', ax=ax[3])

plt.tight_layout()


<div class='alert alert-warning'>
    <b>ToDo</b> (1 point): Last week, you've learnt about a way of combining different feature spaces! In the next step, please combine all our model RDMs into a single model to predict the mean participant RDM. Store the correlation in a variable called corr_combined and also determine the variability of this estimate with the CI.
</div>


In [None]:
''' Implement your ToDo here. '''
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
''' Tests the above ToDo. '''
from week_lynn import test_multiFit

test_multiFit(corr_combined, corr_combined_CI)

In [None]:
rhos = pd.read_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))

new_rhos = pd.DataFrame({'Features': np.zeros(100),
                          'Spearman rho': rdm_corrs_combined_boots})

new_rhos['Features'].replace({0: 'Combined'}, inplace=True)

rhos_updated = pd.concat([rhos, new_rhos])
rhos_updated.to_pickle(os.path.join(data_dir, 'Results','NeuralFits.pkl'))

In [None]:
# Let's visualize this!
f, ax = plt.subplots(figsize=(10,5))

# Show each observation with a scatterplot
sns.stripplot(x="Features", y="Spearman rho",
              data=rhos_updated, dodge=True, alpha=.5, zorder=1)

# Show the conditional means
sns.pointplot(x="Features", y="Spearman rho", 
              data=rhos_updated, dodge=.532, join=False, palette="dark",
              markers="o", scale=.25, ci=None)


ax.set_ylim([0, noiseCeiling['LowerBound'] + 0.005])
ax.axhline(noiseCeiling['LowerBound'], color='gray', label='lower NC')
plt.tight_layout()
sns.despine()


## Concluding remarks

In this tutorial, you have learnt how to describe the variation in a given brain area in response to images by leveraging different computational models. There are several things we can conclude from this:

* Different computational models can pick up on similar or different variation in your neural data. This can be seen from our last result in which the combined model is better than the other ones by themselves. If there is a lot of unique variance in the neural data that is accounted for by the individual models, then their combination will result in an even better explanation of the neural data. If however the different features explain the *same* variation in the data, then combining them will not yield a better result.
* A language model can actually explain a visual area reasonably well, which tells us that linguistic co-occurrences and visual features might be related!
* Instead of fitting a single GLM, we could have also partioned the variances accounted for by every set of features. That is, we could have described how much of the neural data is uniquely explained by a particular set of features, while controlling for the influence of the other features (this is called partial correlation). For an example of this approach, see this [paper](http://dx.doi.org/10.7554/eLife.32962).
* This tutorial largely omitted the issue of correcting for multiple comparisons. Especially, when comparing multiple models to brain data, it is recommended to correct for the false discovery rate when interpreting the statistical signficance of the model fits.
* Finally, all our resampling implementations (CI & stat. sig.) only used a 100 iterations per default to keep computation times manageable. For your own research, you should run at least multple thousands of iterations to get reliable estimates. 

