# Final

For the first part of the exam, you will answer general questions about fMRI analysis. 

# Part 1 - General questions


### 1.1 Imports (4 points)
All semester long, we have been importing many of the same packages. Here, on the line of each import, add a comment that describes what that specific tool is for. Limit your comments/descriptions to one line each, and do NOT let them run off the page. 

In [None]:
# Imports
import os # ? [add description here!]
import cortex # ? [add description here!]
import neurods # ? [add description here!]
# NOTE: this next is imported separately because of a quirk in the way we wrote neurods. 
import neurods.stats # [no answer needed here]
import numpy as np # ? [add description here!]
import matplotlib.pyplot as plt # ? [add description here!]

### (Just finish setting up)
Don't forget to actually run the cell above, too!

In [None]:
# Configure defaults for plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.aspect'] = 'auto'
plt.rcParams['image.cmap'] = 'viridis'
%matplotlib inline

### 1.2 Masks (6 points total)

#### 1.2a Cortical masks (3 points total)
In fMRI analysis, it is common to use *masks* to select particular sets of voxels in the brain. Most often in class, we have used masks to select *cortical* voxels, using a mask generated by pycortex called `'cortical'`. There are other options for how we might select voxels in or around the cortex. Here, we will explore two other masks in addition to the `'cortical'` mask.

We will load and visualize three different masks for the subject we will analyze in the experiment below. Load each of these masks using pycortex. 

For all three masks, the subject will be 's02', and the experiment/transform will be 'reading'. The names of the masks will be `'cortical'`, `'thin'`, and `'thick'`. (`'thin'` and `'thick'` are loaded in the same way as the `'cortical'` mask.) 

* Visualize each of these masks in 2D slices using the appropriate function in neurods. (2 points)
* Why would you use the 'thin' mask instead of the 'cortical' mask? Why would you use the 'thick' mask?  (1 points)

In [None]:
# Define subject and functional-to-anatomical transform for this experiment
sub = 's02'
xfm = 'reading' 
# Load masks
#mask_cortical = # ?
#mask_thin = # ? 
#mask_thick = # ? 

# Visualize masks 
# ??

### STUDENT ANSWER


Why would you use the thin / thick mask instead of the cortical mask? 

`### STUDENT ANSWER`

[type answer in this cell]

#### 1.2b ROI masks (3 points)
A common practice in fMRI is to use a mask to select voxels in a particular region of interest (an area smaller than the whole cortex). Load the following mask, and answer the following questions about it. 

In [None]:
# Download mask
if not os.path.exists('mni_mask_final.npz'):
    neurods.io.download_file('https://drive.google.com/file/d/0B_iniuUpMJoGTkx4Y0NTUHgzTUk', 
                            'mni_mask_final.npz',
                            root_destination='./')
roi_mask = np.load('mni_mask_final.npz')['mask']

The mask is in the space of the MNI brain (the average template brain that we used in lectures 11 and 12). Thus, for pycortex purposes, the subject is `'MNI'` and the transform is `'atlas336'`. 

* How many voxels are in this mask? (1 point)
* Visualize this mask on the surface of the MNI brain. (1 point)
* Where is this mask located? (Please describe the location in appropriate neuroscientific terms!) (1 point)

[hint: for the last part, if you have trouble visualizing where the mask is in 3D based on a cortical flatmap, you can use pycortex to show this mask in a a different way]

In [None]:
### STUDENT ANSWER


### 1.3 HRF (4 points)

Please explain in a few sentences: 

* What does HRF stand for? (1 point)

`### STUDENT ANSWER`

[type answer in this cell]

* Describe the shape of the HRF. Use words, but be quantitative. (1 point)

`### STUDENT ANSWER`

[type answer in this cell]

* How is the HRF generally used in fMRI analysis? (1 point)

(Avoid technical terms if you can - explain this to a non-specialist)

`### STUDENT ANSWER`

[type answer in this cell]

* Name one way in which the HRF constrains the kinds of experiments that can be done using fMRI. (1 point)

`### STUDENT ANSWER`

[type answer in this cell]

### 1.4 Bootstrapping (6 points)

Please explain in few sentences:

* What is the purpose of a bootstrap analysis? (For example, why would you compute a bootstrap estimate of a regression weight?) (2 points)

`### STUDENT ANSWER`

[type answer in this cell]

* How would you compute a bootstrap estimate of a difference between two regression weights? (explain in words [no code!])  (1.5 points)

`### STUDENT ANSWER`

[type answer in this cell]

* How would you compute a bootstrap estimate of model prediction accuracy? (explain in words [no code!]) (1.5 point)

`### STUDENT ANSWER`

[type answer in this cell]

* Bootstrap estimates of a statistical quantity result in a distribution of values for that quantity. How can such distributions be used draw a conclusion about the reliability or accuracy of fMRI results? (please give a specific example of a quantity and a conclusion that could be drawn given a hypothetical outcome of a bootstrap analysis) (1 point)


`### STUDENT ANSWER`

[type answer in this cell]


For the second part of the exam, you will analyze the following experiment. 

# Part 2 - Naturalistic reading experiment

In this experiment, subjects read multiple stories in the scanner. The stories were presented one word at a time, with the words appearing at the rate of natural speech. Each word was presented at the center of the screen by itself, for a few hundred milliseconds. So in every TR, subjects read about 4 to 15 words. 

For these words, one can extract multiple features. For example, some of the features can relate to the semantic properties of the words shown. We will not deal with such features today. We will look at only one type of features for these words: The letters that compose the words. Our feature space is a 26 dimensional space in which each dimension corresponds to a letter in the alphabet. At every TR, we count how many times each letter occured. 

For example, if during one TR the subject reads:

"it 

was 

the 

first 

time 

I 

saw

something

so"

...then, for that TR, the feature channel corresponding to the letter "s" will have a count of 5, the feature corresponding to "t" will have 5, the feature corresponding to "a" will have 2, the feature corresponding to "e" will have 3, etc. E.g.:

| a        |    ...       | e  | ...      | s          | t  | ... | z|
| ------------- |:-------------:| -----:|:-------------:| -----:|:-------------:|:-------------:| ---|
| 2     |  ... | 3 | ...  |5 | 5 | ...| 0|


We would like to learn a model that predicts the activity in the brain as a function of the letters that are read by the subject. Letters are used across words of all meanings, so you can see how this letter model captures low level properties rather than high level meaning. Thus it may be a good model for brain mechanisms related to processing letters.

## 2.1 Loading data (3 points)

* Load the 'thin' mask for subject 2 (1 point)
* Load the data, zscore, and mask it using the relevant function(s) in `neurods`. Load the first two runs of the experiment (`"s02_reading_03.nii.gz"` and `"s02_reading_03.nii.gz"`) into a variable called `Y_train`. (`Y_train` should be a single array of the two data sets concatenated in the time dimension). Load the last run of the experiment (`"s02_reading_03.nii.gz"`) into a variable called `Y_test`. These two variables will serve as training and testing data for our model. (1 point)
* Create the experiment design matrix (here, a feature space that quantifies the letters being read by the subject). You will create one design matrix for the training data (`X_train`) and one for the testing data (`X_train`). The `design` variable loaded below contains separate design matrices for each run; your job is to split up/concatenate these matrices so that `X_train` and `X_test` match with `Y_train` and `Y_test`. (1 point)

In [None]:
# Define experiment directory
basedir = os.path.join(neurods.io.data_list['fmri'],'reading')

# Get 'thin' mask for s02
sub, xfm = 's02', 'reading'
# mask = # ?

# Load fMRI data
nruns = 2
fmri_files_train = ['s02_reading_{:02d}.nii.gz'.format(run) for run in range(nruns)]
fmri_files_train = [os.path.join(basedir, f) for f in fmri_files_train]
fmri_files_test = os.path.join(basedir, 's02_reading_03.nii.gz')
#Y_train = # ?
#Y_test = # ? 

# Load the design matrix (the letter model feature space, your X variable)
design = np.load(os.path.join(basedir, 'features.npz'))
#X_train = # ?
#X_test = # ?

### STUDENT ANSWER


If you can't figure out how to load the data above, restart your kernel, re-run the import cells above, and then run the following cell so that you can still continue with the exam. Obviously, this also provides a way to check whether your answer above is in the correct form - your answer to part 2.1 will be graded on the correctness of how you loaded the data. 

Also: ***fair warning!*** If you try to load *BOTH* the variables in the next cell *and* your own version of the same variables, you may run into memory limits (here or maybe in subsequent steps). 

In [None]:
# Load correct versions of variables
if False: # Switch this line to True if necessary
    if not os.path.exists('exp_data_file.npz'):
        neurods.io.download_file('https://drive.google.com/file/d/0B_iniuUpMJoGam9vNGFyR21IbE0', 
                                 'exp_data_file.npz', 
                                 root_destination='./')
    tmp = np.load('exp_data_file.npz')
    X_train = tmp['X_train']
    X_test = tmp['X_test']
    Y_train = tmp['Y_train']
    Y_test = tmp['Y_test']
    del tmp

### Visualizing X_train and X_test (3 points)

- Print the shape of the **`X_train`** and **`X_test`** arrays
- Use `plt.imshow` to show both matrices (Make sure you make it clear which is which in the plots!)
- Label the columns (`plt.xlabel`) and rows (`plt.ylabel`) of the plots

In [None]:
### STUDENT ANSWER


### Normalize design matrices

Run the following cells to normalize the feature matrices.

In [None]:
from scipy.stats import zscore
X_train = zscore(X_train, axis=0)
X_test = zscore(X_test, axis=0)

### Visualizing Y_train and Y_test (2 points)

- print the shape of the Y_train and Y_test function
- DO **NOT** USE imshow here because you might run into memory issues.
- what does the column and row of each of Y_train and Y_test correspond to?

In [None]:
### STUDENT ANSWER


## 2.2 Convolution of the design matrix with the HRF (4 points)

The TR here is 2 seconds. 

- Use the function in the package `neurods` to generate an appropriate hrf for this experiment. (1 point)
- Plot the hrf. Label the axes in the plot! (1 point)
- Use the np.convolve function to obtain conv_X_train. (Remember to trim the matrix properly). (1 point)
- Use plt.imshow to plot conv_X_train. Label the axes in the plot! (1 point)

In [None]:
### STUDENT ANSWER


## 2.3 Estimating brain responses to features in each voxel (3 points)

Here, we will use linear regression to estimate the brain response to each feature in each voxel. 

- Use the OLS function estimate regression weights for all voxels. (1 point)
- Print the shape of the weight matrix. (1 point)
- What do the rows and columns of the weight matrix correspond to? (1 point)

In [None]:
### STUDENT ANSWER


## 2.4 Predicting training data (3 points)

We will first predict the training data, and see how well the predicted data correlates with the real data.

- compute Y_train_hat using the weights matrix and conv_X_train. (1 point) 
- use neurods.stats.compute_correlations below to compute the correlation of Y_train_hat and Y_train (1 point)
- make a flatmap of the correlation value over the brain. (1 point)

In [None]:
### STUDENT ANSWER


## 2.5 Predicting test data (4 points)

We will first predict the held out data, and see how well the predicted data correlates with the real held out data.

- ATTENTION: you cannot use the matrix X_test to compute Y_test_hat. Remember, the weights you estimated are a function of the convolved design matrix. You need to use the hrf function above and np.convolve to obtain conv_X_test. And you need to trim it appropriately. (1 point)
- compute Y_test_hat using the weights matrix and conv_X_test. (1 point)
- use `neurods.stats.compute_correlations` below to compute the correlation of `Y_test_hat` and `Y_test` (1 point)
- make a flatmap of the correlation value over the brain. (1 point)

In [None]:
### STUDENT ANSWER


## 2.6a Interpretation (3 points)

- Which regions appear to be predicted by letters the subject sees on the screen? (Use neuroscientifically appropriate terms!) Does that make sense? (3 points)

`### STUDENT ANSWER`

[type answer in this cell]

## 2.6b Visualizations to aid interpretation (3 points)
* Make a histogram of the predictions of the training set data. (Label axes!) (1 point)
* Make a histogram of the predictions of the test set data. (Label axes!) (1 point)
* Make a scatter plot of training set prediction accuracy vs test set prediction accuracy (each dot will be one voxel)

In [None]:
### STUDENT ANSWER


## 2.7 Big picture interpretation  (2 points)
What is the difference between predicting the training data and predicting the testing data?
(How do the predictions differ? Why do you think that is the case?) (2 points)

`### STUDENT ANSWER`

[type answer in this cell]

--- 
# Extra credit  
---
Pick and choose among these questions (or answer them all) - there is a maximum of 5 points of extra credit possible. (that is, you will only get 5 points of extra credit even if you answer all of the questions perfectly.)

### Describe another feature space you might construct to model brain responses in this experiment! (2 points)
For this question, assume you have access to the full list of words that the subject read at each point in time. Describe (1) what the hypothesis is, (2) how you would go about making the feature space (what would you label / compute about the words for each TR?), and (3) what you might expect to find / why this would be an interesting hypothesis

`### STUDENT ANSWER`

[type answer in this cell]

### Take a close look at the experimental design (2 points)
In which run were the most total letters read by the subject? In which experiment were there the most letters per TR? (note that the runs were not all the same length!) (1 points)

How might this information matter / influence on the interpretation of the experiment? (1 point)

In [None]:
### STUDENT ANSWER


### Create a mask (a 3D array) that selects the 400 best-predicted voxels in the brain.  (2 points)

Do this for the best-predicted voxels in the training set and in the test set.

Make a flatmap for each of the masks to compare them.

Are they the same voxels? 


In [None]:
### STUDENT ANSWER


### Cross-validation (3 points)

In the main part of the exam we trained on 2 runs of data and predicted a held-out test set. Instead, here, we want to use cross-validation to measure the performance on the training data only. 

* Implement cross-validation:
 - For each fold, hold out one-fourth of the total training data. 
 - Use the remaining 3/4 of the data to compute the weights for the letter model.
 - Compute the correlation between the predicted activity for the held-out run and the real activity.
* Compute the average correlation across all held-out folds and plot it on a flatmap.

* Does this method produce the same results as what we did in 3.4? i.e. is it the same as fitting the model on the entire training data and then predicting the training data? Why or why not?

In [None]:
### STUDENT ANSWER
