# Programming evaluation 1
Fill in the following cells as indicated. If you do not know what specific python code should be used (i.e. if you forgot a function or package name, or otherwise don't remember the precise form of the code necessary to accomplish a task), please write out a description of what needs to be done in pseudo-code or plain English. Comments describing what your code is doing are also strongly encouraged, especially if you are not sure that your code is correct. For example:

```python
# These line constructs a random normal array that is size (100 time points x 50 voxels)
n_voxels = 50
n_timepoints = 100
Y = np.random.randn((n_timepoints, n_voxels))
```

The above line won't actually run correctly due to an error in the code, but would be worth 90% of the points for that component of the question.

Finally, if aren't sure that you are doing something the way we have done it in class, or thin your solution might be imperfect, that's OK - if your hacky or imperfect solution works, you may get full points, and you will get partial credit for trying something even if it doesn't.

In [None]:
# Imports
import matplotlib.pyplot as plt
import numpy as np
import cortex as cx
import h5py

# Make image plots easier
plt.rcParams['image.aspect'] = 'auto'

# Some utility functions
import sys
import os
sys.path.append(os.path.abspath('..'))
import utils.fmri as fmriutils

%matplotlib inline

## Part 1: construction of arrays & fake data

> `1. (5 points)` Construct a design matrix (Call it X) consisting of five columns of binary indicator variables that is 200 TRs long (1 point). Each column should have five occurrences of a condition, each coded as series of ones in the array (1 point), and each condition occurrence should last 2 TRs(1 point). No two conditions should ever be on at the same time (1 point).  Once you have constructed this array, show it using plt.imshow() (1 point) Please attempt to construct this array as instructed, but if you have trouble, please simply make a random array of the same size (200 TRs by 5 conditions) to use in the next questions.

In [None]:
# Answer
X = ... # ? 


In [None]:
# Quick convolution of the data with an HRF function
t, hrf = fmriutils.hrf()
Xc = np.zeros(X.shape)
for column in range(5):
    tmp = np.convolve(X[:,column], hrf)
    Xc[:,column] = tmp[:X.shape[0]]

> `2. (1 point)` Why is it necessary to clip the `tmp` variable above to a maximum length of X.shape[0]? (this answer is just written)

Answer:



> `3. (3 points)` Create an array of random normal weights (call it `B`), one for each condition in the array above for each of 30 voxels (1 point). Multiply the array of weights by the HRF-convolved design matrix (`Xc`) to generate an array of data (call it `Y`) (1 point), and add some noise to Y. (1 point)

In [None]:
# Answer
B = # ?
Y = # ?

> `4. (3 points)` Use ordinary least squares regression to estimate the weights from the noisy data (2 points). The normal equation for ordinary least squares is: $B = (X^TX)^{-1}X^TY$ Show the weights are close to the original weights however you can (1 point)

In [None]:
# Answer


> `5. (3 points)` The experiment above is a simulation of a traditional block design or event-related experiment, a la Karl Friston. Describe in words or code what it would take to make this simulation into an encoding model experiment. Specifically: what would be different about X? (1 point) What might be different about the regression and why? (1 point) A critical aspect of the encoding model framework is making predictions of withheld data. What would this involve, in this sort of simulation? (1 point)

Answer 


In [None]:
# Or: Answer in code & comments (either is acceptable for full credit here)


## Part 2: writing functions
Another of the tasks we've done several times in class is to put a formula into python code. Here, your task is to write a function that computes the Euclidean distance between two vectors. The Euclidean distance is: 

## $D = \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2 + ... + (x_n-y_n)^2}$

or, more generally, 

## $D = \sqrt{\sum_{i=1}^n (x_i-y_i)^2)}$

$n$ is the number of elements in each vector; $x$ and $y$ are the vectors. 

> `6. (3 points)` Write a function that computes the Euclidean distance between two vectors (2 points). Use that function to compute the distance between each pair the three vectors (a, b, and c) below and show the distance between each pair however you see fit (1 point)

In [None]:
# load data
rand_data = np.load('random_vars.npz')
a = rand_data['a']
b = rand_data['b']
c = rand_data['c']
print(a.shape, b.shape, c.shape)

In [None]:
# Answer
# Define the function
def euclidean_dist(x, y): 
    dst = # ??
    return dst

# Use the function to compute distances btw pairs of variables ((a,b), etc )
dst_ab = # ??

## Part 3: array masking
> `7. (1 point)` How many voxels are in the following mask for V4? (you will have to do something to the array to find out)

In [None]:
subject = 's01'
transform = 'color_natims'
roi_mask = cx.get_roi_masks(subject, transform, roi_list=['V4'])

In [None]:
# Answer 
n_voxels = # ? 

> `8. (1 point)` Use the mask to select all voxels in V4 in the val_brain data array. Show the shape of the resulting array.

In [None]:
# Load val_brain
with h5py.File('/unrshare/LESCROARTSHARE/IntroToEncodingModels/s01_color_natims_data.hdf') as hf:
    val_data = hf['val'].value
    mask = hf['mask'].value
val_brain = np.zeros((val_data.shape[0],) + mask.shape)
val_brain[:, mask] = val_data

In [None]:
# Answer
V4 voxels =  # ?

## Part 4: (Bonus, extra credit) data visualization
The following 3D array (`d`) contains a mask for a 3D shape within it (ones where the shape is present, 0 where it is absent). 

> `9. (+ 3 points)` Use the `plt.imshow()` (or `ax.imshow()`) function to visualize 3 different slices through the array in each of the 3 dimensions of the array. Thus, you should make a total of 9 plots (or subplots, if you want to be fancy - not required). These plots are analogous to looking at transverse, saggital, and coronal sections of a brain. (To be clear, a slice involves specifying a value for one dimension, and showing all values for the other dimensions. For each dimension, I recommend showing the 30th, 50th, and 70th values). 

In [None]:
d = np.load('mystery_shape.npy')
print(d.shape)

In [None]:
# Answer


> `10. (+ 1 point)` What is the 3D shape? 

Answer: 
