# HW 2 Problem 2: Logistic Regression

In this problem, you will build a decoding model that predicts the orientation of a visual stimulus based on neural activity.

The data come from experiments in which monkeys viewed static gratings of different orientations while the lab recorded activity of multiple single neurons in **primary visual cortex (V1)** using chronically implanted tetrode arrays.

The gratings were presented at 7 different orientations and two different contrasts (10% and 100%). Each grating was presented for 500 ms. This figure gives a sense of what a grating looks like and how orientation can vary (top row) and how contrast can vary (bottom row). </br></br>
![grating stimulus examples](grating_stimuli.jpg)
</br></br>
Bethge et al. published a classic paper investigating the population code of these visual neurons using decoding analyses. We will implement similar analyses to what they did in this article: https://www.jneurosci.org/content/32/31/10618.short. Data publicly available from the Bethge lab [here](http://bethgelab.org/datasets/v1gratings/).
</br></br>



## Load in the data (100% contrast condition only)


In [None]:
import scipy.io as sio
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression

In [None]:
# Prepare data
data = sio.loadmat('spks.mat')

spikes = data['spk']
ori = data['ori'].reshape((-1,))
timevec = data['times'].reshape((-1,))

print('spikes: ' + str(np.shape(spikes)))
print('ori: ' + str(np.shape(ori)))
print('timevec: ' + str(np.shape(timevec)))


**`spikes`**: contains the spike counts binned into 10 ms bins. The dimensions are (neurons x orientations x time bins x trials).

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Note that spikes is binned spikes over time, not a single response value per orientation.

**`ori`**: contains the 7 different orientations used (corresponding to the second dimension in spikes)

**`timevec`**: contains the time of each 10 ms time bin relative to when the grating was presented (grating presentation happened at 0 ms). times reflect the middle of each time bin.

<br><br>
**Explore the data**

Let's plot the average spike counts over time for neuron 1 at various orientiations.

Change the variable `i_ori` to view responses to different stimuli.


In [None]:
i_ori = 0   # if you're used to matlab, remember python indexing begins at 0!

fig, ax = plt.subplots(1, 1)
ax.plot(timevec, np.mean(spikes[0, i_ori, :, :], axis=1))
_ = ax.set(xlabel='Time (ms)',
        ylabel = 'Mean spike count',
        title=f'orientation = {ori[i_ori]}');


# I. Decoding orientation at a single time bin

First, we will decode the orientation of the grating from the neural activities that occur in the bin from 190 ms to 200 ms after the grating is presented. We will use a logistic regression model because the output variable is categorical in this case -- we will predict which of the 7 distinct orientations was presented.




## Preparing the data



Grab the spiking data in the correct time bin after the grating presentation.  Remember, the time bin from 190-200 ms corresponds to the entry in timevec equal to 195 ms.

In [None]:
# print(timevec)
bin_index = np.where(timevec == 195)[0][0]

spikes_t195 = spikes[:, :, bin_index, :]
print(f'The shape of spikes_t195 is {spikes_t195.shape}')

<br>Now let's divide the data into train and test trials. We have 85 trials per orientation. We want to divide them randomly and use 80% for training the model and 20% for testing.

In [None]:
n_neurons = spikes.shape[0]
n_trials = spikes.shape[3]
n_train_trials = int(0.8*n_trials)
n_test_trials = n_trials - n_train_trials

# Get random trials for train vs test
np.random.seed(0) # I'm adding this line so we all arrive at the same answer. In real analyses, you 
                  # would want trial allocation to be truly random.
trials = np.arange(0, n_trials)
np.random.shuffle(trials)
train_trials = trials[:n_train_trials]
test_trials = trials[n_train_trials:]

print(f'The train trials are {train_trials}')
print(f'\nThe test trials are {test_trials}')

<br>
Now we separate the spiking data on the train and test trials.

We will also reshape the arrays to combine orientation and trial into a single dimension. Now, each combo of orientation and trial is one data point. 

We'll create a corresponding array of the orientation on each trial, in accordance with how we reshaped the spikes array.

In [None]:
spikes_train = spikes_t195[:, :, train_trials].reshape((n_neurons, -1,)) # input data
ori_train = np.repeat(ori, n_train_trials)  # output data: correct label
# print(spikes_train.shape)
# print(ori_train.shape)

spikes_test = spikes_t195[:, :, test_trials].reshape((n_neurons, -1,)) # input data
ori_test = np.repeat(ori, n_test_trials)  # output data: correct label
# print(spikes_test.shape)
# print(ori_test.shape)


### Problem 2a: comprehension check
Why does the `spikes_test` array have 119 columns?
</br></br>




<font color=#2AAA8A><span style="font-size:larger;">
**Answer**

<font color=#2AAA8A><span style="font-size:larger;">
...


## Fitting the logistic regression model

To fit the logistic regression model, we will use sklearn's LogisticRegression functionality. Read about it here: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

The first step is to define our model.

In [None]:
model = LogisticRegression()

### Problem 2b (coding): Fitting a logistic regression model
Now we'll train the model to predict orientation from spike data by calling the fit method: `model.fit(inputs, outputs)`. `inputs` should be a 2D array where each row represents a single trial and each column represents a feature (each input variable will be weighted and combined in the logistic regression). `outputs` should be a 1D array containing the target values (orientations) for each trial, with length equal to the number of trials.

Put in the correct arguments to `model.fit` to fit the logistic model to our training data.

Hint: you can swap rows and columns in an array by transposing it.

In [None]:
# TODO: fill out the call to model.fit with the correct data 
model.fit(...)

### Problem 2c (coding): Evaluating the logistic regression model quantitatively

After fitting the model, we want to assess how good its predictions are. We can use `model.score` to compute the accuracy of the model to evalute it's performance. Accuracy will be the percentage of the trials on which the model predicted the correct orientation. Crucially, we want to use our held-out test data to evaluate model success -- different data points than those used to train the model.

Fill out the code below to compute the accuracy to evalute the logistic regression model's performance.
The arguments to `model.score` should have the same format as `model.fit`.


In [None]:
# TODO: fill out the call to fit with the correct data
accuracy = model.score(...)

print(accuracy)

### Problem 2d: Interpreting model results

i) What does this accuracy score mean? State the result in plain English.

ii) If there were no information about orientation in the neural activity, what accuracy would the model have? In other words, what would be the baseline chance of correctly guessing the stimulus orientation?

iii) Does it seem like there's information about the orientation present in the neural activity 200 ms after grating presentation? Why or why not?

iv) Is there overfitting happening? How do you find out? Demonstrate using code.

v) If we had fewer trials overall (10 trials) and followed the same procedure, do you think the accuracy of the model on the training data would increase, decrease, or stay the same? How about the accuracy on the testing data? Why?

<font color=#2AAA8A><span style="font-size:larger;">
**Answer**

<font color=#2AAA8A><span style="font-size:larger;">
...

In [None]:
# Answer to 2d iv


</br></br>
### Problem 2e: Inspect model coefficients

We can access the learned parameters of the model using `model.coef_`. In this case, the coefficients are in a 7 x 22 array representing the learned weights for all 22 neurons to the 7 orientations. You can plot the array `plt.imshow` to visualize these weights.

**In general, what can be gleaned from inspecting the coefficients?**

In [None]:
plt.imshow(model.coef_, aspect='auto')
plt.xlabel('Neuron')
plt.ylabel('Orientation')
plt.title('Model Coefficients')
plt.colorbar();

<font color=#2AAA8A><span style="font-size:larger;">
**Answer**

<font color=#2AAA8A><span style="font-size:larger;">
...

 ðŸ’ª _almost there_

# II. Decoding accuracy over time

Now, instead of looking at a single time bin, we want to get a sense of the timescale at which orientation information can be read out from the neural population. Each grating stimulus was presented for 500 ms duration. How long after the grating is presented is there information about orientation present in the neural population?

To investigate this, we can fit a logistic regression model for each time bin and calcualte the decoding accuracy for that time bin. We can then look at how the accuracy changes over time relative to the grating presentation onset.

### Problem 2f (coding): Decoding accuracy over time

In the code above, you've already figured out how to fit and evaluate a logistic regression decoding model for a single time bin (the one corresponding to 195 ms after stimulus onset). Generalize that code to calculate the accuracy over time bin for all 90 time bins.


In [None]:
n_time_bins = spikes.shape[2]
accuracy = np.zeros((n_time_bins, ))

# loop over time bins
for i_bin in range(n_time_bins):

    # TODO: fit a model and compute it's accuracy for the data in i_bin
    ...


fig, ax = plt.subplots(1, 1, figsize=(10, 5))
ax.plot(timevec, accuracy, '-o')
_ = ax.set(xlabel='Time (ms)',
       ylabel = 'Accuracy',
       xticks=np.arange(-200, 800, 50))
plt.grid()

<br>You should see a plot that looks approximately like [this](https://drive.google.com/file/d/1SlJ0-6q7D_phXUXRBhrVlkD1v6xzwtOS/view?usp=sharing). There might be small differences depending on how train vs test trials were split.
<br><br>


## Problem 2g: Interpreting this plot

Please answer these questions for the correct plot linked above, even if you got a different one.

i) About how long after the presentation of the grating (at time 0), does information about orientation appear in the neural population activity?

ii) What might that time delay between stimulus presentation and information presence be due to?

iii) Each stimulus was presented for 500 ms (from 0 to 500 ms). How does the amount of information about orientation change over time, compared to both the peak and chance accuracy levels?

<font color=#2AAA8A><span style="font-size:larger;">
**Answer**

<font color=#2AAA8A><span style="font-size:larger;">
...

## Problem 2h: Other scientific questions

**Propose a scientific question about neural coding in V1 that you could address with this data set using decoding methods. Describe the approach for how you would answer this question.**

Your answer should include: 
* **Research question** - It should include what specific aspect of neural coding youâ€™re investigating. (i.e. don't just ask "can X be decoded from the data?")
* **Decoding approach** - What model will you use, logistric regression or something else? What will the input data consist of? What will you be comparing? 
* **Interpretation** - Suggest 2 feasible outcomes. What would each outcome imply about how V1 encodes visual information? 

The exercise only used data from the 100% contrast stimuli, but we also have data for the 10% contrast stimuli!

If you want more info about the experimental details, look at the Bethge publication linked at the top of this page. 

_Avoid simply proposing to repeat the same question about time-course on 10% contrast data. Push yourself to ask a differe question about neural representation or computation!_


***Example Answer for Problem 2h based on Part II of this assignment***

**Research Question:** What is the timecourse of information about grating stimulus orientation contained in the V1 population? For example, how quickly does orientation information become decodable, and how long does it last?

**Decoding Approach:** For each time bin, I would train a logistic regression classifier to decode the stimulus orientation from the population of V1 neurons recorded in this experiment. The input data for each would be an array of spike counts with shape of (n_neuron, n_trials). After calculating accuracy on held-out trials, I would plot accuracy as a function of time. 

**Interpretation:**

Outcome 1: Information remains constantly above chance throughout stimulus duration. 
If decoding accuracy rose immediately at stimulus onset and remained at peak levels throughout the 500 ms stimulus presentation, returning to chance immediately at offset, this would suggest that V1 maintains equally strong orientation information about stimuli that are physically present in the visual field. An immediate return to chance would reveal that V1 represents only currently present visual information without any persistence.

Outcome 2: Accuracy increases (with a small delay) after stimulus onset, peaks early, then declines but remains above chance. 
This outcome implies that the most information about orientation is available shortly after the stimulus appears. While some information about orientation remains in the population, the amount is less after that initial transient response. This pattern would suggest that perhaps adaptation mechanisms alter the population activity, and/or that the V1 population is most responsive to changes within the visual field.


<font color=#2AAA8A><span style="font-size:larger;">
**Answer**

<font color=#2AAA8A><span style="font-size:larger;">
...