## <span style="color:black">Neural Decoding</span>


In lab, we will be working with files in the Zhang_neurons folder. This dataset contains recordings from 132 neurons in a monkey's inferior temporal lobe (IT). an area known to be highly involved in high-level vision and object perception. The recordings were made while a monkey viewed 7 different objects that were presented at each of three screen locations. Each object was presented approximately 20 times at each of the three locations. In each trial, the monkey viewed a fixation dot for 500 ms, and then viewed one of the seven objects for another 500 ms. The data were reported in Zhang et al (2011, *PNAS*). 

Note: This paper contains conditions in which objects were presented simultaneously, but only the single object condition is included.

Zhang, Y., Meyers, E. M., Bichot, N. P., Serre, T., Poggio, T. A., & Desimone, R. (2011). *Object decoding with attention in inferior temporal cortex*. Proceedings of the National Academy of Sciences, 108(21), 8850-8855.

https://doi.org/10.1073/pnas.1100999108

### About the dataset:

The data are in raster format, meaning that each .mat file contains data from one of the 132 neurons. Each of these files contains three variables.

*raster_site_info*: A structure corresponding to the recording parameters of the experiment that <u>can be ignored</u> for the purpose of this problem set.

*raster_labels*: A structure that contains the object being viewed (stimulus_ID), the position of the object (stimulus_position), and the combined object+position (combined_ID_position).

*raster_data*: A matrix where each row corresponds to the data from one trial, and each column corresponds to data from one 1-ms time point (the rows are also in order so that the first trial is in the first row, and the last trial is in the last row).

### Working with the dataset:

Dealing with 132 separate data files can be a challenge. First, import these packages and define some helpful code snippets:

In [None]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from mat2array import loadmat
import glob
import os

# Define the path to the current directory on your computer
homeDirectory = os.getcwd()

# Change the directory to the directory containing the raw data
os.chdir(homeDirectory+ '/Zhang_neurons')

# Create a file list of all neurons' data files
neuronList = glob.glob('*.mat')

# Change directory back to home
os.chdir(homeDirectory)

Next, I recommend reading the files in this way:

In [None]:
# Loops through each file in neuronList
for i in range(len(neuronList)):  
    # creates a full path on your computer to the ith file
    path = homeDirectory + '/Zhang_neurons/' + neuronList[i]
    
    # loads the file
    neuron = loadmat(path)
    
    # parses the data into the relevant variables
    raster_data = neuron['raster_data']
    stimID = neuron['raster_labels']['stimulus_ID']
    stimPosition = neuron['raster_labels']['stimulus_position']

In this lab, we will only concern ourselves with the seven object identities. It's helpful to define them in a list.

In [None]:
classes = ['car','couch','face','flower','guitar','hand','kiwi']

To get the indices of the first class (car), you could do the following:

In [None]:
carInd, = np.where(stimID == classes[0])

Not all IT neurons are equally responsive to visual stimuli. Calculate the mean spike count rate for each neuron in the interval from 601-1000 ms and plot a histogram of the population's spiking rate. (We're omitting the first 100 ms because there is little visually-driven activity in this area during this period).

In [None]:
# Initialize data storage for the average firing rate of each neuron
meanRate = np.zeros()

# Loop through each neuron in neuronList
for i in range():  
    # Define the file path for loading the .mat files
    
    # Use the loadmat function to load the file
    
    # Defining the data stored in the .mat file
    raster_data = neuron['raster_data']
    stimID = neuron['raster_labels']['stimulus_ID']
    stimPosition = neuron['raster_labels']['stimulus_position']
    
    # Calculate the mean spike count rate for each neuron and store in meanRate

# Plotting
plt.hist(meanRate);
plt.title('Histogram of population firing rates')
plt.xlabel('Firing rate')
plt.ylabel('Counts')

In 2-3 sentences:

What do you conclude about the visual responsiveness to this population? What might be a negative consequence of decoding using these raw firing rates?

In [None]:
# Answer

We want to turn the raw raster plots into spike-count rate matrices. I have provided you with a function that computes the firing rate matrix for a neuron and creates a 420-trial by 18-time bin matrix in which each time bin represents the spike count rate for a neuron within the time window. The time bins begin every 50 ms (1 ms, 51 ms, 101 ms, etc), and are 150 ms long. Thus, time window 1 is from 1-150, window 2 is from 51-200, etc. 

Run this cell to define the function and see an example.

In [None]:
neuron = loadmat('Zhang_neurons/bp1001spk_01A_raster_data.mat')

def rate(neuron):
    global bins
    bins = np.arange(0,890,50)
    rateMat = np.zeros((420,18))
    raster_data = neuron['raster_data']
    for i in range(len(bins)):
        rate1 = raster_data[:,bins[i]:bins[i]+150]
        rate2 = np.sum(rate1,axis=1)
        rate3 = rate2/.15
        rateMat[:,i] = rate3
    
    return rateMat

# Example using the new function
rateMat = rate(neuron)
print(rateMat.shape)

In order to fix the problems you outlined above, you want to z-score the firing rates for this neuron. Recall that a z-score is calculated as follows:
$$z = \frac{x-\mu}{\sigma}$$ 

Where $x$ is the raw firing rate, $\mu$ is the mean firing rate, and $\sigma$ is the standard deviation of the cell's firing rate.

Use the zScore function to find the z-score of your firing rate matrix.

In [None]:
def zScore(rateMat): 
    globalMean = np.mean(rateMat)
    globalSTD = np.std(rateMat)
    z = (rateMat - globalMean)/globalSTD
    return z

# Apply the zScore function to your firing rate matrix

First, create a 3-dimensionsal matrix of z-scored firing rates for each of the neuron. The purpose of this is to decode by the entire population of neurons at each time point. 

<img src="image2.jpg" alt="drawing" width="250"/>

In [None]:
# initialize data structures
trialInds = np.zeros() # hint: should be the same as the number of trials
dataMat = np.zeros() # hint: should be N-trials by M-time points by 125 valid neurons

# start a neuron count at 0
count = 0

# loop through each neuron file
for i in range():
    # Define the file path for loading the .mat files
    
    # Use the loadmat function to load the file
    
    # Defining the data stored in the .mat file
    raster_data = neuron['raster_data']
    stimID = neuron['raster_labels']['stimulus_ID']
    stimPosition = neuron['raster_labels']['stimulus_position']

    # Check if the neuron has the full number of trials
    if raster_data.shape[0] == :
        
        # If so, Calculate rate and zScores of the neuron
        rateMat = rate()
        z = zScore()
        
        # We need to sort the data by trial type to make our lives easier later
        sortedTrials = sorted(enumerate(trialInds), key=lambda x:x[1])
        sortedTrials = np.asarray(sortedTrials)
        sortedData = sortedTrials[:,0].astype(int)
        allClasses = sortedTrials[:,1].astype(int)
        
        # place sorted data into array
        dataMat[:,:,count] = z[sortedData,:]
        
        # increment valid cell counter
        count += 1

Now that we have our data structure, we will do the actual decoding. 

Our goal is to apply our SVM classifier to each of the 18 time points. For each time point, each of the 125 neurons is a feature, and we will use 10-fold cross validation. We will calculate and plot the classifier's accuracy given the population and time point.

In [None]:
# Import necessary machine learning tools
from sklearn.model_selection import KFold 
from sklearn import svm
classifier = svm.SVC(kernel='linear')

# Define categories
classes = ['car','couch','face','flower','guitar','hand','kiwi']

# initialize accuracy data structure (hint: there will be one accuracy for each time bin)
totalAccuracy = np.zeros()

# loop through each of the 18 time bins
for t in range():
    
    # Define 420x125 feature matrix per time bin
    timeMat = dataMat[]
    
    # Define random indices for 10-fold cross validation
    randInds = np.random.randint(0, high=10, size=(420))
    
    # 10-fold cross validation
    for j in range():
        
        # define testing data for this fold
        testInds = 
        testVec = allClasses[]
        testData = timeMat[]
        
        # define training data for this fold
        trainInds = 
        trainVec = allClasses[]
        trainData = timeMat[]
        
        # train the SVM classifier
        classify.fit(trainData,trainVec)
        
        # get classifier's predictions on testing data
        predClass = classify.predict()
        
        # Initialize data storage to calculate accuracy for each item in this fold
        accuracy = np.zeros()
        
        # Check the accuracy of each item
        for h in range():
            # Conditional to check accuracy of predClass with respect to testVec
            
    # Average over the accuracies in each fold to yeild an overall accuracy for the time bin
    totalAccuracy[t] = 
    
# Plot accuracy of population with respect to each time bin
plt.figure()
plt.plot(bins, totalAccuracy)
plt.xlabel("Fill me in")
plt.ylabel("Fill me in")


If all went according to plan, your decoding graph should look qualitatively similar to the blue curve from the original paper:

<img src="image1.png" alt="drawing" width="250"/>

Note: Zhang et al used a different type of classifier, so your accuracies will be somewhat different.

Congrats! This is a big project! Please upload this notebook to Lyceum for grading.