# Examples of all decoders (except Kalman Filter)

In this example notebook, we:
1. Import the necessary packages
2. Load a data file (spike trains and outputs we are predicting)
3. Preprocess the data for use in all decoders
4. Run all decoders and print the goodness of fit
5. Plot example decoded outputs

See "Examples_kf_decoder" for a Kalman filter example. <br>
Because the Kalman filter utilizes different preprocessing, we don't include an example here. to keep this notebook more understandable

## 1. Import Packages

Below, we import both standard packages, and functions from the accompanying .py files

In [None]:
#Import standard packages
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy import io
from scipy import stats
import pickle

# If you would prefer to load the '.h5' example file rather than the '.pickle' example file. You need the deepdish package
# import deepdish as dd 

#Import function to get the covariate matrix that includes spike history from previous bins
from Neural_Decoding.preprocessing_funcs import get_spikes_with_history

#Import metrics
from Neural_Decoding.metrics import get_R2
from Neural_Decoding.metrics import get_rho

from Neural_Decoding.decoders import WienerFilterClassification
from Neural_Decoding.decoders import SVClassification
from Neural_Decoding.decoders import XGBoostClassification
from Neural_Decoding.decoders import DenseNNClassification
from Neural_Decoding.decoders import SimpleRNNClassification
from Neural_Decoding.decoders import GRUClassification
from Neural_Decoding.decoders import LSTMClassification


from Neural_Decoding.decoders import WienerFilterDecoder
from Neural_Decoding.decoders import WienerFilterRegression


## 2. Load Data
The data for this example can be downloaded at this [link](https://www.dropbox.com/sh/n4924ipcfjqc0t6/AACPWjxDKPEzQiXKUUFriFkJa?dl=0&preview=example_data_s1.pickle). It was recorded by Raeed Chowdhury from Lee Miller's lab at Northwestern.


The data that we load is in the format described below. We have another example notebook, "Example_format_data", that may be helpful towards putting the data in this format.

Neural data should be a matrix of size "number of time bins" x "number of neurons", where each entry is the firing rate of a given neuron in a given time bin

The output you are decoding should be a matrix of size "number of time bins" x "number of features you are decoding"

 

In [None]:
# folder='' #ENTER THE FOLDER THAT YOUR DATA IS IN
# folder='/home/jglaser/Data/DecData/' 
folder='/Users/jig289/Dropbox/Public/Decoding_Data/'

with open(folder+'example_data_s1.pickle','rb') as f:
    neural_data,vels_binned=pickle.load(f,encoding='latin1') #If using python 3
#     neural_data,vels_binned=pickle.load(f) #If using python 2

# #If you would prefer to load the '.h5' example file rather than the '.pickle' example file.
# data=dd.io.load(folder+'example_data_s1.h5')
# neural_data=data['neural_data']
# vels_binned=data['vels_binned']

## 3. Preprocess Data

### 3A. User Inputs
The user can define what time period to use spikes from (with respect to the output).

In [None]:
bins_before=6 #How many bins of neural data prior to the output are used for decoding
bins_current=1 #Whether to use concurrent time bin of neural data
bins_after=6 #How many bins of neural data after the output are used for decoding

### 3B. Format Covariates

#### Format Input Covariates

In [None]:
# Format for recurrent neural networks (SimpleRNN, GRU, LSTM)
# Function to get the covariate matrix that includes spike history from previous bins
X=get_spikes_with_history(neural_data,bins_before,bins_after,bins_current)

# Format for Wiener Filter, Wiener Cascade, XGBoost, and Dense Neural Network
#Put in "flat" format, so each "neuron / time" is a single feature
X_flat=X.reshape(X.shape[0],(X.shape[1]*X.shape[2]))

#### Format Output Covariates

In [None]:
#Set decoding output
y=vels_binned

### 3C. Split into training / testing / validation sets
Note that hyperparameters should be determined using a separate validation set. 
Then, the goodness of fit should be be tested on a testing set (separate from the training and validation sets).

#### User Options

In [None]:
#Set what part of data should be part of the training/testing/validation sets
training_range=[0.2, 0.5]
testing_range=[0.7, 0.85]
valid_range=[0.85,1]

#### Split Data

In [None]:
num_examples=X.shape[0]

#Note that each range has a buffer of"bins_before" bins at the beginning, and "bins_after" bins at the end
#This makes it so that the different sets don't include overlapping neural data
training_set=np.arange(np.int(np.round(training_range[0]*num_examples))+bins_before,np.int(np.round(training_range[1]*num_examples))-bins_after)
testing_set=np.arange(np.int(np.round(testing_range[0]*num_examples))+bins_before,np.int(np.round(testing_range[1]*num_examples))-bins_after)
valid_set=np.arange(np.int(np.round(valid_range[0]*num_examples))+bins_before,np.int(np.round(valid_range[1]*num_examples))-bins_after)

#Get training data
X_train=X[training_set,:,:]
X_flat_train=X_flat[training_set,:]
y_train=y[training_set,:]

#Get testing data
X_test=X[testing_set,:,:]
X_flat_test=X_flat[testing_set,:]
y_test=y[testing_set,:]

#Get validation data
X_valid=X[valid_set,:,:]
X_flat_valid=X_flat[valid_set,:]
y_valid=y[valid_set,:]

### 3D. Process Covariates
We normalize (z_score) the inputs and zero-center the outputs.
Parameters for z-scoring (mean/std.) should be determined on the training set only, and then these z-scoring parameters are also used on the testing and validation sets.

In [None]:
#Z-score "X" inputs. 
X_train_mean=np.nanmean(X_train,axis=0)
X_train_std=np.nanstd(X_train,axis=0)
X_train=(X_train-X_train_mean)/X_train_std
X_test=(X_test-X_train_mean)/X_train_std
X_valid=(X_valid-X_train_mean)/X_train_std

#Z-score "X_flat" inputs. 
X_flat_train_mean=np.nanmean(X_flat_train,axis=0)
X_flat_train_std=np.nanstd(X_flat_train,axis=0)
X_flat_train=(X_flat_train-X_flat_train_mean)/X_flat_train_std
X_flat_test=(X_flat_test-X_flat_train_mean)/X_flat_train_std
X_flat_valid=(X_flat_valid-X_flat_train_mean)/X_flat_train_std

#Zero-center outputs
y_train_mean=np.mean(y_train,axis=0)
y_train=y_train-y_train_mean
y_test=y_test-y_train_mean
y_valid=y_valid-y_train_mean

# Make output categorical

In [None]:
y_train_cat=(y_train[:,0]>-10).astype(int)
y_train_cat[y_train[:,0]>10]=2

y_valid_cat=(y_valid[:,0]>-10).astype(int)
y_valid_cat[y_valid[:,0]>10]=2

## 4. Run Decoders
Note that in this example, we are evaluating the model fit on the validation set

In [None]:
model=WienerFilterClassification()
model.fit(X_flat_train,y_train_cat)

y_train_predicted=model.predict(X_flat_train)
y_valid_predicted=model.predict(X_flat_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=SVClassification(max_iter=200)
model.fit(X_flat_train,y_train_cat)

y_train_predicted=model.predict(X_flat_train)
y_valid_predicted=model.predict(X_flat_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=XGBoostClassification(num_round=100)
model.fit(X_flat_train,y_train_cat)

y_train_predicted=model.predict(X_flat_train)
y_valid_predicted=model.predict(X_flat_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=DenseNNClassification()
model.fit(X_flat_train,y_train_cat)

y_train_predicted=model.predict(X_flat_train)
y_valid_predicted=model.predict(X_flat_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=SimpleRNNClassification(num_epochs=5)
model.fit(X_train,y_train_cat)

y_train_predicted=model.predict(X_train)
y_valid_predicted=model.predict(X_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=GRUClassification(num_epochs=5)
model.fit(X_train,y_train_cat)

y_train_predicted=model.predict(X_train)
y_valid_predicted=model.predict(X_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

In [None]:
model=LSTMClassification(num_epochs=5)
model.fit(X_train,y_train_cat)

y_train_predicted=model.predict(X_train)
y_valid_predicted=model.predict(X_valid)

print("Training accuracy: ", np.mean(y_train_predicted==y_train_cat))
print("Validation accuracy: ", np.mean(y_valid_predicted==y_valid_cat))

### 4A. Wiener Filter (Linear Regression)

In [None]:
#Declare model
model_wf=WienerFilterDecoder()

#Fit model
model_wf.fit(X_flat_train,y_train)

#Get predictions
y_valid_predicted_wf=model_wf.predict(X_flat_valid)

#Get metric of fit
R2s_wf=get_R2(y_valid,y_valid_predicted_wf)
print('R2s:', R2s_wf)

In [None]:
#Declare model
model_wf=WienerFilterRegression()

#Fit model
model_wf.fit(X_flat_train,y_train)

#Get predictions
y_valid_predicted_wf=model_wf.predict(X_flat_valid)

#Get metric of fit
R2s_wf=get_R2(y_valid,y_valid_predicted_wf)
print('R2s:', R2s_wf)