Skip to content

theowoo/Neural_Decoding_sklearn

 
 

Repository files navigation

Neural_Decoding_sklearn

A python package that includes many methods for decoding neural activity

The package contains a mixture of classic decoding methods and modern machine learning methods.

For regression, we currently include: Wiener Filter, Wiener Cascade, Kalman Filter, Naive Bayes, Support Vector Regression, XGBoost, Dense Neural Network, Recurrent Neural Net, GRU, LSTM.

For classification, we currently include: Logistic Regression, Support Vector Classification, XGBoost, Dense Neural Network, Recurrent Neural Net, GRU, LSTM.

This package was originally designed for regression and classification functions were just added - therefore, the ReadMe, examples, and preprocessing functions are still catered for regression. We are in the process of adding more for classification.

What is new in Neural_Decoding_sklearn compared to its upstream?

Upstream: KordingLab/Neural_Decoding

  1. To take full advantage of the extensive machine learning toolkit from the scikit-learn framework for data preprocessing (e.g. z-scoring), different scoring metrics, cross validation, hyperparameter search, pipelines
    • We treat each decoder model as an sklearn estimator
    • Neural network models are no longer based on Keras but on PyTorch, with the sklearn adaptor provided by skorch
  2. The preprocessing function get_spikes_with_history is replaced by a custom sklearn transformer class LagMat which is a wrapper of the similarly named function from statsmodels.
  3. A test suite to test the new implementation (against the original functions when relevant).
  4. Since we are able to make use of other popular and well-maintained codebases, the new workflow contains only a few additional scripts contained in Neural_Decoding.nonlinear and Neural_Decoding.nn.
    • Neural_Decoding.decoding approaching deprecation but retained in the current version for testing purposes.
  5. Test data is now hosted on a different server that supports token-less download.

Overview of available models

The following decoders have been tested.

Original Current sklearn estimator Notes
WienerFilterDecoder sklearn.linear_model.LinearRegression
WienerCascadeDecoder Neural_Decoding.nonlinear.WienerCascade sklearn.preprocessing.PolynomialFeatures is too computationally intensive. So a wrapper that combines sklearn.linear_model.LinearRegression with numpy.polyfit is used.
KalmanFilterDecoder
NaiveBayesDecoder
SVRDecoder sklearn.multioutput.MultiOutputRegressor(sklearn.svm.SVR) MultiOutputRegressor is needed only for multi-target use.
XGBoostDecoder xgboost.XGBRegressor
DenseNNDecoder skorch.NeuralNetRegressor(Neural_Decoding.nn.FNN)
SimpleRNNDecoder skorch.NeuralNetRegressor(Neural_Decoding.nn.RNN)
GRUDecoder skorch.NeuralNetRegressor(Neural_Decoding.nn.GRU)
LSTMDecoder skorch.NeuralNetRegressor(Neural_Decoding.nn.LSTM)

Usage Examples

Two accompanying notebooks demonstrates walk through our current workflow.

  • Examples_sklearn_decoders.ipynb: Run each decoder separately, based on Examples_original/Examples_all_decoders.ipynb.
  • Examples_sklearn_cross_validation.ipynb: Run multiple decoders with cross validation, a direct comparaison with Paper_code/ManyDecoders_FullData.ipynb

Our manuscript and datasets

This package accompanies a manuscript that compares the performance of these methods on several datasets. We would appreciate if you cite that manuscript if you use our code or data for your research.

Code used for the paper is in the "Paper_code" folder. It is described further at the bottom of this read-me.

All 3 datasets (motor cortex, somatosensory cortex, and hippocampus) used in the paper can be downloaded here. They are in both matlab and python formats, and can be used in the example files described below.

Installation

This package can be installed via pip at the command line by typing

pip install 'Neural_Decoding @ git+https://github.com/theowoo/Neural_Decoding_sklearn.git'

or via SSH

pip install 'Neural_Decoding @ git+git@github.com:theowoo/Neural_Decoding_sklearn.git'

We've designed the code so that not all machine learning packages need to be installed for the others to work.

Dependency groups

Specific dependencies may be specified. full will install all dependencies:

pip install 'Neural_Decoding @ git+https://github.com/theowoo/Neural_Decoding_sklearn.git' --group full

Or combined:

pip install 'Neural_Decoding @ git+https://github.com/theowoo/Neural_Decoding_sklearn.git' --group nn --group xgboost

In order to run all the decoders based on neural networks, use nn.
In order to run the XGBoost Decoder, use xgboost
In order to run the Wiener Filter, Wiener Cascade, or Support Vector Regression, use sklearn.
In order to do hyperparameter optimization, use hyperopt. For all of the above, use full. In order to run tests, use test.

Getting started

We have included jupyter notebooks that provide detailed examples of how to use the decoders.

  • The file central_concepts_in_ML_for_decoding.ipynb is designed for users who are new to machine learning. It builds basic concepts and shows some examples, and also has several exercises to make sure you know your stuff. (Link to the solutions is inside).
  • The file Examples_kf_decoder.ipynb is for the Kalman filter decoder
  • The file Examples_all_decoders.ipynb is for all other decoders. These examples work well with the somatosensory and motor cortex datasets.
  • There are minor differences in the hippocampus dataset, so we have included a folder, Examples_hippocampus, with analogous example files. This folder also includes an example file for using the Naive Bayes decoder (since it works much better on our hippocampus dataset).
  • We have also included a notebook, Example_hyperparam_opt.ipynb, that demonstrates how to do hyperparameter optimization for the decoders.

Here we provide a basic example where we are using a LSTM decoder.
For this example we assume we have already loaded matrices:

  • "neural_data": a matrix of size "total number of time bins" x "number of neurons," where each entry is the firing rate of a given neuron in a given time bin.
  • "y": the output variable that you are decoding (e.g. velocity), and is a matrix of size "total number of time bins" x "number of features you are decoding."

We have provided a Jupyter notebook, Example_format_data.ipynb with an example of how to get Matlab data into this format.

First we will import the necessary functions

from Neural_Decoding.decoders import LSTMDecoder #Import LSTM decoder
from Neural_Decoding.preprocessing_funcs import get_spikes_with_history #Import function to get the covariate matrix that includes spike history from previous bins

Next, we will define the time period we are using spikes from (relative to the output we are decoding)

bins_before=13 #How many bins of neural data prior to the output are used for decoding
bins_current=1 #Whether to use concurrent time bin of neural data
bins_after=0 #How many bins of neural data after the output are used for decoding

Next, we will compute the covariate matrix that includes the spike history from previous bins

# Function to get the covariate matrix that includes spike history from previous bins
X=get_spikes_with_history(neural_data,bins_before,bins_after,bins_current)

In this basic example, we will ignore some additional preprocessing we do in the example notebooks. Let's assume we have now divided the data into a training set (X_train, y_train) and a testing set (X_test,y_test).

We will now finally train and test the decoder:

#Declare model and set parameters of the model
model_lstm=LSTMDecoder(units=400,num_epochs=5)

#Fit model
model_lstm.fit(X_train,y_train)

#Get predictions
y_test_predicted_lstm=model_lstm.predict(X_test)

What's Included

There are 3 files with functions. An overview of the functions are below. More details can be found in the comments within the files.

decoders.py:

This file provides all of the decoders. Each decoder is a class with functions "fit" and "predict".

First, we will describe the format of data that is necessary for the decoders

  • For all the decoders, you will need to decide the time period of spikes (relative to the output) that you are using for decoding.
  • For all the decoders other than the Kalman filter, you can set "bins_before" (the number of bins of spikes preceding the output), "bins_current" (whether to use the bin of spikes concurrent with the output), and "bins_after" (the number of bins of spikes after the output). Let "surrounding_bins" = bins_before+bins_current+bins_after. This allows us to get a 3d covariate matrix "X" that has size "total number of time bins" x "surrounding_bins" x "number of neurons." We use this input format for the recurrent neural networks (SimpleRNN, GRU, LSTM). We can also flatten the matrix, so that there is a vector of features for every time bin, to get "X_flat" which is a 2d matrix of size "total number of time bins" x "surrounding_bins x number of neurons." This input format is used for the Wiener Filter, Wiener Cascade, Support Vector Regression, XGBoost, and Dense Neural Net.
  • For the Kalman filter, you can set the "lag" - what time bin of the neural data (relative to the output) is used to predict the output. The input format for the Kalman filter is simply the 2d matrix of size "total number of time bins" x "number of neurons," where each entry is the firing rate of a given neuron in a given time bin.
  • The output, "y" is a 2d matrix of size "total number of time bins" x "number of output features."


Here are all the decoders within "decoders.py" for performing regression:

  1. WienerFilterDecoder
  • The Wiener Filter is simply multiple linear regression using X_flat as an input.
  • It has no input parameters
  1. WienerCascadeDecoder
  • The Wiener Cascade (also known as a linear nonlinear model) fits a linear regression (the Wiener filter) followed by fitting a static nonlearity.
  • It has parameter degree (the degree of the polynomial used for the nonlinearity)
  1. KalmanFilterDecoder
  • We used a Kalman filter similar to that implemented in Wu et al. 2003. In the Kalman filter, the measurement was the neural spike trains, and the hidden state was the kinematics.
  • We have one parameter C (which is not in the previous implementation). This parameter scales the noise matrix associated with the transition in kinematic states. It effectively allows changing the weight of the new neural evidence in the current update.
  1. NaiveBayesDecoder
  • We used a Naive Bayes decoder similar to that implemented in Zhang et al. 1998 (see manuscript for details).
  • It has parameters encoding_model (for either a linear or quadratic encoding model) and res (to set the resolution of predicted values)
  1. SVRDecoder
  • This decoder uses support vector regression using X_flat as an input.
  • It has parameters C (the penalty of the error term) and max_iter (the maximum number of iterations).
  • It works best when the output ("y") has been normalized
  1. XGBoostDecoder
  • We used the Extreme Gradient Boosting XGBoost algorithm to relate X_flat to the outputs. XGBoost is based on the idea of boosted trees.
  • It has parameters max_depth (the maximum depth of the trees), num_round (the number of trees that are fit), eta (the learning rate), and gpu (if you have the gpu version of XGBoost installed, you can select which gpu to use)
  1. DenseNNDecoder
  • Using the Keras library, we created a dense feedforward neural network that uses X_flat to predict the outputs. It can have any number of hidden layers.
  • It has parameters units (the number of units in each layer), dropout (the proportion of units that get dropped out), num_epochs (the number of epochs used for training), and verbose (whether to display progress of the fit after each epoch)
  1. SimpleRNNDecoder
  • Using the Keras library, we created a neural network architecture where the spiking input (from matrix X) was fed into a standard recurrent neural network (RNN) with a relu activation. The units from this recurrent layer were fully connected to the output layer.
  • It has parameters units, dropout, num_epochs, and verbose
  1. GRUDecoder
  • Using the Keras library, we created a neural network architecture where the spiking input (from matrix X) was fed into a network of gated recurrent units (GRUs; a more sophisticated RNN). The units from this recurrent layer were fully connected to the output layer.
  • It has parameters units, dropout, num_epochs, and verbose
  1. LSTMDecoder
  • All methods were the same as for the GRUDecoder, except Long Short Term Memory networks (LSTMs; another more sophisticated RNN) were used rather than GRUs.
  • It has parameters units, dropout, num_epochs, and verbose

When designing the XGBoost and neural network decoders, there were many additional parameters that could have been utilized (e.g. regularization). To simplify ease of use, we only included parameters that were sufficient for producing good fits.

metrics.py:

The file has functions for metrics to evaluate model fit. It currently has functions to calculate:

  • equation
  • equation : The pearson correlation coefficient

preprocessing_funcs.py

The file contains functions for preprocessing data that may be useful for putting the neural activity and outputs in the correct format for our decoding functions

  • bin_spikes: converts spike times to the number of spikes within time bins
  • bin_output: converts a continuous stream of outputs to the average output within time bins
  • get_spikes_with_history: using binned spikes as input, this function creates a covariate matrix of neural data that incorporates spike history

Paper code

In the folder "Paper_code", we include code used for the manuscript.

  • Files starting with "ManyDecoders" use all decoders except the Kalman Filter and Naive Bayes
  • Files starting with "KF" use the Kalman filter
  • Files starting with "BayesDecoder" use the Naive Bayes decoder
  • Files starting with "Plot" create the figures in the paper
  • Files ending with "FullData" are for figures 3/4
  • Files ending with "DataAmt" are for figures 5/6
  • Files ending with "FewNeurons" are for figure 7
  • Files ending with "BinSize" are for figure 8
  • Files mentioning "Hyperparams" are for figure 9

About

A python package that includes many methods for decoding neural activity

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 94.9%
  • Python 5.1%