<a href="https://githubtocolab.com/neurallatents/nlb_tools/blob/main/examples/tutorials/slds_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SLDS Demo

In this notebook, we will use a switching linear dynamical system (SLDS) to model the neural data. We will use the Linderman Lab's [`ssm` package](https://github.com/lindermanlab/ssm), which you should install before running this demo. We recommend first viewing `basic_example.ipynb` for more explanation of the `nlb_tools` functions we use here.

## 1. Setup

Below, we import the necessary functions from `nlb_tools` and additional standard packages.

In [1]:
## Install packages if necessary
# !pip install git+https://github.com/lindermanlab/ssm
# !pip install -U scikit-learn
# !pip install nlb-tools

In [2]:
## Imports

from nlb_tools.nwb_interface import NWBDataset
from nlb_tools.make_tensors import make_train_input_tensors, make_eval_input_tensors, make_eval_target_tensors, save_to_h5
from nlb_tools.evaluation import evaluate

import ssm
import numpy as np
import h5py
import sys

In [3]:
## If necessary, download dataset from DANDI
# !pip install dandi
# !dandi download https://dandiarchive.org/dandiset/000138 # replace URL with URL for dataset you want
# # URLS are:
# # - MC_Maze: https://dandiarchive.org/dandiset/000128
# # - MC_RTT: https://dandiarchive.org/dandiset/000129
# # - Area2_Bump: https://dandiarchive.org/dandiset/000127
# # - DMFC_RSG: https://dandiarchive.org/dandiset/000130
# # - MC_Maze_Large: https://dandiarchive.org/dandiset/000138
# # - MC_Maze_Medium: https://dandiarchive.org/dandiset/000139
# # - MC_Maze_Small: https://dandiarchive.org/dandiset/000140

## 2. Loading data

Below, please enter the path to the dataset, as well as the name of the dataset, to load the data. In addition, you can choose a bin size (0.005 or 0.02 s) to run the notebook at.

In [4]:
## Load dataset

dataset_name = 'mc_maze_small'
datapath = './000140/sub-Jenkins/'
prefix = f'*ses-small'
dataset = NWBDataset(datapath, prefix)

## 3. Input prep

`ssm` expects inputs as a list of 2d arrays of type `int`, so we will use functions from `make_tensors` to create 3d arrays, and split the arrays along the trial axis to get our list. Note that since SLDS can perform forward prediction, we indicate `include_forward_pred=True` in `make_train_input_tensors`, which includes the next 200 ms of spiking activity after the required window for each trial in separate tensors called `'train_spikes_heldin_forward'` and `'train_spikes_heldout_forward'`.

In [5]:
## Dataset preparation

# Choose the phase here, either 'val' or 'test'
phase = 'val'

# Choose bin width and resample
bin_width = 5
dataset.resample(bin_width)

# Create suffix for group naming later
suffix = '' if (bin_width == 5) else f'_{int(round(bin_width))}'

In [6]:
## Make train input data

# Generate input tensors
train_trial_split = 'train' if (phase == 'val') else ['train', 'val']
train_dict = make_train_input_tensors(dataset, dataset_name=dataset_name, trial_split=train_trial_split, save_file=False, include_forward_pred=True)

# Unpack input data
train_spikes_heldin = train_dict['train_spikes_heldin']
train_spikes_heldout = train_dict['train_spikes_heldout']

In [7]:
## Make eval input data

# Generate input tensors
eval_trial_split = phase
eval_dict = make_eval_input_tensors(dataset, dataset_name=dataset_name, trial_split=eval_trial_split, save_file=False)

# Unpack data
eval_spikes_heldin = eval_dict['eval_spikes_heldin']

In [8]:
## Prep input

# Combine train spiking data into one array
train_spikes_heldin = train_dict['train_spikes_heldin']
train_spikes_heldout = train_dict['train_spikes_heldout']
train_spikes_heldin_fp = train_dict['train_spikes_heldin_forward']
train_spikes_heldout_fp = train_dict['train_spikes_heldout_forward']
train_spikes = np.concatenate([
    np.concatenate([train_spikes_heldin, train_spikes_heldin_fp], axis=1),
    np.concatenate([train_spikes_heldout, train_spikes_heldout_fp], axis=1),
], axis=2)

# Fill missing test spiking data with zeros and make masks
eval_spikes_heldin = eval_dict['eval_spikes_heldin']
eval_spikes = np.full((eval_spikes_heldin.shape[0], train_spikes.shape[1], train_spikes.shape[2]), 0.0)
masks = np.full((eval_spikes_heldin.shape[0], train_spikes.shape[1], train_spikes.shape[2]), False)
eval_spikes[:, :eval_spikes_heldin.shape[1], :eval_spikes_heldin.shape[2]] = eval_spikes_heldin
masks[:, :eval_spikes_heldin.shape[1], :eval_spikes_heldin.shape[2]] = True

# Make lists of arrays
train_datas = [train_spikes[i, :, :].astype(int) for i in range(len(train_spikes))]
eval_datas = [eval_spikes[i, :, :].astype(int) for i in range(len(eval_spikes))]
eval_masks = [masks[i, :, :].astype(bool) for i in range(len(masks))]

num_heldin = train_spikes_heldin.shape[2]
tlen = train_spikes_heldin.shape[1]
num_train = len(train_datas)
num_eval = len(eval_datas)

## 4. Running SLDS

Now that we have our input data prepared, we can fit an SLDS to it. Feel free to vary the parameters as you see fit

In [9]:
## Run SLDS

# Set parameters
T = train_datas[0].shape[0] # trial length
K = 3 # number of discrete states
D = 15 # dimensionality of latent states
N = train_datas[0].shape[1] # input dimensionality

slds = ssm.SLDS(N, K, D,
    transitions='standard',
    emissions='poisson',
    emission_kwargs=dict(link="log"),
    dynamics_kwargs={
        'l2_penalty_A': 3000.0,
    }
)

# Train
q_elbos_lem_train, q_lem_train = slds.fit(
    datas=train_datas,
    method="laplace_em",
    variational_posterior="structured_meanfield",
    num_init_iters=25, num_iters=25, alpha=0.2,
)

# Pass eval data
q_elbos_lem_eval, q_lem_eval = slds.approximate_posterior(
    datas=eval_datas,
    masks=eval_masks,
    method="laplace_em",
    variational_posterior="structured_meanfield",
    num_iters=25, alpha=0.2,
)

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1.0), HTML(value='')))

Initializing with an ARHMM using 25 steps of EM.


HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=25.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=25.0), HTML(value='')))




HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=25.0), HTML(value='')))




## 5. Generating rate predictions

We now have our estimates of continuous neural population state, so we'll now use them to predict neuron firing rates. `SLDS` does this by smoothing the input data.

In [10]:
## Generate rate predictions

# Smooth observations using inferred states
train_rates = [slds.smooth(q_lem_train.mean_continuous_states[i], train_datas[i]) for i in range(num_train)]
eval_rates = [slds.smooth(q_lem_eval.mean_continuous_states[i], eval_datas[i], mask=eval_masks[i]) for i in range(num_eval)]

# Reshape output
train_rates = np.stack(train_rates)
eval_rates = np.stack(eval_rates)

train_rates_heldin = train_rates[:, :tlen, :num_heldin]
train_rates_heldout = train_rates[:, :tlen, num_heldin:]
eval_rates_heldin = eval_rates[:, :tlen, :num_heldin]
eval_rates_heldout = eval_rates[:, :tlen, num_heldin:]
eval_rates_heldin_forward = eval_rates[:, tlen:, :num_heldin]
eval_rates_heldout_forward = eval_rates[:, tlen:, num_heldin:]

## 6. Making the submission

Now, we'll make the submission dict manually. As described in `basic_example.ipynb`, you can also use the function `save_to_h5` from `make_tensors.py` to save the output as an h5 file for submission on EvalAI.

In [11]:
## Prepare submission data

output_dict = {
    dataset_name + suffix: {
        'train_rates_heldin': train_rates_heldin,
        'train_rates_heldout': train_rates_heldout,
        'eval_rates_heldin': eval_rates_heldin,
        'eval_rates_heldout': eval_rates_heldout,
        'eval_rates_heldin_forward': eval_rates_heldin_forward,
        'eval_rates_heldout_forward': eval_rates_heldout_forward,
    }
}

# To save as an h5 file:
# save_to_h5(output_dict, 'submission.h5')

## 7. Evaluation

Finally, we will create the test data with make_test_tensor and evaluate our model.

In [12]:
## Make data to test predictions with and evaluate

if phase == 'val':
    target_dict = make_eval_target_tensors(dataset, dataset_name=dataset_name, train_trial_split='train', eval_trial_split='val', include_psth=('mc_rtt' not in dataset_name), save_file=False)

    print(evaluate(target_dict, output_dict))

Zero rate predictions found. Replacing zeros with 1e-9


[{'mc_maze_scaling_split': {'[100] co-bps': 0.15331211388512617, '[100] vel R2': 0.6514445468827901, '[100] psth R2': 0.21864536599656217, '[100] fp-bps': -2.7198655704068924}}]


## Summary

In this notebook, we used `nlb_tools` and `ssm` to run and evaluate SLDS on our benchmark.