## DataModule Configuration
The first step in applying lfads-torch to your dataset is to prepare your preprocessed data files. Save your data as n_samples x n_timesteps x n_channels arrays in the HDF5 format using the following keys:

train_encod_data: Data to be used as input when training the model.
train_recon_data: Data to be used as a reconstruction target when training the model.
valid_encod_data: Data to be used as input when validating the model.
valid_recon_data: Data to be used as a reconstruction target when validating the model.
Note that for both training and validation data, encod_data may be the same as recon_data, but they can be different to allow prediction of held out neurons or time steps.

Create a new configuration file for your dataset (e.g. configs/datamodule/my_datamodule.yaml). For single-session runs, set datafile_pattern to the path to your data file. For multi-session runs, set datafile_pattern to a glob-style pattern that matches all of your data files.

_target_: lfads_torch.datamodules.BasicDataModule
datafile_pattern: <PATH-TO-HDF5-FILE>
batch_size: <YOUR-BATCH-SIZE>
We provide preprocessed example data files from the Neural Latents Benchmark in datasets. With nlb_tools installed in your environment, you can additionally use the NLBEvaluation extension to monitor NLB metrics while training lfads-torch models.

In [13]:
import h5py
data_module_path = 'datasets/mc_rtt-05ms-val.h5'
data_module_config_file_path = "configs/datamodule/nlb_mc_rtt.yaml"
f = h5py.File(data_module_path, 'r')

In [14]:
for key,value in f.items():
   print(key)
   print(value)

train_behavior
<HDF5 dataset "train_behavior": shape (810, 120, 2), type "<f4">
train_decode_mask
<HDF5 dataset "train_decode_mask": shape (810, 1), type "|b1">
train_encod_data
<HDF5 dataset "train_encod_data": shape (810, 120, 98), type "<f2">
train_recon_data
<HDF5 dataset "train_recon_data": shape (810, 160, 130), type "<f2">
valid_behavior
<HDF5 dataset "valid_behavior": shape (270, 120, 2), type "<f4">
valid_decode_mask
<HDF5 dataset "valid_decode_mask": shape (270, 1), type "|b1">
valid_encod_data
<HDF5 dataset "valid_encod_data": shape (270, 120, 98), type "<f2">
valid_recon_data
<HDF5 dataset "valid_recon_data": shape (270, 160, 130), type "<f2">


## Model Configuration
Next, you'll need to create a model configuration file that defines the architecture of your LFADS model (e.g. configs/model/my_model.yaml). We provide several examples in configs/model.

While these config files provide an easy way to get up and running with LFADS relatively quickly, these default hyperparameters are unlikely to be the best ones for your dataset. We recommend sweeping over architecture and regularization hyperparameters in order to maximize performance.

Single-Session
You can find an example configuration for a single-session model at configs/model/nlb_mc_maze.yaml. You'll need to update the following values with your dataset-specific parameters:

encod_data_dim: The n_channels dimension of encod_data from your data file.
encod_seq_len: The n_timesteps dimension of encod_data from your data file.
recon_seq_len: The n_timesteps dimension of recon_data from your data file.
readout.modules.0.out_features: The n_channels dimension of recon_data from your data file.

In [11]:
model_config_path = "configs/model/nlb_mc_rtt.yaml" # double checked that config file lines up with the dataset-specific parameters

## Training a Model


In [12]:
import os
import shutil
from datetime import datetime
from pathlib import Path

from lfads_torch.run_model import run_model

# ---------- OPTIONS -----------
PROJECT_STR = "lfads-torch-example"
DATASET_STR = "nlb_mc_rtt"
RUN_TAG = datetime.now().strftime("%y%m%d") + "_exampleSingle"
RUN_DIR = Path("runs") / PROJECT_STR / DATASET_STR / RUN_TAG
OVERWRITE = True
# ------------------------------

# Overwrite the directory if necessary
if RUN_DIR.exists() and OVERWRITE:
    shutil.rmtree(RUN_DIR)
RUN_DIR.mkdir(parents=True)
# Copy this script into the run directory
shutil.copyfile(__file__, RUN_DIR / Path(__file__).name)
# Switch to the `RUN_DIR` and train the model
os.chdir(RUN_DIR)
run_model(
    overrides={
        "datamodule": data_module_config_file_path,
        "model": model_config_path,
    },
    config_path="../configs/single.yaml",
)


  from .autonotebook import tqdm as notebook_tqdm

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/Users/rosskempner/opt/anaconda3/envs/lfads-torch/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/rosskempner/opt/anaconda3/envs/lfads-torch/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/rosskempner/opt/anaconda3/envs/lfads-torch/lib/python3.9/site-packages/ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "/Users/rosske

AttributeError: module 'numpy' has no attribute 'bool8'

In [16]:
import numpy as np
np.__version__

'2.0.2'