# LFADS
This notebook is part of the [SachsLab Workshop for Intracranial Neurophysiology and Deep Learning](https://github.com/SachsLab/IntracranialNeurophysDL).

Follow the link below to run in Google Colab or continue if running locally.

Run the first few cells to normalize Local / Colab environments, then proceed below for the lesson.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/SachsLab/IntracranialNeurophysDL/blob/master/notebooks/06_01_LFADS.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/SachsLab/IntracranialNeurophysDL/blob/master/notebooks/06_01_LFADS.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

In [None]:
import os
import sys
from pathlib import Path
try:
    # See if we are running on google.colab
    import google.colab
    from google.colab import files
    if sys.version_info > (3, 0):
        from importlib import reload
        !pip install --upgrade -q https://storage.googleapis.com/jax-wheels/cuda$(echo $CUDA_VERSION | sed -e 's/\.//' -e 's/\..*//')/jaxlib-latest-cp36-none-linux_x86_64.whl
    else:
        !pip install --upgrade -q https://storage.googleapis.com/jax-wheels/cuda$(echo $CUDA_VERSION | sed -e 's/\.//' -e 's/\..*//')/jaxlib-latest-cp27-none-linux_x86_64.whl
    !pip install --upgrade -q git+https://github.com/google/jax.git

    os.chdir('..')
    if not (Path.home() / '.kaggle').is_dir():
        # Configure kaggle
        files.upload()  # Find the kaggle.json file in your ~/.kaggle directory.
        !pip install -q kaggle
        !mkdir -p ~/.kaggle
        !mv kaggle.json ~/.kaggle/
        !chmod 600 ~/.kaggle/kaggle.json
    if Path.cwd().stem != 'IntracranialNeurophysDL':
        if not (Path.cwd() / 'IntracranialNeurophysDL').is_dir():
            # Download the workshop repo and change to its directory
            !git clone --recursive https://github.com/SachsLab/IntracranialNeurophysDL.git
        os.chdir('IntracranialNeurophysDL')
    IN_COLAB = True
except ModuleNotFoundError:
    IN_COLAB = False
    if Path.cwd().stem == 'notebooks':
        os.chdir(Path.cwd().parent)
    # Make sure the kaggle executable is on the PATH
    os.environ['PATH'] = os.environ['PATH'] + ';' + str(Path(sys.executable).parent / 'Scripts')

In [None]:
# Download and unzip data (2.1 GB)
datadir = Path.cwd() / 'data' / 'joeyo'
if not (datadir / 'converted').is_dir():
    !kaggle datasets download --unzip --path {str(datadir / 'converted')} cboulay/joeyo_nhp_reach_mea
    print("Finished downloading and extracting data.")
else:
    print("Data directory found. Skipping download.")

## Imports

In [None]:
import datetime
import h5py
import jax.numpy as np
from jax import random
from jax.experimental import optimizers
from jax.config import config
#config.update("jax_debug_nans", True) # Useful for finding numerical errors
import matplotlib.pyplot as plt
import numpy as onp  # original CPU-backed NumPy
import scipy.signal
import scipy.stats
import time

## Configs, Constants, and Hyperparameters

In [None]:
SESS_IDX = 0          # Index of recording session we will use. 0:8
BIN_DURATION = 0.250  # Width of window used to bin spikes, in seconds
N_TAPS = 4            # Number of bins of history used in a sequence.
P_TRAIN = 0.8         # Proportion of data used for training.
BATCH_SIZE = 32       # Number of sequences in each training step.
P_DROPOUT = 0.05      # Proportion of units to set to 0 on each step.
N_RNN_UNITS = 60      # Size of RNN output (state)
L2_REG = 1.7e-5       # Parameter regularization strength.
STATEFUL = False      # Whether or not to keep state between sequences (True is not tested)
EPOCHS = 10           # Number of loops through the entire data set.

# LFADS Hyper parameters
# data_dim = train_data.shape[2]  # input to lfads should have dimensions:
# ntimesteps = train_data.shape[1] #   (batch_size x ntimesteps x data_dim)

# LFADS architecture - The size of the numbers is rather arbitrary, 
# but relatively small because we know the integrator RNN isn't too high 
# dimensional in its activity.
ENC_DIM = 128         # encoder dim
CON_DIM = 128         # contoller dim
II_DIM = 1            # inferred input dim, we know there is 1 dim in integrator RNN
GEN_DIM = 128         # generator dim, should be large enough to generate integrator RNN dynamics
FACTORS_DIM = 32      # factors dim, shoudl be large enough to capture most variance of dynamics

# Numerical stability
VAR_MIN = 0.001       # Minimal variance any gaussian can become.

# Initial state prior parameters
# the mean is set to zero in the code
ic_prior_var = 0.1 # this is $\sigma^2_p$ of uninformative prior

# Inferred input autoregressive prior parameters. I don't plan to do inputs.
# Again, these hyper parameters are set "in the ballpark" but otherwise
# pretty randomly.
ar_mean = 0.0                 # process mean
ar_autocorrelation_tau = 1.0  # seconds, how correlated each time point is, related to $\phi$ above.
ar_noise_variance = 0.1       # noise variance


## Load Spiking Data

In [None]:
from data.utils.fileio import load_joeyo_reaching

def load_dat_with_vel_accel(datadir, sess_idx):
    BEHAV_CHANS = ['CursorX', 'CursorY']
    sess_names = ['indy_201' + _ for _ in ['60921_01', '60927_04', '60927_06', '60930_02', '60930_05', '61005_06',
                                       '61006_02', '60124_01', '60127_03']]
    X, Y, X_ax_info, Y_ax_info = load_joeyo_reaching(datadir, sess_names[sess_idx], x_chunk='mu_spiketimes')

    # Slice Y to only keep required behaviour data (cursor position)
    b_keep_y_chans = np.in1d(Y_ax_info['channel_names'] , BEHAV_CHANS)
    Y = Y[b_keep_y_chans, :]
    Y_ax_info['channel_names'] = [_ for _ in Y_ax_info['channel_names'] if _ in BEHAV_CHANS]

    # Calculate discrete derivative and double-derivative to get velocity and acceleration.
    vel = np.diff(Y, axis=1)
    vel = np.concatenate((vel[:, 0][:, None], vel), axis=1)  # Assume velocity was constant across the first two samples.
    accel = np.concatenate(([[0], [0]], np.diff(vel, axis=1)), axis=1)  # Assume accel was 0 in the first sample.
    Y = np.concatenate((Y, vel, accel), axis=0)
    Y_ax_info['channel_names'] += ['VelX', 'VelY', 'AccX', 'AccY']
    
    return X, Y, X_ax_info, Y_ax_info

X, Y, X_ax_info, Y_ax_info = load_dat_with_vel_accel(datadir, SESS_IDX)


### Segment Into Trials 


### Visualize Data

In [None]:
#TODO

## LFADS - Latent Factor Analysis via Dynamical Systems
This notebook is based on a notebook found in in the [google-research/computation-thru-dynamics github repo](https://github.com/google-research/computation-thru-dynamics).

In the below diagram, the "Generator" is the cortical neuronal population, which is assumed to be a nonlinear,
dynamical system that is modeled with a RNN. For now we will ignore the bottom "Controller" and "Inferred inputs" parts.

![](https://raw.githubusercontent.com/google-research/computation-thru-dynamics/master/images/lfads_architecture_w_inferred_inputs_3.png)

The data is put through nonlinear, recurrent **encoders**, and this produces an **initial state distribution**,
which is a per-trial mean and variance to produce random vectors to encode that trial.
This is exactly the same as the 'bottleneck' or 'latent variables' we saw in the variational auto-encoder tutorial.

The initial state of the generator is a randomly drawn vector from this distribution.
The **generator** marches through time and at each time point produces "factors" and "rates".
The "factors" are the low-dimensional neural state. The "rates" are the projection of the neural state into neuronal activations.
The rates are then used to parameterize a Poisson process to generate spikes.
The loss function compares the generated spike trains to the original input spike trains.
