# LSTM and GRU with cortical electrophys
This notebook is part of the [SachsLab Workshop for Intracranial Neurophysiology and Deep Learning](https://github.com/SachsLab/IntracranialNeurophysDL).

Run the first three cells to normalize Local / Colab environments, then proceed below for the lesson.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path
import os
try:
    # See if we are running on google.colab
    import google.colab
    from google.colab import files
    os.chdir('..')
    if not (Path.home() / '.kaggle').is_dir():
        # Configure kaggle
        files.upload()  # Find the kaggle.json file in your ~/.kaggle directory.
        !pip install -q kaggle
        !mkdir -p ~/.kaggle
        !mv kaggle.json ~/.kaggle/
        !chmod 600 ~/.kaggle/kaggle.json
    if Path.cwd().stem != 'IntracranialNeurophysDL':
        if not (Path.cwd() / 'IntracranialNeurophysDL').is_dir():
            # Download the workshop repo and change to its directory
            !git clone --recursive https://github.com/SachsLab/IntracranialNeurophysDL.git
        os.chdir('IntracranialNeurophysDL')
    IN_COLAB = True
    # Setup tensorflow 2.0
    !pip install -q tensorflow-gpu==2.0.0-alpha0
except ModuleNotFoundError:
    IN_COLAB = False
    import sys
    if Path.cwd().stem == 'notebooks':
        os.chdir(Path.cwd().parent)
    # Make sure the kaggle executable is on the PATH
    os.environ['PATH'] = os.environ['PATH'] + ';' + str(Path(sys.executable).parent / 'Scripts')

# Try to clear any logs from previous runs
if (Path.cwd() / 'logs').is_dir():
    import shutil
    try:
        shutil.rmtree(str(Path.cwd() / 'logs'))
    except PermissionError:
        print("Unable to remove logs directory.")

# Additional imports
import tensorflow as tf
import datetime
import numpy as np
import matplotlib.pyplot as plt
from indl import enable_plotly_in_cell
%load_ext tensorboard.notebook

In [6]:
# Download and unzip data (3.4 GB)
datadir = Path.cwd() / 'data' / 'joeyo'
if not (datadir / 'converted').is_dir():
    !kaggle datasets download --unzip --path {str(datadir / 'converted')} cboulay/joeyo_nhp_reach_mea
    print("Finished downloading and extracting data.")
else:
    print("Data directory found. Skipping download.")

Data directory found. Skipping download.


## Prepare Data

### Import

In [17]:
from data.utils.fileio import load_joeyo_reaching
sess_names = ['indy_2016' + _ for _ in ['0921_01', '0927_04', '0927_06', '0930_02', '0930_05' '1005_06' '1006_02']]
sess_name = sess_names[0]
X, Y, X_ax_info, Y_ax_info = load_joeyo_reaching(datadir, sess_name, x_chunk='spikerates')

### Prepare for TensorFlow
TODO: Our X LFP data is sampled at 1 kHz vs 250 Hz behavior

In [20]:
X_train.shape

(72030, 257)

In [19]:
from sklearn.model_selection import train_test_split
p_train = 0.8
seq_length = 100
batch_size = 5
b_keep_y_chans = np.in1d(Y_ax_info['channel_names'] , ['CursorX', 'CursorY'])

# TODO: Segment into sequences before doing train_test_split
X_train, X_valid, Y_train, Y_valid = train_test_split(X.T.astype(np.float32),
                                                      Y[b_keep_y_chans, :].T.astype(np.float32),
                                                      train_size=p_train)

ds_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))
ds_valid = tf.data.Dataset.from_tensor_slices((X_valid, Y_valid))

ds_train = ds_train.shuffle(int(n_trials * p_train) + 1).batch(batch_size, drop_remainder=True)  # , drop_remainder=True?
ds_valid = ds_valid.batch(batch_size)

NameError: name 'n_trials' is not defined

## Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU)
[This blog post](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) from mid-2015 is still one of the best explanations of LSTMs around. The short of it is that each LSTM cell has three functions or 'gates' that decide (on each step) how much of the state (forget gate) gets combined with how much of the input (update gate) to make the new state, and how much of the input gets combined with the state to make the output (output gate).

GRUs are similar in concept but use only two gates: an update gate and a reset gate. [See here](https://arxiv.org/pdf/1412.3555v1.pdf).

As such, LSTMs are slightly more sophisticated and theoretically could model more of your model space, but in practice GRUs usually do just as well and are more efficient computationally.

### Implementation Details
This will probably change, but the unification of LSTM APIs in TensorFlow 2.0 means that there are some constraints on how an LSTM can be parameterized and still run on the GPU.

1. activation == 'tanh'
2. recurrent_activation == 'sigmoid'
3. recurrent_dropout == 0
4. unroll is False
5. use_bias is True
6. No use of masking.

In [8]:
inputs = tf.keras.layers.Input(shape=(SEQ_LENGTH, X.shape[-1]))
outputs = tf.keras.layers.LSTM(N_RNN_UNITS, return_sequences=True)
model = tf.keras.Model(inputs, outputs)
model.summary()
model.compile(optimizer='rmsprop', loss='mean_squared_error')

NameError: name 'SEQ_LENGTH' is not defined

In [None]:
model.fit(x=dataset, epochs=EPOCHS, verbose=1)