---
---

# Introduction to DeepInsight - Decoding position, speed and head direction from tetrode CA1 recordings

This notebook stands as an example of how to use DeepInsight v0.5 on tetrode data and can be used as a guide on how to adapt it to your own datasets. All methods are stored in the deepinsight library and can be called directly or in their respective submodules. A typical workflow might look like the following: 
- Load your dataset into a format which can be directly indexed (numpy array or pointer to a file on disk)
- Preprocess the raw data (wavelet transformation)
- Preprocess your outputs (the variable you want to decode)
- Define appropriate loss functions for your output and train the model 
- Predict performance across all cross validated models
- Visualize influence of different input frequencies on model output


In [1]:
# Import DeepInsight
import sys
sys.path.insert(0, "/home/marx/Documents/Github/DeepInsight")
import deepinsight
# Choose GPU
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
# Additional imports
from scipy.io import loadmat
import numpy as np

---
---
Here you can define the paths to your raw data files, and create file names for the preprocessed HDF5 datasets.

The data we use here is usually relatively large in its raw format. Running it through the next lines takes roughly 24 hours for a 40 minute recording.

We provide a preprocess file to play with the code. See next cell

In [2]:
# Define base paths
base_path = './example_data/calcium/'
fp_raw_file = base_path + 'traces_M1336.mat' # This is an example dataset containing calcium traces and linear position in a virtual track
fp_deepinsight = base_path + 'processed_M1336.h5' # This will be the processed HDF5 file
sampling_rate = 30 # Might also be stored in above mat file for easier access

if os.path.exists(fp_raw_file):
    # Load data 
    calcium_data = loadmat(fp_raw_file)['dataSave']
    raw_data = np.squeeze(calcium_data['df_f'][0][0])
    raw_timestamps = np.arange(0, raw_data.shape[0]) / sampling_rate
    output = np.squeeze(calcium_data['pos_dat'][0][0])
    output_timestamps = raw_timestamps # In this recording timestamps are the same for output and raw_data
        
    # Transform raw data to frequency domain
    deepinsight.preprocess.preprocess_input(fp_deepinsight, raw_data, sampling_rate=sampling_rate, average_window=1, wave_highpass=1/500, wave_lowpass=sampling_rate)
    # # Prepare outputs
    # deepinsight.util.tetrode.preprocess_output(fp_deepinsight, raw_timestamps, output,
    #                                            output_timestamps, sampling_rate=info['sampling_rate'])

Starting wavelet transformation (n=73073, chunks=685, frequencies=26)


ValueError: not enough values to unpack (expected 3, got 2)

In [None]:
%debug

In [None]:
import deepinsight.util.wavelet_transform as wt

In [None]:
(_, wavelet_frequencies) = wt.wavelet_transform(np.ones(50000), sampling_rate, average_window=1, scaling_factor=0.5, wave_highpass=1/500, wave_lowpass=sampling_rate)

In [None]:
wavelet_frequencies.shape

In [None]:
1/500

---
---
The above steps create a HDF5 file with all important data for training the model.

You can download the preprocessed dataset by running the following command

In [None]:
!wget https://ndownloader.figshare.com/files/20150468 -O ./example_data/processed_R2478.h5

---
---
Now we can train the model. 

The following command uses 5 cross validations to train the models and stores weights in HDF5 files

In [None]:
# Define loss functions and train model
loss_functions = {'position' : 'euclidean_loss', 
                  'head_direction' : 'cyclical_mae_rad', 
                  'speed' : 'mae'}
loss_weights = {'position' : 1, 
                'head_direction' : 25, 
                'speed' : 2}
deepinsight.train.run_from_path(fp_deepinsight, loss_functions, loss_weights)

In [None]:
# Get loss and shuffled loss for influence plot, both is also stored back to HDF5 file
losses, output_predictions, indices = deepinsight.analyse.get_model_loss(fp_deepinsight,
                                                                         stepsize=10)
shuffled_losses = deepinsight.analyse.get_shuffled_model_loss(fp_deepinsight, axis=1,
                                                              stepsize=10)

---
---
Above line calculates the loss and shuffled loss across the full experiment and writes it back to the HDF5 file.

Below command visualizes the influence across different frequency bands for all samples

Note that Figure 3 in the manuscript shows influence across animals, while this plot shows the influence for one animal across the experiment

In [None]:
# Plot influence across behaviours
deepinsight.visualize.plot_residuals(fp_deepinsight, frequency_spacing=2,
                                     output_names=['Position', 'Head Direction', 'Speed'])

---
---