## Background

In order to manage all useful information and data in these recordings more effectively, I created a class structure called `cage_data`. Each `.pkl` file in the folder `continuous` contains one `cage_data` object, corresponding to one of the 15-minute long recording files. For some practical reasons, we have a 2-second long pause between every two adjacent 15-minute long recording files.

To read those `.pkl` files, first you need to have the codes defining the `cage_data` object in your path. You can find those codes in this [repo](https://github.com/limblab/cage_data). Clone it to your local machine, and simply add it to your path like this:

In [None]:
import fnmatch, os, sys
sys.path.append('path to the cloned repo/cage_data')

In [2]:
import cage_data
import numpy as np

## Load one file as an example

Data are stored in the variable `my_cage_data`, an instance of the `cage_data` class.

In [4]:
import pickle

data_path = 'E:/Box Sync/Pop_20201020/cage/50ms/continuous/'
file_name = '20201020_Pop_Cage_003.pkl'
print('The file %s is going to be loaded'%(data_path + file_name))
with open ( data_path + file_name, 'rb' ) as fp:
    my_cage_data = pickle.load(fp)
my_cage_data.pre_processing_summary()

The file E:/Box Sync/Pop_20201020/cage/50ms/continuous/20201020_Pop_Cage_003.pkl is going to be loaded
This is a non-sorted file
EMG filtered? -- True
EMG filtered? -- True
Cortical data cleaned? -- True
Data binned? -- True
Spikes smoothed? -- True


## Some basic information about the file just loaded

In [5]:
print('There are %d cortical channels'%(len(my_cage_data.spikes)))
print('There are %d EMG channels'%(len(my_cage_data.EMG_diff)))
print('The raw EMG signals are sampled at %.3f Hz'%(my_cage_data.EMG_fs))
print('There are %d behavior segments in this file'%(len(my_cage_data.behave_tags['tag'])))
print('The length of this file is %.3f seconds'%(my_cage_data.EMG_timeframe[-1]))
print('Spikes and EMGs are binned or downsampled with %.2f seconds time bins'%(my_cage_data.binned['timeframe'][1]-my_cage_data.binned['timeframe'][0]))

There are 72 cortical channels
There are 15 EMG channels
The raw EMG signals are sampled at 2011.061 Hz
There are 124 behavior segments in this file
The length of this file is 900.108 seconds
Spikes and EMGs are binned or downsampled with 0.05 seconds time bins


## How to get raw EMGs?

The raw EMGs are acquired by DSPW wireless system with an Intan RHD2132 frontend. Since the channels on RHD2132 are all single-ended, we do software differential after getting the signals. Therefore, here the raw EMGs are stored in a field called `EMG_diff`.

More specifically, `EMG_diff` is an attribute of the `cage_data` class, and could be accessed by calling:
* raw_EMGs = my_cage_data.EMG_diff

And the sampling frequency of the raw EMGs could be obtained by calling:
* fs_raw_EMG = my_cage_data.EMG_fs

The time frame of the raw EMGs could be obtained by calling:
* raw_EMG_timeframe = my_cage_data.EMG_timeframe

The names of each EMG channel could be got by:
* EMG_names = my_cage_data.EMG_names

In [35]:
raw_EMGs = my_cage_data.EMG_diff
print('There are %d channels, and each channel has %d time samples'%(len(raw_EMGs), len(raw_EMGs[0])))

raw_EMG_timeframe = my_cage_data.EMG_timeframe

fs_raw_EMG = my_cage_data.EMG_fs
print('The raw EMG signals are sampled %.3f Hz'%(fs_raw_EMG))

EMG_names = my_cage_data.EMG_names
print(EMG_names)

There are 15 channels, and each channel has 1810172 time samples
The raw EMG signals are sampled 2011.061 Hz
['APB', 'Lum', 'PT', 'FDP2', 'FCR1', 'FCU1', 'FCUR', 'FPB', '3DI', 'SUP', 'ECU', 'ECR', 'EDC1', 'BI', 'TRI']


## How to get raw spike timings?

Raw spike timings are stored with the attribute `spikes`, and could be got like this:
* spike_timing = my

## How to get binned and downsampled spike counts and filtered EMG envelops? 

The binned data is stored with another attribute named `binned`. It is a dictionary, and could be accessed like this:
* binned = my_cage_data.binned

There are 4 fields in this dictionary:
* spikes: binned spike counts
* filtered_EMG: EMG envelops been rectified, filtered and downsampled
* FSR_data: the data from force sensitive resistors inside the plastic cage, only meaningful for power grasping
* timeframe: the common time frame for the 3 types of data above

Data in the files we are using now are binned with 50 ms time bins

In [7]:
print(my_cage_data.binned.keys())

# To get the binned spike counts
binned_spike_counts = my_cage_data.binned['spikes']

# To get the rectified, filtered and downsampled EMGs
filtered_EMG = my_cage_data.binned['filtered_EMG']

# To get the time frame of the binned data
timeframe = my_cage_data.binned['timeframe']

dict_keys(['timeframe', 'spikes', 'filtered_EMG', 'FSR_data'])
