# A pipeline for recording, processing & successfully merging e-phys, IMU and tracking data (v3.0.0).
### author: github/bartulem


#### **[Step 0]** Considerations before, during and after conducting the experiments.

1. Set up [SpikeGLX](https://billkarsh.github.io/SpikeGLX/) to accommodate your specific recording configuration.
2. Turn on and calibrate the IMU before every session (system calibration does not need to be 3, but others do before the rat is plugged on; time readings should not have duplicates either).
3. Open Motive.
- Check whether the system is calibrated (continuous calibration should be on).
- If necessary, for each camera change strobe light settings to continuous light.
- Check that the rigid bodies for the head & arena LEDs exist (this enables on-line automatic labeling).
- Check whether the acquisition directory is the correct one.
- Check whether camera 1 is recording in MJPG/greyscale mode.
4. Put three circular markers on the back of the animal and plug it on.
5. Conduct the recording.
- In the NPX, tracking and IMU acquisition programs, you should see the microcontroller-generated random LED pulses.
- If the data acquisition looks OK in SpikeGLX, start recording.
- Start acquiring data on the IMU.
- Start recording in Motive.
- Keep it going for some time (e.g. 20-25 min).
- Stop recording in Motive.
- Stop acquiring data on the IMU.
- Stop recording in SpikeGLX.
6. It's good practice to label the back points immediately in Motive (the head & LEDs should be labeled already, if step 3d was implemented) and export the data in a .csv file.

#### **[Step 1]** Merge Neuropixel sessions to run Kilosort2. This step is *optional*, you can skip it if you are interested in only one session.

As instance attributes, you determine:
1. the directories where the NPX .bin files are (all associated files should be in the same directory; also, make sure they're named in a way that will order them properly)
2. the desired paths to the future merged files
3. the desired paths to the future .pkl files

Thus, multiple unrelated data streams can be processed sequentially.

As inputs to the concat_npx function, note that you have the option to set:
1. file_type (concatenate lf or ap files; defaults to ap)
2. cmd_prompt (whether to do the merging through the terminal/cmd prompt; defaults to True)
3. nchan (the number of channels on the probe; defaults to 385)
4. npx_sampling_rate (sampling rate of the NPX system; defaults to 3e4)

Along with the merged file, this code outputs a .pkl file with information about changepoints of the merged sessions (necessary for extracting spike times later).

In [None]:
from kisn_pylab import concatenate

file_directories = [r'A:\store\Bartul\neuropixel\26148_bruno\060520\spikes_imec0']
new_file_names = [r'A:\store\Bartul\neuropixel\26148_bruno\060520\spikes_imec0\060520_distal_all_g0_t0.imec0.ap.bin']
pkl_lengths = [r'A:\store\Bartul\neuropixel\26148_bruno\060520\060520_distal_all_g0_t0.imec0.ap.pkl']

In [None]:
for file_dir, new_file_name, pkl_len in zip(file_directories, new_file_names, pkl_lengths):
    concatClass = concatenate.Concat(file_dir, new_file_name, pkl_len)
    concatClass.concat_npx()

#### **[Step 2]** Run Kilosort2 through Python.

This step assumes you are happy with *everything* in the config file. If you need to modify anything, either the code needs to change or you complete this step in Matlab.
If this doesn't bother you, then you should do the following:
1. Download/clone [Kilosort2](https://github.com/MouseLand/Kilosort2) and set up the config, master and CUDA files accordingly.
2. Install [matlab engine](https://www.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html) (as an admin!).
3. Kilosort2 runs on all the .bin files in the given directory below. Make sure that this is what you want.
4. Don't use my Kilosort2 directory, but rather your own (created in step 0).
5. Set the file and Kilosort2 directories, and run the cell below.

!NB: While Kilosort2 is running, it's a good opportunity to label the tracking data if you haven't done so already!

As inputs to the run_kilosort function, note that you have to set:
1. file_dir (the absolute path to the directory the binary file is in)
2. kilosort2_dir (the absolute path to the directory the Kilosort2 code is in)

You, therefore, run one binary file at a time.

In [None]:
from kisn_pylab import kilosort

file_dir = r'A:\store\Bartul\neuropixel\26148_bruno\060520\spikes_imec0'
kilosort2_dir = r'A:\group\bartulm\Kilosort2-master'

In [None]:
kilosort.run_kilosort(file_dir, kilosort2_dir)

#### **[Step 3]** Arbitrate what is noise and what are clusters in Phy.
1. Install Phy: [Phy v2.0](https://github.com/cortex-lab/phy)
2. Navigate to the directory where Kilosort2 results were saved, open powershell and type "cmd", followed by "activate phy2", followed by "phy template-gui params.py".
3. Complete the manual curation ([Phy tutorial](https://phy.readthedocs.io/en/latest/)) and save your work.

#### **[Step 4]** Read in the sync events (make sure the PC has enough memory to run this, say 64Gb RAM) and put them in separate .txt files.

If you haven't done so already, label the tracked rigid bodies and marker sets in Motive (read the tutorial if you need) and export the data:
1. File > Export Tracking Data.
2. The following options need to be OFF: (1) Unlabeled markers, (2) Rigid Bodies, (3) Rigid Body markers, (4) Bones, (5) Bone markers.
3. Click "Export" and you should have created a .csv file (it may take ~1 minute).

As instance attributes, you determine:
1. the list with the files whose sync events you'd like to read (in practice this would be the imec0 and imec1 files for a given recording session)
2. the absolute path of the future sync .pkl file

As inputs to the read_se function, note that you have the option to set:
1. nchan (the number of channels on the probe; defaults to 385)
2. sync_chan (the specific sync port channel on the probe; defaults to 385)
3. track_file (the absolute path to the tracking file for that session; defaults to 0)
4. imu_file (the absolute path to the IMU file for that session; defaults to 0)
5. imu_pkl (the absolute future path to the IMU dataframe .pkl file; defaults to 0)
6. jitter_samples (number of samples in the imec data across which LED jitter could arise; defaults to 3)
7. half_smooth_window (number of frames in the tracking data is smoothed over to correct nans in fully empty frames; defaults to 10)
8. ground_probe (in a multi probe setting, the probe other probes are synced to - if you only have imec1, this needs to be set to 1; defaults to 0)
9. frame_rate (the tracking camera frame rate for that session; defaults to 120)
10. npx_sampling_rate (the sampling rate of the NPX system; defaults to 3e4)
11. sync_sequence (the length of the sequence the LED events should be matched across data streams; defaults to 10)
12. sample_error (the time the presumed IMEC/IMU LEDs could be allowed to err around; defaults to 20 (ms))
13. which_imu_time (the IMU time to be used in the analyses, loop.starttime (0) or sample.time (1); defaults to 1)

Therefore, you proceed one recording session at a time.

In [None]:
from kisn_pylab import reader

npx_files = [r'D:\SGL_DATA\test_100520\distal_s2_sound_g0\distal_s2_sound_g0_imec1\distal_s2_sound_g0_t0.imec1.ap.bin']
track_file = r'A:\store\Bartul\neuropixel\test_100520\Sound\Take 2020-05-10 01.15.15 PM.csv'
imu_file = r'A:\store\Bartul\neuropixel\test_100520\Sound\CoolTerm Capture 2020-05-10 13-15-20.txt'
sync_df = r'A:\store\Bartul\neuropixel\test_100520\Sound\sync_df_100520_s2.pkl'
imu_pkl = r'A:\store\Bartul\neuropixel\test_100520\Sound\CoolTerm Capture 2020-05-10 13-15-20.pkl'

In [None]:
readClass = reader.EventReader(npx_files, sync_df)
readClass.read_se(track_file=track_file, imu_file=imu_file, imu_pkl=imu_pkl)

#### **[Step 5]** Load the sync data from the .pkl file(s) and analyze how well the tracking/IMU data are synced with NPX data.

Before running this step, make sure you have [plotly](https://plotly.com/python/getting-started/?utm_source=mailchimp-jan-2015&utm_medium=email&utm_campaign=generalemail-jan2015&utm_term=bubble-chart) installed.

You set the absolute paths to:
1. the sync_df .pkl files

Thus, multiple files can be processed in sequence.

As inputs to the estimate_sync_quality function, note that you have the option to set:
1. npx_sampling_rate (sampling rate of the NPX system; defaults to 3e4)
2. to_plot (plot or not to plot y_test and y_test_prediction statistics; defaults to False)
3. ground_probe (in a multi probe setting, the probe other probes are synced to; defaults to 0)
4. imu_files (the list of absolute paths to imu_pkl files that contain the raw IMU data; defaults to 0)
5. which_imu_time (the IMU time to be used in the analyses, loop.starttime (0) or sample.time (1); defaults to 1)

The imu_files should be ordered such that the first sync file corresponds to the IMU file of the same session, and so forth.

In [None]:
from kisn_pylab import synchronize

sync_pkls = [r'A:\store\Bartul\neuropixel\test_100520\No_sound\sync_df_100520_s1.pkl']
imu_files = [r'A:\store\Bartul\neuropixel\test_100520\No_sound\CoolTerm Capture 2020-05-10 12-31-19.pkl']
to_plot = 1

In [None]:
syncClass = synchronize.Sync(sync_pkls)
syncClass.estimate_sync_quality(to_plot=to_plot, imu_files=imu_files)

#### **[Step 6]** Split clusters back into individual sessions and get spike times. This step should be completed irrespective of the number of sorted sessions.

Variable the_dirs is a list of directories where Kilosort2 results are stored for each recording probe (processing one recording day in one go).

As inputs to the split_cluster function, note that you have the option to set:
1. nchan (the number of channels on the probe; defaults to 385)
2. one_session (whether you have only one session; defaults to True)
3. min_spikes (the minimum number of spikes in one session to consider the cluster worthy of saving; defaults to 100)
4. npx_sampling_rate (sampling rate of the NPX system; defaults to 3e4)
5. ground_probe (in a multi probe setting, the probe other probes are synced to; defaults to 0)
6. to_plot (plot or not to plot y_test and y_test_prediction statistics; defaults to False)
7. sync_pkls (paths to as many sync .pkl files as there are recording sessions; defaults to 0)
8. pkl_lengths (.pkl files that have information about where concatenated files were stitched together; defaults to 0)
9. print_details (whether or not to print details about spikes in every individual cluster; defaults to 0)

!NB: make sure each directory has the imec ID in the name!

In [None]:
from kisn_pylab import spikes2sessions

the_dirs = [r'A:\store\Bartul\neuropixel\26148_bruno\060520\spikes_imec0']
sync_pkls = [r'A:\store\Bartul\neuropixel\05022020_both_distal_g0_t0.imec0.ap.pkl']
pkl_lengths = [r'A:\store\Bartul\neuropixel\26148_bruno\060520\060520_distal_all_g0_t0.imec0.ap.pkl']
nchan = 385
one_session = 0
min_spikes = 100
to_plot = 1

In [None]:
sstClass = spikes2sessions.ExtractSpikes(the_dirs)
sstClass.split_clusters(one_session=one_session, min_spikes=min_spikes, nchan=nchan, pkl_lengths=pkl_lengths)

#### **[Step 7]** Create .pkl file for GUI.

As instance attributes, you determine:
1. the absolute paths where the .csv (tracking) files are (tracking files with the appendage "final" should be used!)
2. the absolute paths where the .pkl (sync event) files are

Thus, multiple files can be processed in sequence. Every GUI .pkl file is saved to the same directory as the tracking .csv file.

!NB: The rat-cam is meant to be used for the raw tracking video, which should be exported to match the sequence from the start of first to the start of the last LED event!

As inputs to the csv_to_pkl function, note that you have the option to set:
1. frame_rate (you set it manually, otherwise it's read from the sync .pkl file)
2. npx_sampling_rate (sampling rate of the NPX system; defaults to 3e4)
3. ground_probe (in a multi probe setting, the probe other probes are synced to; defaults to 0)

In [None]:
from kisn_pylab import motive2GUI

the_csvs = [r'A:\store\Bartul\neuropixel\test_100520\No_sound\Take 2020-05-10 12.31.15 PM_final.csv']
sync_pkls = [r'A:\store\Bartul\neuropixel\test_100520\No_sound\sync_df_100520_s1.pkl']

In [None]:
for the_csv, sync_pkl in zip(the_csvs, sync_pkls):
    mtgClass = motive2GUI.Transformer(the_csv, sync_pkl)
    mtgClass.csv_to_pkl()

#### **[Step 8]** Final processing steps before you start analyzing.

1. create the head in the GUI (trackedpointdata_V3_5_LEDs.py version)
2. load the spiking .mat files
3. export everything as a .mat file