In [1]:
%matplotlib inline

# Preprocessing and Spike Sorting Tutorial

- In this introductory example, you will see how to use the :code:`spikeinterface` to perform a full electrophysiology analysis.
- We will first create some simulated data, and we will then perform some pre-processing, run a couple of spike sorting algorithms, inspect and validate the results, export to Phy, and compare spike sorters.


In [2]:
import os
import pickle
import _pickle as cPickle
import glob
import warnings
import git
import imp

  import imp


In [3]:
from collections import defaultdict
import time
import json
from datetime import datetime

In [4]:
import matplotlib.pyplot as plt
from matplotlib.pyplot import cm
import numpy as np
import pandas as pd
import scipy.signal

In [5]:
# Changing the figure size
from matplotlib.pyplot import figure
figure(figsize=(8, 6), dpi=80)

<Figure size 640x480 with 0 Axes>

<Figure size 640x480 with 0 Axes>

The spikeinterface module by itself import only the spikeinterface.core submodule
which is not useful for end user



In [6]:
import spikeinterface

We need to import one by one different submodules separately (preferred).
There are 5 modules:

- :code:`extractors` : file IO
- :code:`toolkit` : processing toolkit for pre-, post-processing, validation, and automatic curation
- :code:`sorters` : Python wrappers of spike sorters
- :code:`comparison` : comparison of spike sorting output
- :code:`widgets` : visualization



In [7]:
import spikeinterface as si  # import core only
import spikeinterface.extractors as se
import spikeinterface.sorters as ss
import spikeinterface.preprocessing as sp

import spikeinterface.comparison as sc
import spikeinterface.widgets as sw
from spikeinterface.exporters import export_to_phy

In [8]:
import spikeinterface.core

In [9]:
from probeinterface import get_probe
from probeinterface.plotting import plot_probe, plot_probe_group
from probeinterface import write_prb, read_prb

In [10]:
import mountainsort5 as ms5

We can also import all submodules at once with this
  this internally import core+extractors+toolkit+sorters+comparison+widgets+exporters

This is useful for notebooks but this is a more heavy import because internally many more dependency
are imported (scipy/sklearn/networkx/matplotlib/h5py...)



In [11]:
import spikeinterface.full as si

In [12]:
# Increase size of plot in jupyter

plt.rcParams["figure.figsize"] = (10,6)

- Getting the root directory of the Github Repo to base the files off of

In [13]:
git_repo = git.Repo(".", search_parent_directories=True)
git_root = git_repo.git.rev_parse("--show-toplevel")

In [14]:
git_root

'/nancy/projects/reward_competition_extention'

# Part 0: Loading in the Probe

In [15]:
probe_filepath_glob = "data/*.prb"

In [16]:
probe_absolultepath_glob = os.path.join(git_root, probe_filepath_glob)

In [17]:
# Getting all the file paths of the recording parameter files(that happen to all end in `.prm`)
all_probe_files = glob.glob(probe_absolultepath_glob, recursive=True)

In [18]:
all_probe_files

['/nancy/projects/reward_competition_extention/data/linear_probe_with_large_spaces.prb']

- If you have more than one metadata file, then you must either:
    - A. Put the index of the file in `all_parameter_files[0]` below. You would replace the `0` with the correct index. (Remember, Python is zero indexed so the first file in the list is 0. Second is 1, and so forth.
    - B. Add a absolute or relative path to `open({./path/to/recording_file.rec})` below. You would replace `{./path/to/recording_file.rec}` with the path of the file for the metadata.

In [19]:
if len(all_probe_files) < 1:
    warnings.warn("There are no parameter files in the directory that you specified. Please add a file, or correct the directory path")
else:
    probe_parameters = imp.load_source("probe_parameters", all_probe_files[0])
    with open(all_probe_files[0]) as info_file:
        lines = info_file.readlines()
        for line in lines:
            print(line)

channel_groups = {0: {'channels': [0,

                  1,

                  2,

                  3,

                  4,

                  5,

                  6,

                  7,

                  8,

                  9,

                  10,

                  11,

                  12,

                  13,

                  14,

                  15,

                  16,

                  17,

                  18,

                  19,

                  20,

                  21,

                  22,

                  23,

                  24,

                  25,

                  26,

                  27,

                  28,

                  29,

                  30,

                  31],

     'geometry':{

    0: (0, 0),

    1: (5, 20),

    2: (-7, 40),

    3: (9, 60),

    4: (-11, 80),

    5: (13, 100),

    6: (-15, 120),

    7: (17, 140),

    8: (-19, 160),

    9: (21, 180),

    10: (-23, 200),

    11: (25, 220),

    12: (-27

- Reading in the probe information into Spike interface and plotting the probe

In [20]:
if len(all_probe_files) < 1:
    warnings.warn("There are no parameter files in the directory that you specified. Please add a file, or correct the directory path")
else:
    # Reading in the probe data
    probe_object = read_prb(all_probe_files[0])

In [21]:
probe_object.to_dataframe()

Unnamed: 0,probe_index,x,y,contact_shapes,radius,shank_ids,contact_ids
0,0,0.0,0.0,circle,5.0,,
1,0,5.0,20.0,circle,5.0,,
2,0,-7.0,40.0,circle,5.0,,
3,0,9.0,60.0,circle,5.0,,
4,0,-11.0,80.0,circle,5.0,,
5,0,13.0,100.0,circle,5.0,,
6,0,-15.0,120.0,circle,5.0,,
7,0,17.0,140.0,circle,5.0,,
8,0,-19.0,160.0,circle,5.0,,
9,0,21.0,180.0,circle,5.0,,


In [22]:
probe_object.get_global_contact_ids()

array(['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
       '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],
      dtype='<U64')

In [23]:
probe_object.get_global_device_channel_indices()

array([(0,  0), (0,  1), (0,  2), (0,  3), (0,  4), (0,  5), (0,  6),
       (0,  7), (0,  8), (0,  9), (0, 10), (0, 11), (0, 12), (0, 13),
       (0, 14), (0, 15), (0, 16), (0, 17), (0, 18), (0, 19), (0, 20),
       (0, 21), (0, 22), (0, 23), (0, 24), (0, 25), (0, 26), (0, 27),
       (0, 28), (0, 29), (0, 30), (0, 31)],
      dtype=[('probe_index', '<i8'), ('device_channel_indices', '<i8')])

- Creating a dictionary of all the variables in the probe file

In [24]:
if 'probe_parameters' in locals():
    probe_dict = defaultdict(dict)
    for attribute in dir(probe_parameters):
        # Removing built in attributes
        if not attribute.startswith("__"): 
            probe_dict[attribute] = getattr(probe_parameters, attribute)

In [25]:
if "probe_dict" in locals():
    for key, value in probe_dict.items():
        print("{}: {}".format(key, value))

channel_groups: {0: {'channels': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], 'geometry': {0: (0, 0), 1: (5, 20), 2: (-7, 40), 3: (9, 60), 4: (-11, 80), 5: (13, 100), 6: (-15, 120), 7: (17, 140), 8: (-19, 160), 9: (21, 180), 10: (-23, 200), 11: (25, 220), 12: (-27, 240), 13: (29, 260), 14: (-31, 280), 15: (33, 300), 16: (-35, 320), 17: (37, 340), 18: (-39, 360), 19: (41, 380), 20: (-43, 400), 21: (45, 420), 22: (-47, 440), 23: (49, 460), 24: (-51, 480), 25: (53, 500), 26: (-55, 520), 27: (57, 540), 28: (-59, 560), 29: (61, 580), 30: (-63, 600), 31: (65, 620)}, 'graph': [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9), (9, 10), (10, 11), (11, 12), (12, 13), (13, 14), (14, 15), (15, 16), (16, 17), (17, 18), (18, 19), (19, 20), (20, 21), (21, 22), (22, 23), (23, 24), (24, 25), (25, 26), (26, 27), (27, 28), (28, 29), (29, 30), (30, 31)]}}


# Part 1: Importing Data

## Loading in the Electrophysiology Recording

- We are inputting the electrophsiology recording data with probe information. This should have been created in the prevous notebook in a directory created by Spike Interface. If you had already read in your own electrophsiology recording data with probe information with a different way, then follow these instructions.
    - If you want to use a different directory, then you must either:
        - Change `glob.glob({./path/to/with/*/recording_raw})` to the directory that you have the directories created from Spikeinterface. You can use a wildcard if you have multiple folders. You would replace `{./path/to/with/*/recording_raw}` with the path to either the parent directory or the actual directory containing the electrophsiology recording data read into Spikeinterface.
        - Or change `(file_or_folder_or_dict={./path/to/recording_raw})`. You would replace `{./path/to/recording_raw}` with the path to either the parent directory or the actual directory containing the electrophsiology recording data read into Spikeinterface.

In [26]:
recording_filepath_glob = "/scratch/back_up/reward_competition_extention/data/pilot/*/*base*.rec"

In [27]:
all_recording_files = glob.glob(recording_filepath_glob, recursive=True)

In [28]:
all_recording_files

['/scratch/back_up/reward_competition_extention/data/pilot/20221203_154800_omission_and_competition_subject_6_4_and_6_1.rec/20221203_154800_omission_and_competition_subject_6_1_top_1_base_3_merged.rec',
 '/scratch/back_up/reward_competition_extention/data/pilot/20221122_161341_omission_subject_6_1_and_6_3.rec/20221122_161341_omission_subject_6_1_top_4_base_2.rec',
 '/scratch/back_up/reward_competition_extention/data/pilot/20221125_152723_competition_subject_6_1_and_6_2.rec/20221125_152723_competition_subject_6_1_top_3_base_2_merged.rec',
 '/scratch/back_up/reward_competition_extention/data/pilot/20221125_144832_omission_subject_6_1_and_6_2.rec/20221125_144832_omission_subject_6_1_top_1_base_2_merged.rec',
 '/scratch/back_up/reward_competition_extention/data/pilot/20221122_164720_competition_subject_6_1_and_6_3.rec/20221122_164720_competition_6_1_top_3__base_3_merged.rec',
 '/scratch/back_up/reward_competition_extention/data/pilot/20221215_145401_comp_amd_om_6_1_and_6_3.rec/20221215_145

# Part 2: Sorting

In [29]:
successful_files = [] 
failed_files = []
for recording_file in all_recording_files:
    try:
        trodes_recording = se.read_spikegadgets(recording_file, stream_id="trodes")       
        trodes_recording = trodes_recording.set_probes(probe_object)
        recording_basename = os.path.basename(recording_file)
        recording_output_directory = "/scratch/back_up/reward_competition_extention/proc/spike_sorting/{}".format(recording_basename)
        
        os.makedirs(recording_output_directory, exist_ok=True)
        print("Output directory: {}".format(recording_output_directory))
        child_spikesorting_output_directory = os.path.join(recording_output_directory,"ss_output")
               
        if not os.path.exists(child_spikesorting_output_directory):
            start = time.time()
            # Make sure the recording is preprocessed appropriately
            # lazy preprocessing
            recording_filtered = sp.bandpass_filter(trodes_recording, freq_min=300, freq_max=6000)
            recording_preprocessed: si.BaseRecording = sp.whiten(recording_filtered, dtype='float32')
            spike_sorted_object = ms5.sorting_scheme2(
            recording=recording_preprocessed,
            sorting_parameters=ms5.Scheme2SortingParameters(
                detect_sign=0,
                phase1_detect_channel_radius=700,
                detect_channel_radius=700,
                # other parameters...
                )
                    )
    
            spike_sorted_object.save(folder=child_spikesorting_output_directory)
    
            print("Sorting finished in: ", time.time() - start)
            
            
        else:
            warnings.warn("""Directory already exists for: {}. 
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again""".format(child_spikesorting_output_directory))
            continue
                        
        sw.plot_rasters(spike_sorted_object)
        plt.title(recording_basename)
        plt.ylabel("Unit IDs")
        
        plt.savefig(os.path.join(recording_output_directory, "{}_raster_plot.png".format(recording_basename)))
        plt.close()
        
        waveform_output_directory = os.path.join(recording_output_directory, "waveforms")
        
        we_spike_sorted = si.extract_waveforms(recording=recording_preprocessed, 
                                       sorting=spike_sorted_object, folder=waveform_output_directory,
                                      ms_before=1, ms_after=1, progress_bar=True,
                                      n_jobs=8, total_memory="1G", overwrite=True,
                                       max_spikes_per_unit=2000)
        
        phy_output_directory = os.path.join(recording_output_directory, "phy")
        
        export_to_phy(we_spike_sorted, phy_output_directory,
              compute_pc_features=True, compute_amplitudes=True, remove_if_exists=False)
        successful_files.append(recording_file)
    except Exception as e: 
        print(e)
        failed_files.append(recording_file)


Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221203_154800_omission_and_competition_subject_6_1_top_1_base_3_merged.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221122_161341_omission_subject_6_1_top_4_base_2.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221125_152723_competition_subject_6_1_top_3_base_2_merged.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221125_144832_omission_subject_6_1_top_1_base_2_merged.rec


            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again


Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221122_164720_competition_6_1_top_3__base_3_merged.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221215_145401_comp_amd_om_6_1_top_4_base_3.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221202_134600_omission_and_competition_subject_6_1_top_2_base_3_merged.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221123_113957_omission_subject_6_1_top_4_base_2_merged.rec
Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221214_125409_om_and_comp_6_1_top_1_base_2_vs_6_3.rec


            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again
            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again


Output directory: /scratch/back_up/reward_competition_extention/proc/spike_sorting/20221123_122652_competition_subject_6_1_top_3_base_3.rec


            Either continue on if you are satisfied with the previous run 
            or delete the directory and run this cell again


In [30]:
raise ValueError()

ValueError: 