# Spike sorting

Ready to move on? Let's try to figure out what we have in terms of action potentials. Extracting action potentials from multielectrode voltage traces, and assigning them to specific neurons, has long been more art than science. One reason is that ground truth is generally not available, so quantifying whether one algorithm performs better than another has been difficult. Also, traditionally, different spike sorters have required input data in slightly different formats, and presented their results in different formats. Format conversion is not hard, but unpleasant enough that unbiased and quantitative comparison has been rare. And spike sorting is a slow process, so running lots of spike sorters on your data requires a real commitment of time.

Faster computers and the publication of a generalized interface to spike sorting have improved the situation recently. In this exercise, you will feed a section of our data into one modern spike sorter, then compare your results with other students who used different sorters.

## Installing the sorter and the generalized interface

The wrapper software is called "spikeinterface" (https://elifesciences.org/articles/61834) and is just one pip away:

In [1]:
!pip install spikeinterface

Collecting spikeinterface
  Downloading spikeinterface-0.98.0-py3-none-any.whl (747 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m747.9/747.9 kB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
Collecting neo>=0.11.1 (from spikeinterface)
  Downloading neo-0.12.0-py3-none-any.whl (586 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m586.9/586.9 kB[0m [31m49.8 MB/s[0m eta [36m0:00:00[0m
Collecting probeinterface>=0.2.16 (from spikeinterface)
  Downloading probeinterface-0.2.17-py3-none-any.whl (36 kB)
Collecting quantities>=0.14.1 (from neo>=0.11.1->spikeinterface)
  Downloading quantities-0.14.1-py3-none-any.whl (87 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m87.9/87.9 kB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: quantities, probeinterface, neo, spikeinterface
Successfully installed neo-0.12.0 probeinterface-0.2.17 quantities-0.14.1 spikeinterface-0.98.0


It comes pre-configured with just a few spike sorters:

In [2]:
import spikeinterface.sorters as ss
ss.installed_sorters()

['spykingcircus2', 'tridesclous2']

but we can easily install several more:

In [3]:
!pip install herdingspikes
!pip install mountainsort5

Collecting herdingspikes
  Downloading herdingspikes-0.3.102.tar.gz (458 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/458.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m450.6/458.3 kB[0m [31m15.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m458.3/458.3 kB[0m [31m12.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: herdingspikes
  Building wheel for herdingspikes (pyproject.toml) ... [?25l[?25hdone
  Created wheel for herdingspikes: filename=herdingspikes-0.3.102-cp310-cp310-linux_x86_64.whl size=1412247 sha256=d3ba9a1db8a30b909d63d5f99b6b9bba13d34537b6affbdd11914561b34ea963
  Stored in direct

The standard invocation for importing spikeinterface is a little elaborate:

In [4]:
import spikeinterface as si
import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.postprocessing as spost
import spikeinterface.qualitymetrics as sqm
import spikeinterface.comparison as sc
import spikeinterface.exporters as sexp
import spikeinterface.widgets as sw
from probeinterface import Probe
from probeinterface.plotting import plot_probe

import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path


## Loading data into spikeinterface

Loading raw data into spikeinterface is straightforward, though I had to jump through some hoops to make it load our SALPA-preprocessed data. Check out the "SI Hoops" notebook for details. It also teaches you how to tell spikeinterface about the geometry of the Neuropixels probe.

In [5]:
from google.colab import drive
drive.mount('/content/drive')
!ls /content/drive/MyDrive/datasai-daw

Mounted at /content/drive
data  notebooks  __pycache__  vizly.py


In [6]:
root = "/content/drive/MyDrive/datasai-daw/data/2021-07-20_11-59-01"
src = Path(root) / "Record Node 115"

To load the raw data, you would do:

    rec = se.read_openephys(src, stream_name="Record Node 115#Neuropix-PXI-111.0")

However, we will read the pre-processed data:

In [7]:
rec = si.load_extractor(src / "salpa")

In [10]:
# fs_Hz = 30e3 # Now irrelevant [was: check that this is really true]

In [15]:
fs_Hz = rec.get_sampling_frequency()

30000.0

This data set is an hour long, so spike sorting can take many hours. For the purposes of this tutorial, we will work with a subset of the data:

In [11]:
rec_sub = rec.frame_slice(start_frame=0.0*fs_Hz, end_frame=5.0*60*fs_Hz) # grab 5 minutes

That's still 6.5 GB of data, so feel free to experiment with an even shorter snippet. However, too short a snippet will make the sorters produce unreliable output.

Choose one of the installed sorters:

In [12]:
ss.installed_sorters()

['herdingspikes', 'mountainsort5', 'spykingcircus2', 'tridesclous2']

and educate yourself on the available parameters for that sorter:

In [13]:
ss.get_default_sorter_params('herdingspikes') # or 'mountainsort5', etc.

{'clustering_bandwidth': 5.5,
 'clustering_alpha': 5.5,
 'clustering_n_jobs': -1,
 'clustering_bin_seeding': True,
 'clustering_min_bin_freq': 16,
 'clustering_subset': None,
 'left_cutout_time': 0.3,
 'right_cutout_time': 1.8,
 'detect_threshold': 20,
 'probe_masked_channels': [],
 'probe_inner_radius': 70,
 'probe_neighbor_radius': 90,
 'probe_event_length': 0.26,
 'probe_peak_jitter': 0.2,
 't_inc': 100000,
 'num_com_centers': 1,
 'maa': 12,
 'ahpthr': 11,
 'out_file_name': 'HS2_detected',
 'decay_filtering': False,
 'save_all': False,
 'amp_evaluation_time': 0.4,
 'spk_evaluation_time': 1.0,
 'pca_ncomponents': 2,
 'pca_whiten': True,
 'freq_min': 300.0,
 'freq_max': 6000.0,
 'filter': True,
 'pre_scale': True,
 'pre_scale_value': 20.0,
 'filter_duplicates': True}

It is worth looking at the documentation for the sorter to see what they have to say about the parameters. Especially important are options that allow you to use more than one CPU or GPU core. Also, make sure that your Colab runtime has a GPU and lots of memory.

Next, set a destination folder:

In [14]:
dst = Path("/content/drive/MyDrive")

and run *one* of the following:

In [None]:
sorting_hs = ss.run_sorter("herdingspikes", rec_sub, output_folder=dst / 'res_slp_hs', verbose=True, filter=False)

In [None]:
sorting_ms = ss.run_sorter("mountainsort5", rec_sub, output_folder=dst / 'res_slp_ms', verbose=True, filter=False)

In [None]:
sorting_tri = ss.run_sorter("tridesclous2", rec_sub, output_folder=dst / 'res_slp_tri', verbose=True, filter=False)

In [None]:
sorting_sc2 = ss.run_sorter("spykingcircus2", rec_sub, output_folder=dst / 'res_slp_sc2', verbose=True, filter=False)

You may well run into a few errors. That's OK. Resolving those is part of the exercise. But don't bang your head against any brick walls. Ask for help instead!