# Investigating spike sorting results

Have your spike sorters finished? We are going to compare the results from the various sorters following the "spikeinterface" paper (https://elifesciences.org/articles/61834). Very helpfully, they have provided code that reproduces their Figure 1: https://spikeinterface.github.io/blog/ensemble-sorting-of-a-neuropixels-recording.

In this notebook, I will show you how to load the data from multiple sorter runs. After that, I invite you to copy the code from the "spikeinterface" paper into this notebook, and apply their techniques to our data.

## Prerequisites

First, the usual prerequisites:

In [None]:
!pip install spikeinterface

In [None]:
import spikeinterface as si
import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.postprocessing as spost
import spikeinterface.qualitymetrics as sqm
import spikeinterface.comparison as sc
import spikeinterface.exporters as sexp
import spikeinterface.widgets as sw
from probeinterface import Probe
from probeinterface.plotting import plot_probe

import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path
import pandas as pd
import seaborn as sns
from collections import defaultdict
from matplotlib_venn import venn3

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
root = "/content/drive/MyDrive/datasai-daw/data/2021-07-20_11-59-01"
src = Path(root) / "Record Node 115"

## Loading spike sorting results

Here is how to load spike sorting results I obtained earlier by passing the entire 1-hour-long recording into various sorters:

In [None]:
sorting_hs = ss.read_sorter_folder(src / 'res_slp_hs') # HerdingSpikes2
sorting_ms = ss.read_sorter_folder(src / 'res_slp_ms') # MountingSort5
sorting_tri = ss.read_sorter_folder(src / 'res_slp_tri') # Tridesclous
sorting_sc = ss.read_sorter_folder(src / 'res_slp_sc2') # SpykingCircus2
sorting_ksall = se.KiloSortSortingExtractor(src / 'experiment1/recording9/continuous/Neuropix-PXI-111.0/kilosort20_output') # Kilosort2
sorting_ks = se.KiloSortSortingExtractor(src / 'experiment1/recording9/continuous/Neuropix-PXI-111.0/kilosort20_output', keep_good_only=True) # Kilosort2


(The difference between `ks_all` and `ks` is the inclusion in the former of units that Kilosort itself expresses doubt about.)

In [None]:
allsorts = {
    "HS": sorting_hs,
    "MS": sorting_ms,
    "TRI": sorting_tri,
    "SC": sorting_sc,
    "KS0": sorting_ksall,
    "KS": sorting_ks
}

I encourage you to load your own data (and those of your colleagues) instead of mine.

## Quick look at spike sorting results

Before we compare the various results, let's take a quick look at one sorter. For instance, herding spikes:

In [None]:
nspk = []
for unit in sorting_hs.get_unit_ids():
    spiketrain = sorting_hs.get_unit_spike_train(unit)
    nspk.append(len(spiketrain))
    print(f"Unit {unit} - num spikes: {len(spiketrain)}")
print("Total number of units:", len(sorting_hs.get_unit_ids()))
nspk.sort()
plt.plot(nspk, '.')

## Comparing spike sorting results

Now, look at the https://spikeinterface.github.io/blog/ensemble-sorting-of-a-neuropixels-recording notebook, and apply their analysis to our data.

In [None]:
# Insert your code here


A useful tutorial notebook that teaches much more about spikeinterface is here: https://github.com/SpikeInterface/spiketutorials/blob/master/Official_Tutorial_SI_0.96_Oct22/SpikeInterface_Tutorial.ipynb.