# COMPARISON MODULE

This notebook shows how to use the spiketoolkit.comparison module to:
- compare pair of spike sorters
- compare multiple spike sorters
- extract units in agreement with multiple sorters (consensus-based)
- perform systematic performance comparisons on ground truth recordings

# Several sorter comparison on several datatset with ground truth

This simple notebook illustrate how to run several sorters on several dataset with ground truth.

This will be done with mainly with 2 functions:
  * **spiketoolkit.sorters.run_sorters** : this run several sorters on serevals dataset
  * **spiketoolkit.comparison.gather_sorting_comparison** : this run several all possible comparison
    with ground truth and results some metrics (accuracy, true positive rate, ..)



In [1]:
import spiketoolkit as st
import spikeextractors as se

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sn
import shutil

import time

%matplotlib notebook

14:29:15 [I] klustakwik KlustaKwik2 version 0.2.6


## Step 1 : generate several dataset with "toy_example"



In [2]:
rec0, gt_sorting0 = se.example_datasets.toy_example(num_channels=4, duration=30)
rec1, gt_sorting1 = se.example_datasets.toy_example(num_channels=32, duration=30)

In [3]:
st.sorters.available_sorters()

['herdingspikes',
 'ironclust',
 'kilosort',
 'kilosort2',
 'klusta',
 'mountainsort4',
 'spykingcircus',
 'tridesclous']

## Step 2 : run all sorter on all dataset

In [4]:
# this cell is really verbose due to some sorter so switch off output console

recording_dict = {'toy_tetrode' : rec0, 'toy_probe32': rec1}
sorter_list = ['klusta', 'tridesclous']
path = 'comparison_example/'
working_folder = path + '/working_folder'
shutil.rmtree(working_folder)

t0 = time.perf_counter()
st.sorters.run_sorters(sorter_list, recording_dict, working_folder, engine=None)
t1 = time.perf_counter()
print('total run time', t1-t0)

('toy_tetrode', <spikeextractors.extractors.numpyextractors.numpyextractors.NumpyRecordingExtractor object at 0x7f5111354668>, 'klusta', PosixPath('comparison_example/working_folder/output_folders/toy_tetrode/klusta'), None, False, True)
'group' property is not available and it will not be saved.
('toy_tetrode', <spikeextractors.extractors.numpyextractors.numpyextractors.NumpyRecordingExtractor object at 0x7f5111354668>, 'tridesclous', PosixPath('comparison_example/working_folder/output_folders/toy_tetrode/tridesclous'), None, False, True)
'group' property is not available and it will not be saved.
probe allready in dir


  exec(open(probe_filename).read(), None, d)
  exec(open(probe_filename).read(), None, d)


order_clusters waveforms_rms
make_catalogue 0.040502187999663875
('toy_probe32', <spikeextractors.extractors.numpyextractors.numpyextractors.NumpyRecordingExtractor object at 0x7f510e798128>, 'klusta', PosixPath('comparison_example/working_folder/output_folders/toy_probe32/klusta'), None, False, True)
'group' property is not available and it will not be saved.
('toy_probe32', <spikeextractors.extractors.numpyextractors.numpyextractors.NumpyRecordingExtractor object at 0x7f510e798128>, 'tridesclous', PosixPath('comparison_example/working_folder/output_folders/toy_probe32/tridesclous'), None, False, True)
'group' property is not available and it will not be saved.


  exec(open(probe_filename).read(), None, d)
  exec(open(probe_filename).read(), None, d)


probe allready in dir
order_clusters waveforms_rms
make_catalogue 0.11518473900014214
Unable to extract clusters from /home/alessio/Documents/Codes/spike_sorting/spiketoolkit/examples/comparison_example/working_folder/output_folders/toy_tetrode/klusta/recording.kwik
Unable to extract clusters from /home/alessio/Documents/Codes/spike_sorting/spiketoolkit/examples/comparison_example/working_folder/output_folders/toy_probe32/klusta/recording.kwik
total run time 16.184670593999726


  exec(open(probe_filename).read(), None, d)
  exec(open(probe_filename).read(), None, d)


## Step3 : collect Datatframe of comparison

In [None]:
ground_truths = {'toy_tetrode': gt_sorting0, 'toy_probe32': gt_sorting1}

comp_dataframes = st.comparison.gather_sorting_comparison(working_folder, ground_truths, use_multi_index=True)

## Step 4 : display tables

In [None]:
comp_dataframes['performances']

In [None]:
comp_dataframes['run_times']

## Step 5 : easy plot with seaborn

In [None]:
run_times = comp_dataframes['run_times'].reset_index()
fig, ax = plt.subplots()
sn.barplot(data=run_times, x='rec_name', y='run_time', hue='sorter_name', ax=ax)
ax.set_title('Run times')

In [None]:
perfs = comp_dataframes['performances'].reset_index()
fig, ax = plt.subplots()
sn.barplot(data=perfs, x='rec_name', y='tp', hue='sorter_name', ax=ax)
ax.set_title('True positive rate')
ax.set_ylim(0, 100)

In [None]:
perfs = comp_dataframes['performances'].reset_index()
fig, ax = plt.subplots()
ax = sn.barplot(data=perfs, x='rec_name', y='accuracy', hue='sorter_name', ax=ax)
ax.set_title('accuracy')
ax.set_ylim(0, 100)

## Et voilà!!