# SpikeInterface DEMO v0.95 -  DCBT school Nijmegen - Sep 2022


In this demo, you will use SpikeInterface to analyze a 64-channel dataset from am "ASSY-156-P1" probe from Cambridge Neurotech. 
The dataset is kindly provided by [Samuel McKenzie's Lab](https://mckenzieneurolab.com/). 

The objective of this demo is to show all the functionalities of SpikeInterface on a real-world example.

# Table of contents

* [0. Preparation](#preparation)
* [1. Loading the data and probe information](#loading)
* [2. Preprocessing](#preprocessing)
* [3. Saving and loading SpikeInterface objects](#save-load)
* [4. Spike sorting](#spike-sorting)
* [5. Extracting waveforms](#waveforms)
* [6. Postprocessing](#postprocessing)
* [7. Validation and curation](#curation)
* [8. Spike sorting comparison](#comparison)
* [9. Exporters](#exporters)

# 0. Preparation <a class="anchor" id="preparation"></a>

### Download the ephys data

First, we need to download the recording. Feel free to use your own recordings as well later on. 

The dataset is called `cambridge_data.dat` and can be found on this [drive link](https://drive.google.com/drive/folders/1eWPuOd8q4MjpVpwazkWygQJDzJnToN3i) (`practice_2_real_dataset` folder). Move the dataset in the current folder.
The recording was performed with the "ASSY-156-P1" probe with 4 shanks of 16 channels (in total 64 channels).


### Import the modules

Let's now import the `spikeinterface` modules that we need:

In [None]:
import spikeinterface as si
import spikeinterface.extractors as se 
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.postprocessing as spost
import spikeinterface.qualitymetrics as sqm
import spikeinterface.comparison as sc
import spikeinterface.exporters as sexp
import spikeinterface.widgets as sw

In [None]:
print(f"SpikeInterface version: {si.__version__}")

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path

import spikeinterface_gui as sigui

%matplotlib widget

# 1. Loading recording and probe information <a class="anchor" id="loading"></a>

In [None]:
# file path
base_folder = Path('.')
recording_file = base_folder / 'cambridge_data.bin'

# parameters to load the bin/dat format
num_channels = 64
sampling_frequency = 20000
gain_to_uV = 0.195
offset_to_uV = 0
dtype = "int16"
time_axis = 0

In [None]:
recording = si.read_binary(recording_file, num_chan=num_channels, sampling_frequency=sampling_frequency,
                           dtype=dtype, gain_to_uV=gain_to_uV, offset_to_uV=offset_to_uV, 
                           time_axis=time_axis)

The `read_binary()` function returns a `RecordingExtractor` object. We can print it to visualize some of its properties:

In [None]:
recording

We can further `annotate` the recording to tell SI that it is not filtered yet. This will prevent further mistakes in the pipieline, such as attempting to extract waveforms from unfiltered data.

In [None]:
recording.annotate(is_filtered=False)

While the `read_binary()` function is part of the `core` module (as it's used internally by SI to store data in a convenient format), the `extractor` module allows you to load many file formats used in electrophysiology. 

The extractors available in SI are all loaded using the [NEO](https://neo.readthedocs.io/en/stable/) python package.

We can access the full list of available extractors with:

In [None]:
se.recording_extractor_full_dict

A `RecordingExtractor` object extracts information about channel ids, channel locations (if present), the sampling frequency of the recording, and the extracellular traces (when prompted). The `BinaryRecordingExtractor` is designed specifically for raw binary files datasets (.bin, .dat, .raw).

Here we load information from the recording using the built-in functions from the RecordingExtractor

In [None]:
channel_ids = recording.get_channel_ids()
fs = recording.get_sampling_frequency()
num_chan = recording.get_num_channels()
num_segments = recording.get_num_segments()

print(f'Channel ids: {channel_ids}')
print(f'Sampling frequency: {fs}')
print(f'Number of channels: {num_chan}')
print(f"Number of segments: {num_segments}")

SpikeInterface supports multi-segment recordings. A segment is a contiguous piece of data, and sometimes recordings can be made of multiple acquisitions, for examples a baseline, a stimulation phase, and a post recording. In such cases, the recording object will be made of multiple segments and be treated as such over the pipeline.

The `get_traces()` function returns a TxN numpy array where N is the number of channel ids passed in (all channel ids are passed in by default) and T is the number of frames (determined by start_frame and end_frame).

In [None]:
trace_snippet = recording.get_traces(start_frame=int(fs*0), end_frame=int(fs*2))

In [None]:
print('Traces shape:', trace_snippet.shape)

The `widgets` module includes several convenient plotting functions that can be used to explore the data:

Before moving on with the analysis, we have to load the probe information. For this we will use the [ProbeInterface](https://probeinterface.readthedocs.io/en/main/index.html) package. 

ProbeInterface allows to easily create, manipulate, and visualize neural probes. Moreover, it comes with a wide range of IO functions to import and export existing formats. Finally, we have created a public library of commercial probes (https://gin.g-node.org/spikeinterface/probeinterface_library/) that can be retrieved with a single line of code.

Let's import `probeinterface`, download the probe and plot it!

In [None]:
import probeinterface as pi
from probeinterface.plotting import plot_probe

In [None]:
manufacturer = 'cambridgeneurotech'
probe_name = 'ASSY-156-P-1'

probe = pi.get_probe(manufacturer, probe_name)
print(probe)

In most experiments, the neural probe has a connector, that is interfaced to an headstage, which in turn connects to the acquisition system. This *pathway* usually results in a channel remapping, which means that the order of the contacts on the probe is different than the order of the recorded traces.

`probeinterface` provides a growing collection of common pathways that can be loaded directly to wire a device and apply the correct channel mapping:

In [None]:
pi.get_available_pathways()

In [None]:
probe.wiring_to_device('ASSY-156>RHD2164')

In [None]:
fig, ax = plt.subplots(figsize=(14, 10))
plot_probe(probe, ax=ax)


In [None]:
fig, ax = plt.subplots(figsize=(14, 10))
plot_probe(probe, with_contact_id=True, with_device_index=True, ax=ax)
ax.set_xlim(-50, 300)

The probe now has contact ids `id#` and device ids `dev#`! We can also visualize the probe information as a `pandas` dataframe:

In [None]:
probe.to_dataframe(complete=True).loc[:, ["contact_ids", "shank_ids", "device_channel_indices"]]

Note that also the `shank_id` is loaded with the probe.

A `probeinterface` object can be loaded directly to a SI recording object:

In [None]:
recording_prb = recording.set_probe(probe, group_mode="by_shank")

When loading the probe, the device indices (and all the other contact properties) are automatically sorted:

In [None]:
probe_rec = recording_prb.get_probe()
probe_rec.to_dataframe(complete=True).loc[:, ["contact_ids", "shank_ids", "device_channel_indices"]]

In [None]:
print(f'Channels after loading the probe file: {recording_prb.get_channel_ids()}')
print(f'Channel groups after loading the probe file: {recording_prb.get_channel_groups()}')

In [None]:
w_ts = sw.plot_timeseries(recording_prb, order_channel_by_depth=True, backend="ipywidgets")

### Properties 

`RecordingExtractor` object can have *properties*. A property is a piece of information attached to a channel, e.g. group or location.

Similarly, for `SortingExtractor` objects (that we'll cover later), anything related to a unit can be stored as a property. 

We can check which properties are in the extractor as follows:

In [None]:
print("Properties before loading the probe:", list(recording.get_property_keys()))

In [None]:
print("Properties after loading the probe:", list(recording_prb.get_property_keys()))

After loading the probe we now have some new properties: `contact_vector`, `location`, and `group`.

Let's add some new properties! 
The first 32 channels are in the CA1 area, the second 32 are in the CA3 area:

In [None]:
brain_area_property_values = ['CA1']*32 + ['CA3']*32
print(brain_area_property_values)

In [None]:
recording_prb.set_property(key='brain_area', values=brain_area_property_values)

We can also specify a property on a subset of channels. In this case, the non-specified channels will be filled empty values:

In [None]:
recording_prb.set_property(key='quality', values=["good"]*(recording_prb.get_num_channels() - 3),
                           ids=recording_prb.get_channel_ids()[:-3])

In [None]:
recording_prb.get_property("quality")

In [None]:
print("Properties after adding custom properties:", list(recording_prb.get_property_keys()))

**NOTE:** Internally the properties is jus a dictionary attached to the recording that is accessible as `_properties`

In [None]:
print(recording_prb._properties.keys())

### Annotations

*Annotations* can be attached to any object and they can carry any information related to the recording or sorting objects.

Let's add an annotation about this tutorial:

In [None]:
recording_prb.annotate(description="Dataset for SI 0.93 tutorial")

In [None]:
print(recording_prb.get_annotation_keys())

# 2. Preprocessing <a class="anchor" id="preprocessing"></a>


Now that the probe information is loaded we can do some preprocessing using `toolkit` module.

We can filter the recordings, rereference the signals to remove noise, discard noisy channels, whiten the data, remove stimulation artifacts, etc. (more info [here](https://spiketoolkit.readthedocs.io/en/latest/preprocessing_example.html)).

For this notebook, let's filter the recordings and apply common median reference (CMR). All preprocessing modules return new `RecordingExtractor` objects that apply the underlying preprocessing function. This allows users to access the preprocessed data in the same way as the raw data.

We will focus only on the first shank (grouo `0`) for the following analysis:

In [None]:
recordings_by_group = recording_prb.split_by("group")
print(recordings_by_group)

In [None]:
recording_to_process = recordings_by_group[0]

In [None]:
recording_to_process.get_num_channels()

Below, we bandpass filter the recording and apply common median reference to the original recording:

In [None]:
recording_f = st.bandpass_filter(recording_to_process, freq_min=300, freq_max=6000)

w = sw.plot_timeseries(recording_f)

We can see that the after filtering we can observe spiking activity on many channels! We can also apply other preprocessing steps to further increase the quality of the recording. 

For examplem let's apply Common Median Reference (CMR)

In [None]:
recording_cmr = st.common_reference(recording_f, reference='global', operator='median')
recording_cmr

We can plot the traces after applying CMR:

In [None]:
w = sw.plot_timeseries(recording_cmr)

## Take only 5 min. for demo

Since we are going to spike sort the data, let's first cut out a 5-minute recording, to speed up computations.

We can easily do so with the `frame_slice()` function:

In [None]:
fs = recording_cmr.get_sampling_frequency()
recording_sub = recording_cmr.frame_slice(start_frame=0*fs, end_frame=300*fs)
recording_sub

# 3. Saving and loading SpikeInterface objects <a class="anchor" id="save-load"></a>

All operations in SpikeInterface are *lazy*, meaning that they are not performed if not needed. This is why the creation of our filter recording was almost instantaneous. However, to speed up further processing, we might want to **save** it to a file and perform those operations (eg. filters, CMR, etc.) at once. 

In [None]:
recording_saved = recording_sub.save(folder=base_folder/"preprocessed", progress_bar=True, 
                                     n_jobs=4, total_memory="100M")

If we inspect the `preprocessed` folder, we find that a few files have been saved. Let's take a look at what they are:

In [None]:
recording_saved

In [None]:
!ls -ll {base_folder}/preprocessed

The `traces_cached_seg0.raw` contains the processed raw data, while the `.json` files include information on how to reload the binary file. The `provenance.json` includes the information of the recording before saving it to a binary file, and the `probe.json` represents the probe object. 

The `save` returns a new *cached* recording that has all the previously loaded information:

In [None]:
print(f'Cached channels ids: {recording_saved.get_channel_ids()}')
print(f'Channel groups after caching: {recording_saved.get_channel_groups()}')

After saving the SI object, we can easily load it back in a new session:

In [None]:
recording_loaded = si.load_extractor(base_folder/"preprocessed")

In [None]:
print(f'Loaded channels ids: {recording_loaded.get_channel_ids()}')
print(f'Channel groups after loading: {recording_loaded.get_channel_groups()}')

We can double check that the traces are exactly the same as the `recording_saved` that we saved:

In [None]:
fig, axs = plt.subplots(ncols=2)
w_saved = sw.plot_timeseries(recording_saved, ax=axs[0])
w_loaded = sw.plot_timeseries(recording_loaded, ax=axs[1])
axs[0].set_title("Saved")
axs[1].set_title("Loaded")

**IMPORTANT**: the same saving mechanisms are available also for all SortingExtractor

# 4. Spike sorting <a class="anchor" id="spike-sorting"></a>

We can now run spike sorting on the above recording. We will use `tridesclous` and `ironclust` for this demonstration, to show how easy SpikeInterface makes it easy to interchengably run different sorters :)

Let's first check the installed sorters in spiketoolkit to see if herdingspikes is available. Then we can then check the `tridesclous` default parameters.
We will sort the bandpass cached filtered recording the `recording_saved` object.

In [None]:
ss.installed_sorters()

We can retrieve the parameters associated to any sorter with the `get_default_params()` function from the `sorters` module:

In [None]:
ss.get_default_params('tridesclous')

In [None]:
ss.get_params_description('tridesclous')

In [None]:
ss.run_sorter?

In [None]:
ss.run_tridesclous?

To modify a parameter, we can easily pass it to the `run` function as an extra argument!
For example, let's set the `filter` parameter to False as the recording is already preprocessed:

In [None]:
# run spike sorting on entire recording
sorting_TDC = ss.run_sorter('tridesclous', recording_sub, 
                            output_folder=base_folder/'results_TDC', verbose=True)


In [None]:
sorting_TDC

In [None]:
print('Found', len(sorting_TDC.get_unit_ids()), 'units')

SpikeInterface ensures full provenance of the spike sorting pipeline. Upon running a spike sorter, a `spikeinterface_params.json` file is saved in the `output_folder`. This contains a `.json` version of the recording and all the input parameters. 

In [None]:
!ls {base_folder}/results_TDC

In [None]:
!cat {base_folder}/results_TDC/spikeinterface_params.json

### Installing IronClust (requires MATLAB)

For MATLAB-based sorters, all you need to do is cloning the sorter repo and point it to SpikeInterface:

Let's clone ironclust in the current directory:

In [None]:
!git clone https://github.com/flatironinstitute/ironclust

Now all we have to tell the IronClustSorter class where is the ironclust repo:

In [None]:
ss.IronClustSorter.set_ironclust_path('./ironclust')

Note that we can also set a global environment variable called `IRONCLUST_PATH`. In that case we don't need to set the path in each session because the sorter class looks for this environment variable.

Now ironclust should be installed and we can run it:

In [None]:
ss.IronClustSorter.ironclust_path

In [None]:
ss.installed_sorters()

In [None]:
# run spike sorting by group
sorting_IC = ss.run_sorter('ironclust', recording_saved, 
                              output_folder=base_folder/'results_IC',
                              verbose=True)


In [None]:
sorting_IC

In [None]:
print(f'IronClust found {len(sorting_IC.get_unit_ids())} units')

The spike sorting returns a `SortingExtractor` object. Let's see some of its functions:

In [None]:
print(f'Ironclust unit ids: {sorting_IC.get_unit_ids()}')

In [None]:
print(f'Spike train of a unit: {sorting_IC.get_unit_spike_train(13)}')

We can use `spikewidgets` functions for some quick visualizations:

In [None]:
w_rs = sw.plot_rasters(sorting_IC)

# Running multiple sorter jobs in parallel

So far we have seen how to run one sorter at a time. SI provides a convenient launcher in order to run multiple sorters on multiple recordings with one line of code!

The `run_sorters()` function of the `sorter` module allows you to specify a list of sorters to use on a list (or dictionary) of parameters. The jobs are by default ran in a loop, but the `engine` argument enables to specify a parallel backend (`joblib` or `dask`) and relative parameters.

In the following example, we run the 2 jobs to run `herdingspikes` and `ironclust` in parallel:

In [None]:
sorting_outputs = ss.run_sorters(sorter_list=["tridesclous", "ironclust"],
                                 recording_dict_or_list={"group0": recording_saved},
                                 working_folder=base_folder / "all_sorters",
                                 verbose=True,
                                 engine="joblib",
                                 engine_kwargs={'n_jobs': 2})

The returned `sorting_outputs` variable is a dictionary that has (rec_name, sorter_name) as keys, and the `SortingExtractor` objects as valus:

In [None]:
print(sorting_outputs.keys())

For the rest of the tutorial, let's pick the `ironclust` output:

In [None]:
sorting_IC = sorting_outputs[('group0', 'ironclust')]
sorting_IC

## Spike sort in Docker containers
###  (Linux and MacOS only)

Some sorters are hard to install! To alleviate this headache, SI provides a built-in mechanism to run a spike sorting job in a docker container.

We are maintaining a set of sorter-specific docker files in the [spikeinterface-dockerfiles repo](<https://github.com/SpikeInterface/spikeinterface-dockerfiles>)
and most of the docker images are available on Docker Hub from the [SpikeInterface organization](<https://hub.docker.com/orgs/spikeinterface/repositories>).

Running spike sorting in a docker container just requires to:

1. have docker installed
2. have docker python SDK installed (`pip install docker`)

When docker is installed, you can simply run the sorter in a specified docker image:

In [None]:
sorting_SC = ss.run_spykingcircus(recording_saved, output_folder="/data_local/DataSpikeSorting/results_SC",
                                  docker_image="spikeinterface/spyking-circus-base:1.0.7", 
                                  verbose=True)

In [None]:
sorting_SC

In [None]:
sw.plot_rasters(sorting_SC)

# 5. Extracting waveforms <a class="anchor" id="waveforms"></a>

The core of postprocessing spike sorting results revolves around extracting waveforms from paired recording-sorting objects.

**NEW** In the SI API, waveforms are extracted using the `WaveformExtractor` class in the `core` module.

The `WaveformExtractor` object has convenient functions to retrieve waveforms and templates.
Let's see how it works.

To extract the waveforms, we can run:

In [None]:
si.extract_waveforms?

In [None]:
we = si.extract_waveforms(recording_saved, sorting_IC, folder=base_folder / "wf_IC", progress_bar=True,
                          n_jobs=4, total_memory="500M", overwrite=False, load_if_exists=True)
print(we)

Now all waveforms are computed and stored in the provided `wf_IC` folder. We can now retrieve waveforms and templates easily:

In [None]:
waveforms0 = we.get_waveforms(unit_id=0)
print(f"Waveforms shape: {waveforms0.shape}")
template0 = we.get_template(unit_id=0)
print(f"Template shape: {template0.shape}")
all_templates = we.get_all_templates()
print(f"All templates shape: {all_templates.shape}")

For waveforms, the dimension is (num_spikes, num_samples, num_channels), while each template has dimension (num_samples, num_channels). Note that the number of spikes in this case is 500..we'll get back to it later!

The `WaveformExtractor` is also compatible with several `widgets` to visualize the spike sorting output:

In [None]:
w = sw.plot_unit_waveforms(we, unit_ids=[2, 3, 4])
w = sw.plot_unit_templates(we, unit_ids=[2, 3, 4])
w = sw.plot_unit_probe_map(we, unit_ids=[2,])

In [None]:
w = sw.plot_unit_summary(we, unit_id=3)

As we noticed before, the number of spikes for the waveforms is 500. Let's check the number of spikes for other waveforms:

In [None]:
for unit in sorting_IC.get_unit_ids():
    waveforms = we.get_waveforms(unit_id=unit)
    spiketrain = sorting_IC.get_unit_spike_train(unit)
    print(f"Unit {unit} - num waveforms: {waveforms.shape[0]} - num spikes: {len(spiketrain)}")

No units have more than 500 spikes! This is because by default, the `WaveformExtractor` extracts waveforms on a random subset of 500 spikes. To extract waveforms on all spikes, we can use the `max_spikes_per_unit` argument:

In [None]:
we_all = si.extract_waveforms(recording_saved, sorting_IC, folder=base_folder / "wf_IC_all", 
                              max_spikes_per_unit=None, progress_bar=True, load_if_exists=True)

In [None]:
for unit in sorting_IC.get_unit_ids():
    waveforms = we_all.get_waveforms(unit_id=unit)
    spiketrain = sorting_IC.get_unit_spike_train(unit)
    print(f"Unit {unit} - num waveforms: {waveforms.shape[0]} - num spikes: {len(spiketrain)}")

Now waveforms have been extracted for all spikes! Let's move on to explore the postprocessing capabilities of the `toolkit` module.

# 6. Postprocessing <a class="anchor" id="postprocessing"></a>

Postprocessing spike sorting results ranges from computing additional information, such as spike amplitudes and Principal Component Analisys (PCA) scores, to computing features of the extracellular waveforms, similarity between templates and crosscorrelograms. All of this is possible with the `toolkit` module.

### PCA scores

PCA scores can be easily computed with the `compute_principal_components()` function. Similarly to the `extract_waveforms`, the function returns an object of type `WaveformPrincipalComponent` that allows to retrieve all pc scores on demand.

In [None]:
st.compute_principal_components?

In [None]:
pc = st.compute_principal_components(we, n_components=3, load_if_exists=True)

In [None]:
pc0 = pc.get_components(unit_id=0)
print(f"PC scores shape: {pc0.shape}")
all_labels, all_pcs = pc.get_all_components()
print(f"All PC scores shape: {all_pcs.shape}")

For pc scores of a single unit, the dimension is (num_spikes, num_components, num_channels). 
The `get_all_components()` function returns an array with the label/unit id for each component (`all_labels`) and an array of dimension (num_all_samples, num_components, num_channels). 

### Spike amplitudes

Spike amplitudes can be computed with the `get_spike_amplitudes` function.

In [None]:
amplitudes = st.compute_spike_amplitudes(we, outputs="concatenated", progress_bar=True)

In [None]:
amplitudes

In [None]:
sw.plot_amplitudes_distribution(we)

In [None]:
sw.plot_amplitudes_timeseries(we, unit_ids=[3, 11])

By default, all amplitudes are concatenated in one array.

The correspinding spike times and labels can be easily retrieved as:

In [None]:
all_spike_times, all_spike_labels = sorting_IC.get_all_spike_trains()[0]

The [0] index is to select the first segment. In case of multiple segments each element will correspond to a different segment and will contain spike times and labels for that segment.


### Compute template metrics

Template metrics, or extracellular features, such as peak to valley duration or full-width half maximum, are important to classify neurons into putative classes (excitatory - inhibitory). The `toolkit` allows one to compute several of these metrics:

In [None]:
print(st.get_template_metric_names())

In [None]:
template_metrics = st.calculate_template_metrics(we)
display(template_metrics)

For more information about these template metrics, we refer to this [documentation](https://github.com/AllenInstitute/ecephys_spike_sorting/tree/master/ecephys_spike_sorting/modules/mean_waveforms) from the Allen Institute.

# 7. Viewers <a class="anchor" id="viewers"></a>

Let's check put the `spikeinterface-gui` to explore our spike sorting results:

In [None]:
!sigui wf_IC

# 8. Validation and curation <a class="anchor" id="curation"></a>

The `toolkit` module also provides several functions to compute qualitity metrics to validate the spike sorting results.

Let's see what metrics are available:

In [None]:
print(st.get_quality_metric_list())

In [None]:
qc = st.compute_quality_metrics(we)

In [None]:
display(qc)

For more information about these waveform features, we refer to this [documentation](https://allensdk.readthedocs.io/en/latest/_static/examples/nb/ecephys_quality_metrics.html) from the Allen Institute.

## Automatic curation based on quality metrics

A viable option to curate (or at least pre-curate) a spike sorting output is to filter units based on quality metrics. As we have already computed quality metrics a few lines above, we can simply filter the `qc` dataframe based on some thresholds.

Here, we'll only keep units with an SNR > 5 and an ISI violation threshold < 0.2:

In [None]:
snr_thresh = 5
isi_viol_thresh = 0.2

A straightforward way to filter a pandas dataframe is via the `query`.
We first define our query (make sure the names match the column names of the dataframe):

In [None]:
our_query = f"snr > {snr_thresh} & isi_violations_rate < {isi_viol_thresh}"
print(our_query)

and then we can use the query to select units:

In [None]:
keep_units = qc.query(our_query)
keep_unit_ids = keep_units.index.values

In [None]:
sorting_auto = sorting_IC.select_units(keep_unit_ids)
print(f"Number of units before curation: {len(sorting_IC.get_unit_ids())}")
print(f"Number of units after curation: {len(sorting_auto.get_unit_ids())}")

# 8. Spike sorting comparison <a class="anchor" id="comparison"></a>

Can we combine the output of multiple sorters to curate the spike sorting output?

To answer this question we can use the `comparison` module.
We first compare and match the output spike trains of the different sorters, and we can then extract a new `SortingExtractor` with only the units in agreement.

### compare 2 by two

In [None]:
comp = comp_TDC_IC = sc.compare_two_sorters(sorting_TDC, sorting_IC, 'TDC', 'IC')

In [None]:
sw.plot_agreement_matrix(comp)

In [None]:
comp = comp_TDC_IC = sc.compare_two_sorters(sorting_SC, sorting_IC, 'SC', 'IC')
sw.plot_agreement_matrix(comp)

In [None]:
comp = comp_TDC_IC = sc.compare_two_sorters(sorting_SC, sorting_TDC, 'SC', 'TDC')
sw.plot_agreement_matrix(comp)

### compare all

In [None]:
mcmp = sc.compare_multiple_sorters([sorting_TDC, sorting_IC, sorting_SC], ['TDC', 'IC', 'SC'], 
                                   spiketrain_mode='union', verbose=True)

In [None]:
w = sw.plot_multicomp_agreement(mcmp)
w = sw.plot_multicomp_agreement_by_sorter(mcmp)

In [None]:
sw.plot_multicomp_graph(mcmp, draw_labels=True)

In [None]:
agreement_sorting = mcmp.get_agreement_sorting(minimum_agreement_count=2)

In [None]:
sw.plot_rasters(agreement_sorting)

# 9. Exporters <a class="anchor" id="exporters"></a>

## Export to Phy for manual curation

To perform manual curation we can export the data to [Phy](https://github.com/cortex-lab/phy). 

In [None]:
from spikeinterface.exporters import export_to_phy

In [None]:
export_to_phy(we_all, output_folder=base_folder / 'phy_IC',
              progress_bar=True, total_memory='100M')

In [None]:
%%capture --no-display
!phy template-gui phy_IC/params.py

After curating the results we can reload it using the `PhySortingExtractor` and exclude the units that we labeled as `noise`:

In [None]:
sorting_IC_phy_curated = se.PhySortingExtractor(base_folder / 'phy_IC/', exclude_cluster_groups=['noise'])

In [None]:
print(f"Number of units before curation: {len(sorting_IC.get_unit_ids())}")
# in manual curation, 4 units were labeled as noise:
print(f"Number of units after curation: {len(sorting_IC_phy_curated.get_unit_ids())}")

## Export spike sorting report

In [None]:
from spikeinterface.exporters import export_report

In [None]:
%%capture --no-display
export_report(waveform_extractor=we, output_folder=base_folder/"SI_report", format="png", show_figures=False)

In [None]:
!ls {base_folder}/SI_report

In [None]:
!ls {base_folder}/SI_report/units/

# Et voilà