Searches for clusters that likely belong to the same cell, and merges them automatically.
This module has not been used since switching to Kilosort 2, which has far fewer split units than Kilosort 1. We're keeping the code around in case others find it useful. For example, it could be helpful for matching units across a series of chronic recordings. However, much more complete implementations of this functionality exist elsewhere (UnitMatch, for example).
The SpikeInterface postprocessing
module can compute information that is helpful for making merge decisions, such as waveform similarity and cross-correlograms:
import spikeinterface.full as si
from spikeinterface.postprocessing import (compute_template_similarity,
compute_correlograms)
# run a sorter and extract waveforms
# note that this omits some important pre-processing steps and parameters for brevity
recording = si.read_openephys('/path/to/data')
sorting = si.run_sorter('kilosort2_5', recording)
waveform_extractor = si.extract_waveforms(recording=recording,
sorting=sorting,
folder='waveforms')
# run post-processing steps
_ = compute_template_similarity(waveform_extractor)
_ = compute_correlograms(waveform_extractor)
In addition, the curation
module includes a function for identifying units that likely need to be merged:
from spikeinterface.curation import get_potential_auto_merge
# Returns a list of unit pairs that should be merged
potential_merges = get_potential_auto_merge(waveform_extractor)
More information can be found in the documentation for the Curation module and the get_potential_auto_merge
method.
python -m ecephys_spike_sorting.modules.automerging --input_json <path to input json> --output_json <path to output json>
Two arguments must be included:
- The location of an existing file in JSON format containing a list of paths and parameters.
- The location to write a file in JSON format containing information generated by the module while it was run.
See the _schemas.py
file for detailed information about the contents of the input JSON.
- Kilosort outputs : includes spike times, spike clusters, cluster quality, etc.
- spike_clusters.npy : updated with new cluster labels
- cluster_group.tsv : updated with new cluster labels