There are several implementations of nearest neighbor metrics which can be used to evaluate unit quality.
When calling the compute_quality_metrics()
function, the following options are available to calculate NN metrics:
- The
nearest_neighbor
option will returnnn_hit_rate
andnn_miss_rate
(based on [Siegle]_ inspired by [Chung]_). - The
nn_isolation
option will return the nearest neighbor isolation metric (adapted from [Chung]_). - The
nn_noise_overlap
option will return the nearest neighbor isolation metric (adapted from [Chung]_).
All options involve non-parametric calculations in PCA space.
The membership function, \rho is defined such that for any spike g_i in some cluster G, \rho(g_i) = G. Additionally, the nearest neighbor function n_k(g_i) is defined such that the output of the function is the set of k spikes which are closest to g_i.
For a unit associated with cluster C, a subset of spikes are randomly drawn to form the cluster A. A subset of spikes which are not in C are drawn to form the cluster B. Note that |A| = |B|. The NN-hit rate for C is then:
NN_{\textrm{hit}}(C) = \frac{1}{k} \sum_{i=1}^{k} \frac{ | \{x \in A : \rho(n_i(x)) = A \} |}{ | A | }
Similarly, the NN-miss rate for C is:
NN_{\textrm{miss}}(C) = \frac{1}{k} \sum_{i=1}^{k} \frac{ | \{x \in B : \rho(n_i(x)) = A \} |}{ | B | }
NN-hit rate gives an estimate of contamination (an uncontaminated unit should have a high NN-hit rate). NN-miss rate gives an estimate of completeness. A more complete unit should have a low NN-miss rate.
The overall logic of this approach is to choose a cluster for which the isolation is to be computed, and compute the pairwise isolation score between the chosen cluster and every other cluster. The isolation score is then the minimum of the pairwise scores (the worst case).
Let A and B be two clusters from sorting.
We set |A| = |B| by subsampling as appropriate to match the size of the smaller cluster (or the max_spikes_for_nn
parameter value, if using).
We also restrict the waveforms to channels with significant signal.
The pairwise isolation between clusters A and B is then:
NN_{\textrm{isolation}}(A, B) = \frac{1}{k} \sum_{i=1}^{k} \frac{ | \{x \in A \cup B : \rho(n_i(x)) = \rho(x) \} |}{ | A \cup B | }
Note that nn_isolation is affected by the size of the clusters, so setting the max_spikes_for_nn
may aid downstream comparison of scores.
A noise cluster is generated by randomly sampling voltage snippets from the recording. Following a similar procedure to that of the nn_isolation method, compute isolation between the cluster of interest and the generated noise cluster. noise overlap is then 1 - NN_{\textrm{isolation}}.
This metric gives an indication of the contamination present in the unit cluster.
.. autofunction:: spikeinterface.qualitymetrics.pca_metrics.nearest_neighbors_metrics
.. autofunction:: spikeinterface.qualitymetrics.pca_metrics.nearest_neighbors_isolation
.. autofunction:: spikeinterface.qualitymetrics.pca_metrics.nearest_neighbors_noise_overlap
Introduced by [Chung]_ and adapted by [Siegle]_ and Kyu Hyun Lee.