Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-counted spikes #29

Closed
jsiegle opened this issue Mar 13, 2019 · 18 comments · Fixed by #595
Closed

Double-counted spikes #29

jsiegle opened this issue Mar 13, 2019 · 18 comments · Fixed by #595

Comments

@jsiegle
Copy link

jsiegle commented Mar 13, 2019

It seems like there's a tradeoff between Kilosort's spike threshold and the frequency of putative double-counted spikes. Presumably, this results from spike residuals getting fit by the same template or a different template on the same set of channels. So, the lower the threshold, the higher the likelihood that a residual will get counted as a different spike. Here's an example of a pair of units that has a lots of overlapping spikes at zero time lag, which is unlikely to be physiological:

double-counting_example_sort1_574_326

The good news is that the latest version of Kilosort is performing much better than previously when it comes to double-counting. Even with a lower threshold (ops.Th = [10 4] vs. [12 12] previously), the percentage of units with large zero-lag ACG peaks drops from 10% to 3%. But because false-positive zero-lag synchrony is so dangerous, it would be advantageous to reduce double-counting even further.

A few questions:

  • Do other people see this as well? The easiest way to visualize it in phy is by selecting a block of units from nearby channels and looking for pairs with abnormally large peaks at 0.
  • Are there any parameters besides ops.Th that would help reduce double counting? I tried changing ops.lam to 40, but it didn't have much of an effect.
  • Does double-counting always happen on the last pass, after the residuals are subtracted? Or can the same spike be fit by multiple templates during the main sort?
  • Would it be possible to flag spikes with highly overlapping templates that are detected within +/-2 samples? Nick suggested notifying the user about undetected spikes in tell user about undetected spikes #3, and I think adding information about potential double-counted spikes could also be helpful.
@marius10p
Copy link
Contributor

Thanks Josh. Is the issue of zero-lag ACG peaks related to the issue of zero-lag CCG peaks?

@jsiegle
Copy link
Author

jsiegle commented Mar 13, 2019

I think so...they both occur more often when the threshold is lowered. But I am most worried about zero-lag CCG peaks, because we do expect there to be some perfectly overlapping spikes, just not as many as we're seeing in certain pairs.

@AurelieP23
Copy link

I may be seeing this as well, though not very often. It seems to correlate with a smaller ops.lam (in this example ops.lam = 0, ops.Th = [10 2]).

Ex_doublecountedAPs

The templates for the two clusters are very different ( blue one was manually merged, hence the blue and grey traces).

Ex_doublecountedAPs_templates

Any thoughts about the role of ops.lam and ops.Th in this issue?

@AurelieP23
Copy link

It seems like increasing ops.Th(1) may help... from #17 (example with blue, red and yellow clusters)

@marius10p
Copy link
Contributor

@AurelieP23 I would not be too worried about your example... the red cluster should be discarded anyway, since it only has 159 spikes. You are right that increasing Th(1) would reduce this problem, but you might lose other perfectly good units in the process. This situation should happen only for the very large spikes, when there is a residual after template subtraction.

Are you sure the blue one should have been merged? Looks like pretty large refractory violation bin at 0.

@chris-angeloni
Copy link

I have also seen this with my data, and that makes sense that it is just refitting the residual, as I usually see it with very high amplitude units, as seen in the screen shot below. I'll note that this clustering was run with the default parameters, where threshold is [10 4] and lam is 10.

image

@kalfton
Copy link

kalfton commented May 30, 2019

I encountered this problem as well, and I will show this problem in another prospective.
We recorded neurons by electrode array, and each electrode channel is far from others, so there should not be any neuron appears in multiple channels.

I use kilosort to detect spikes from the channels, and then import the result to Plexon's offline sorter to check. (ops.lam = 10; ops.Th = [10 4]; )

The picture below shows one of the channels in offline sorter. There are 3 units here(yellow , green and blue). And we found that a large number of the spikes in the blue unit is just a small shift of that in the yellow unit, and I will explain it below:

The 3 subplots on the top row are: Raw spike shapes, ACG and CCG of units, and PCA result. The CCG of yellow and green units has a large peak at zero lag. (even for yellow and blue units as well)

The row in the middle shows the units' average shapes and histograms of Inter-spike intervals

The subplot in the lowest row: It plots spikes versus time. Here as you can see, a spike from the green unit overlaps with a spike from the yellow unit. Also another spike from the yellow unit overlaps with a unsorted(mua) spike.

plexon

I found this double counting problem happened in about 10% of the channels. These double counted spikes are very dangerous for us, and is not easy to discover and remove them in the phy GUI. So it would be very beneficial to find a way to get rid of it.

@susuchen66
Copy link

susuchen66 commented Jul 7, 2019

Hi we are also seeing such cases which show up as high peak at near zero of cross correlogram between two clusters, which presumably is a result of a second template fit onto residual after the first template subtraction. This is obvious in about 5% of clusters, but could be more prevalent in cases that are harder to detect.

We realised this both when using JRClust GUI and Phy during manual curation.

ks2output15
templates1
templates2

@susuchen66
Copy link

susuchen66 commented Jul 7, 2019

In addition, we encounter large fraction of double counted spikes within the same cluster, and they are slightly temporally shifted, so the autocorrelogram shows high peak near zero.

High amplitude unit example visualised in JRClust GUI:

Screenshot 2019-07-07 at 17 23 45

Same unit in Phy. Note that Phy does not show the peak at near zero!! (this issue has been reported to Cyrille on Phy Github page):

Screenshot 2019-07-07 at 17 24 09

@susuchen66
Copy link

susuchen66 commented Jul 7, 2019

Here is a summary of ISI violation % when setting a threshold of 0.5 ms, 1 ms, or 2 ms for 180 manually curated units. a quick way we can pick out clusters with likely double counting by comparing the number of ISI < 0.5 and ISI < 2.0

Picture1

@DJ-Hayden
Copy link

DJ-Hayden commented Jul 12, 2019

Could this simply be fixed by searching through every channel and flagging spikes that occur (in separate clusters) within a millisecond of each other?

@thomaszhihaoluo
Copy link

Hi! Has anyone has arrived at a way to address this issue and would be willing to let me know? I would really appreciate it. I encounter a similar problem with my datasets. Thanks!

@NoahDolev
Copy link

NoahDolev commented Dec 2, 2019

We have similar issues sorting retinal spikes on MEAs. In addition to double counted spikes, we also have spikes where only a portion of them is detected as part of the waveform.

Can you point us to parameters that would be likely to affect this? Is anyone willing to post their ops settings for retina?

image

@bapungiri
Copy link

Another example of this issue. Does anyone have a solution to this?

image

@nsteinme
Copy link
Collaborator

nsteinme commented Jan 5, 2020

In this last example from @bapungiri, the red unit is one with a much smaller template (can see this in the amplitude column value), where the red template is erroneously found on the residuals of the blue. Since the red one has very low amplitude (judging by the few red spikes that don't look like the blue), a simple amplitude threshold on the template would solve this, i.e. you ought to call the red one noise, and an amplitude threshold would achieve that. Are there examples where neither of the templates are very small? I doubt it, but please show if there are. (P.S. thanks for the full phy screenshot, very helpful).

@bapungiri
Copy link

You are correct, these are mostly restricted to high amplitude spikes. The red cluster shown above was previously part of another cluster (53 in the above example) which had a smaller amplitude. I guess after subtracting the high amplitude template, the residuals were similar to template of smaller waveform (cluster 53). Just as an example, I re-merged the red cluster to cluster number 53 to show two waveforms after which i decided to separate them.
image

@DJ-Hayden
Copy link

Hello,

I don't think this is just a template-subtraction issue.

I've attached two images. The first is of unit 14 alone. Check out TraceView, where 3 spikes in a row are clearly visible (and likely the same unit), but only the first is included. Then check out unit 14 and 314. The first spike is a part of BOTH unit 14 and unit 314.

This is...troubling to say the least. Are there any plans to make sure the algorithm isn't double-counting spikes?

Unit 14:
unit14_only

Unit 14 and 314:
unit14_and_314

@marius10p
Copy link
Contributor

It's a very small proportion of all spikes that get double counted. Remove them yourself in post-processing if you're still worried about them, but right now this is the price to pay for the template matching approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.