-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve default dimensions in feature grid view #360
Comments
Hm. Definitely it makes sense to show different dimensions in (i,j) vs (j,i) rather than just mirroring. Are you saying that it currently picks just one channel, and shows you PC1, PC2, and PC3 from that channel in column 1, 2, and 3? And the same for a different channel in the rows? Well, I don't want to claim I'm certain I have the best answer. But here's how I'd do it if I were doing it from scratch: (1,1) = time vs. depth (where depth is a feature computed for each spike - did I request this as a feature? I think it's a really good idea). Then I would keep the x axis of all first-row and first-column plots as Time (like it currently is but keeping time on x rather than y, which I think is much easier to look at. Column 2, 3, 4 = channel x, y, and z where those are the best channels for the first selected cluster. Then you plot PC1, PC1, PC2, and PC3 for each of the four rows in that column (all on the y-axes) Row 2, 3, 4 = channel m, n, p where those are the best channels for the second selected. So for example plot (3, 1) is channelN PC1 vs Time. Plot (3,2) is ChannelN PC1 [again] vs ChannelX PC2. Plot (4,4) is channel P PC3 vs channel Z PC3. etc. This gives you 18 different features to look at, plus six different PC1vsTime plots. It could know to pick a different channel for m, n, or p if those were already selected in x, y, and z. Just one idea! Just thinking how to maximize information there. There could be another "mode" that would be very useful in which all plots were PC1 vs Time, and the plots were arranged by their spatial layout on the probe. So plot(2,1) is the channel physically just above plot(3,1), and you're seeing PC1 vs Time for each. This would really help see drift. |
Great stuff in there!
what is it exactly? is it the "waveform mean position"? |
Yes, the y-coordinate of it. Except for each spike individually, rather On Wed, Jun 17, 2015 at 7:57 PM, Cyrille Rossant notifications@github.com
|
More generally, if I compute a quantity of my own for each spike, can I On Wed, Jun 17, 2015 at 8:09 PM, Nick Steinmetz nick.steinmetz@gmail.com
|
great idea! you can already register your own cluster statistic, we could do the same for spikes, and then add an option to plot it in the feature view what do you have in mind specifically? would it be ok if the interface was something like:
Basically computing your quantities on a per-cluster basis: this would just make my life simpler because this mechanism already exists... |
Yep, I think that would do it. I don't think computing on a per-cluster On Thu, Jun 18, 2015 at 10:08 AM, Cyrille Rossant notifications@github.com
|
No. The assumption is that custom stats are fast to compute and they are always computed on-the-fly. But that's something we could change if necessary. How intensive are your functions? What is it that you want to compute exactly? (that's also the case for the default stats, except mean features/masks/waveforms which take a lot if time to compute due to the high I/O cost of loading everything in RAM...) |
The main thing is computing the x,y location of each individual spike, which optimally would be based on the waveform amplitude on each channel for each spike. That might be fairly intensive. You might be able to do it on the fly for a selection of waveforms. Another thing that I would sometimes find useful (and I know at least two other people who think it is a major feature) is computing the value of the waveform at a particular sample. I.e. just to take the value session.store.waveforms(clusterID)[spikeNum, specifiedSampleNum, specifiedChannelNum]. Then plotting that as a feature so you can use the lasso to split on it. |
You don't have fast access to the waveforms of all spikes. They are fetched dynamically from the raw data, which is slow for thousands of waveforms. So instead, phy samples ~100 waveforms per cluster and saves them for fast per-cluster read access. So you can compute quickly anything based on the waveforms as long as only this small subset of waveforms is used. Anything that uses all waveforms will be extremely slow. |
That is an argument for being able to save the thing you compute so you On Thu, Jun 18, 2015 at 12:21 PM, Cyrille Rossant notifications@github.com
|
Or, how about this, let's say I computed the waveform positions myself and On Thu, Jun 18, 2015 at 12:24 PM, Nick Steinmetz nick.steinmetz@gmail.com
|
that would work, but it's kind of a hack... Ok, so that suggests a different approach:
cc @nippoo |
OK - so I'm not sure how useful this will be, in reality. The concept of an amplitude discriminator at a fixed time offset seems nice in theory, but isn't this the same goal that the PCs are trying to achieve? If you're finding that the features don't discriminate adequately enough between waveforms, perhaps try increasing the number of PCs per waveform from 3 to 5 or something? Arbitrary user-defined amplitude splitting is always going to be REALLY slow (aka cups-of-coffee slow) since you're effectively bypassing PCA and going back to the waveforms. We should really avoid any clustering method based on the raw waveforms, since IIRC you should be able to get as much information as you need out of the features. (Correct me if I'm wrong)... I'm not entirely confident of the maths, but @kdharris101 or @shabnamkadir might have some ideas to suggest about the best way of going around this? |
The problem arises when some neuron has a really unusual waveform, and this And adding more PCs isn't a good solution: it will take a lot longer to do On Thu, Jun 18, 2015 at 5:00 PM, nippoo notifications@github.com wrote:
|
We could write a function quite easily to split a cluster in two based on amplitude above/below a certain point, clickable in the WaveformView. But I reckon it'd be something along the lines of:
|
So for a cluster with 100,000 spikes it'd take about 1-2 minutes, as a rough guess. |
I feel like this is the sort of things that's best done from IPython rather than hard-coded in the GUI... |
Well, I'll put it this way. This is a feature that you can do in Plexon Nick
On Thu, Jun 18, 2015 at 5:14 PM, nippoo notifications@github.com wrote:
|
Same. Unless it's something you'll end up doing regularly...? If this is more than a once-off thing and it'll actually be useful, there should be some way of clicking to select the discriminator's standard and visually confirm it's about right. |
anyway, this is something that might be partially doable with this suggestion -- do you still need it? If so, I'll open a new issue. |
I feel like that might be too general a solution to a specific problem - probably YAGNI for the moment, though I think I've come round to Nick's viewpoint that we probably need something to deal with this particular issue. |
well, I think there are many other examples when you'll want to compute custom per-spike data and plot it (like position of the mouse at the time of every spike etc.) no? |
Well there's a big difference between per-spike data that is computed from On Thu, Jun 18, 2015 at 5:28 PM, Cyrille Rossant notifications@github.com
|
@nsteinme I'm gonna do this. What about the following API:
|
I'm not 100% sure what this API is supposed to achieve - but if I understand right, you're suggesting we should be able to define an arbitrary feature as the amplitude at a certain point and calculate it for all spikes? If so, I think this isn't really a very good solution to the problem - this will be for one cluster and the time offset will probably be different every time you want to do it, and you'll want to discard this information as soon as the split is complete. But maybe I'm understanding wrongly? |
This is something the very early spike sorting systems had (e.g. Michael Recce’s software). At that time I was against it as it makes it easy to “carve out” things that look like clear spikes from pure MUA. But I guess it could be useful in a modern context. From: nippoo [mailto:notifications@github.com] I'm not 100% sure what this API is supposed to achieve - but if I understand right, you're suggesting we should be able to define an arbitrary feature as the amplitude at a certain point and calculate it for all spikes? If so, I think this isn't really a very good solution to the problem - this will be for one cluster and the time offset will probably be different every time you want to do it, and you'll want to discard this information as soon as the split is complete. But maybe I'm understanding wrongly? — |
There are a couple of things this is addressing (correct me if I'm wrong,
Looks good to me! Nick On Fri, Jun 19, 2015 at 11:35 AM, Kenneth Harris notifications@github.com
|
@nsteinme yes you understood correctly; you can define any |
@nsteinme what should be the
x
andy
dimensions in subplot(i, j)
? Previously,x=dim_j, y=dim_i
, now this is reversed. I can find arguments for both ways. WDYT?Also, maybe we could show different dimensions in
(i, j)
and(j, i)
(currently, the plots are just symmetric)The text was updated successfully, but these errors were encountered: