New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: correct way to do permutation clustering for multiple label time courses #1176
Comments
AFAIK you would have to construct a custom connectivity matrix that connects the columns but not rows and then pass it using the |
@mluessi yes, that's where I'm stuck --- setting up the correct structure .... |
Using a custom matrix will be slow (because in it can't use the optimized cluster finding code). I think in your case you could pass the data as a 1D signal, i.e., concatenate all rows together and add one-element zero values between the rows (to make sure there cant be clusters that span more than one row). |
yeah, I though of that as well, the zero padding idea was missing though :-) |
Might this be beneficial on the mne_analysis list, where others could follow? |
Anyways, with 10 time courses performance should not be an issue. Would actually be nice to expose some custom connectivity matrix example in the -- to-be-started -- cookbook or so |
Yes, I agree, that would be nice. I also agree with @dgwakeman, I think it is good to send general analysis related questions to the mailing list. This question is a good example of a question about an advanced feature available only in mne-python, so it would be a good idea to expose more users to it. At the same time, it is not a good idea to spam the mailing list ;). |
Maybe a good compromise would be to post the original question to the list, |
Hi folks, another obvious solution to handling this to put empty time courses between the label time courses. An array of size (16, 10, 500) would then be (16, 20, 500). This should be equivalent of @mluessi proposal but has as advantage that the returned cluster indices are bool ndarray instead of slices. Subsequent visualization seems more straighforward to me this way, at least given my spatio-temporal viz code. I think it would be nice to properly support this use case, maybe by providing a temporal-only connectivity matrix. |
the use case is also relevant for multi sensors without the spatial see this very old example: one could avoid the mean over the time axis |
yes +1 |
So we would need to setup a conn matrix which comprises different blocks of features. The temporal features would be connected via the diagonal and diagonal + 1 (for the upper triangle). The spatial feature block would only have zeros, just as the spatio-temporal cross-section. Does that make sense? @Eric89GXL @mluessi @agramfort |
That being said, it would of course be nice to have a top-level function that allows you to set e.g. one type of connectivity {regular lattice, ...} selectively for one block of features, e.g., {frequency only, time only, space only}. |
Probably it would also be nice to set the degree (1, 2, 3, neighbours ...) |
So how would people think about a function that covers the creation of the most relevant connectivity matrices? It could be used internally. |
Alternatively: if good sklearn / scipy utils should exist, it would be worth adding an example that covers this usecase: custom connectivity + multiple label time series. |
I doubt those packages provide use cases. I'm +1 for adding relevant connectivity matrices like this. |
So we would create a matrix of zeros with n_columns = n_features, e.g. n_times * n_locations. |
we should start a list of todos for the sprint in june... |
Ok folks, here's a first draft on a temporal only connectivity function: from scipy import sparse
def get_temporal_connectivity(n_locations, n_times):
"""Create temporal connectivity matrix
Useful for analyses of non neighouring labels.
Note. It is assumed that time is the last dimension
Parameters
----------
n_locations : int
the number of locations.
n_times : int
the number of time points.
"""
n_features = n_locations + n_times
c1 = sparse.eye(n_features, k=1)
c1.data[:, :n_locations + 1] = 0
c2 = sparse.eye(n_features, k=-1)
c2.data[:, :n_locations] = 0
return c1 + c2 |
Ok I think I now see what's wrong here (the dimensions...) I'm currently testing and will let you know once everything works with my use case. |
Ok, here we go: def get_temporal_connectivity(n_locations, n_times):
"""Create temporal connectivity matrix
Useful for analyses of non neighouring labels.
Note. It is assumed that time is the last dimension
Parameters
----------
n_locations : int
the number of locations.
n_times : int
the number of time points.
"""
n_features = n_locations * n_times
connectivity = sparse.eye(n_features, k=1)
connectivity.data[:, n_times - 1::n_times] = 0
return connectivity works on my machine ;-) any thoughts? shall we add this as a function or make an example? @agramfort @mluessi @Eric89GXL |
For generalization: a higher order use case would then be to disconnect labels while connecting times and frequencies for multiple TFR analyses. And so on. |
Shouldn't it be
This assumes that your data is originally a |
see updated code ... |
lol.. I see you figured it out while I was writing my comment. The solution is always the same :), but you also need the |
parallel discoveries ;-)
... why actually? The documentation says only the upper triangle is uesed. |
Oh.. I guess that makes sense :). You should be all set then. I wonder how using this custom matrix compares in terms of computation time to using the flattening trick we discussed earlier.. I assume that using the matrix will be much slower.. |
@mluessi for that use case it does not matter at all: 10 time courses ;-) |
I'm wondering what would be the appropriate way to do clustering permutation stats with a bunch of time courses extracted from multiple labels. Currently I'm using 'permutation_cluster_test' with an f-test. However this can lead to pseudo-spatial clusters just because one label is put next to another label in the data ndarray. My hunch would be to pass a connectivity matrix that only connects temporal features but not the spatial ones. Any thoughts on that @christianmbrodbeck @agramfort
The text was updated successfully, but these errors were encountered: