Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to look at clustering in time, frequency and space #28

Open
pokor076 opened this issue Mar 31, 2020 · 9 comments
Open

How to look at clustering in time, frequency and space #28

pokor076 opened this issue Mar 31, 2020 · 9 comments

Comments

@pokor076
Copy link

Hi there,
I'm enjoying playing around with the toolkit and love the idea of the TFCE approach. Is it possible to perform a t-test that identifies significant clusters in time, frequency and space? I have tried running two matrices (condition1 and condition2) composed of 128 electrodes x 129 frequency bins x 513 time bins through the ept_TFCE function (I changed the ft_flag to 1) . The function ran, but the output didn't have any spatial information and I couldn't get the results to load (I know I selected a valid e_loc file when running the t-test because I was able to get spatial information when reading in subjxtimexelec matrices). Essentially, I would like contrast two grand average time-frequency matrices to identify spatio-temporo-frequency clusters of interest. Is this possible using this toolkit and if so do you have any tips for where I might be going astray?

Thanks!
Victor

@Mensen
Copy link
Owner

Mensen commented Apr 1, 2020

In principle, you shouldn't need to change anything and your matrix will just have the other dimension in it.

The ft_flag is for if you don't have any channel information at all.

@pokor076
Copy link
Author

pokor076 commented Apr 1, 2020 via email

@pokor076
Copy link
Author

Ok so I was able to get everything to run smoothly (I believe). The result viewer is super handy and intuitive! Now my issue is that I'm not seeing any compelling clusters in space, frequency or time (screenshot attached). I realize that it's possible that there are just no strong clusters that modulate between my conditions, but I wanted to check with you to ask: 1.) is this a fairly common result of TFCE on TF EEG data? 2.) do you have any recommendations for approaches that better leverage the power of TFCE for TF EEG data (in contrast to the approach I described in my first post)?

Thanks so much for this tool! My lab produces a lot of TF EEG data and a perennial issue we face is choosing electrodes, time points and frequency bins of interest for exploratory analyses.

image

@Mensen
Copy link
Owner

Mensen commented Apr 13, 2020

Hmmm... I wouldn't say that is a typical results actually. Although I don't do a lot of channel x time x frequency analysis myself... so difficult for me to comment on "typical".

I'm glad you think the result viewer is still handy... I don't think I've touched the code for almost 5 years at this point and would probably do a lot of things differently (if I could ever find the time).

I'd go as far as to say there may be some mistake in the analysis... however, you do get "some" results.

If you plot the individual TF maps for a single channel of interest, does it seem like there should be a difference there?

If you then look at the "Results" variable in the file, and find the T_obs (observed T-value) for a few specific points of interest, are those T-values very high?

While the TFCE method does "generally" reduce the significance compared to analysing any one single selected point (given the multiple comparisons issue)... this is not always the case since if those points have strong support from their neighbours (in channel, time, and frequency), it could in fact increase significance (lower p-value), then the single-point analysis.

Let's first make sure there are no bugs somewhere in that code, and then you can be more confident in those results (or lack thereof).

@pokor076
Copy link
Author

Hey thanks for the response! Plotting TF maps for a single channel of interest definitely makes it seem like there are condition differences (example subtraction surfaces attached). Similarly, when I plot the Results.Obs (which I'm guessing is the raw t values variable) for a single channel, the surface looks fairly believable (attached). However, the corresponding p values to the believable raw t values are all 1... This makes me think that there are two possible explanations: 1. my data has a lack of spatial, frequency and/or time clusters of difference between conditions 2. there is an issue with how the cluster enhancement code is dealing with the three dimensions. Let me know what you think. Like I said I'm super motivated to try to get the details of this method ironed out because it could be a real game changer!

image

image

@Mensen
Copy link
Owner

Mensen commented Apr 25, 2020

Could you restrict the analysis to a single time point, or frequency bin, or even channel of particular interest and see if the code still generates only 1's for the p-value?

@pokor076
Copy link
Author

Yep so I tried another t-test contrasting two conditions for 128 elecs x 1 frequency bin x 513 time bins and again the original t values seem reasonable, but the p values are dominated by 1s again (histogram attached).

This may or may not be related, but I've noticed that if I choose to run an independent t-test the code takes about 2-4 min to run, but when I choose to run the dependent t-test on the same data the code takes around 8 hours to run. Does that seem right? The independent t-test seems to be coming up with more significant p-values as well (I would've thought the dependent t-test would be more sensitive). I tried running an independent t-test on my original grand averages (elecsxfreqxtime), but matlab gives me the attached error.

Thanks again!

Untitled document.pdf

image

@pokor076
Copy link
Author

Hi just wanted to check in about this issue. Would it be helpful if I DM'd you some example files?

@Mensen
Copy link
Owner

Mensen commented Jun 23, 2020

Sorry for late/no responses. Unfortunately I'm just not sure what could be going on here. This project (and even research field) is no longer related to my daily work, so its very difficult to find the time to go into much detail here.

Its certainly not normal to have such a long run-time for the dependent tests... the calculation itself is actually faster to complete. However the majority of the time is finding the clusters of results at all the different thresholds set (the approximation to "threshold-free"), so if a lot of different clusters of results are in the data, this can take more time. However, this should still not take any longer than a few minutes, so with a run-time of 8 hours something is certainly not write.

If you're willing to share some anonymised summary files with me, perhaps I can have a look under the hood at each stage of the calculation and give you some more info. You can email me, or share with research.mensen@gmail.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants