Problem with tedpca #949

BahmanTahayori · 2023-06-06T04:58:54Z

Summary

When tedpca is set to a number (float/integer) in tedana 23 an error is reported and the process stops. Here is an example when tedpca is set to 0.99.

tedana: error: argument --tedpca: invalid choice: 0.99 (choose from 'mdl', 'kic', 'aic')

handwerkerd · 2023-06-06T13:29:04Z

Thank you for noticing this. I think I see what changed. We added the following line to the command line argument parser to limit string inputs, but it prevented users from giving numbers as inputs. Our test used the API, which doesn't have this issue. I'll try to get this fixed soon.

https://github.com/ME-ICA/tedana/blob/9a3b83f21a245e4c0fbf1c8f97ceb9b0c46b52c0/tedana/workflows/tedana.py#LL153C8-L153C8

handwerkerd · 2023-06-06T20:15:01Z

Please check and confirm #950 addresses this issue.

I also added in code to skip the dimensionality reduction step, but I realized it caused additional problems so I removed it. My added code would just give the maximum number of components, which would likely make ICA fail since many components would just be gaussian noise. I think your existing method of setting --tedpca=0.99 (or 0.999?) might actually be the best option. You'll effectively model the full dataset, but remove dimensions that are just gaussian. As an added benefit the reduced number of components would effectively be your loss in degrees of freedom from an earlier MP-PCA or NORDIC process. That is, if the full time series has 300 volumes and 99.9% of the variance is in 100 components, that means the PCA-based denoising process effectively removed 200 degrees of freedom. tedana would not keep track of that because it add that noise back in, but you'd be able to track and use that information as appropriate.

BahmanTahayori · 2023-06-07T01:40:54Z

Thanks Dan,

when we wanted to apply changes, I tried to skip PCA but I ran into some issues, therefore, I decided to set the tedpca to a number close to 1 (I used 0.9999 but the results are similar to 0.99). A good number of components have empty maps and will not be considered. I agree with you that manipulating the pca with --tedpca is a good option. Thanks for your help.

Lestropie · 2023-06-29T06:56:30Z

@handwerkerd Can you provide any additional detail into the additional problems caused by skipping the dimensionality reduction step? I'd still like to pursue that prospect, even if it's a little tricky code-wise.

I'm a little hesitant about setting a variance threshold "close to 100%" and hoping that it does a reasonable job. If there's a lot of low-variance components due to a prior PCA, then a small change in that variance threshold eg. 99% -> 99.9% -> 99.99% might actually change the number of components considerably, and I don't have a feel for what influence that might have on the ICA. So having a full bypass and dealing with classification exclusively at the ICA is still appealing.

handwerkerd · 2023-06-29T14:46:43Z

@Lestropie I think setting the threshold to 99% only makes sense if estimated Gaussian noise was removed in an earlier preprocessing step. I think you are aware of that context, but I'm realizing it might not be obvious to others reading this thread.

Ideally, the PCA + dimensionality estimation step should identify the minimum number of components/dimensions required to model the parts of the signal that are not Gaussian noise. That would be my ultimate goal. From a practical perspective, I'm not sure how much that matters. The practical requirement is to end up with a manageable number of components (i.e. <1/3 of the total number of volumes in my typical experience with fMRI) and where ICA reliably converges. Within tedana, any component that is not modeled by PCA is retained in the signal so there is no risk of removing task-locked information in excluded low-variance PCA components. Additionally, both of the current decision tree option skew towards keeping low variance ICA components rather than losing statistical degrees of freedom.

That means that getting dimensionality estimation right means the ICA might be more accurate and opens ways to more aggressively remove undesired noise, but the worst case of getting it slightly wrong is not remove some noise.

FWIW, The kundu method for pca dimensionality estimation in tedana takes advantage of multi-echo information to remove PCA components that are likely to be thermal noise. I think taking advantage of multi-echo information in this step has a lot of potental, but that option is implemented in a confusing and slightly brittle manner so I tend not to use it. Extending that concept, one of my projects is to see if there's a way to use a generalization of ICA to create components that are more purely T2* or S0 weighted and then making noise removal easier and more effective. Some early work in that direction is here: https://fim.nimh.nih.gov/sites/default/files/handwerker_multiechodenoising_ohbm2018_small.pdf

Does this answer your question?

Lestropie · 2023-07-10T00:54:29Z

There's a lot of discussions to potentially branch out from here, but it all diverges further and further from the original purpose of this issue listing. What I might do is let this thread die naturally, explore some data more closely with @BahmanTahayori, and if I still think there's merit in the prospect of a full bypass (or maybe something else, eg. a tailored PCA rank estimation heuristic for our use case) then I'll re-raise in its own thread.

handwerkerd added the bug issues describing a bug or error found in the project label Jun 6, 2023

handwerkerd mentioned this issue Jun 6, 2023

tedpca CLI fix #950

Merged

handwerkerd closed this as completed in #950 Jun 7, 2023

Lestropie mentioned this issue Jun 29, 2023

-tedpca command-line interface #956

Closed

tsalo mentioned this issue Aug 24, 2023

Add section to reports that show system info, tedana call and version #747

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with tedpca #949

Problem with tedpca #949

BahmanTahayori commented Jun 6, 2023 •

edited

Loading

handwerkerd commented Jun 6, 2023

handwerkerd commented Jun 6, 2023

BahmanTahayori commented Jun 7, 2023

Lestropie commented Jun 29, 2023

handwerkerd commented Jun 29, 2023

Lestropie commented Jul 10, 2023

Problem with tedpca #949

Problem with tedpca #949

Comments

BahmanTahayori commented Jun 6, 2023 • edited Loading

Summary

handwerkerd commented Jun 6, 2023

handwerkerd commented Jun 6, 2023

BahmanTahayori commented Jun 7, 2023

Lestropie commented Jun 29, 2023

handwerkerd commented Jun 29, 2023

Lestropie commented Jul 10, 2023

BahmanTahayori commented Jun 6, 2023 •

edited

Loading