Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP, ENH: Avoid ICA on rank-deficient data #8316

Closed
wants to merge 1 commit into from

Conversation

hoechenberger
Copy link
Member

@hoechenberger hoechenberger commented Sep 29, 2020

  • ICA.max_pca_components can now be set to rank to limit the number of components to the rank the data.
  • This is also the new default.
  • The previous default value, None, is still an accepted, but deprecated.
  • None, too, now dictates to use the rank of the data. Previously, it implied to use the number of channels, which caused problems with rank-deficient data.
  • If requested number of ICA or PCA components exceeds the rank of the data, we will now raise an exception.
  • compute_rank() and _estimate_rank_meeg_signals() have gained a ref_meg=False kwarg to allow for rank calculation to include MEG reference channels. This was required to make some tests work.

Implementing this helped discover a few existing test cases that never converged because they fed rank-deficient data to ICA.

Closes #8313.

I need help:

  • mne/viz/tests/test_ica.py::test_plot_ica_overlay doesn't pass because of Accessing data after applying gradient compensation & picking doesn't work #8315
  • mne/preprocessing/tests/test_ica.py::test_ica_full_data_recovery doesn't pass because the difference between actual and expected test results is too large. It's still rather small but larger than the predefined limits. I have no idea why that is the case, and whether it might be safe to just make the test more lenient or not.

- `ICA.max_pca_components` can now be set to `rank` to use the
rank the data.
- This is also the new default. The previous default value, ``None``,
  is still an accepted, but deprecated input. ``None``, too, now
  dictates to use the rank of the data. Previously, it implied to
  use the number of channels, which caused problems with
  rank-deficient data.
- If requested number of ICA or PCA components exceeds the rank of
  the data, we will now raise an exception.
- `compute_rank()` and `_estimate_rank_meeg_signals()` have gained
  a `reg_meg=False` kwarg, to allow rank calculation for MEG reference
  channels. This was required to make some tests work.
@hoechenberger
Copy link
Member Author

Another call with @agramfort … will try yet another approach. Stay tuned ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant