MRG, API: Deprecate max_pca_components #8351

larsoner · 2020-10-07T16:20:58Z

Todo:

Deprecate max_pca_components
Get test_ica.py working
Add n_pca_components to plot_ica_overlay
Make keyword-only after MRG, BUG: Force keywords for verbose in short sigs #8350 is in
Update plot_40_artifact_correction_ica.py
Add warning when n_components leads to unstable pseudoinversion
Update doc/_static/diagrams/ica.dot
Decide if n_components as float selects based on the variance remaining of the n_pca_components components. This is how max_pca_components worked (it was relative to that)

Closes #8337
Closes #8342
Closes #8327
Closes #8328
Closes #8316
Closes #8313
Closes #7727
Closes #8353

larsoner · 2020-10-07T19:31:31Z

Okay this is ready for review/merge from my end

larsoner · 2020-10-07T21:06:44Z

Travis failure is an unrelated conda-forge python 3.9-related bug that's hitting all PRs/branches

agramfort

@hoechenberger and @cbrnr you might want to review and test this branch before it gets merged

mne/preprocessing/ica.py

agramfort · 2020-10-08T11:29:08Z

what should n_pca_components=3, n_components=1. mean?

should break. you cannot limit the number of PCA components and require all the variance to be explained for ICA.

larsoner · 2020-10-08T11:33:58Z

should break. you cannot limit the number of PCA components and require all the variance to be explained for ICA.

It depends on if you're talking about wanting the n_components=1. fraction of explained variance of the N PCA components (which is how it works in master with max_pca_components) or the total variance of your original data (what you're saying/assuming). I can certainly code it this way, and if I were designing it from scratch this is probably how I would do it, but I suspect this is going to break more people's code than the way I have it coded now.

larsoner · 2020-10-08T11:34:34Z

(It certainly breaks more tests, but those are quick enough to fix)

cbrnr · 2020-10-08T12:16:12Z

Wait, are we really distinguishing 1 from 1.0? That's a bit unexpected, so can we find a better solution for this edge case (wanting just 1 component should be rarely needed, whereas all components corresponding to a fraction of 1.0 should be the default).

larsoner · 2020-10-08T12:19:08Z

Wait, are we really distinguishing 1 from 1.0

Yes float and int are different, I can add a test that n_components=1 will raise an error (it does already at least when selecting by exp var, I can make sure it does for the int case)

whereas all components corresponding to a fraction of 1.0 should be the default

This is the default as n_components=None means "use all"

hoechenberger · 2020-10-08T12:24:08Z

@larsoner

It depends on if you're talking about wanting the n_components=1. fraction of explained variance of the N PCA components (which is how it works in master with max_pca_components) or the total variance of your original data (what you're saying/assuming). I can certainly code it this way, and if I were designing it from scratch this is probably how I would do it, but I suspect this is going to break more people's code than the way I have it coded now.

If we're re-thinking these questions from scratch anyway, I'd prefer if n_components: float would refer to the fraction of total explained variance. So we'd need to check if n_components: float <= n_pca_components

I cannot really imagine anyone would want a fraction of a fraction?

hoechenberger · 2020-10-08T12:26:11Z

Wait, are we really distinguishing 1 from 1.0

Yes float and int are different, I can add a test that n_components=1 will raise an error (it does already at least when selecting by exp var, I can make sure it does for the int case)

I believe this would lead to a cleaner API, yes. Good idea.

cbrnr · 2020-10-08T12:31:26Z

What if I do want to select just one component?

hoechenberger · 2020-10-08T12:32:45Z

What if I do want to select just one component?

Is that an actual use case or just a theoretical consideration?

cbrnr · 2020-10-08T12:34:45Z

Is that an actual use case or just a theoretical consideration?

It's theoretical, but I'm pretty sure someone will want to use it for a good reason.

Also, it is a pretty unusual behavior to treat 1 and 1.0 differently, because even though the types are different, 1 == 1.0, and I'd prefer a solution where we can avoid this case.

hoechenberger · 2020-10-08T12:35:05Z

But ok, allowing 1 as int probably won't hurt anyone either..

cbrnr · 2020-10-08T12:36:10Z

But ok, allowing 1 as int probably won't hurt anyone either..

If both 1 and 1.0 do they same thing, then yes.

larsoner · 2020-10-08T12:39:57Z

If both 1 and 1.0 do they same thing, then yes.

Can you open a separate issue about this? Let's solve it and discuss it elsewhere

agramfort · 2020-10-08T12:55:52Z

but using 1 component for ICA does not make sense it should not be allowed anyway

…

larsoner

docs updated
killed 1. (float) and 1 (int) to be safe
updated behavior of n_components=None (see inline comment)
made n_components float value operate on total explained variance, not relative to n_pca_components

larsoner · 2020-10-08T14:00:19Z

mne/preprocessing/ica.py

+        - ``None``
+            ``n_pca_components`` or ``0.999999`` will be used, whichever
+            results in fewer components. This is done to avoid numerical
+            stability problems when whitening.
+


This is an important change/fix I think. The defaults of n_pca_components=None and n_components=None are still dangerous for rank-reduced data. This makes our defaults safe for rank-reduced data.

larsoner · 2020-10-09T19:21:52Z

Why do we insist on the float values referring to more than the specified explained variance?

To be consistent with sklearn:

"If 0 < n_components < 1 and svd_solver == 'full', select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components"

For floats you generally shouldn't rely on equivalence anyway so this is really in practice a tiny (hopefully meaningless in practice) distinction here anyway

hoechenberger

Proposal to make the docs a little easier to understand, otherwise looks super great and good to merge!

mne/preprocessing/ica.py

hoechenberger · 2020-10-10T09:54:54Z

@cbrnr Would be great to get your feedback / approval too before merging. Thanks!

hoechenberger · 2020-10-12T10:55:03Z

Friendly ping @cbrnr

cbrnr · 2020-10-12T11:28:28Z

I'll take a look within the next hour or so!

hoechenberger · 2020-10-12T11:30:26Z

Great, thank you

cbrnr · 2020-10-12T11:31:32Z

Where can I find the updated rendered ICA diagram?

hoechenberger · 2020-10-12T11:33:02Z

Where can I find the updated rendered ICA diagram?

The doc artifact is at https://22659-1301584-gh.circle-artifacts.com/0/dev/index.html

cbrnr

LGTM! I've tested this with one of my current data sets, and everything seems to work as before.

I have one comment that we could also discuss in a separate issue, but I wanted to bring it up here. Do we really need n_pca_components in ICA.__init__? As far as I can see, this parameter is neither used during initialization nor fitting. It is used in ICA.apply, which also has an n_pca_components parameter, which when set overrides the value passed in the initializer (I think). Wouldn't our API be much cleaner and easier to understand if we dropped n_pca_components from __init__ and just kept it in apply? Or am I missing something and we actually need to pass it when constructing the ICA object? Again, we don't have to discuss this here, so feel free to merge this PR and discuss this in a separate issue.

doc/changes/latest.inc

hoechenberger · 2020-10-12T12:39:42Z

@cbrnr

I have one comment that we could also discuss in a separate issue, but I wanted to bring it up here. Do we really need n_pca_components in ICA.__init__? As far as I can see, this parameter is neither used during initialization nor fitting. It is used in ICA.apply, which also has an n_pca_components parameter, which when set overrides the value passed in the initializer (I think). Wouldn't our API be much cleaner and easier to understand if we dropped n_pca_components from __init__ and just kept it in apply? Or am I missing something and we actually need to pass it when constructing the ICA object?

You're absolutely right, it's only used during apply(). @larsoner and I had a brief discussion somewhere (too lazy to look it up now) and figured it might be best to just keep in in __init__() so you could set it right upon ICA instantiation and can forget about it later, but still be able to override this setting when calling apply().

But happy to open this discussion again!

doc/changes/latest.inc

hoechenberger · 2020-10-12T13:46:17Z

Thanks a bunch, @larsoner!

cbrnr · 2020-10-12T14:09:59Z

This must have been the PR that closed the most issues at once! Thanks @larsoner!

hoechenberger · 2020-10-12T14:13:50Z

Absolutely. I like how I kept opening issues and @larsoner would come along closing them. How convenient 💪☺️

* API: Deprecate max_pca_components * FIX: Flake * FIX: Test * FIX: Fix viz tests * FIX: Test * ENH: Warn and doc * FIX: Exp var and 1. and 1 * STY: Update docs [skip travis] [skip github] * TST: Faster test * DOC: Better docs [ci skip] * Apply suggestions from code review [skip travis] * Fix docstring [skip travis] Co-authored-by: Richard Höchenberger <richard.hoechenberger@gmail.com>

cbrnr · 2020-12-01T12:57:04Z

@larsoner any chance we can backport this? It contains a nice fix in read_ica which I originally hadn't noticed. Previously, method was set to the default FastICA (as in ICA.__init__), but now we're actually reading in the method used to compute a particular solution. This is nice, because I have a bunch of files where I used picard, and currently I do need to install scikit-learn just because we don't read the actual method.

If it's too big of a change for a backport that's OK, it's not that I can't just install scikit-learn.

cbrnr · 2020-12-01T13:03:16Z

Although of course just because of this I'm running into the issue that scikit-learn has not been released for Python 3.9 yet, so generating the wheel myself takes ages.

agramfort · 2020-12-01T13:21:37Z

we do not backport API changes. we should push a new MNE release in early 2021?

…

cbrnr · 2020-12-01T13:23:18Z

Yes, if we do not have to stick to a 6 month release cycle I'd be happy if the next one was a bit sooner.

agramfort · 2020-12-01T13:32:39Z

point taken :)

…

hoechenberger · 2020-12-01T13:40:30Z

Yes, can we cut a new release in December already? These ICA changes are really… important. Also lots of @drammock's viz changes are a real game changer already!

agramfort · 2020-12-01T13:50:12Z

if we have the bandwidth I am all for it !

…

larsoner added 5 commits October 7, 2020 14:04

API: Deprecate max_pca_components

f060322

FIX: Flake

9bd9505

FIX: Test

63f09a9

FIX: Fix viz tests

d0a8e45

FIX: Test

74389e3

larsoner force-pushed the ica branch from a3d8072 to 74389e3 Compare October 7, 2020 18:14

larsoner changed the title ~~API: Deprecate max_pca_components~~ MRG, API: Deprecate max_pca_components Oct 7, 2020

ENH: Warn and doc

7cf1677

agramfort reviewed Oct 8, 2020

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

larsoner mentioned this pull request Oct 8, 2020

BUG: 1. and 1 treated differently in ICA #8353

Closed

FIX: Exp var and 1. and 1

e5e1f37

larsoner marked this pull request as ready for review October 8, 2020 13:59

STY: Update docs [skip travis] [skip github]

14688ed

larsoner commented Oct 8, 2020

View reviewed changes

DOC: Better docs [ci skip]

f4d8484

hoechenberger approved these changes Oct 10, 2020

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

hoechenberger requested a review from cbrnr October 10, 2020 09:54

cbrnr approved these changes Oct 12, 2020

View reviewed changes

doc/changes/latest.inc Outdated Show resolved Hide resolved

hoechenberger reviewed Oct 12, 2020

View reviewed changes

doc/changes/latest.inc Outdated Show resolved Hide resolved

Apply suggestions from code review [skip travis]

a7a458b

hoechenberger approved these changes Oct 12, 2020

View reviewed changes

cbrnr mentioned this pull request Oct 12, 2020

Remove n_pca_components from ICA.__init__ #8355

Closed

Fix docstring [skip travis]

9ba6dbe

hoechenberger merged commit 9147314 into mne-tools:master Oct 12, 2020

larsoner deleted the ica branch October 12, 2020 13:46

MRG, API: Deprecate max_pca_components #8351

MRG, API: Deprecate max_pca_components #8351

Conversation

larsoner commented Oct 7, 2020 • edited Loading

larsoner commented Oct 7, 2020

larsoner commented Oct 7, 2020

agramfort left a comment

Choose a reason for hiding this comment

agramfort commented Oct 8, 2020 via email

larsoner commented Oct 8, 2020 • edited Loading

larsoner commented Oct 8, 2020

cbrnr commented Oct 8, 2020

larsoner commented Oct 8, 2020

hoechenberger commented Oct 8, 2020

hoechenberger commented Oct 8, 2020

cbrnr commented Oct 8, 2020

hoechenberger commented Oct 8, 2020

cbrnr commented Oct 8, 2020

hoechenberger commented Oct 8, 2020

cbrnr commented Oct 8, 2020

larsoner commented Oct 8, 2020

agramfort commented Oct 8, 2020 via email

larsoner left a comment

Choose a reason for hiding this comment

larsoner Oct 8, 2020

Choose a reason for hiding this comment

larsoner commented Oct 9, 2020

hoechenberger left a comment

Choose a reason for hiding this comment

hoechenberger commented Oct 10, 2020

hoechenberger commented Oct 12, 2020

cbrnr commented Oct 12, 2020

hoechenberger commented Oct 12, 2020

cbrnr commented Oct 12, 2020

hoechenberger commented Oct 12, 2020

cbrnr left a comment

Choose a reason for hiding this comment

hoechenberger commented Oct 12, 2020 • edited Loading

hoechenberger commented Oct 12, 2020

cbrnr commented Oct 12, 2020

hoechenberger commented Oct 12, 2020

cbrnr commented Dec 1, 2020

cbrnr commented Dec 1, 2020

agramfort commented Dec 1, 2020 via email

cbrnr commented Dec 1, 2020

agramfort commented Dec 1, 2020 via email

hoechenberger commented Dec 1, 2020

agramfort commented Dec 1, 2020 via email

larsoner commented Oct 7, 2020 •

edited

Loading

larsoner commented Oct 8, 2020 •

edited

Loading

hoechenberger commented Oct 12, 2020 •

edited

Loading