New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Warn future change of default init in NMF #18525
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be noted that the current PR will break backward compatibility:
- raising an error when using
solver="mu"
andinit="nndsvd"
explicitly - changing the results with both solvers when using
init=None
. The changes will probably be small with the "cd" solver, but might be larger with the "mu" solver (though this might be considered a bugfix).
I am fine with it, but other maintainers might disagree.
Other suggested changes:
- add
Not available for solver="mu".
in the nndsvd section of the docstrings (in _initialize_nmf, NMF, and non_negative_matrix_factorization). - add a whats_new entry, including in the "Changed models" section.
Another possibility is to keep allowing init="nndsvd"
with solver="mu"
(the fixed sparsity might be considered as a kind of regularization), keep the warning, and change the behavior of init=None
through a deprecation process.
Not sure what is the best solution, to be honest.
If we go ahead with this change, a clear note in the whats_new should be added |
The proposed backward incompatible change may be against our guidelines. The problem being that people need to be able to update scikit-learn without worrying about breakage. An argument for going ahead with such backward-incompatible change would be if the prior version should be considered as a bug. In other terms, if it leads often to significant breakage, it can be changed without a compatibility shim. Trying to summarize things to make sure that I understand: in the current settings, the "transform" is broken by default, and the change is required to fix things? Am I correct? |
This will not completely fix the fit.transform vs fit_transform issue. It's just a better default for the multiplicative update solver because it avoids zeros in the init. |
This PR currently changes the default for both solvers. Shouldn't it be only for the 'mu' solver ? I'm in favor of keeping the warning and change the default for nmf through a deprecation cycle. (Note that for MiniBatchNMF you can already set the good default). |
I think I like the last suggestion of @TomDLT's #18525 (review) most:
|
I believe this is the same solution as @jeremiedbb's #18525 (comment). |
Thanks for all your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could take the opportunity to rename it "auto" (in 2 versions).
You also need to change the default in _initialize_nmf
and maybe elsewhere
Could also make the warning visible (i.e even when init is none) in _check_string_parameters
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change from 'nndsvd' to 'nndsvda' for all solvers. Indeed, I think it makes almost no difference for the "cd" solver, and it simplifies the behavior/documentation.
The (non negative) SVD initialization:
Importantly, the "mu" solver is based on multiplicative updates, so zeros in the initialization remain zeros during optimization. This is why |
I need some direction here.
I guess this is due to the fact that (quoting @TomDLT) "initialization changes zeros into X.mean() when init="nndsvda"? Should I keep 'nndsvd' here as we want to separate the effect of the initialization from the one of regularization?
|
It's failing for both 'cd' and 'mu'. I wonder if the test is correct. I don't think it's guaranteed that the mean of W and the mean of H should both be smaller when using l2 reg. Don't you think we should check that |
I'm not the right person to ask... :) |
You are right, the test is wrong, we should check what you suggest.
|
Thanks @jeremiedbb and @TomDLT, I will change the test then: is this worth a new PR? |
it's fine to do it in this pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cmarmo ! |
Thanks for this and for catching the incorrect test! Just a note, there is a leftover comment in the test file that is now outdated (https://github.com/scikit-learn/scikit-learn/pull/18525/files#diff-e809907f1fe6840a81b7b9b9f1d0ef494052df97d328eeb25ae3e0495e238575R470) and it might confuse devs in the future. (I probably introduced this mistake...) |
Right! Thanks @vene! I will fix it in #16948 if there are no objections. |
Reference Issues/PRs
Closes #18505
What does this implement/fix? Explain your changes.
Any other comments?
To make tests pass I had to increase the tolerance for float32 float64 consistency.Ping @jeremiedbb, @TomDLT