Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Ensure determinism of SVD output in dict_learning #18433

Merged
merged 2 commits into from Feb 3, 2021

Conversation

brcharron
Copy link
Contributor

@brcharron brcharron commented Sep 20, 2020

Reference Issues/PRs

Fixes #12826
See also #12799

What does this implement/fix? Explain your changes.

Adds a call to sklearn.utils.extmath.svd_flip after the scipy.linalg.svd call in dict_learning (used by sklearn.decomposition.DictionaryLearning) to ensure that the output is deterministic across platforms.
This could allow to put back the example in #12799 (reverting 4a7075a and fixing the signs) but there is already another docstring example now so not sure if needed.

Any other comments?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @brcharron !

The concern I have with this, is that it will change the behavior of dict_learning, which makes it backward incompatible.

We can either add a flip_sign to dict_learning or consider this a bug fix by making it more consistent with dict_learning_online. WDYT @ogrisel ?

@jeremiedbb
Copy link
Member

@thomasjpfan I'd consider non determinism a bug. Moreover there are ongoing fixes that will change the behavior (see #19198 for instance) so one more change won't make a difference.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @brcharron, please add a what's new entry

sklearn/decomposition/_dict_learning.py Show resolved Hide resolved
Base automatically changed from master to main January 22, 2021 10:53
@brcharron
Copy link
Contributor Author

@jeremiedbb Thank you for the review, I added a comment and a What's New entry.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also consider this a bug fix. I am not sure how to test this because I believe the non-determinism could only be observed by switching from one platform to another or from one version of numpy / scipy / openblas to the next.

+1 for merging as it is. The existing tests pass which is a clue that this should not break to much our users' code.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks @brcharron

@jeremiedbb jeremiedbb merged commit 4fd851c into scikit-learn:main Feb 3, 2021
@brcharron brcharron deleted the deterministic_dict_learning branch February 3, 2021 13:37
@glemaitre
Copy link
Member

Shall we consider to backport this bug in 0.24.2 instead of 1.0?

@jeremiedbb
Copy link
Member

I don't think it's that urgent. It's been there for a long time and it's not a nasty bug.

@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DictionaryLearning not stable
5 participants