Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] KernelPCA: fix transform issue when zero eigenvalues are present and not removed (issue 12141) #12143

Merged
merged 21 commits into from Mar 21, 2019

Conversation

@smarie
Copy link
Contributor

commented Sep 24, 2018

This fixes #12141

Minor edits for clarity:
 - Added a few comments to clarify `_fit_transform`, `fit_transform` and `transform`.
 - Uniformized the numpy coding style for `fit` and `fit_transform`.
@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Sep 24, 2018

All set, waiting for review.

@smarie smarie changed the title KernelPCA: fix transform issue when zero eigenvalues are present and not removed (issue 12141) [MRG] KernelPCA: fix transform issue when zero eigenvalues are present and not removed (issue 12141) Sep 25, 2018

@NicolasHug
Copy link
Contributor

left a comment

Thanks for the fix, this looks like a legit bug.

Here are a few first comments, but we would need some core contributors to jump in. @amueller @jnothman maybe?

sklearn/decomposition/kernel_pca.py Show resolved Hide resolved
smarie added 2 commits Oct 9, 2018
Following code review from @NicolasHug : now only scaling eigenvector…
…s where there is no zero division, to avoid numpy warnings.
Following code review from @NicolasHug : now using pytest warning cap…
…ture to perform fine-grain warning assertion
@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

Thanks @smarie , LGTM. Let's wait for other reviews now.

smarie added 2 commits Oct 12, 2018
@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Oct 12, 2018

Thanks @NicolasHug . If you have still a couple time for reviews in the next days, I would really appreciate a review for the companion PRs of this PR:

  • #12145 to provide kernel warnings when the gram matrix has conditioning issues
  • #12069 to add the randomized svd solver to accelerate partial decompositions. This was my original PR actually :)
@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Oct 16, 2018

Your #12069 (comment)

the randomized method seems to naturally find the "perfect zero eigenvalues" while arpack and dense methods most of the time find a tiny non-zero number instead (e.g. of the order of magnitude of 1e-16 if my memory is correct)

makes me think we should not test for strict equality with zero but instead have an approximate comparison.

@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2018

Actually that's already taken care of - but in #12145 :) where we round to zero the small eigenvalues that are due to bad numerical conditioning (as well as negative eigenvalues because a kernel is supposed to be semi-positive definite so they are probably numerical errors).

See https://github.com/scikit-learn/scikit-learn/pull/12145/files#diff-3b70045c110a30d29de66ed0ea3fb86dR1082 for details

@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

OK. I guess this would make sense to do it here?

IMHO the way to go would be to first merge this one, then #12145, and then #12069. This way you can merge master each time to avoid the redundant changes in the diffs (right now #12145 diffs also show the changes from here and that make reviewing trickier)

@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Oct 17, 2018

Yes, this is the merge order that should be used since each PR contains and relies on the commits from the previous. I am not sure that it would make sense to backport the zero eigenvalues rounding here: indeed, that is precisely what #12145 is about : controling and harmonizing the eigenvalues found by the various solvers.

@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

You're right, if #12145 sets small eigenvalues to 0 then the code here won't have to change.

@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2018

Hi @NicolasHug , @jnothman , @ogrisel : any update on this PR and the subsequent ones #12145 and #12069 ? Experience shows that the more we wait the more likely the PRs will not be mergeable again...

Of course I understand that this is a volunteer-based open-source project with many contributions, but I really believe that the effort we collectively made on these three PR both in terms of idea (@grilling original code in matlab), coding, and code review, is mature now.

Sylvain MARIE
@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2019

@jnothman I updated what's new following your advice on the other kPCA PR. The documentation does not need to be updated, this is ready to Merge (this PR is the first of a series of 3 : #12143, #12145 and #12069)

@NicolasHug
Copy link
Contributor

left a comment

Nitpicks from me, this still LGTM.

I find the comments that you added very helpful.

Hope we can get other reviewers on this!

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved
sklearn/decomposition/kernel_pca.py Outdated Show resolved Hide resolved
sklearn/decomposition/kernel_pca.py Outdated Show resolved Hide resolved
sklearn/decomposition/kernel_pca.py Outdated Show resolved Hide resolved
@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

Maybe @adrinjalali and @qinhanmin2014 could to review this? Should be quick to review IMO (LGTM already) and it'd be nice to address @smarie great efforts ;)

@thomasjpfan
Copy link
Member

left a comment

Few nits about the comments. Otherwise LGTM

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved
sklearn/decomposition/kernel_pca.py Show resolved Hide resolved
sklearn/decomposition/kernel_pca.py Show resolved Hide resolved
thomasjpfan and others added 8 commits Mar 14, 2019
Update doc/whats_new/v0.21.rst
Co-Authored-By: smarie <sylvain.marie@schneider-electric.com>
Sylvain MARIE
Sylvain MARIE added 2 commits Mar 19, 2019
@jnothman
Copy link
Member

left a comment

Thanks @smarie

@jnothman jnothman added this to the 0.21 milestone Mar 20, 2019

@NicolasHug

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2019

This is +2 now, @thomasjpfan can you merge it please?

@thomasjpfan thomasjpfan merged commit 2f5bb34 into scikit-learn:master Mar 21, 2019

16 checks passed

LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 93.33% of diff hit (target 92.37%)
Details
codecov/project 92.38% (+0.01%) compared to b73a51b
Details
scikit-learn.scikit-learn Build #20190319.16 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_np_atlas) Linux py35_np_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda) Linux pylatest_conda succeeded
Details
scikit-learn.scikit-learn (Windows py35_32) Windows py35_32 succeeded
Details
scikit-learn.scikit-learn (Windows py37_64) Windows py37_64 succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda) macOS pylatest_conda succeeded
Details
@smarie

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

Thanks everyone!

xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde added a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.