[MRG before #12069] KernelPCA: raise Errors and Warnings according to eigenvalue decomposition numerical/conditioning issues #12145
While waiting for the review I proposed a way to fix the coverage and improve maintainability: the method to check the kernel eigenvalues (
In addition I created a test for kPCA to check that warnings and errors are raised correctly in case of bad conditioning.
…ange failure in Travis.
…ed ('randomized' is not available in this branch !)
…o always check the inner check_kernel_eigenvalues method before the kPCA fit.
…aced during call to fit() - we now execute the test directly on `_fit_transform` instead, so that the matrix is untouched.
I had some afterthoughts on one of the warnings: indeed it is normal to find quasi-zero eigenvalues when the number of samples is high enough (my intuition would be, in the case of a gaussian kernel, that it is when this number is larger than the underlying distribution's manifold dimensionality, but maybe in this paper there are better explanations).
For this reason I will push a new commit where there is no warning by default about zero eigenvalues.
…all: this is most probably a common case especially when the number of samples gets high. Removing the warning by default.
@adrinjalali are you ok with:
We are clearly over engineering this PR by trying too hard to future-proof it. Let's keep things simple.
a precision concerning your last sentences @NicolasHug "one parameter to control all the warnings" and "I would be ok to remove all the warnings": please have a look at my previous comments, there is a huge difference between the warning for significant negative values (which could even be transformed into a
* Renamed `small_nonzeros_warning` into `enable_warnings`. * now consistent warnings are raised for all three cases (imaginary parts, negative, small non zero), and the parameter disables all of them. * improved string formatting using `%g` instead of `%f` or other things
Ready for a last round. That "simple" last change was actually quite impactant but I think that the result is now straightforward and consistent. I updated the docstring, please have a look.
…th the others. Now adopting the same message everywhere.
…ot copied back. Added it.
…y parts are all zeros, to convert to float dtype
…r faster partial decompositions, like in PCA (#12069) Co-authored-by: Sylvain MARIE <firstname.lastname@example.org> Co-authored-by: Thomas J Fan <email@example.com> Co-authored-by: Nicolas Hug <firstname.lastname@example.org> Co-authored-by: Joel Nothman <email@example.com> Co-authored-by: Olivier Grisel <firstname.lastname@example.org> Co-authored-by: Olivier Grisel <email@example.com> Co-authored-by: Tom Dupré la Tour <firstname.lastname@example.org>