ENH propagate `eigen_tol` to all eigen solver #11968

massich · 2018-09-01T18:17:42Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Propagate the eigen_tol for the 'arpack', 'amg', and 'lobpcg' solvers. By default, we are using the scipy default ('arpack'=>0 otherwise None).

Any other comments?

massich · 2018-09-01T18:51:21Z

This is not how the default should be done but it gives an idea.
cc:@lobpcg, @glemaitre

lobpcg · 2018-09-03T00:11:54Z

I ran a few tests using the new code and it now seems to work as I expected. Thanks!

sklearn/manifold/spectral_embedding_.py

sklearn/cluster/tests/test_spectral.py

massich · 2018-09-18T10:29:05Z

IIRC, that made pep8 scream

…

On Mon, Sep 17, 2018, 12:58 Joel Nothman ***@***.***> wrote: ***@***.**** approved this pull request. ------------------------------ In sklearn/cluster/tests/test_spectral.py <#11968 (comment)> : > @@ -205,3 +205,20 @@ def test_spectral_clustering_with_arpack_amg_solvers(): assert_raises( ValueError, spectral_clustering, graph, n_clusters=2, eigen_solver='amg', random_state=0) + + +def test_6489_regression(): + X = np.ones((50, 50)) + X[[3, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 8, 17, 17, 17, 17, 17, put a space before that first 3 so things line up — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#11968 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGt-4znJ6bScBSS6Ct4BBTzefQop58dQks5ub4BagaJpZM4WWUeI> .

massich · 2018-09-21T07:56:10Z

ping: @rth

rth

Maybe add a note to the docstring of eigen_tol saying that for amg solver, it will be always set to larger or equal to 1e12, and document the behavior of this parameter for lobpcg. Also see comment below. Otherwise LGTM.

rth · 2018-09-21T08:38:13Z

sklearn/manifold/spectral_embedding_.py

@@ -303,6 +305,10 @@ def spectral_embedding(adjacency, n_components=8, eigen_solver=None,
            raise ValueError

    elif eigen_solver == "lobpcg":
+
+        if eigen_tol == 0:
+            eigen_tol = 1e-15


Would there be a disadvantage of doing, eigen_tol = max(1e-15, eigen_tol) same as for amg? I find it strange that eigen_tol is not monotonic in this case: e.g. 0 will produce 1e-15 but 1e-16 will produce 1e-16. Or are there use cases for tolerances below 1e-15? cc @lobpcg

"amg" solve actually is the same lobpcg, only with an extra option, so indeed, the default eigen_tol should be the same for "amg" and "lobpcg" My recommendation is to set eigen_tol=None in both cases and thus let the lobpcg code to choose its default.

1e-15 is way too small for the default eigen_tol. The default eigen_tol in the https://github.com/scipy/scipy/blob/v1.1.0/scipy/sparse/linalg/eigen/lobpcg/lobpcg.py line 307-308 is np.sqrt(1e-15) * n where n is the size of the matrix. The clean fix may be NOT to specify ANY default eigen_tol in "spectral_embedding" but rather let lobpcg to do it, as I suggest above.

I do not like in lines 136-138 of in "spectral_embedding":

def spectral_embedding(adjacency, n_components=8, eigen_solver=None,
random_state=None, eigen_tol=0.0,

Why not eigen_tol=None ? As far as I can see, eigen_tol=0.0 makes little sense, while eigen_tol=None would be passed to "lobpcg" and let it choose its default, as I propose. I do not know how "arpack" reacts to eigen_tol=None, however.

Yes, there may be use cases for tolerances below 1e-15 in general, because the stopping criteria currently used in "lobpcg" is not scale invariant, i.e., finding eigenvectors of the matrix 1e-10*A with the same accuracy as of finding the eigenvectors of the matrix A would require to scale the eigen_tol by the same magnitude, 1e-10. I have not actually tested it, but this is how the code is written. The default eigen_tol in "lobpcg" which is np.sqrt(1e-15) * n implicitly assumes that the matrix A if reasonably scaled.

amueller · 2019-08-05T20:02:34Z

what's the status of this?

massich · 2019-09-06T12:55:01Z

This PR has been updated taking into account the changes in #13707 that were merged in master.

The original issue in #6489 is already fixed in master probably by #13707. But the tolerance in the function call was not propagated to all solvers still.

This PR proposes to unify them. The PR changes the default to 1e-15 for all solvers that support tol except for amg+lobpcg where we kept 1e-5.

@lobpcg do you think that using 1e-15 is a good default in general? whats your opinion?

massich · 2019-09-06T14:29:37Z

maybe it would be also nice to cath if eigen_tol is passed to spectral_embedding but the method does not support tol.

lobpcg · 2019-09-06T14:32:12Z

This PR has been updated taking into account the changes in #13707 that were merged in master.

The original issue in #6489 is already fixed in master probably by #13707. But the tolerance in the function call was not propagated to all solvers still.

This PR proposes to unify them. The PR changes the default to 1e-15 for all solvers that support tol except for amg+lobpcg where we kept 1e-5.

@lobpcg do you think that using 1e-15 is a good default in general? whats your opinion?

propagate eigen_tol is a good idea. There only two solvers involved: arpack and lobpcg. AMG is also lobpcg, just with an extra option. 1e-15 is a good default only for trivial unit tests. It is never a good default in practical calculations - way too small. It is difficult to come up with a good default as it needs to depend, e.g., on the size of the matrix and if the matrix is float64 or 32. So I strongly advocate delegating this to the solver, choosing the default "None"

lobpcg · 2019-09-06T14:36:08Z

I also advocate propagating max_it - the cap on the number of iterations, and introducing and propagating the convergence_flag output, True or False, showing if the tolerance requirement has been achieved.

sklearn/manifold/spectral_embedding_.py

cmarmo · 2020-08-15T20:14:18Z

Following @lobpcg comment the correspondent issue seems to be solved. I'm going to close this PR then, feel free to reopen if necessary.

lobpcg · 2020-08-15T20:40:42Z

Following @lobpcg comment the correspondent issue seems to be solved. I'm going to close this PR then, feel free to reopen if necessary.

@cmarmo My comment quoted above is unrelated to this issue, which is still unresolved to my knowledge and needs to be reopened...

cmarmo · 2020-08-15T20:56:54Z

Ok, sorry for misunderstanding.
To be clear: what is missing here is

set the default eigen_tol=None
introducing a max_iter parameter
introducing a ConvergenceWarning when convergence is not achieved

Am I understanding correctly?

lobpcg · 2020-08-15T21:07:09Z

Ok, sorry for misunderstanding.
To be clear: what is missing here is

set the default eigen_tol=None

introducing a max_iter parameter

introducing a ConvergenceWarning when convergence is not achieved

Am I understanding correctly?

Yes. And of course unit tests checking that each one actually works for all (both) solvers implemented.

albertvillanova · 2020-09-27T12:08:41Z

Take

albertvillanova · 2020-09-27T12:19:11Z

@glemaitre sorry, I just saw the label "help wanted", but afterwards I realized you are assignee. I guess you are already working on this...

jeremiedbb

Here are some suggestions. It will also require a what's new entry, and some tests as well.

jeremiedbb · 2022-03-25T15:08:56Z

setup.cfg

@@ -8,7 +8,7 @@ doctest_optionflags = NORMALIZE_WHITESPACE ELLIPSIS
 testpaths = sklearn
 addopts =
    --doctest-modules
-    --disable-pytest-warnings
+    # --disable-pytest-warnings


Suggested change

# --disable-pytest-warnings

--disable-pytest-warnings

jeremiedbb · 2022-04-06T09:52:03Z

sklearn/manifold/_spectral_embedding.py

@@ -339,7 +355,13 @@ def spectral_embedding(
        X = random_state.randn(laplacian.shape[0], n_components + 1)
        X[:, 0] = dd.ravel()
        X = X.astype(laplacian.dtype)
-        _, diffusion_map = lobpcg(laplacian, X, M=M, tol=1.0e-5, largest=False)
+        # Until scikit-learn minimum scipy dependency <1.4.0 we require high


Suggested change

# Until scikit-learn minimum scipy dependency <1.4.0 we require high

# As long as scikit-learn has minimum scipy dependency <1.4.0 we require high

jeremiedbb · 2022-04-06T09:57:42Z

sklearn/cluster/_spectral.py

-    eigen_tol : float, default=0.0
-        Stopping criterion for eigendecomposition of the Laplacian matrix
-        when using arpack eigen_solver.
+    eigen_tol : float or None, default=None


Suggested change

eigen_tol : float or None, default=None

eigen_tol : float, default=None

jeremiedbb · 2022-04-06T09:57:58Z

sklearn/cluster/_spectral.py

-    eigen_tol : float, default=0.0
-        Stopping criterion for eigendecomposition of the Laplacian matrix
-        when ``eigen_solver='arpack'``.
+    eigen_tol : float or None, default=None


Suggested change

eigen_tol : float or None, default=None

eigen_tol : float, default=None

jeremiedbb · 2022-04-06T09:58:10Z

sklearn/manifold/_spectral_embedding.py

-    eigen_tol : float, default=0.0
-        Stopping criterion for eigendecomposition of the Laplacian matrix
-        when using arpack eigen_solver.
+    eigen_tol : float or None, default=None


Suggested change

eigen_tol : float or None, default=None

eigen_tol : float, default=None

jeremiedbb · 2022-04-06T09:58:26Z

sklearn/manifold/_spectral_embedding.py

@@ -443,6 +470,21 @@ class SpectralEmbedding(BaseEstimator):
        to be installed. It can be faster on very large, sparse problems.
        If None, then ``'arpack'`` is used.

+    eigen_tol : float or None, default=None


Suggested change

eigen_tol : float or None, default=None

eigen_tol : float, default=None

thomasjpfan

Recently, we have been using "auto" to change behavior depending on another parameter.

@jeremiedbb What do you think of eigen_tol="auto"?

thomasjpfan · 2022-04-06T16:50:35Z

sklearn/manifold/_spectral_embedding.py

+        # Until scikit-learn minimum scipy dependency <1.4.0 we require high
+        # tolerance as explained in:
+        # https://github.com/scikit-learn/scikit-learn/pull/13707#discussion_r314028509
+        tol = max(1e-5, 1e-5 if eigen_tol is None else eigen_tol)


May we update the docstrings to explain this clipping behavior for eigen_tol + lobpcg.

From an irl discussion with @ogrisel and @glemaitre, we don't think clipping is a good solution here. We should let the user chose the tol he wants. Instead we can document in the parameter description that it's advised to no use a too low tolerance when scipy < 1.4

jeremiedbb · 2022-04-06T19:33:57Z

@jeremiedbb What do you think of eigen_tol="auto"?

makes sense

Micky774 · 2022-04-16T20:54:39Z

Is someone still actively working on this PR? @massich are you still involved with this one?

glemaitre · 2022-04-19T09:07:40Z

@Micky774 You can go ahead. I personally know @massich and he will not be able to carry on the work here.

cmarmo · 2022-05-10T21:59:57Z

I'm closing this one as superseded by #23210.
Thanks @massich for your work: all your commits have been included in the new PR.

propagate eigen_tol

b11545e

This was referenced Sep 3, 2018

Spectral clustering with lobpcg solver is unstable #10278

Closed

LinAlgError in training spectral clustering model: leading minor not positive definite #6489

Closed

jnothman reviewed Sep 3, 2018

View reviewed changes

sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved

sklearn/cluster/tests/test_spectral.py Outdated Show resolved Hide resolved

massich added 3 commits September 16, 2018 20:20

clean up

99b35c4

Merge branch 'master' into lobpcg_stability_10278_6489

ce8de92

Remove unused variable

292e014

jnothman approved these changes Sep 17, 2018

View reviewed changes

sklearn/cluster/tests/test_spectral.py Outdated Show resolved Hide resolved

jnothman approved these changes Sep 19, 2018

View reviewed changes

massich changed the title ~~propagate eigen_tol~~ [MRG+1] propagate eigen_tol Sep 19, 2018

rth approved these changes Sep 21, 2018

View reviewed changes

rth reviewed Sep 21, 2018

View reviewed changes

lobpcg mentioned this pull request Oct 13, 2018

[MRG] add lobpcg svd_solver to PCA and TruncatedSVD #12319

Closed

glemaitre self-assigned this Aug 12, 2019

massich added 5 commits September 6, 2019 12:05

Merge branch 'master' into lobpcg_stability_10278_6489

99e8811

clean up

88a9c3b

fix new amg tol default

10010c1

remove regression test since its already working in master

7d30e2b

update default

3f3eabf

massich changed the title ~~[MRG+1] propagate eigen_tol~~ [MRG] propagate eigen_tol Sep 6, 2019

glemaitre reviewed Sep 9, 2019

View reviewed changes

sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved

github-actions bot added the module:manifold label Mar 2, 2020

cmarmo closed this Aug 15, 2020

cmarmo reopened this Aug 15, 2020

cmarmo added help wanted Stalled labels Aug 15, 2020

Base automatically changed from master to main January 22, 2021 10:50

lobpcg mentioned this pull request Nov 3, 2021

ENH support float32 in SpectralEmbedding for LOBPCG and PyAMG solvers #21534

Merged

Merge remote-tracking branch 'origin/main' into pr/massich/11968

1c92bc7

glemaitre changed the title ~~[MRG] propagate eigen_tol~~ ENH propagate eigen_tol to all eigen solver Dec 17, 2021

glemaitre added 2 commits December 17, 2021 15:01

iter

758e1d5

iter

a5e3f29

glemaitre mentioned this pull request Dec 17, 2021

eigen_tol in _spectral_embedding.py does not propagate to solvers other than arpack #21243

Closed

jeremiedbb reviewed Apr 6, 2022

View reviewed changes

thomasjpfan reviewed Apr 6, 2022

View reviewed changes

cmarmo removed Stalled help wanted labels Apr 7, 2022

Micky774 mentioned this pull request Apr 24, 2022

ENH propagate eigen_tol to all eigen solver #23210

Merged

cmarmo added the Superseded PR has been replace by a newer PR label May 10, 2022

cmarmo closed this May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH propagate `eigen_tol` to all eigen solver #11968

ENH propagate `eigen_tol` to all eigen solver #11968

massich commented Sep 1, 2018 •

edited by glemaitre

Loading

massich commented Sep 1, 2018

lobpcg commented Sep 3, 2018

massich commented Sep 18, 2018 via email

massich commented Sep 21, 2018

rth left a comment

rth Sep 21, 2018

lobpcg Sep 21, 2018

amueller commented Aug 5, 2019

massich commented Sep 6, 2019

massich commented Sep 6, 2019

lobpcg commented Sep 6, 2019

lobpcg commented Sep 6, 2019

cmarmo commented Aug 15, 2020

lobpcg commented Aug 15, 2020 •

edited

Loading

cmarmo commented Aug 15, 2020 •

edited by jeremiedbb

Loading

lobpcg commented Aug 15, 2020

albertvillanova commented Sep 27, 2020

albertvillanova commented Sep 27, 2020

jeremiedbb left a comment •

edited

Loading

jeremiedbb Mar 25, 2022

jeremiedbb Apr 6, 2022

jeremiedbb Apr 6, 2022

jeremiedbb Apr 6, 2022

jeremiedbb Apr 6, 2022

jeremiedbb Apr 6, 2022

thomasjpfan left a comment

thomasjpfan Apr 6, 2022 •

edited

Loading

jeremiedbb Apr 7, 2022

jeremiedbb commented Apr 6, 2022

Micky774 commented Apr 16, 2022

glemaitre commented Apr 19, 2022

cmarmo commented May 10, 2022

	# Until scikit-learn minimum scipy dependency <1.4.0 we require high
	# As long as scikit-learn has minimum scipy dependency <1.4.0 we require high

	eigen_tol : float or None, default=None
	eigen_tol : float, default=None

ENH propagate eigen_tol to all eigen solver #11968

ENH propagate eigen_tol to all eigen solver #11968

Conversation

massich commented Sep 1, 2018 • edited by glemaitre Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

massich commented Sep 1, 2018

lobpcg commented Sep 3, 2018

massich commented Sep 18, 2018 via email

massich commented Sep 21, 2018

rth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amueller commented Aug 5, 2019

massich commented Sep 6, 2019

massich commented Sep 6, 2019

lobpcg commented Sep 6, 2019

lobpcg commented Sep 6, 2019

cmarmo commented Aug 15, 2020

lobpcg commented Aug 15, 2020 • edited Loading

cmarmo commented Aug 15, 2020 • edited by jeremiedbb Loading

lobpcg commented Aug 15, 2020

albertvillanova commented Sep 27, 2020

albertvillanova commented Sep 27, 2020

jeremiedbb left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasjpfan left a comment

Choose a reason for hiding this comment

thomasjpfan Apr 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremiedbb commented Apr 6, 2022

Micky774 commented Apr 16, 2022

glemaitre commented Apr 19, 2022

cmarmo commented May 10, 2022

ENH propagate `eigen_tol` to all eigen solver #11968

ENH propagate `eigen_tol` to all eigen solver #11968

massich commented Sep 1, 2018 •

edited by glemaitre

Loading

lobpcg commented Aug 15, 2020 •

edited

Loading

cmarmo commented Aug 15, 2020 •

edited by jeremiedbb

Loading

jeremiedbb left a comment •

edited

Loading

thomasjpfan Apr 6, 2022 •

edited

Loading