Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for spectral clustering error when using 'amg' solver #13707

Merged
merged 27 commits into from Aug 29, 2019

Conversation

whitews
Copy link
Contributor

@whitews whitews commented Apr 24, 2019

Reference Issues/PRs

Fixes #13393. See also PR #12316

What does this implement/fix? Explain your changes.

Fixes LinAlgError when using spectral clustering with the amg solver

Any other comments?

This PR is derived from the previous PR #12316 submitted by Andrew Knyazev (lobpcg). In that PR, Andrew fixed issue #13393 and also added a new label assignment option 'clusterQR'. It was requested that the PR be split to separate the fix and the new label assignment functionality. This PR contains Andrew's fix for the AMG bug.

@whitews whitews changed the title Spec clust amg fix Fix for spectral clustering error when using 'amg' solver Apr 24, 2019
Copy link
Member

@jnothman jnothman left a comment

Thanks for this @whitews

I'm not sure what chance it has to get into 0.21, but just in case:
Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
@whitews
Copy link
Contributor Author

@whitews whitews commented Apr 24, 2019

Updated the change log. Let me know if anything else is needed.

sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
sklearn/manifold/tests/test_spectral_embedding.py Outdated Show resolved Hide resolved
sklearn/manifold/tests/test_spectral_embedding.py Outdated Show resolved Hide resolved
sklearn/manifold/tests/test_spectral_embedding.py Outdated Show resolved Hide resolved
@whitews
Copy link
Contributor Author

@whitews whitews commented Apr 25, 2019

Don't understand the codecov/patch failure, all local tests are passing for me. What does this failure mean?

@jnothman
Copy link
Member

@jnothman jnothman commented Apr 25, 2019

@whitews
Copy link
Contributor Author

@whitews whitews commented Apr 25, 2019

@jnothman This seems pretty clean now, everything is passing.

@lobpcg
Copy link
Contributor

@lobpcg lobpcg commented Apr 26, 2019

I think that the mathematically proper fix is changing in sklearn/manifold/spectral_embedding_.py the present

        laplacian = _set_diag(laplacian, 1 + 1e-5, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))

into

        laplacian = _set_diag(laplacian, 1, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian + 1e-5 * sparse.eye(laplacian.shape[0], 'csr'))

so that the LOBPCG solver is still called on the unchanged Laplacian, but only the AMG preconditioner is fed with the shifted Laplacian. I have updated my #13393 to highlight this.

I am unsure how the memory allocation would work in my suggestion above. May be to save memory one can do something like:

        laplacian = _set_diag(laplacian, 1, norm_laplacian)
        laplacian = laplacian + 1e-5 * sparse.eye(laplacian.shape[0]
        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))
        laplacian = laplacian - 1e-5 * sparse.eye(laplacian.shape[0]

centers = np.eye(n_clusters, n_features)
S, true_labels = make_blobs(n_samples=n_samples, centers=centers,
cluster_std=1., random_state=42)

Copy link
Contributor

@lobpcg lobpcg Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be check separately norm_laplacian = False and norm_laplacian = True . The latter is the default, the only option currently checked.

@jnothman
Copy link
Member

@jnothman jnothman commented May 22, 2019

@whitews are you continuing with this?

@whitews
Copy link
Contributor Author

@whitews whitews commented May 23, 2019

@jnothman Yes, I've updated the PR. Have the tests changed? Getting a 404 response for a deb package in the np_atlas Azure build.

@jnothman
Copy link
Member

@jnothman jnothman commented May 23, 2019

@jnothman
Copy link
Member

@jnothman jnothman commented May 27, 2019

Yes, bit I've been low on time to review this and other pull requests. I hope one of us can get to it soon.

@lobpcg
Copy link
Contributor

@lobpcg lobpcg commented Jul 23, 2019

@jnothman is there a plan trying to finish this one?

@jnothman jnothman closed this Jul 25, 2019
@jnothman jnothman reopened this Jul 25, 2019
Copy link
Member

@jnothman jnothman left a comment

Thanks @whitews.

Btw, I've confirmed that the test fails on master, which is good.

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
@jnothman
Copy link
Member

@jnothman jnothman commented Aug 2, 2019

Please merge the master from upstream to avoid the Circle CI failure.

@whitews
Copy link
Contributor Author

@whitews whitews commented Aug 2, 2019

@jnothman I think this is ready. Are there any outstanding requests?

Copy link
Member

@jnothman jnothman left a comment

Otherwise LGTM. Awaiting another review. Thanks @whitews and @lobpcg.

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.22.rst Show resolved Hide resolved
@lobpcg
Copy link
Contributor

@lobpcg lobpcg commented Aug 13, 2019

https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html, says that using amg eigen solver may lead to instabilities. This can be now removed, I think.

Copy link
Member

@ogrisel ogrisel left a comment

LGTM, I pushed some cosmetic commits, will merge when CI is green.

@ogrisel ogrisel merged commit 372092c into scikit-learn:master Aug 29, 2019
18 checks passed
@jnothman
Copy link
Member

@jnothman jnothman commented Aug 30, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants