Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for spectral clustering error when using 'amg' solver #13707

Merged
merged 27 commits into from Aug 29, 2019

Conversation

@whitews
Copy link
Contributor

whitews commented Apr 24, 2019

Reference Issues/PRs

Fixes #13393. See also PR #12316

What does this implement/fix? Explain your changes.

Fixes LinAlgError when using spectral clustering with the amg solver

Any other comments?

This PR is derived from the previous PR #12316 submitted by Andrew Knyazev (lobpcg). In that PR, Andrew fixed issue #13393 and also added a new label assignment option 'clusterQR'. It was requested that the PR be split to separate the fix and the new label assignment functionality. This PR contains Andrew's fix for the AMG bug.

@whitews whitews changed the title Spec clust amg fix Fix for spectral clustering error when using 'amg' solver Apr 24, 2019
Copy link
Member

jnothman left a comment

Thanks for this @whitews

I'm not sure what chance it has to get into 0.21, but just in case:
Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
@whitews

This comment has been minimized.

Copy link
Contributor Author

whitews commented Apr 24, 2019

Updated the change log. Let me know if anything else is needed.

@whitews

This comment has been minimized.

Copy link
Contributor Author

whitews commented Apr 25, 2019

Don't understand the codecov/patch failure, all local tests are passing for me. What does this failure mean?

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Apr 25, 2019

whitews added 2 commits Apr 25, 2019
@whitews

This comment has been minimized.

Copy link
Contributor Author

whitews commented Apr 25, 2019

@jnothman This seems pretty clean now, everything is passing.

@lobpcg

This comment has been minimized.

Copy link

lobpcg commented Apr 26, 2019

I think that the mathematically proper fix is changing in sklearn/manifold/spectral_embedding_.py the present

        laplacian = _set_diag(laplacian, 1 + 1e-5, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))

into

        laplacian = _set_diag(laplacian, 1, norm_laplacian)

        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian + 1e-5 * sparse.eye(laplacian.shape[0], 'csr'))

so that the LOBPCG solver is still called on the unchanged Laplacian, but only the AMG preconditioner is fed with the shifted Laplacian. I have updated my #13393 to highlight this.

I am unsure how the memory allocation would work in my suggestion above. May be to save memory one can do something like:

        laplacian = _set_diag(laplacian, 1, norm_laplacian)
        laplacian = laplacian + 1e-5 * sparse.eye(laplacian.shape[0]
        # noinspection PyUnboundLocalVariable
        ml = smoothed_aggregation_solver(check_array(laplacian, 'csr'))
        laplacian = laplacian - 1e-5 * sparse.eye(laplacian.shape[0]
centers = np.eye(n_clusters, n_features)
S, true_labels = make_blobs(n_samples=n_samples, centers=centers,
cluster_std=1., random_state=42)

This comment has been minimized.

Copy link
@lobpcg

lobpcg Apr 26, 2019

May be check separately norm_laplacian = False and norm_laplacian = True . The latter is the default, the only option currently checked.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented May 22, 2019

@whitews are you continuing with this?

whitews added 4 commits May 23, 2019
@whitews

This comment has been minimized.

Copy link
Contributor Author

whitews commented May 23, 2019

@jnothman Yes, I've updated the PR. Have the tests changed? Getting a 404 response for a deb package in the np_atlas Azure build.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented May 23, 2019

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented May 27, 2019

Yes, bit I've been low on time to review this and other pull requests. I hope one of us can get to it soon.

@lobpcg

This comment has been minimized.

Copy link

lobpcg commented Jul 23, 2019

@jnothman is there a plan trying to finish this one?

@jnothman jnothman closed this Jul 25, 2019
@jnothman jnothman reopened this Jul 25, 2019
Copy link
Member

jnothman left a comment

Thanks @whitews.

Btw, I've confirmed that the test fails on master, which is good.

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
sklearn/manifold/spectral_embedding_.py Outdated Show resolved Hide resolved
whitews and others added 5 commits Aug 2, 2019
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Aug 2, 2019

Please merge the master from upstream to avoid the Circle CI failure.

…into spec-clust-amg-fix
@whitews

This comment has been minimized.

Copy link
Contributor Author

whitews commented Aug 2, 2019

@jnothman I think this is ready. Are there any outstanding requests?

Copy link
Member

jnothman left a comment

Otherwise LGTM. Awaiting another review. Thanks @whitews and @lobpcg.

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.22.rst Show resolved Hide resolved
@lobpcg

This comment has been minimized.

Copy link

lobpcg commented Aug 13, 2019

https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html, says that using amg eigen solver may lead to instabilities. This can be now removed, I think.

Copy link
Member

ogrisel left a comment

LGTM, I pushed some cosmetic commits, will merge when CI is green.

@ogrisel ogrisel merged commit 372092c into scikit-learn:master Aug 29, 2019
18 checks passed
18 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc artifact Link to 0/doc/_changed.html
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.91%)
Details
codecov/project Absolute coverage decreased by -0.75% but relative coverage increased by +3.08% compared to 00ce061
Details
scikit-learn.scikit-learn Build #20190829.15 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Aug 30, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.