New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sklearn.cluster.bicluster.BaseSpectral._svd: n_discard eigenvectors from svds #12863
Comments
Thanks for the report. It would be useful to either present a snippet of code that fails your assumptions or link to the error in the code (https://help.github.com/articles/creating-a-permanent-link-to-a-code-snippet/). But a pull request fixing the issue, ideally with a test, is also welcome. |
Code is presented. The issue is that: the order of eigenvalues computed in the two conditions (randomized and arpack) are different. A simple fix would be reorder the eigenvalues after the computation. This will also avoid future errors if more parameter choices are included. scikit-learn/sklearn/cluster/bicluster.py Lines 132 to 162 in 8d7e849
|
I can confirm the bug:
gives
, but
gives
Checking the arpack docs it clearly states that the singular values are returned in ascending order. I just submitted a pull request #12898 fixing the issue. #12898 |
Description
The function sklearn.cluster.bicluster.BaseSpectral._svd incorrectly uses the parameters svd_method = 'arpack' and n_discard.
The function _svd should discard the eigenvectors with largest eigenvalues, but the function svds used when svd_method=='arpack' returns the eigenvectors with ascending eigenvalues. This behavior is different with that of the function randomized_svd used when svd_method=='randomized'.
The text was updated successfully, but these errors were encountered: