Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digits example #9

Open
hedgefair opened this issue Nov 12, 2017 · 8 comments
Open

Digits example #9

hedgefair opened this issue Nov 12, 2017 · 8 comments

Comments

@hedgefair
Copy link

569 self._raise_no_convergence()
570 else:
--> 571 raise ArpackError(self.info, infodict=self.iterate_infodict)
572
573 def extract(self, return_eigenvectors):

ArpackError: ARPACK error 3: No shifts could be applied during a cycle of the Implicitly restarted Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV.

@lmcinnes
Copy link
Owner

The computation of eigenvectors of the Laplacian has failed -- now why that happens is a bit of a mystery. Potentially it means that some eigenvalues are very close to each other and hard to extract. This can happen if the whole thing gets distorted by an outlier too badly. I would recommend tweaking the parameter values (increasing n_neighbors incrementally) to see if that remedies the problem. If so then there is something in the data that the code isn't quite handling well in a corner case. Let me know how that goes and we can work from there.

@asstergi
Copy link

Hi @lmcinnes,

Same issue happens for me as well.
I increased the 'n_neighbors' up to 1000 but the same issue still remains. Do you have any idea on that?

Then, I set 'n_neighbors' to 2000 and got this error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\asterioss\AppData\Local\Continuum\anaconda3\lib\site-packages\umap\umap_.py", line 792, in fit_transform
    self.fit(X)
  File "C:\Users\asterioss\AppData\Local\Continuum\anaconda3\lib\site-packages\umap\umap_.py", line 759, in fit
    self._metric, self.metric_kwds)
  File "C:\Users\asterioss\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\sparse\coo.py", line 184, in __init__
    self._check()
  File "C:\Users\asterioss\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\sparse\coo.py", line 236, in _check
    raise ValueError('negative column index found')
ValueError: negative column index found

@lmcinnes
Copy link
Owner

This is a little mysterious to me. You should definitely not need an n_neighbors value that large, so something is going wrong somewhere along the line. Can you share the data you are using? I think, unfortunately, this will take a bit of digging for me to figure out exactly what is going wrong here, so I can't really promise a quick fix. Thanks for the report though, it is very helpful to know about these edge cases that can cause problems like this (and I know it is frustrating for users).

@lmcinnes
Copy link
Owner

As an interim solution you can use init='random' to avoid this issue. I'm not sure exactly what will happen with the data.

@lmcinnes
Copy link
Owner

I've just pushed code that will at least work around the issue. The result will be slower performance (because we need to trey spectral initialisation, have it fail, and fall back) but it should work. If @asstergi or @hedgefair have time or opportunity to pull from master and reinstall to verify if this resolves the issue for them I would appreciate it. Thanks again for the feedback.

@asstergi
Copy link

Regarding the data, I'm using the digits = load_digits() example.
Thanks for your help. When I find some time I'll reinstall and let you know.

@lmcinnes
Copy link
Owner

That's odd because I have definitely run successfully on that exact dataset. I'll continue looking into this.

@sleighsoft
Copy link
Collaborator

Did this get resolved? Can the issue be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants