New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
umap non determinism - intended? #27
Comments
That would definitely be a bug. I believe I had the seed working such that it eliminated randomness -- that is the latest version of UMAP has a Just setting numpy's random seed is not going to be enough because of interactions with numba and the fact that UMAP uses its own internal PRNG for speed. Can you clarify under what conditions you aren't getting repeatability? |
Hi thanks for responding. In this case I'm using the random_state parameter and setting it to 42:
I then plot the embedding like so: The graph is get is different each time. I tried switching the axis but that doesn't explain the differences. fyi I just pip installed umap. Any thoughts would be helpful. Thanks! |
Okay, that's definitely disconcerting because I worked through getting the random_state to work properly (which turned out to be frustratingly non-trivial) and for at least the test dataset I was working with it produced perfectly consistent results when fixed. I'll try a few other datasets to verify that it is indeed working for me at least, and then perhaps we can start trying to track down why it isn't working for you. Which python version are you using? That's potentially one reason for issues... |
nondeterminism probably comes from unstable result of metric_nn_descent function, I observe that some rows of returned knn_indices, knn_dists are not sorted according to knn distance (this may be a serious bug, not sure) |
That is a bug that was caught and should be fixed in more recent versions. It should either be in the current master or will appear in version 0.3. |
This is still happening for me on 0.3.8. However, @warenlg found that fixing the numpy seed makes it fully deterministic: |
So this is somewhat disconcerting, and is definitely on my list of things to fix. I am not honestly quite sure where or how this is happening. |
This is still happening for me on 0.3.10. |
@ericloud Can you provide example code of what you did in order for others to reproduce the issue? |
Eureka!
Thanks. |
Glad you resolved it. See here for details on Python Data Structures https://docs.python.org/3/tutorial/datastructures.html#sets. |
Was testing it out and noticed that setting the random seed doesn't stop the embedding from changing upon different runs.
is non-determinism part of the design (like tsne)? is there a way to replicate prior results?
The text was updated successfully, but these errors were encountered: