Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return UMAP graph? #47

Closed
twhiscock opened this issue Feb 7, 2020 · 5 comments
Closed

Return UMAP graph? #47

twhiscock opened this issue Feb 7, 2020 · 5 comments
Labels

Comments

@twhiscock
Copy link

@twhiscock twhiscock commented Feb 7, 2020

Hello! Thank you for writing such a useful package. It is great not to have to switch between python and R to use umap :).

I was wondering: is it possible to output the graph (i.e. the fuzzy simplicial set) that is an intermediate step in the UMAP projection?

In the original python implementation, I obtained this using the function:

umap.umap_.fuzzy_simplicial_set

I have found that this graph has several nice properties, and can be used to cluster data directly using graphical clustering methods.

Tom

@jlmelville
Copy link
Owner

@jlmelville jlmelville commented Feb 7, 2020

Hello, right now the fuzzy simplicial set is not available for output. But it could be added as a new option in the next release.

@twhiscock
Copy link
Author

@twhiscock twhiscock commented Feb 10, 2020

OK, thanks! I think this would be a great feature for the next release :)

@jlmelville
Copy link
Owner

@jlmelville jlmelville commented Feb 17, 2020

@twhiscock the github version of uwot now has the option to return the fuzzy simplicial set by using:

res <- umap(X, ret_extra = c("fgraph"))

The coordinates will be in res$embedding and the graph in res$fgraph. It's a sparse dgCMatrix from the Matrix package.

This will show up in the next CRAN version, whenever that is.

Note that the graph is further sparsified by dropping low-weight edges that would not be sampled during optimization. That is determined by the n_epochs parameter. If you only care about the graph and not the embedded coordinates, set n_epochs = 0 and no edges are removed.

The effect of the sparsifying is small with default values. I looked at n_epochs = 200 and n_epochs = 500 with MNIST (a largeish dataset) and iris (a very small one) and the number of edges that are dropped is always < 1%.

@twhiscock
Copy link
Author

@twhiscock twhiscock commented Feb 24, 2020

Awesome :)

@jlmelville
Copy link
Owner

@jlmelville jlmelville commented Mar 16, 2020

uwot 0.1.8 is on CRAN with this feature available.

@jlmelville jlmelville closed this Mar 16, 2020
yuhanH pushed a commit to yuhanH/uwot that referenced this issue Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.