Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMAP performance #13

Open
bwang12 opened this issue Sep 12, 2019 · 2 comments
Open

UMAP performance #13

bwang12 opened this issue Sep 12, 2019 · 2 comments

Comments

@bwang12
Copy link

bwang12 commented Sep 12, 2019

It's great to have a Julia implementation of the UMAP. I have been using the Python one quite a bit and am very impressed with its performance thus far. (https://github.com/lmcinnes/umap)

Since I am somewhat new to Julia, I am wondering how much faster can the Julia version be?

Currently, the Python UMAP takes about 3.2 seconds to run on a randomized 2000 by 2000 matrix.

import numpy as np
import umap
test = np.random.rand(2000, 2000)
UMAP = umap.UMAP(n_components=2)
%timeit UMAP.fit_transform(test)

While Julia UMAP would take about 4.3 seconds to run on a same size randomized matrix.

using BenchmarkTools
using UMAP
test = rand(2000, 2000)
@Btime umap(test)

I'd love to get your take on this @dillondaudert .

@SimonDanisch
Copy link

Maybe this can get you into the same ballpark: #2

@kadir-gunel
Copy link

kadir-gunel commented Apr 17, 2021

Hello, I am trying to use umap for reducing the dimensions of word embeddings (200k x 300) to (200k x 2). And I cannot get any results because it runs on a single thread. I changed the number of threads that Julia is using from 1 to 8 but still UMAP.jl works on a single thread.
What is wrong ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants