Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential optimization opportunity in supervised.cpp #34

Closed
LTLA opened this issue Oct 26, 2019 · 1 comment
Closed

Potential optimization opportunity in supervised.cpp #34

LTLA opened this issue Oct 26, 2019 · 1 comment

Comments

@LTLA
Copy link
Contributor

LTLA commented Oct 26, 2019

Looking at:

uwot/src/supervised.cpp

Lines 71 to 75 in 3e359a3

for (auto k = indptr1[i]; k < indptr1[i + 1]; k++) {
if (indices1[k] == j) {
left_val = data1[k];
}
}

This seems like it could be replaced by a binary search, assuming that the inputs represent slots from a dgCMatrix; entries of i should always be sorted within each column specified by p.

    auto left_end=indices1.begin() + indptr1[i + 1];
    auto left_it=std::lower_bound(indices1.begin() + indptr1[i], left_end, j);
    double left_val = (left_it!=left_end && *left_it==j ? data1[left_it - indices1.begin()] : left_min);

This should be faster for any decently sized input matrix where you're getting >100 non-zero entries in each column (I don't know if this is particularly common?), and saves two lines as well.

library(uwot)
library(Matrix)

X <- rsparsematrix(10000, 10000, 0.1)
Y <- rsparsematrix(10000, 10000, 0.1)
Z <- as(X + Y, 'dgTMatrix')

system.time({
uwot:::general_sset_intersection_cpp(
    X@p, X@i, X@x,
    Y@p, Y@i, Y@x, 
    Z@i, Z@j, Z@x)
})
##    user  system elapsed 
##  19.838   0.000  19.843 

system.time({
uwot:::general_sset_intersection_cpp2( # modified as above
    X@p, X@i, X@x,
    Y@p, Y@i, Y@x, 
    Z@i, Z@j, Z@x)
})
##    user  system elapsed 
##    1.43    0.00    1.43 

(Not that I have any concerns about speed; I was trawling through the code for other reasons and just happened to notice this. Just something to consider.)

@jlmelville
Copy link
Owner

Thanks a lot. I think that part of the code was translated directly from the Python and I forgot to go back and make any sort of optimization or C++ify it.

Would you be interested in creating a PR and adding yourself to the author list as a contributor in the DESCRIPTION? I already have a couple of tests around general_simplicial_set_intersection which calls the C++ code, so you wouldn't have to add any tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants