Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do add_many in Rust, use it in LCA _signatures #826

Merged
merged 3 commits into from Jan 7, 2020
Merged

Conversation

@luizirber
Copy link
Member

luizirber commented Jan 6, 2020

Calling .add_hash() on a MinHash sketch is fine, but if you're calling it all the time it's better to pass a list of hashes and call .add_many() instead. Before this PR add_many just called add_hash for each hash it was passed, but now it will pass the full list to Rust (and that's way faster).

No changes for public APIs, and I changed the _signatures method in LCA to accumulate hashes for each sig first, and then set them all at once. This is way faster, but might use more intermediate memory (I'll evaluate this now).

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jan 6, 2020

Codecov Report

Merging #826 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #826      +/-   ##
==========================================
+ Coverage   79.17%   79.21%   +0.03%     
==========================================
  Files          45       45              
  Lines        6705     6707       +2     
  Branches      469      469              
==========================================
+ Hits         5309     5313       +4     
+ Misses       1096     1094       -2     
  Partials      300      300
Flag Coverage Δ
#pytests 90.42% <100%> (+0.04%) ⬆️
#rusttests 48.71% <ø> (ø) ⬆️
Impacted Files Coverage Δ
sourmash/lca/lca_utils.py 96.88% <100%> (+0.02%) ⬆️
sourmash/_minhash.py 97.81% <100%> (-0.01%) ⬇️
sourmash/utils.py 75.43% <0%> (+3.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e99e85...c6cbdf0. Read the comment docs.

@luizirber

This comment has been minimized.

Copy link
Member Author

luizirber commented Jan 7, 2020

No changes for public APIs, and I changed the _signatures method in LCA to accumulate hashes for each sig first, and then set them all at once. This is way faster, but might use more intermediate memory (I'll evaluate this now).

As expected, it's using more memory. I tried used both a set and a list to accumulate hashes.

version mem time
original 1.5 GB 160s
set 3.8GB 80s
list 1.7GB 73s

So I kept the list version, since the memory increase is not so bad (and it's faster than the set).

@luizirber luizirber changed the title [WIP] do add_many in Rust, use it in LCA _signatures Do add_many in Rust, use it in LCA _signatures Jan 7, 2020
@luizirber luizirber added the rust label Jan 7, 2020
@luizirber luizirber requested review from ctb and olgabot Jan 7, 2020
@ctb
ctb approved these changes Jan 7, 2020
@luizirber luizirber merged commit 6a2a14e into master Jan 7, 2020
20 checks passed
20 checks passed
Check
Details
build
Details
test (beta)
Details
test (stable)
Details
test (windows)
Details
test (macos)
Details
test_all_features
Details
coverage
Details
Lints
Details
Check if wasm-pack builds a valid package for the sourmash crate
Details
Run tests under wasm32-wasi Run tests under wasm32-wasi
Details
Publish (dry-run)
Details
minimum_rust_version
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: C/C++ No new or fixed alerts
Details
LGTM analysis: Python No new or fixed alerts
Details
Travis CI - Pull Request Build Passed
Details
codecov/patch 100% of diff hit (target 79.17%)
Details
codecov/project 79.21% (+0.03%) compared to 7e99e85
Details
netlify/sourmash-docs/deploy-preview Deploy preview canceled.
Details
@luizirber luizirber deleted the rust_add_many branch Jan 7, 2020
@luizirber luizirber mentioned this pull request Jan 13, 2020
5 of 5 tasks complete
@luizirber luizirber added this to the 3.1 milestone Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.