Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep smaller temp lists in _signatures #840

Merged
merged 6 commits into from Jan 15, 2020
Merged

Keep smaller temp lists in _signatures #840

merged 6 commits into from Jan 15, 2020

Conversation

@luizirber
Copy link
Member

luizirber commented Jan 13, 2020

Implement suggestion posted in lobste.rs:

I wonder if you could get a reasonable compromise between cpu and memory consumption by growing the list till it hits (say) 64 elements then flushing it? That way you’re amortizing the per-call ffi overhead by buffering, but your buffer doesn’t grow very large.

Turns out this works great!

version mem time
original 1.5 GB 160s
set 3.8GB 80s
list 1.7GB 73s
This PR 1.5GB 77s

Runtime increase is small (77s versus 73s in my tests) and memory consumption is stable (down to 1.5GB again, instead of using 1.7GB as in #826). While this is not a big change for current LCA indices, as they increase in number of signatures in the future this can potentially save quite a bit of memory.

Additionally, fix two issues detected during writing of https://blog.luizirber.org/2020/01/10/sourmash-pr/:

  • hashes_ptr as *mut u64 should have been hashes_ptr as *const u64 (changing mutability here doesn't matter much, but can be dangerous as code changes in the future)
  • Lack of benchmark for add_many. Added both time and mem benchmarks for it. (It would also be nice to have an LCA _signatures benchmark, but that involves more infra).

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jan 13, 2020

Codecov Report

Merging #840 into master will decrease coverage by 0.36%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #840      +/-   ##
==========================================
- Coverage   78.41%   78.04%   -0.37%     
==========================================
  Files          82       82              
  Lines        6884     6891       +7     
  Branches      469      469              
==========================================
- Hits         5398     5378      -20     
- Misses       1185     1212      +27     
  Partials      301      301
Flag Coverage Δ
#pytests 89.48% <100%> (+0.01%) ⬆️
#rusttests 48.54% <ø> (ø) ⬆️
Impacted Files Coverage Δ
sourmash/commands.py 83.54% <ø> (ø) ⬆️
sourmash/search.py 91.26% <ø> (ø) ⬆️
sourmash/sbt.py 86.59% <100%> (+0.17%) ⬆️
sourmash/sbtmh.py 84.5% <100%> (ø) ⬆️
sourmash/sbt_storage.py 76.66% <0%> (-10%) ⬇️
sourmash/command_compute.py 86.26% <0%> (-9.9%) ⬇️
sourmash/lca/command_gather.py 83.66% <0%> (-0.11%) ⬇️
sourmash/lca/lca_utils.py 96.03% <0%> (-0.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ff6306...1977557. Read the comment docs.

sourmash/lca/lca_utils.py Outdated Show resolved Hide resolved
luizirber added 3 commits Jan 13, 2020
(I hate you python 2)
@luizirber luizirber added this to the 3.1 milestone Jan 14, 2020
@ctb
ctb approved these changes Jan 15, 2020
ctb and others added 2 commits Jan 15, 2020
@luizirber luizirber merged commit 34514e7 into master Jan 15, 2020
18 of 19 checks passed
18 of 19 checks passed
archive
Details
Check
Details
build
Details
test (beta)
Details
test (stable)
Details
test (windows)
Details
test (macos)
Details
test_all_features
Details
coverage
Details
Lints
Details
Check if wasm-pack builds a valid package for the sourmash crate
Details
Run tests under wasm32-wasi Run tests under wasm32-wasi
Details
Publish (dry-run)
Details
minimum_rust_version
Details
Travis CI - Pull Request Build Errored
Details
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
netlify/sourmash-docs/deploy-preview Deploy preview canceled.
Details
@luizirber luizirber deleted the add_many_more branch Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.