Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Move similarity with abundance into Rust #808

Merged
merged 10 commits into from
Dec 27, 2019
Merged

Conversation

luizirber
Copy link
Member

@luizirber luizirber commented Dec 22, 2019

Similarity calculations had to call .merge() (to check if minhashes are compatible), .get_mins() (which copies a lot of data) and calculate the dotproduct in Python, so moving this into Rust makes things... faster. I used sourmash_benchmarks to run a compare test, and for that benchmark (using 380 signatures from the genbank k51 SBT) it takes 30s now (down from 90s in 2.3.1).

I also changed compare to avoid downsampling a signature if it is already at the right scaled.

(Incidentally, also avoids calling .merge() and leaking memory 🙈)

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@codecov
Copy link

codecov bot commented Dec 22, 2019

Codecov Report

Merging #808 into master will decrease coverage by 0.3%.
The diff coverage is 22.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #808      +/-   ##
==========================================
- Coverage   79.74%   79.43%   -0.31%     
==========================================
  Files          45       45              
  Lines        6630     6673      +43     
  Branches      454      467      +13     
==========================================
+ Hits         5287     5301      +14     
- Misses       1042     1071      +29     
  Partials      301      301
Flag Coverage Δ
#pytests 90.36% <90%> (?)
#rusttests 49.3% <0%> (?)
Impacted Files Coverage Δ
src/sketch/minhash.rs 35.36% <0%> (-2.57%) ⬇️
sourmash/commands.py 87.94% <100%> (+0.06%) ⬆️
sourmash/_minhash.py 97.81% <75%> (+0.3%) ⬆️
sourmash/lca/lca_utils.py 96.78% <0%> (+0.05%) ⬆️
sourmash/index.py 98.46% <0%> (+0.07%) ⬆️
sourmash/sbt.py 86.61% <0%> (+0.07%) ⬆️
sourmash/sbtmh.py 85.21% <0%> (+0.21%) ⬆️
sourmash/sbt_storage.py 85.71% <0%> (+0.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba5b5a1...4ed598f. Read the comment docs.

@luizirber luizirber changed the title [WIP] Move similarity with abundance into Rust Move similarity with abundance into Rust Dec 24, 2019
@luizirber luizirber changed the title Move similarity with abundance into Rust [MRG] Move similarity with abundance into Rust Dec 24, 2019
@luizirber
Copy link
Member Author

Ready for review @ctb @olgabot

The failing checks on coverage are because the Rust coverage is still a bit spotty, and I haven't figure out how to also calculate coverage when calling from Python. I can reimplement Python tests on the Rust side, but since they are working... It's working?

@luizirber luizirber requested a review from ctb December 27, 2019 01:40
@ctb
Copy link
Contributor

ctb commented Dec 27, 2019

LGTM!

In the future it'd be nice to avoid too many changes to infrastructure code (the workflows and Cargo.toml stuff) along with the other changes...

Copy link
Contributor

@ctb ctb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@luizirber
Copy link
Member Author

In the future it'd be nice to avoid too many changes to infrastructure code (the workflows and Cargo.toml stuff) along with the other changes...

Fair point...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants