Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] require scaled signatures for containment #1381

Merged
merged 7 commits into from
Mar 9, 2021
Merged

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Mar 9, 2021

I stumbled across some code that shouldn't have worked while writing a new test for #1374, and discovered that we are allowing contained_by to be called on num MinHash objects.

This is unambiguously wrong and I think we should remove it from the code base. Only scaled MinHashes allow accurate estimation of containment. I guess we didn't know that a few years back, though, so it crept into the code base :).

#1345 is relevant - but while there are reasonable places to call count_common on regular MinHashes, there are no reasonable places to call contained_by, AFAICT.

NOTE: this breaks backwards compatibility, but I think it's rectifies a bug so we're ok.

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@ctb ctb changed the title [wrequire scaled signatures for containment [WIP] require scaled signatures for containment Mar 9, 2021
@codecov
Copy link

codecov bot commented Mar 9, 2021

Codecov Report

Merging #1381 (5b3a8b9) into latest (001cd35) will increase coverage by 5.32%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1381      +/-   ##
==========================================
+ Coverage   88.84%   94.16%   +5.32%     
==========================================
  Files         123       96      -27     
  Lines       18270    14707    -3563     
  Branches     1409     1410       +1     
==========================================
- Hits        16232    13849    -2383     
+ Misses       1800      621    -1179     
+ Partials      238      237       -1     
Flag Coverage Δ
python 94.16% <100.00%> (+0.02%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/minhash.py 93.35% <100.00%> (+0.04%) ⬆️
src/sourmash/sourmash_args.py 92.24% <100.00%> (+0.32%) ⬆️
tests/test__minhash.py 99.31% <100.00%> (+<0.01%) ⬆️
tests/test_sourmash.py 99.42% <100.00%> (+<0.01%) ⬆️
src/core/src/wasm.rs
src/core/src/index/bigsi.rs
src/core/src/errors.rs
src/core/src/encodings.rs
src/core/src/ffi/signature.rs
src/core/src/ffi/minhash.rs
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 001cd35...5b3a8b9. Read the comment docs.

@ctb ctb changed the title [WIP] require scaled signatures for containment [MRG] require scaled signatures for containment Mar 9, 2021
@ctb
Copy link
Contributor Author

ctb commented Mar 9, 2021

Ready for review @luizirber @bluegenes

@ctb
Copy link
Contributor Author

ctb commented Mar 9, 2021

yay, tests pass!

@luizirber luizirber merged commit 55741dc into latest Mar 9, 2021
@luizirber luizirber deleted the fix/contained_by branch March 9, 2021 22:21
@ctb
Copy link
Contributor Author

ctb commented Mar 9, 2021

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants