Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Refactor the gather code so that it uses 'hashes' instead of 'mins' #1329

Merged
merged 5 commits into from
Feb 15, 2021

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Feb 15, 2021

During #1328, I got annoyed at the code in search.py:gather_databases because it used the old mins terminology and not the new hashes terminology. So I fixed it.

No functional changes, internal refactoring only.

NOTE: contains changes from #1328, so we should merge that one first. Or merge this into that one.

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@codecov
Copy link

codecov bot commented Feb 15, 2021

Codecov Report

Merging #1329 (7389fe1) into latest (bf5eeba) will increase coverage by 5.33%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1329      +/-   ##
==========================================
+ Coverage   88.73%   94.06%   +5.33%     
==========================================
  Files         123       96      -27     
  Lines       18125    14510    -3615     
  Branches     1399     1399              
==========================================
- Hits        16083    13649    -2434     
+ Misses       1803      622    -1181     
  Partials      239      239              
Flag Coverage Δ
python 94.06% <100.00%> (+<0.01%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/search.py 91.22% <100.00%> (+0.07%) ⬆️
src/core/tests/test.rs
src/core/src/ffi/utils.rs
src/core/src/index/sbt/mod.rs
src/core/src/wasm.rs
src/core/src/index/bigsi.rs
src/core/src/sketch/hyperloglog/mod.rs
src/core/src/ffi/cmd/compute.rs
src/core/src/sketch/minhash.rs
src/core/src/index/linear.rs
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf5eeba...7389fe1. Read the comment docs.

@ctb
Copy link
Contributor Author

ctb commented Feb 15, 2021

Ready for review @luizirber

@ctb
Copy link
Contributor Author

ctb commented Feb 15, 2021

sigh, or @bluegenes too :)

@ctb
Copy link
Contributor Author

ctb commented Feb 15, 2021

ok ready for review & maybe merge :). This doesn't change anything so there's no real urgency.


# calculate fractions wrt second denominator - metagenome size
orig_query_mh = orig_query_mh.downsample(scaled=cmp_scaled)
query_n_mins = len(orig_query_mh)
f_unique_to_query = len(intersect_mins) / float(query_n_mins)
query_n_hashes = len(orig_query_mh) # @CTB reuse ^^^?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just calling attn to your note here -- not sure if there's something else you wanted to do here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh. yes. thanks.

Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than your note, lgtm!

@ctb
Copy link
Contributor Author

ctb commented Feb 15, 2021

thanks! a bit more cleanup, if tests pass I will merge.

@ctb ctb merged commit 3bfd0fa into latest Feb 15, 2021
@ctb ctb deleted the refactor/gather_code branch February 15, 2021 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants