-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Refactor the gather code so that it uses 'hashes' instead of 'mins' #1329
Conversation
Codecov Report
@@ Coverage Diff @@
## latest #1329 +/- ##
==========================================
+ Coverage 88.73% 94.06% +5.33%
==========================================
Files 123 96 -27
Lines 18125 14510 -3615
Branches 1399 1399
==========================================
- Hits 16083 13649 -2434
+ Misses 1803 622 -1181
Partials 239 239
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
Ready for review @luizirber |
sigh, or @bluegenes too :) |
ok ready for review & maybe merge :). This doesn't change anything so there's no real urgency. |
src/sourmash/search.py
Outdated
|
||
# calculate fractions wrt second denominator - metagenome size | ||
orig_query_mh = orig_query_mh.downsample(scaled=cmp_scaled) | ||
query_n_mins = len(orig_query_mh) | ||
f_unique_to_query = len(intersect_mins) / float(query_n_mins) | ||
query_n_hashes = len(orig_query_mh) # @CTB reuse ^^^? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just calling attn to your note here -- not sure if there's something else you wanted to do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh. yes. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than your note, lgtm!
thanks! a bit more cleanup, if tests pass I will merge. |
During #1328, I got annoyed at the code in
search.py:gather_databases
because it used the oldmins
terminology and not the newhashes
terminology. So I fixed it.No functional changes, internal refactoring only.
NOTE: contains changes from #1328, so we should merge that one first. Or merge this into that one.
Checklist
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?