Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Debug and fix SBT search #484

Merged
merged 10 commits into from Jun 5, 2018
Merged

[MRG] Debug and fix SBT search #484

merged 10 commits into from Jun 5, 2018

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Jun 2, 2018

We have a persistent problem in that sourmash sbt search does not find everything it should. Containment analysis (sourmash gather) seems to work fine.

First task: figure out how to reproduce it quickly & easily.

  • add utils/check-tree.py to systematically search trees for their leaves

Second task: scour the codebase with flame and acid.

Ref #244 and #454.

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@codecov-io
Copy link

codecov-io commented Jun 2, 2018

Codecov Report

Merging #484 into master will decrease coverage by 0.03%.
The diff coverage is 92.15%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #484      +/-   ##
=========================================
- Coverage   90.54%   90.5%   -0.04%     
=========================================
  Files          33      33              
  Lines        4906    4931      +25     
  Branches       36      36              
=========================================
+ Hits         4442    4463      +21     
- Misses        463     467       +4     
  Partials        1       1
Impacted Files Coverage Δ
sourmash/sbtmh.py 86.25% <85.71%> (-0.11%) ⬇️
sourmash/sbt.py 83.29% <94.59%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5791cc...c4fd90d. Read the comment docs.

@ctb
Copy link
Contributor Author

ctb commented Jun 2, 2018

It looks to my naive eye like the problem is due to use of max instead of min to calculate the highest possible Jaccard similarity below an internal tree node. If so, this is a really embarrassing brain fart on my part.

@ctb
Copy link
Contributor Author

ctb commented Jun 2, 2018

@luizirber your thoughts welcome ... might be ready for merging.

@ctb ctb changed the title [WIP] Debug and fix SBT search [MRG] Debug and fix SBT search Jun 2, 2018
@luizirber
Copy link
Member

I aggree with the changes, but trying to think how to support them on 2.0.0a6 without breaking existing DBs (or sourmash installation): probably need to keep max_n_below as a synonym for min_n_below in the metadata?

@ctb
Copy link
Contributor Author

ctb commented Jun 4, 2018 via email

@luizirber
Copy link
Member

luizirber commented Jun 5, 2018

I bumped the version to 2.0.0a7.

On the DB side, two things:

@luizirber luizirber merged commit ad9999e into master Jun 5, 2018
@luizirber luizirber deleted the bug/check_tree branch June 5, 2018 19:42
@ctb
Copy link
Contributor Author

ctb commented Jun 6, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants