Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Fix Jaccard calculation to be intersection over union #150

Merged
merged 15 commits into from Apr 18, 2017
Merged

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Apr 4, 2017

Fixes #125 so that we are MinHash-correct by computing Jaccard estimator as intersection over union of k-mers. This makes it symmetric & presumably mash compatible but also breaks many tests :)

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

TODO:

  • confirm that the default behavior is mash compatible

@ctb ctb mentioned this pull request Apr 4, 2017
@ctb
Copy link
Contributor Author

ctb commented Apr 4, 2017

@luizirber I ran headlong into Cython issues related to calling merge on KmerMinHash objects. I would like to solicit your opinions on my changes :)

@ctb
Copy link
Contributor Author

ctb commented Apr 18, 2017

@luizirber all tests fixed. Review pls?

@codecov-io
Copy link

codecov-io commented Apr 18, 2017

Codecov Report

Merging #150 into master will decrease coverage by 0.21%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #150      +/-   ##
==========================================
- Coverage   85.52%   85.31%   -0.22%     
==========================================
  Files          13       13              
  Lines        1879     1879              
  Branches       52       52              
==========================================
- Hits         1607     1603       -4     
- Misses        263      265       +2     
- Partials        9       11       +2
Impacted Files Coverage Δ
sourmash_lib/kmer_min_hash.hh 88% <100%> (-2.29%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0e3476...d650b22. Read the comment docs.

@luizirber luizirber changed the title [WIP] Fix Jaccard calculation to be intersection over union [MRG] Fix Jaccard calculation to be intersection over union Apr 18, 2017
Copy link
Member

@luizirber luizirber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We shouldn't need to change the merge to merge_abund, but it's not a problem that needs fixing in this PR.

@ctb ctb merged commit 1033479 into master Apr 18, 2017
@ctb ctb deleted the fix/jaccard branch April 18, 2017 22:43
@luizirber luizirber added this to Done in sourmash 2.0 May 19, 2017
@luizirber luizirber removed this from Done in sourmash 2.0 May 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants