Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add functionality to sourmash signature to extract common hashes? #603

Open
ctb opened this issue Jan 5, 2019 · 5 comments
Open

add functionality to sourmash signature to extract common hashes? #603

ctb opened this issue Jan 5, 2019 · 5 comments

Comments

@ctb
Copy link
Contributor

ctb commented Jan 5, 2019

see #587 (comment) by @taylorreiter - "It would be useful if we could have a threshold for the intersection -- like if the hash occurs in 80% of signatures, give it to me in the intersection."

@ctb
Copy link
Contributor Author

ctb commented Jan 5, 2019

note that this functionality is not trivial, and enabling this kind of general query is in fact one of the works-in-progress that https://github.com/ctb/2017-sourmash-revindex and #604 is devoted to :)

@ctb
Copy link
Contributor Author

ctb commented Jan 5, 2019

(not trivial doesn't mean it's hard; we have several implementations of such a thing, including e.g. https://github.com/ctb/2017-sourmash-revindex/blob/master/hashes-to-numpy-2.py)

@ctb
Copy link
Contributor Author

ctb commented Apr 18, 2020

It would now (as of #946) be very easy to add a filtering option to LCA_Database to support this functionality.

@ctb
Copy link
Contributor Author

ctb commented Jan 26, 2022

similarly for #1808 and SqliteIndex, this is quite easy there.

@ctb
Copy link
Contributor Author

ctb commented Sep 23, 2023

is now generically available as a plugin: https://github.com/ctb/sourmash_plugin_commonhash

also ref #2383

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant