-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a function to compare sequences of DNA #57
Conversation
adding a shuffling feature
Can we rename the similarity function to contain the type of similarity computed? |
for more information, see https://pre-commit.ci
Adding functions to calculate Jaccard similarity of two sequences.
adding the kmer metrics
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Done |
I've also added a function to ingest fasta files though I'm not sure if its needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks @hssn-20!
It looks like something unintentional may have happened here resulting in changes to many more files than you intended to edit. The issue causing this was resolved by @ssenan in #99 .
Could you please synchronize your fork default branch to the tip of this upstream, make a branch for this particular contribution such as hssn-20:kmer_metric
, and submit the PR from that branch instead?
You may find some of the information in the info pane of the project here helpful.
We will likely close this PR in the meantime, but you can still find it in the list of closed PRs. If we misunderstood what happened here we can always reopen it.
Related to #39.
Please rebase after #99 or later and resubmit from a non-default branch on your fork so that we can see only the intended changes. Apologies for the inconvenience and thank you! |
The functions in this file make use of the sourmash library to compare sequences of DNA. There are a few parameter choices ( i.e. number of hashes and k-mer sizes) I've made default however we can change them at a later point in the future.