Creating a function to compare sequences of DNA #57

hssn-20 · 2022-11-23T11:50:13Z

The functions in this file make use of the sourmash library to compare sequences of DNA. There are a few parameter choices ( i.e. number of hashes and k-mer sizes) I've made default however we can change them at a later point in the future.

adding a shuffling feature

mateibejan1 · 2022-11-29T08:37:55Z

Can we rename the similarity function to contain the type of similarity computed?

for more information, see https://pre-commit.ci

Adding functions to calculate Jaccard similarity of two sequences.

adding the kmer metrics

for more information, see https://pre-commit.ci

hssn-20 · 2023-03-09T20:25:59Z

Done

hssn-20 · 2023-03-09T20:27:50Z

I've also added a function to ingest fasta files though I'm not sure if its needed.

cameronraysmith

Many thanks @hssn-20!

It looks like something unintentional may have happened here resulting in changes to many more files than you intended to edit. The issue causing this was resolved by @ssenan in #99 .

Could you please synchronize your fork default branch to the tip of this upstream, make a branch for this particular contribution such as hssn-20:kmer_metric, and submit the PR from that branch instead?

You may find some of the information in the info pane of the project here helpful.

We will likely close this PR in the meantime, but you can still find it in the list of closed PRs. If we misunderstood what happened here we can always reopen it.

Related to #39.

cameronraysmith · 2023-03-10T02:37:39Z

Please rebase after #99 or later and resubmit from a non-default branch on your fork so that we can see only the intended changes.

Apologies for the inconvenience and thank you!

hssn-20 added 2 commits November 21, 2022 20:41

Add files via upload

6632014

Update k_mer_based_metric.py

d769418

adding a shuffling feature

github-actions bot added the stale label Mar 2, 2023

hssn-20 mentioned this pull request Mar 2, 2023

Implement a sequence quality metric based on k-mer composition #42

Closed

cameronraysmith linked an issue Mar 3, 2023 that may be closed by this pull request

Implement a sequence quality metric based on k-mer composition #42

Closed

cameronraysmith removed the stale label Mar 3, 2023

hssn-20 and others added 3 commits March 8, 2023 17:17

Merge branch 'dna-diffusion' into dna-diffusion

83792d1

[pre-commit.ci] auto fixes from pre-commit.com hooks

33a0b1f

for more information, see https://pre-commit.ci

adding the kmer metrics

5a3802f

Adding functions to calculate Jaccard similarity of two sequences.

hssn-20 mentioned this pull request Mar 9, 2023

adding the kmer metrics hssn-20/DNA-Diffusion#1

Merged

6 tasks

hssn-20 and others added 6 commits March 9, 2023 20:12

Merge pull request #1 from hssn-20/hssn-20-kmer-2

0e4e1f3

adding the kmer metrics

[pre-commit.ci] auto fixes from pre-commit.com hooks

376c033

for more information, see https://pre-commit.ci

Update k_mer_based_metric.py

a5df01b

[pre-commit.ci] auto fixes from pre-commit.com hooks

336d218

for more information, see https://pre-commit.ci

Update k_mer_based_metric.py

155475c

[pre-commit.ci] auto fixes from pre-commit.com hooks

1f6cddd

for more information, see https://pre-commit.ci

cameronraysmith added this to the 0.0.1 milestone Mar 10, 2023

cameronraysmith added the metrics modifies definition or measurement of model metrics label Mar 10, 2023

cameronraysmith requested review from ssenan, mateibejan1 and LucasSilvaFerreira March 10, 2023 00:49

cameronraysmith requested changes Mar 10, 2023

View reviewed changes

cameronraysmith removed the request for review from ssenan March 10, 2023 00:57

ssenan self-requested a review March 10, 2023 01:18

cameronraysmith closed this Mar 10, 2023

cameronraysmith linked an issue Mar 10, 2023 that may be closed by this pull request

Implement a sequence quality metric based on k-mer composition #39

Closed

cameronraysmith removed a link to an issue Mar 11, 2023

Implement a sequence quality metric based on k-mer composition #39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating a function to compare sequences of DNA #57

Creating a function to compare sequences of DNA #57

hssn-20 commented Nov 23, 2022

mateibejan1 commented Nov 29, 2022 •

edited

Loading

hssn-20 commented Mar 9, 2023

hssn-20 commented Mar 9, 2023

cameronraysmith left a comment •

edited

Loading

cameronraysmith commented Mar 10, 2023 •

edited

Loading

Creating a function to compare sequences of DNA #57

Creating a function to compare sequences of DNA #57

Conversation

hssn-20 commented Nov 23, 2022

mateibejan1 commented Nov 29, 2022 • edited Loading

hssn-20 commented Mar 9, 2023

hssn-20 commented Mar 9, 2023

cameronraysmith left a comment • edited Loading

Choose a reason for hiding this comment

cameronraysmith commented Mar 10, 2023 • edited Loading

mateibejan1 commented Nov 29, 2022 •

edited

Loading

cameronraysmith left a comment •

edited

Loading

cameronraysmith commented Mar 10, 2023 •

edited

Loading