Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal Ns not recognized as missing #29

Open
ktmeaton opened this issue May 6, 2022 · 1 comment
Open

Terminal Ns not recognized as missing #29

ktmeaton opened this issue May 6, 2022 · 1 comment

Comments

@ktmeaton
Copy link

ktmeaton commented May 6, 2022

While investigating cov-lineages/pango-designation#590, I noticed that samples with the BA.2 S2M deletion (29734:29759) were being incorrectly visualized as having reference bases in sc2rf:

Consensus View:
image

sc2rf View:
image

I think this could be for a couple of reasons:

  1. When --enable-deletions is used, perhaps deletions should not be considered missing data?

    missings_matches = ["N"]
    if not args.enable_deletions:
        missings_matches.append("-")
  2. I think there is missing logic when detecting a run of Ns, to catch if that runs proceeds to the end of the genome?

    if s in missings_matches:
        # we've been tracking a run of N's, this base marks the end              
        if start_n == -1:
            start_n = i  # mark the start of possible run of N's
    elif start_n >= 0:
        missings.append((start_n, i-1))  # Python-style (closed, open) interval
        start_n = -1
    
    # Missing logic to catch missing data at the end of the genome?
    if i == len(reference) and s in missings_matches:
        missings.append((start_n, i-1))

With these changes, the sc2rf output more closely matches the consensus sequence/my expectation:

image

I think this is a bug, but if it's the intended behaviour for deletions, please let me know. Thanks!

@ktmeaton
Copy link
Author

ktmeaton commented May 6, 2022

Sorry, the code for both of these is:

python3 sc2rf.py pango_designation_590.fasta --ansi --unique 1 --enable-deletions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant