-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lineages B.1.526, B.1.526.1 and B.1.526.2 #45
Comments
Hi @andersonbrito, thanks for letting us know. I've updated the classifications for B.1.526, B.1.526.1 and B.1.526.2 from your genomes, the update should appear in release 1.1.14 |
So the two unassigned ones that split these branches are probably the problematic sequences. Also that long one looks odd. |
@andersonbrito Can you send the alignment to me offline? |
FWIW S:L5F (C21575T) is in the Problematic Sites list as highly homoplasic, so we mask it out when building our trees. Problematic Sites initial report on virological (there have been several updates, but 21575 was there from the beginning): https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 Problematic Sites VCF file: https://raw.githubusercontent.com/W-L/ProblematicSites_SARS-CoV2/master/problematic_sites_sarsCov2.vcf |
I think it could be pertinent here because it is associated with the Q1011H and G1946S - I suspect all three need to be pushed back to the common ancestor of B.1.526 and B.1.526.2 with the two unassigned ones (the thin branches) being the problems (presumably they are missing the corresponding pairs). Homoplasic sites can just be neutral sites that are flip-flopping and thus actually provide useful fine scale information (I don't know the pattern of L5F - will look into it). |
Also C21575T (L5F) has occurred within patients suggesting that it is a true homoplasy so likely phylogenetically informative. |
Possibly a recombinant but more likely an artifactual mosaic. @andersonbrito - worth checking these two out for mixtures or contamination issues. |
Thank you for letting us know, Andrew.
We will check the raw data of those genomes.
If those genomes are excluded, is the remaining data enough to distinguish
the three lineages?
I'll send you the alignment in a minute, offline.
*Anderson Brito*
…On Tue, 13 Apr 2021 at 18:03, Andrew Rambaut ***@***.***> wrote:
Possibly a recombinant but more likely an artifactual mosaic.
@andersonbrito <https://github.com/andersonbrito> - worth checking these
two out for mixtures or contamination issues.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#45 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWEXDWGKERSJYSDDUQPON3TISWTNANCNFSM42VUWNUA>
.
|
@rambaut, I have just sent you an email with alignment and tree files. |
@rambaut: I'm not sure what version of pangolin is being run by GISAID, but here's the latest I see: on GISAID currently Take a look at: https://nextstrain.org/groups/blab/ncov/ny/B.1.526?c=gt-S_253 I have noticed that the homoplasies do complicate structure within B.1.526, but I believe that 253G is consistently partitioning these lineages in question as sister lineages. |
Looking at the designated sequences, B.1.526, B.1.526.1 and B.1.526.2 on the surface look very similar from an epi point of view (https://raw.githubusercontent.com/cov-lineages/pango-designation/master/lineages.csv), they've all got sequences from CT little other locations desigated. Currently the only sequences designated B.1.526.3 are
Any of the assignments that are beyond what's in the designation list already can be included if they are >95% complete to help with these assignments. The question for the .1 and .2 sublineages here may be whether they should be sublineages or just merged back into the parent if it seems they're not robustly being distinguished in a phylogeny. |
Merged in sublineages to B.1.526 in commit ec2335a. |
By Anderson Brito & Nathan Grubaugh Lab.
Description
Sub-lineage of: B.1.526
Earliest sequence:
B.1.526 (2020-11-23); B.1.526.1 (2020-09-07); B.1.526.2 (2021-02-10)
Most recent sequence: 2020-09-07
Countries circulating: USA
These B.1.526 lineages and sub-lineages were recently reassigned, and their classifications are currently mixed up, as can be seen in the tree below (see image and link)
Genomes
B.1.526: metadata of 230 genomes (download here)
B.1.526.1: metadata of 49 genomes (download here)
B.1.526.2: metadata of 89 genomes (download here)
Evidence
Image: available here
Build: available here
Proposed lineage name
Same as already proposed by Pango team. We only want to report the need for an update of the B.1.526 lineage group assignments.
The text was updated successfully, but these errors were encountered: