-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BA.* sublineages with S:L452R and S:F486V (79 sequences as of 2022-04-05, mainly South Africa) #517
Comments
Agree with this proposal, @Rajeev_The_King was asking some days ago about a sudden uptick in positivity rate there. And the only thing i noticed was that S:F486V edited |
Very interesting sequences. Just had a look at them in Nextclade. In addition to S:452R (Delta mutation) and S:F486 (very rare during pandemic, only 800 sequences!): 2 reversions with respect to BA.2:
Silent mutations:
The fact that there is so much commonality, especially nuc:G12160A makes this cluster look very real to me, despite of all the variation. Too much coincidence. |
Since the RBD region is often subject to amplicon dropout, I tried an alternative query to find potential members of this cluster that have amplicon dropout using I find some 52 sequences in total that seem to belong to this cluster. Here's a covSpectrum query that should catch all (even with RBD dropout, and not have many (or any) false positives): https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?variantQuery=BA.2*+%26+12160A+%26+%289866C+%7C+21570G+%7C+27788T+%7C+28724T%29& Click to see EPI ISLS
|
Caveat: 22917 (S:452) is one of several sites that I have been masking in BA.1 in the UCSC/UShER tree since the 2022-03-18 build (22813 22898 22882 22917 23854) because they caused a lot of split branches -- i.e. important lineage-defining mutations, that I think were not homoplasic, appeared to be highly homoplasic because they would sometimes appear with, and sometimes without, various combinations of the Omicron mutations at 22813, 22898, 22882 or 23854 or the Delta mutation at 22917, which I think can "sneak in" to amplicon dropout regions from stray contamination reads. ... Or 22917 could be truly homoplasic, in which case, bummer! Likewise, BA.2 has 22917 (along with 22786, 22882, and 23854) masked since the latest completed build, 2022-03-30. Tom's tree there is the "downsampled global tree" output of UShER, which shows the uploaded sequences in the context of the whole tree but without their nearest neighbors. If you click on one of the uploaded samples, there is a link to view subtree 1, which shows the local context (nearest neighbors in the tree and downsampled nearby branches). Here I have copied the temporary URL for subtree 1 which will expire in a couple days to a URL that should last much longer: Note that for samples already in the tree and uploaded, the uploaded samples are placed one mutation past the tree placement -- that's 22917, not masked in uploaded fasta, but masked on BA.1 and BA.2: which is all to say -- beware what you see in the UShER tree in BA.1 and BA.2 with respect to 22917. It may appear more homoplasic in uploaded samples than it should. [Meanwhile, the tree still has quite a few more sites that could be masked to make lineage-defining mutations more coherent, but I don't want to overdo it and make the currently known recombinants incorrect; the "reversions" accurately reflect their lack of some mutations for the branch on which they're placed.] |
22917G = S:L452R has already appeared as defining mutation independently in a number of lineages: |
Here are UShER trees for all the 51 sequences. They split in two clusters. Here's the global downsampled tree: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/singleSubtreeAuspice_genome_334ad_a01370.json?c=country&label=nuc%20mutations:T670G,C2790T,G4184A,C4321T,C9344T,A9424G,C9534T,C9866T,C10198T,G10447A,C12880T,T15240C,C15714T,C17410T,C19955T,A20055G,C21618T,T21762C,T21846C,T22200G,C22673T,A22688G,G22775A,A22786C,A24130C,C26060T,C26858T,G27382C,A27383T,T27384C,A29510C Here's tree 1: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_334ad_a01370.json?c=country&label=nuc%20mutations:C28724T |
The two trees posted by @corneliusroemer differ outside the spike for defining AA mutations M:D3N (tree 2) and N:151S + Orf7b:11F (tree1), M:D3N appeared only in March Earliest sequence of the G12160A,Orf7b:11F branch should be this one: EPI_ISL_10860989 from South Africa (26-01-2022) |
I wouldn't draw strong conclusions based on these 2 UShER trees - placement is strongly affected by dropouts. The branch is too long for good placement. |
24 more sequences from South Africa: Includes 16 out of 40 uploads from KwaZulu-Natal today, 6 out of 30 from Eastern Cape and 2 out of 56 from Western Cape. edit: the sequences from KwaZulu-Natal are all in the M:D3N branch and the rest are all in the N:P151S branch. |
@silcn great you are monitoring it could you update the title of the issue with the number of sequences you found day by day? |
5 more from Gauteng today out of 12 uploads: |
Changed the title to avoid confusion now that the cat's out of the bag. Interested to see the reasoning behind the decision. |
@silcn i have made head to head comparison of the first sequence of BA.4 vs first sequence of BA.5 (First i was able to find) by NUC |
Thanks @silcn, this has been included as BA.4 and BA.5 in release v1.3 |
CovSpectrum queries to track both lineages already: |
Proposal for a sublineage of BA.*
Earliest sequence: 2022-01-10 (South Africa)
Latest sequence: 2022-03-24 (Denmark)
Countries detected: South Africa (30 including 21 in Gauteng), Botswana (3), Denmark (3), United Kingdom (2)
Mutations differing with respect to BA.2:
S: L452R, F486V
ORF7b: L11F*
N: P151S*
nuc: G12160A
*these two don't always seem to be picked up, G12160A is a better marker outside the spike
In addition, there might be reversions ORF1a:F3201L and S:R493Q, but someone with more expertise can weigh in on whether these are real.
Notably, 452 and 486 are two of the biggest antigenic sites that are not already hit by BA.2 mutations; 452R might knock out some of the same antibodies that 446S does in BA.1.
This lineage has risen quickly to a large proportion of BA.2 in Gauteng. The sample sizes are very small, but as there are signals that prevalence is rising again in South Africa, I think this is worth keeping an eye on.
https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?aaMutations=S%3AL452R%2CS%3AF486V&pangoLineage=BA.2*&
Genomes:
452_486_genomes.txt
The text was updated successfully, but these errors were encountered: