New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2nd-Generation BA.2 Saltation Lineage, >30 spike mutations (3 seq, 2 countries, Aug 14) #2183
Comments
Alternative non spike nuc query: A7842G, C8293T, G8393A C897A, G3431T, A7842G, G8393A is another query by @HynnSpylor |
Folks , regarding the Israeli sample : |
This comment was marked as resolved.
This comment was marked as resolved.
alternative discussion sars-cov-2-variants/lineage-proposals#606 |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This is missing C9866T = ORF1a:L3201F, which was present in almost all BA.2 outside southern Africa due to a founder effect. Suggests a southern African origin for this variant, potentially even the Omicron source. There are a few shared mutations with BA.1 as I think has been alluded to on Twitter. The ones in Spike probably arose independently even if this did come from the Omicron source, but G8393A = ORF1a:A2710T is fairly rare outside BA.1 and might be suggestive of recombination. (note: given that it's unlikely a recombinative origin can ever be proved, and the evidence for this coming from the Omicron source is much weaker than for BA.4 and BA.5, I would suggest this gets the next available BA.2.x designation rather than BA.6) |
yeah the 9866C branch of BA.2 was common only in SA @corneliusroemer proposed a bunch of sublineage of them with at least one got designated i recall. If i dont recall badly they were successfully exported just in Germany I am going to check RKI seqs on Open CovSpectrum |
A12160G is just a reversion from G12160A of BA.4/5 so unreal due to misrooting by Usher i think. |
BA.2-without-9866T had a branch with C26681T which reached 5-10% of BA.2-without-9866T in South Africa in early 2022. This could be a descendant of that branch, in which case it likely wouldn't be from the Omicron source. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Any thoughts on G446S and F486P? Might it be a recombinante of BA.2.75 + XBB.1.5 + BA.2 ? |
This comment was marked as resolved.
This comment was marked as resolved.
More likely just a lot of convergent evolution, many of the RBD mutations are things we've seen before, just not all together. The mutations at 481-484 are the really new part. |
This comment was marked as resolved.
This comment was marked as resolved.
Note: S:H245N also appears in BA.2.3.20* (all of them), is it very common in all saltation? |
Could it be a reversion? ORF1a:L3201 might be an important residue. ORF1a:L3201P is in both Iota and Lambda. |
483- and 245N were in #1692 (a few dozen sequences from Ukraine in February). 245N has also been seen in at least BQ.1.1.48 as a point mutation. Actually most of the RBD and NTD mutations (or different ones at the same positions, ~35 in total compared to ~12 outside the S1) have been in some previous interesting variant - it's extremely improbable. I've never commented on either of these github projects before (been following since the BA.1 issue), but I do believe this thread has been shared on social media and could (depending on resharing and spread over the upcoming days) get some attention from the public. Best to be ready for that. |
I want to highlight the mutation S:S50L here. As of today, a search of S:S50L only gives less than 1000 results on covSPECTRUM. For a C-to-U mutation, it's not much. I wondered whether there are mutations that like to occur with S:S50L together. I found potential candidates, S:P621X.
I think there might be something special about the S50L+P621X combination. |
S:P1143L (and P1143S) are also associated with chronic sequences often. |
Thanks for raising opening this issue @ryhisner and the productive discussion everyone. In particular getting some extra epidemiological info @shay671! It would be great if we could keep discussion here in this issue limited to new sequences and phylogenetics (putative parent lineages, discussion of recombinant nature or not) with exception for extra epidemiological info like Shay shared about patient history. Discussion of the putative function of individual mutations is off topic here and better placed in sars-cov-2-variants/lineage-proposals#606 or in other issues in that repo. I'll try to moderate this issue a little as it may get a significant attention/readership. If I hide comments as off-topic, this is just to keep the most salient information most easily locatable. For broader discussion please use sars-cov-2-variants/lineage-proposals#606 or open another issue there - for example if you have questions that aren't directly related to this lineage (e.g. how to save Usher trees for longer than 2 days). |
@trilisser Yes, @AngieHinrichs is working on a pangolin-data release so BA.2.86 should also be called by pangolin over the next few days. Nextclade now calls it also using the main dataset (not just master). @FedeGueli It would be good to stick to verifiable data, e.g. if we don't have published SGTF data let's not speculate about it (or do so in the other repo) to keep it clean here. It would be great if you could edit previous posts rather than making lots of one sentence comments as these clog up people's email inboxes and notifications. |
I dont speculate i asked and they said that to me. |
S:R21T is in all the sequences. The only sequences that don't have it are the two from Denmark, both of which have no coverage in that part of spike. |
Great work. the most recent sample of each branch is 4 muts away only from a common ancestor. |
Reports that a third sequence has been found in Denmark: https://en.ssi.dk/news/news/2023/three-cases-of-ba-2-86-have-been-detected-in-denmark No sign of it yet on GISAID, Denmark is only uploading on Mondays so it will presumably be in the next batch unless it gets uploaded early like the England sequence was. |
Here's a Nextstrain tree which shows the same result as @ryhisner's annotated Usher tree: https://nextstrain.org/groups/neherlab/ncov/BA.2.86 I'll try to keep it updated daily as new sequences are uploaded over the next few days. Just to confirm, the English sequence doesn't change the common ancestor sequence I shared yesterday. |
The third Danish sequence has been uploaded together with a batch of 29 other Danish sequences with collection dates in the past month (note: I've been told that Danish dates are all rounded to Monday's for privacy reasons - I don't know whether rounded up, down or to the nearest Monday): A very rough frequency estimate would hence put BA.2.86 at 1-5% in Denmark at the end of July. Sequence fits squarely into the Danish/English lineage and is clean (including the insertion now), with extra synonymous The (un)deletions mentioned by @Over-There-Is in the Nextstrain tree should be ignored. Unfortunately there's no such concept as an unknown deletion. I'm now masking all gaps to prevent these artefacts. Thanks for the suggestion! |
New Danish sample doesn't have the NNNs in spike, and has one additional silent mutation C7528T relative to the first two. So far in Denmark: Consistent with a rise in frequency but also not exactly suggestive of a BA.1-like explosion. |
From a rough 2% to a 6% in a week is still a lot, but unsure how much we can be confident with these data not knowing exactly the sampling strategy If hospitalized focused one could vary a lot if severity changes, and could also skew the prevalence toward elderly people may not representing real prevalence in pop. If less severe for example it could be more prevalent than what we are seeing. |
Yeah, and the sample sizes are too small to draw any firm conclusions of course. |
Some useful negative results may be the absence of BA.2.86 in the 71 new samples from Israel collected around August 10th from a different facility, 5 of those samples remained unassigned in GISAID but don't have the new strain's mutations in any case. |
Number of BA.2.86 out of all samples collected in calendar weeks starting on Monday per continent and also globally Date of analysis: 2023-08-21, 10:20am UTC, GISAID data queried via the GISAID web interface
Edit: there was a typo in 2023-07-17/Europe - I first had a 1 there due to line copying, but it is 0 Note: @theosanderson remarked that "It seems likely that the UK sequence was only deposited this early becase of its genotype, so it's probably better to think of the 2023-08-07 numbers as 1 rather than 2". The UK sequence had an extremely low collection-submission delay of only ~5 days. |
(It seems likely that the UK sequence was only deposited this early becase of its genotype, so it's probably better to think of the 2023-08-07 numbers as 1 rather than 2) |
One more seq detected in US International Airport, with travel history in Japan (EPI_ISL_18121060) |
With extra C222T, C1960T, T12775C, G22200Trev(S:G213Vrev, artefact?) and without S:Ins16MPLF(artefact?) Danish branch: A6183G(Orf1a:K1973R), C12815T (Denmark 3, UK 1) |
The Michigan Department of Health, in their press release, claimed that the CDC had informed them 7 sequences had been detected worldwide. Possible the CDC were already aware of this Virginia ex-Japan sequence. |
There are NNN's in the S:211-212 area, so S:G213E is undoubtedly there, and I'm sure the insert is there as well. This is a Gingko Bozoworks sequence, so the quality is not good. I have no idea why the CDC doesn't do the travel sequences. CDC sequences are always top-notch; Ginkgo sequences never are. |
The two new South Africa sequences pointed out by @emily-smith1 in the open discussion thread (EPI_ISL_18125249 and EPI_ISL_18125259) both branch off directly from @corneliusroemer's common ancestor sequence - they don't share any extra mutations with each other or the other 7 sequences. One has 3 mutations from the common ancestor, the other 7. The collection dates are 07/24 and 07/28. South Africa has uploaded 22 sequences with collection date 07/24 onwards, the most recent being 08/10. Food for thought for country of origin speculations: Mpumalanga was one of the two provinces where BA.2-9866T+26681T+S:939F was found in Feb 2022. I will happily admit I was probably wrong earlier about the Denmark and Israel/Michigan sequences being from two distinct transmissions from the source that didn't have time to pick up any mutations yet. |
Just realized that
|
49 new sequences from Denmark, no BA.2.86. Updated ratios: |
update from denmark: low levels of BA.2.86 found in wastewater, one more confirmed case. |
EDIT: I almost forgot to thank @JosetteSchoenma for first calling my attention to the presence of T4579A in this new sequence! There is one fascinating aspect to the most recent Denmark sequence: the synonymous mutations A4576T and T4579A. T->A and A->T mutations are rare, as seen in the figure below from @jbloom. https://jbloomlab.github.io/SARS2-mut-spectrum/rates-by-clade.html It's therefore surprising to see that T4579A has occurred in over 45,000 sequences and A4576T in over 10,000. Even more remarkable is how often the two have appeared together. A4576T is in over 23% of sequences with T4579A, while T4579A occurs in over 98% of sequences with A4576T. I first noticed this peculiar mutational combination because it was in BA.5.2.23—a lineage that competed surprisingly well considering it's relative lack of immune-evasion spike mutations—and when I look more closely, it became clear that the co-occurrence of these mutations is TRS-related. (TRS = transcription regulation sequence) Below is a diagram I made using Nextclade that shows wild-type nucleotide sequence (top row), nuc sequence with A4576T + T4579A (middle row), and the TRS-L from the beginning of the SARS-CoV-2 genome (bottom row). . . |
@ryhisner Good point! The silent T4579A appeared twice within XBB during diversification, which is conspicuous given the rareness of this transversion. Both clusters diverged further, meaning that the mutation might be linked to their success. |
thx @ryhisner T4579A only is defining of FL.2 and present in all FE.1.1 noticed by @aviczhl2 here:sars-cov-2-variants/lineage-proposals#606 (comment) |
@ryhisner Any idea how a new transcription start at the beginning of Orf1a might contribute to virus fitness? Enhancement of transcription of (most) NSPs? Or would this new transcript be produced in the opposite direction (antisense)? |
Not sure if relevant but doing an alignment tonite i noticed the BA.2.86 silent nuc mutation at the end of the spike C25207T (S:Y1215Y) interestingly there was a little XBB.1.34.1 cluster (with one sample from Sudan intercepted by GBW in VA) back in March 2023 with it. Ok until this there were 12K+ sequence with it associated to C25000T (32K without) so not so relevant, but the fact is that XBB.1.34 has as defining S:P681R and XBB.1.34.1 has also S:E554K as defining . The cluster with the sudanese sample has no S:A570V S:P621S S:S939F but it is true that has no additional silent or non synonymous mutation between S:E554K and S:Y1215Y. It has a silent mutation at Orf3a:8 that is absent in BA.2.86 so IF anything recombination has happened it would have been between S:F486P and Orf3a:7 . But being three additional Non Synonymous mutations present in BA.2.86 and absent in this cluster no smoking gun here. cc @silcn @corneliusroemer @thomasppeacock @ryhisner could you check please? |
From epidemiologically perspective i think it would be beneficial to designate the European branch. Probably of course there is not expected advantage for it, but the comparative tracing of it while other samples stems directly from BA.2.86 is very important. I think. |
@ryhisner I think your TRS homology was misaligned, this is the corrected (shifted by 1) figure, I've also added codon boundaries so one can see that the mutations are both in the 3rd position, they are both synonymous. I've also marked remaining mismatches with red boxes and labelled the rows: If this was about TRS-L/B homology, the remaining mismatch is a GAT -> GAA which is nonsynonymous Asp -> Glu, and might hence be selected against. However in codon 1441, CTA -> CTT is synonymous. Both mismatches in codon 1442 are non-synonymous. 4588T has been seen not so much: I wonder what the mechanistic effect of this would be: does it act like a TRS-B, causing production of a truncated ORF1ab missing nsp1 and nsp2 and most of nsp3? Or as a secondary TRS-L, either guiding to the real TRS-L or acting as a drop in if that one can't be found? Any ideas @theosanderson @thomasppeacock? 4588T might be slightly selected against on its own, so are all the other homology increasing nt mutations per https://raw.githubusercontent.com/jbloomlab/SARS2-mut-fitness/main/results_public_2023-10-01/nt_fitness/nt_fitness.csv
|
Description
Sub-lineage of: BA.2
Earliest sequence: 2023-7-24, Denmark
Most recent sequence: 2023-7-31; Denmark & Israel
Countries circulating: Denmark (2), Israel
Number of Sequences: 3
GISAID AA Query: Spike_E484K, V445H
GISAID Nucleotide Query: T22032C, C22033A, A22034G
CovSpectrum Query: T22032C & C22033A & A22034G
Substitutions/Deletions/Insertions on top of BA.2:
Spike: ins16_MPLF (ins21608_TCATGCCGCTGT), R21T, S50L, ∆69-70, V127F, ∆Y144, F157S, R158G, ∆N211, L212I, L216F, H245N, A264D, I332V, D339H, K356T, R403K, V445H, G446S, N450D, L452W (2-nuc), N460K, N481K, ∆V483, A484K (2-nuc), F486P, E554K (Denmark seq only), A570V, P621S, I670V (Israel seq only), H681R, S939F, P1143L
N: Q229K
M: D3H, T30A, A104V
ORF1a: A211D, V1056L, N2526S, A2710T, V3593F, T4175I
Nucleotide: C897A, G3431T, A7842G, C8293T, G8393A, G11042T, A12160G, C12789T, T13339C, T15756A, A18492G, ins21608TCATGCCGCTGT, C21711T, G21941T, T22032C, C22208T, A22034G, C22295A, C22353A, A22556G, G22770A, G22895C, T22896A, G22898A, A22910G, C22916T, ∆23009-23011, G23012A, C23013A, T23018C, T23019C, C23271T, C23423T, A23604G, C24378T, C24990T, C25207T, A26529C, A26610G, C26681T, C26833T, C28958A
USHER Tree (for what it's worth)
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/2nd-Gen_BA.2.json?c=gt-nuc_897&label=id:node_5341437
Evidence
One day after the first sequence i this lineage was uploaded from Israel (Sunday 13-Aug), two sequence were uploaded from Denmark (Monday 14-Aug), and one of them has a collection date a week earlier than the Israel sequence. This one's already gone international and is likely circulating in a country with little genetic surveillance. The only question at this point is whether this will be a situation like BS.1.1 or BA.2.83, where a hugely divergent, 2nd-generation lineage spreads but never has a large impact or whether this will be closer to a BA.1-type situation.
Genomes
Genomes
EPI_ISL_18096761, EPI_ISL_18097315, EPI_ISL_18097345The text was updated successfully, but these errors were encountered: