Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1b:P1727L circulating in Shandong (78 seq with Orf1b:S2339F) and Gansu,Hunan (3 seqs with Orf1a:T3284I) #1542

Closed
FedeGueli opened this issue Jan 10, 2023 · 11 comments
Assignees
Labels
BA.5 designated UShER not clean Lineage currently not clean in UShER tree
Milestone

Comments

@FedeGueli
Copy link
Contributor

FedeGueli commented Jan 10, 2023

EDITED

I will competely rewrite this issue after the mass upload of sequences from Shandong last night.
Initially i proposed it separately in #1565 but then i realize thast this has the same root of the little one proposed here.

Defining mutations: BA.5.2 > C27513T > C27012T > G12310A > Orf1b:T1050N (C16616A) > C18647T > Orf1a:R207C (C884T), Orf1a:S2488F (C7728T)

Gisaid query/covspectrum query: C7728T, C18647T, C16616A

Tree:
Schermata 2023-01-21 alle 09 40 44
Schermata 2023-01-21 alle 09 41 21
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_2d1fe_ba22c0.json?c=userOrOld&label=id:node_7413453

Sequences:
EPI_ISL_16390376, EPI_ISL_16458311, EPI_ISL_16458319,
EPI_ISL_16494609, EPI_ISL_16576905, EPI_ISL_16604347,
EPI_ISL_16604351-16604353, EPI_ISL_16604355-16604357, EPI_ISL_16604361-16604364,
EPI_ISL_16604366-16604370, EPI_ISL_16604372-16604373, EPI_ISL_16604375,
EPI_ISL_16604378-16604381, EPI_ISL_16604383-16604388, EPI_ISL_16604390-16604391,
EPI_ISL_16604393, EPI_ISL_16604412-16604413, EPI_ISL_16604416,
EPI_ISL_16604418, EPI_ISL_16604424, EPI_ISL_16604429,
EPI_ISL_16604453, EPI_ISL_16604467, EPI_ISL_16604474,
EPI_ISL_16604483, EPI_ISL_16604485, EPI_ISL_16604526,
EPI_ISL_16604545, EPI_ISL_16604553, EPI_ISL_16604555,
EPI_ISL_16604563, EPI_ISL_16604565-16604566, EPI_ISL_16604576,
EPI_ISL_16604579, EPI_ISL_16604612, EPI_ISL_16604620,
EPI_ISL_16604624, EPI_ISL_16604647-16604648, EPI_ISL_16604679,
EPI_ISL_16604691, EPI_ISL_16604736, EPI_ISL_16604767-16604768,
EPI_ISL_16604770-16604771, EPI_ISL_16604774, EPI_ISL_16604780-16604785,
EPI_ISL_16604789, EPI_ISL_16604885, EPI_ISL_16604889,
EPI_ISL_16604894

@FedeGueli FedeGueli changed the title BA.5.2 + orf1b:1050N sublineage circualting in China defined by Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L BA.5.2 + orf1b:1050N sublineage circultaing in China defined by Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L Jan 10, 2023
@FedeGueli FedeGueli changed the title BA.5.2 + orf1b:1050N sublineage circultaing in China defined by Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L BA.5.2 + orf1b:1050N sublineage circulating in China defined by Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L Jan 10, 2023
@thomasppeacock thomasppeacock added BA.5 monitor currently too small, watch for future developments labels Jan 11, 2023
@FedeGueli FedeGueli changed the title BA.5.2 + orf1b:1050N sublineage circulating in China defined by Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L circulating in Shandong (78 seq) Jan 21, 2023
@FedeGueli FedeGueli changed the title BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L circulating in Shandong (78 seq) BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L circulating in Shandong (78 seq with Orf1b:S2339F) and Gansu,Hunan (3 seqs with Orf1a:T3284I) Jan 21, 2023
@FedeGueli
Copy link
Contributor Author

@thomasppeacock @InfrPopGen @corneliusroemer this new chinese lineage will be tracked here being #1565 just a sublineage of this one although big.

@thomasppeacock thomasppeacock added recommended Recommended for designation by pango team member and removed monitor currently too small, watch for future developments labels Jan 21, 2023
@thomasppeacock
Copy link

Recommending this to keep on top of lineage diversity in China, thanks Fede!

@InfrPopGen InfrPopGen self-assigned this Jan 21, 2023
InfrPopGen added a commit that referenced this issue Jan 21, 2023
Added new lineage BA.5.2.50 from #1542 with 4 new sequence designations, and 0 updated
@InfrPopGen InfrPopGen added designated and removed recommended Recommended for designation by pango team member labels Jan 21, 2023
@InfrPopGen InfrPopGen added this to the BA.5.2.50 milestone Jan 21, 2023
@InfrPopGen
Copy link
Contributor

Thanks for submitting. We've added lineage BA.5.2.50 with 4 newly designated sequences, and 0 updated. Defining mutations C884T (ORF1a:R207C), C7728T (ORF1a:S2488F) (following C18647T (ORF1b:P1727L)).

@FedeGueli
Copy link
Contributor Author

thank you @InfrPopGen and @thomasppeacock !

@aviczhl2
Copy link
Contributor

aviczhl2 commented Feb 8, 2023

Screen Shot 2023-02-08 at 10 26 46

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice6_genome_2967e_2e7340.json?label=id:node_7556982

Seems that Usher is not categorizing this branch correctly now and is categorizing another branch as BA.5.2.50.

There're a lot of sequences without orf1a:R207C on this branch. Previously those seqs are regarded as "C207R reversions", and now the algorithm decides to flip the order, and therefore this branch is being "thrown out" of BA.5.2.50.

@corneliusroemer
Copy link
Contributor

Yep, @aviczhl2 you are right, thanks for pointing this out! This is a problem with Usher rather than pango-designations, pinging @AngieHinrichs

The defining mutations of BA.5.2.50 above BA.5.2 are:

    164     "C884T",
    165     "C7728T",
    166     "G12310A",
    167     "C16616A",
    168     "C18647T",
    169     "C27012T",
    170     "C27513T"

There seems to be dropout of ORF1ab:R207C which cause the top bit with that mutation to appear as non-BA.5.2.50.

@corneliusroemer corneliusroemer added the UShER not clean Lineage currently not clean in UShER tree label Feb 8, 2023
@corneliusroemer
Copy link
Contributor

I've added the USHeR not clean label, if you spot similar issues with other lineages let us know. Good spot :) @aviczhl2

@AngieHinrichs AngieHinrichs changed the title BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1a:P1727L circulating in Shandong (78 seq with Orf1b:S2339F) and Gansu,Hunan (3 seqs with Orf1a:T3284I) BA.5.2 + orf1b:1050N sublineage with Orf1a:T3284I,then orf1a:R207C, orf1a:S2488F and Orf1b:P1727L circulating in Shandong (78 seq with Orf1b:S2339F) and Gansu,Hunan (3 seqs with Orf1a:T3284I) Feb 8, 2023
@AngieHinrichs
Copy link
Member

AngieHinrichs commented Feb 8, 2023

How important is ORF1a:R207C to the definition of BA.5.2.50? If we omit ORF1a:R207C (C884T) from the definition of BA.5.2.50, and instead define it as

... ORF1b:P1727L (C18647T) > ORF1a:S2488F (C7728T)

that would include both branches. Here is a taxonium view of the branch at ORF1a:S2488F (C7728T) that currently has both the annotated BA.5.2.50 and the other branch that gets ORF1a:R207C (C884T) after several other mutations, with red circles around the four designated BA.5.2.50 sequences in lineages.csv and nodes colored by allele at 884 (orange=C, green=T):

image

I can't blame UShER for structuring it that way because the non-Shandong sequences seem to get C884T right after C7728T, while the Shandong sequences after C7728T include sequences without C884T but with

  • ORF1b:S2339F (C20483T)
  • ORF1b:S2339F (C20483T) and G23608T
  • ORF1b:S2339F (C20483T) and G23608T and T8885C

and then the Shandong sequences that do have C884T also have all three of the above mutations. The only way to force UShER to make C884T come first would be to permanently exclude the 18 Shandong sequences that don't have C884T. Do we want to do that? Alternatively I could add a "BA.5.2.50_alt" label to the Shandong C884T branch so that both branches would be included in the minimized tree for pangolin and would result in BA.5.2.50 being assigned.

... or, again, we could simply omit ORF1a:R207C (C884T) from the definition and let BA.5.2.50 start at ORF1a:S2488F (C7728T).

@aviczhl2
Copy link
Contributor

aviczhl2 commented Feb 13, 2023

... or, again, we could simply omit ORF1a:R207C (C884T) from the definition and let BA.5.2.50 start at ORF1a:S2488F (C7728T).

Please fix this way? @AngieHinrichs I see orf1a:R207C has been removed from defining mutation of BA.5.2.50 on lineage_note.txt

BA.5.2.50 Alias of B.1.1.529.5.2.50 China, ORF1a:S2488F after ORF1b:P1727L, issue #1542

@AngieHinrichs
Copy link
Member

BA.5.2.50 Alias of B.1.1.529.5.2.50 China, ORF1a:S2488F after ORF1b:P1727L, issue #1542

Good point @aviczhl2! And I was about to ask Cornelius if he's OK with the change but he added more designated sequences in 3e600c1 to make it extra clear. OK, I will fix it.

@corneliusroemer
Copy link
Contributor

Yes sorry for not reporting back here - I agree that it makes sense to remove that one mutation. I'm not sure I trust the Shandong sequences but in the end it's not a very important change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BA.5 designated UShER not clean Lineage currently not clean in UShER tree
Projects
None yet
Development

No branches or pull requests

6 participants