Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to split B.1.1.529 in multiple sublineages #364

Closed
08011990 opened this issue Dec 8, 2021 · 2 comments
Closed

Proposal to split B.1.1.529 in multiple sublineages #364

08011990 opened this issue Dec 8, 2021 · 2 comments
Labels
not accepted A proposal for a new lineage has not been accepted

Comments

@08011990
Copy link

08011990 commented Dec 8, 2021

Submitted by Rakesh Sarkar (Senior Research Fellow, Division of Virology, ICMR-NICED, Kolkata, West Bengal, India; E. mail: rakeshsarkar133@gmail.com)

I have analyzed the S glycoprotein mutations of around 554 genome sequences of the Omicron variant which were deposited to GISAID from 33 different countries till 5th December, 2021 (Table 1). My analysis revealed the presence of 37 dominant mutations (A67V, ∆H69, ∆V70, T95I, G142D, ∆V143, ∆Y144, ∆Y145, ∆N211, L212I, ins214EPE, G339D, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y , Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, D796Y, N856K, Q954H, N969K, and L981F) which range in the frequency from 44.22% to 100%, all across the S glycoprotein of the 554 Omicron variants (Table 2). However, 37 different mutations were found to be present in different combinations in different group of sequences. On the basis of coexisting mutations of the S glycoprotein I have classified 554 sequences into 95 different groups, each group representing a different set of S glycoprotein mutations (Table 3). Around 75% (412/554) of the Omicron variants formed three groups (Group 1, N=159; Group 2, N=114; Group 3, N=109). Group 1 contains all the 37 different mutations; Group 2 includes all the mutations except K417N, N440K, and G446S, whereas Group 3 harbours all the mutations of Group 2 except N764K. Rest of the 92 groups represented only 142 strains (Table 3). We have presented the sequence names of all the strains (with their S glycoprotein mutations) belonging to a specific group in Supplementary file 1.

I would request to go through the every details I provided in the attached files and give new lineage names of different groups accordingly.

Table 1.docx
Table 2.docx
Table 3.docx
Supplementary File 1.xlsx

@08011990
Copy link
Author

08011990 commented Dec 9, 2021

I would request to give prime importance on group 1 to group 14 which have minimum 5 sequences, with special attention to group 1, group 2 and group 3 which include 159, 114 and 109 sequences respectively.

@corneliusroemer
Copy link
Contributor

The differences between these groups are most likely sequencing artefacts due to amplicon dropout.

I do not see evidence for a split along those lines. If there was a split, we'd expect some correlated mutations in non-Spike areas.

This tree was built by masking all sites that seemed to have quality problems. If the S:440 split was real, it should cluster. But it doesn't, it's all over the tree.

image

https://nextstrain.org/groups/neherlab/ncov/21K-diversity/unmasked/?c=gt-S_440&gt=S.440N

@corneliusroemer corneliusroemer added the not accepted A proposal for a new lineage has not been accepted label Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not accepted A proposal for a new lineage has not been accepted
Projects
None yet
Development

No branches or pull requests

2 participants