Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with the designation of BA.4 and BA.5 #645

Closed
shay671 opened this issue May 17, 2022 · 5 comments
Closed

Issues with the designation of BA.4 and BA.5 #645

shay671 opened this issue May 17, 2022 · 5 comments

Comments

@shay671
Copy link

shay671 commented May 17, 2022

In an analysis I did with @talyash, we found some discrepancies in the BA.4 and BA.5 samples found in Pango compared to other methods.

We took the samples that were designated BA.4/BA.5 In CovSpectrum and add to samples found using a particular queries we made for BA.4 and BA.5 (explained later).

All samples were run through USHER and Nextclade, and Pango online platforms.

Full results :

BA.4 BA.5 Analysis.xlsx

The results in the combined cohort (CovSpectrum designated + particular query) :

Found as BA.4 -
Query 1502 - (1331 found in Pango)
USHER – 1481 - (1329 found in Pango)
NextClade – 1472 (1320 found in Pango)
Pango – 1331

BA.5
Query – 951 (747 found in Pango)
USHER -934 (745 found in Pango)
NextClade- 951 (found 748 in Pango)
Pango – 757 (8 were exclusive to Pango from all other methods)

The queries for BA.4 and BA.5 are designed to take in the count for each variant :
Mutations found in the Variant and BA.2
Mutations that are found in BA.2 but reversed in the Variant
Mutations that found in BA.4 and BA.5 but not in BA.2
Mutations that were found only in BA.4 or BA.5

Query used for BA.4 :

[1-of: G27788T, C28724T, [1-of: A686-,A687-,G688-,T689-,C690-,A691-,T692-,T693-,T694-]]&[3-of: G27788T, C28724T, [1-of:A686-,A687-,G688-,T689-,C690-,A691-,T692-,T693-,T694-],G12160A,T22917G,T23018G]&![2-of:C9866T,A23040G]&[3-of:C26858T,A27259C,G27382C,A27383T,T27384C]&[33-of:C241T,G4184A,A9424G,C14408T,C23525T,A24424T,C10198T,C17410T,A18163G,A22786C,G27382C,C3037T,C12880T,T22200G,A28271T,C10449A,A27383T,A23055G,A23063T,T22679C,C10029T,C22674T,G22992A,T27384C,C15714T,G22775A,C21618T,T24469A,C26060T,C26858T,C27807T,C28311T,A29510C,T670G,C2790T,C9344T,A20055G,C22686T,C22995A,A23013C,G23948T,G28882A,C4321T,C23604A,C25584T,G28881A,A22688G,A23403G,T23075C,C23854A,C26270T,G26709A,C9534T,G22578A,T22882G,C25000T,C26577G,G10447A,G22813T,T23599G,G28883C,C19955T,G21987A,A27259C,[1-of:T11288-,C11289-,T11290-,G11291-,G11292-,T11293-,T11294-,T11295-,T11296-],[1-of:T21633-,A21634-,C21635-,C21636-,C21637-,C21638-,C21639-,T21640-,G21641-,T21765-,A21766-,C21767-,A21768-,T21769-,G21770-],[1-of:G28362-,A28363-,G28364-,A28365-,A28366-,C28367-,G28368-,C28369-,A28370-]]

Query used for BA.5
[1-of: G26529A,C27889T]&[3-of: G12160A,T22917G,T23018G,G26529A,C27889T]&![2-of: C9866T,A23040G,C26858T,A27259C,G27382C,A27383T,T27384C]&[30-of: C10449A,G21987A,T22882G,C26270T,A28271T,G22578A,A23403G,C25000T,C12880T,C28311T,C15714T,C17410T,G22775A,A22786C,G4184A,A9424G,A20055G,C22995A,A23063T,G23948T,C22686T,G22992A,C23525T,G28881A,T670G,A24424T,C2790T,A23055G,C9534T,C19955T,T23075C,T23599G,C26577G,G28883C,A29510C,C23854A,T24469A,C26060T,G26709A,G10447A,T22200G,C3037T,C21618T,A22688G,G22813T,C27807T,C241T,C10198T,A18163G,C25584T,C4321T,C9344T,T22679C,A23013C,G28882A,C10029T,C14408T,C22674T,C23604A,[1-of: T11288-,C11289-,T11290-,G11291-,G11292-,T11293-,T11294-,T11295-,T11296-],[1-of: T21633-,A21634-,C21635-,C21636-,C21637-,C21638-,C21639-,T21640-,G21641-,T21765-,A21766-,C21767-,A21768-,T21769-,G21770-],[1-of:G28362-,A28363-,G28364-,A28365-,A28366-,C28367-,G28368-,C28369-,A28370-]]

@shay671
Copy link
Author

shay671 commented Jun 4, 2022

Things are keeping to be problematic.
In Botswana CovSpectrum based on Pango gives 0 % in recent weeks for BA.4. Our query get's ~60%
In Switzerland CovSpectrum based on Pango gives 0 % this week for BA.5, our query get's 47.1 %

@AngieHinrichs
Copy link
Member

Thanks @shay671 for looking into this and providing the queries. If possible, can you run pangolin in the default UShER mode with the --skip-scorpio flag? Scorpio is needed as a double-check on pangoLEARN assignments, but it's been very tough to get the scorpio/constellations rules just right for BA.2/BA.4/BA.5 and at the moment scorpio errs on the side of Unassigned/BA.2 vs. BA.4/BA.5. Meanwhile, UShER mode seems to do OK for BA.2/BA.4/BA.5. (See also cov-lineages/scorpio#47.)

Also, it seems that two months after the release of pangolin v4.0 with UShER as the default analysis mode and an assignment cache option to compensate for UShER being slower than pangoLEARN, GISAID is still using pangoLEARN for assignments. I'm hoping that if multiple community members ask GISAID to use pangolin v4's UShER mode and assignment caches (with --skip-scorpio until scorpio/constellations are updated to handle BA.2/BA.4/BA.5 with good specificity and sensitivity), perhaps they will listen.

@shay671
Copy link
Author

shay671 commented Jun 11, 2022

Hi @AngieHinrichs
I haven't got into Scorpio (i run mainly the online Pango version), but I will now.
I work together with @talyash and @Boltyansky on designation rationale for different variants. We would love to help the Intl teams (Pango/Scorpio/usher/next/etc.) with anything we can, like the work mentioned here.
One project we maintain is the detailed mutation list for variants :
bodek 09.6 .csv
and also have a derivative that focuses on convergence (of all saltation variants, not just the VOI/VOCs).

@FedeGueli
Copy link
Contributor

Hi @shay671 i noticed in your list C27889T as defining mutation in BA.5, i have spotted a few BA.5 sequences which dont carry 27889T but 27889C (as BA.4): most of them are for sure base calling to reference for sequencing issue but one portuguese sublineage could be real : https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3ac76_30e1d0.json?branchLabel=aa%20mutations&c=gt-nuc_27889&label=nuc%20mutations:C1912T,C2232T,C3317T,T3358C
Schermata 2022-06-13 alle 00 16 17

cc @AngieHinrichs

@corneliusroemer
Copy link
Contributor

I think this has been more or less fixed recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants