Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORF1ab is listed in JSON genome_annotations, but not in dropdown for Color by: Genotype #1699

Open
AngieHinrichs opened this issue Sep 18, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@AngieHinrichs
Copy link

Hi! This might be an Auspice thing, but since I'm using the nextstrain.org/fetch/ function I'll file it here.

Current Behavior

When I view https://nextstrain.org/fetch/hgwdev.gi.ucsc.edu/~angie/whereIsOrf1AB.json and select Color by: Genotype, the gene menu does not include ORF1ab even though it is the first item in the JSON's genome_annotations list:

... "genome_annotations": { "ORF1ab": { "start": 266, "end": 21555, "strand": "+", "type": "CDS"} , "S": { "start": 21563, "end": 25384, "strand": "+", "type": "CDS"} , "ORF3a": {  ...

The Color by: Genotype gene menu lists nucleotide, S, ORF3a, ...:
image

Expected behavior

I would expect the Color by: Genotype gene menu to list nucleotide, ORF1ab, S, ORF3a, ....

How to reproduce

  1. Go to https://nextstrain.org/fetch/hgwdev.gi.ucsc.edu/~angie/whereIsOrf1AB.json
  2. Select Color by: Genotype
  3. Try to choose ORF1ab from the gene menu... it's missing.

Possible solution

Is 'nucleotide' perhaps replacing the first element instead of being prepended to the list??

Your environment: if browsing Nextstrain online

  • Mac OS X 10.15.7
  • Browser: Chrome 114.0.5735.198

Additional context

HT @FedeGueli

@AngieHinrichs AngieHinrichs added the bug Something isn't working label Sep 18, 2023
@joverlee521
Copy link
Contributor

Hi @AngieHinrichs,

When I open the console on the page, I see the following error coming from Auspice:

[Genome annotation] ORF1ab has length 21290 which is not a multiple of 3

With the latest Auspice updates made by @jameshadfield, you should be able to define the two segments separately in the genome annotations:

  "ORF1ab": {
    "strand": "+",
    "segments":[
      {"start": 266, "end": 13468},
      {"start": 13468, "end": 21555}
    ]
  },

@joverlee521 joverlee521 transferred this issue from nextstrain/nextstrain.org Sep 18, 2023
@joverlee521
Copy link
Contributor

It would be helpful to make this error more obvious to users by dispatching a warning or error notification.

@jameshadfield
Copy link
Member

Hey @AngieHinrichs - @joverlee521's summarisd things perfectly but note that the segmented annotations can't yet be produced by the augur tools so you'll have to add them via a short python script. Here's an example of a python script I used in testing to manipulate the ncov JSONs to produce segmented annotations for the 2 CDSs which cover the slip site (RdRp and ORF1ab). Internally we debated changing all our ncov datasets from separate ORF1a + ORF1b CDSs to the more correct ORF1ab CDS, but I don't think we will do this as so many people (and pango designations) are using ORF1b numbering; we will probably add the 16 proteins cleaved from the polyproteins tho.

@AngieHinrichs
Copy link
Author

Ah, thanks @joverlee521 and @jameshadfield! I wish I'd thought to check the console. OK, I will update the ORF1ab coords in the JSON to list the segments. My code has been adding ORF1ab mutation annotations to the nodes so it didn't occur to me that the ORF1ab coords would really matter for anything besides drawing the genes down below. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

No branches or pull requests

3 participants