Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map variant annotation with mutation by originalVariantQuery #3327

Merged
merged 1 commit into from
Jul 31, 2020

Conversation

leexgh
Copy link
Member

@leexgh leexgh commented Jul 23, 2020

Fix cBioPortal/cbioportal#7707

Describe changes proposed in this pull request:

  • map variant annotation with mutation by originalVariantQuery

Example:
https://deploy-preview-3327--cbioportalfrontend.netlify.app/results/mutations?Action=Submit&RPPA_SCORE_THRESHOLD=2.0&Z_SCORE_THRESHOLD=2.0&cancer_study_list=msk_impact_2017&case_set_id=msk_impact_2017_cnaseq&data_priority=0&gene_list=ABL1&geneset_list=%20&genetic_profile_ids_PROFILE_COPY_NUMBER_ALTERATION=msk_impact_2017_cna&genetic_profile_ids_PROFILE_MUTATION_EXTENDED=msk_impact_2017_mutations&mutations_transcript_id=ENST00000318560&profileFilter=0&tab_index=tab_visualize

Filter by P-0006124-T01-IM5.

The genomic location of this mutation is

9,133759441,133759501,GCGCCTTCTCCCCAAAGACAAAAAGACCAACTTGTTCAGCGCCTTGATCAAGAAGAAGAAG,GCGCCTTCTCCCCAAAGACAAAAAGACCAACTTGTTCAGCGCCTTGATCAAGAAGAAGAAGA

But in variant annotation is

9,133759501,133759502,-,A

Mapping withoriginalVariantQuery can make this variant show up in the mutation table

@leexgh leexgh requested a review from inodb July 24, 2020 18:33
Copy link
Member

@inodb inodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @leexgh !

This looks great!

This specific example case is actually interesting because of some other reasons as well unrelated to the PR. That is P-0006124-T01-IM5 ABL1 mutation is annotated as inframe in the portal, but as a frameshift in genome nexus. That mutation event in the internal mskimpact cohort for the same sample in ABL1 is different:

133759441 | 133759501 | GCGCCTTCTCCCCAAAGACAAAAAGACCAACTTGTTCAGCGCCTTGATCAAGAAGAAGAAG | CAGCCCCAACCCCTCCCAA

Anyway I just wrote it down in the review here, b/c it might be worth investigating further.

Code itself looks great to me! One question: I forget does this code also apply to how hotspots are annotated in e.g. oncoprint? Just want to make sure that we are properly annotating all hotspots as well now

@leexgh
Copy link
Member Author

leexgh commented Jul 27, 2020

@inodb good point! This pr only fix mapping in mutation table, it doesn't touch Hotspots fetching in oncoprint.

@leexgh
Copy link
Member Author

leexgh commented Jul 27, 2020

@inodb Cancer Hotspots data fetching is using POST /cancer_hotspots/genomic, but only /annotation/ endpoints have originalVariantQuery, so I think the hotspots are fine for now, at least they remain the same.

Since we use genomic location to map annotations in a lot of places, I change the implementation to have both genomic locations from mutations and originalVariantQuery as keys when they are different, so it may have two genomic locations
point to the same annotation, thus we don't need to change other logic in the codebase.

Copy link
Member

@inodb inodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this could work as temp fix but I don't fully understand yet why we can't use originalVariantQuery everywhere (see comment)

Comment on lines 334 to 339
if (key !== annotation.originalVariantQuery) {
indexAnnotationsByGenomicLocation[
annotation.originalVariantQuery
] = annotation;
}
return indexAnnotationsByGenomicLocation;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leexgh could you add some comments in the code that describe what you mentioned in the PR? From what I understand this is mostly to avoid changing the logic in other places. Using genomicLocation to map back won't always be correct, so don't you think it's best to use originalVariantQuery everywhere? E.g. we could use the annotation endpoint instead to get the hotspots. If it's too much work to change everywhere, one option is to add a // TODO: explaining what needs to be done before we can use originalVariantQuery everywhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inodb I see, I found we also use genomic location to query myVariantInfo so I did this temp fix to avoid too much code changes. But yes using originalVariantQuery is more clear. I'm not exactly sure how many places we need to change, shouldn't be too much I guess. I'll try to change them all using annotation endpoint and mapping with originalVariantQuery

Copy link
Member

@inodb inodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you!

@inodb inodb merged commit f835205 into cBioPortal:master Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants