New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
frontend selection of mutation genetic profiles may need improvements #1646
Comments
I'm seeing this in the logs: |
I have continued looking into this. I have just queried genes TP53 and AKT1 across all provisional studies plus 3-4 additional high-sample studies from breast cancer. Out of 34 studies queried, 15 failed to show mutations which were visible on the oncoprint after opening the mutations tab ... instead reporting "There are no TP53 mutations in the selected samples." for example. Affected tudy ids: |
Following up on the comment from @n1zea144, I went to the public-portal-beta.log and there were bunches of similar errors from the one that he found. Example: |
core/src/main/java/org/mskcc/cbio/portal/util/OmaLinkUtil.java |
Ok, I think I have it ... the code in MutationDataUtils.java checks for the value "NA" and hardcodes an "NA" into the link response which is then not displayed. But I am guessing that we switched away from using "NA" in our database at some point ... I remember a discussion that there was a gene named "NA" (ENSG00000047597 ?) as one reason to switch away. So the "[Not Available]" string is actually a replacement for "NA" .. but it is slipping through this code because it was only looking for "NA" verbatim. This probably has nothing to do with the domain names used for mutationassessor, and the fix should be relatively easy. |
Confirmed .. in the latest version of cgds_public on dashi, the link fields in mutation_event for mutation assessor contain the string "[Not Available]" for 101404 rows out of 2496234 total. In my local copy of cgds_public from a couple months ago, there are no such entries. Any query which constructs a link to omaRedirect.do? on one of these mutations will throw a MalformedURLException with the current codebase, so the error occurs when one of these 101404 mutations are present in the query results. |
One good example test case is mutation TP53 L194R .. this mutation event is marked "[Not Available]" in the mutation event record links, but is present in the following profiles: |
Additional searching shows that there are three "non-link" values to handle from the current databases we are using: "NA", "[Not Available]", and "" mysql> select link_pdb, count(1) from mutation_event where link_pdb not like '%pdb%' group by link_pdb; |
This issue has been partially fixed now, via: I am changing this issue from a Bug to an Enhancement now. It is possible this never happens --- if the query results page only ever loads a single mutation profile, maybe we don't need to fix anything. But in general 'getMutationProfileIds' might return more than one mutation profile, and in DataProxyFactory.js we are setting servletParams.geneticProfiles = mutation_profile_ids[0]; |
adding participants: @jjgao @adamabeshouse @onursumer |
I am closing this issue now .. because each study currently has only a single genetic profile of type EXTENDED_MUTATION. If in the future we have multiple mutation profiles per study, we may need to revisit this issue. |
Originally, this issue was a bug report about unpopulated mutation lists in the results page.
The bugs have been fixed, but there is still possible ambiguity about which genetic profile should be used for frontend visualization (such as the mutationmapper) when there are more than one mutation profile to choose from.
Below Here Is The Original Bug Report
Accessing views through /beta deployment (rc?) show unpopulated mutation lists:
(suspect malfunctioning mutation data servlets - oncoprint looks ok)
@ersinciftci @n1zea144
PatientVew (accessed through sample list in study view):
http://www.cbioportal.org/beta/case.do?cancer_study_id=brca_tcga&case_id=TCGA-3C-AAAU
Local debugging shows some exceptions:
SEVERE: Servlet.service() for servlet [MutationsJSON] in context with path [/cbioportaltest] threw exception
java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:592)
at java.lang.Integer.parseInt(Integer.java:615)
at org.mskcc.cbio.portal.servlet.MutationsJSON.getDrugs(MutationsJSON.java:425)
at org.mskcc.cbio.portal.servlet.MutationsJSON.processGetMutationsRequest(MutationsJSON.java:276)
at org.mskcc.cbio.portal.servlet.MutationsJSON.processRequest(MutationsJSON.java:126)
at org.mskcc.cbio.portal.servlet.MutationsJSON.doPost(MutationsJSON.java:786)
Through Query Page: (query study brca_tcga on genes TP53 BRCA1)
There are many mutations in this query, but the "Mutations" Tab is empty for both genes.
http://www.cbioportal.org/beta/index.do?cancer_study_list=brca_tcga&cancer_study_id=brca_tcga&genetic_profile_ids_PROFILE_MUTATION_EXTENDED=brca_tcga_mutations&genetic_profile_ids_PROFILE_COPY_NUMBER_ALTERATION=brca_tcga_gistic&Z_SCORE_THRESHOLD=2.0&RPPA_SCORE_THRESHOLD=2.0&data_priority=0&case_set_id=brca_tcga_cnaseq&case_ids=&patient_case_select=sample&gene_set_choice=user-defined-list&gene_list=TP53+BRCA1&clinical_param_selection=null&tab_index=tab_visualize&Action=Submit&show_samples=false&\
The text was updated successfully, but these errors were encountered: