-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script that allows for update to existing gene panel #7695
Add script that allows for update to existing gene panel #7695
Conversation
21b2133
to
f0d9601
Compare
These are some sketches I made regarding edge cases in comparing incoming with original genes. So far, I've tested intersection case 1 on pages 1 and 2, where A = imported (test panel to overwrite with) and B = original (test panel from database). The first image corresponds to the Before and After in the PR's top comment. For the second image, I tested this using the TESTPANEL2 text file against TESTPANEL2 (unchanged). |
I've just tested cases 1-7, and the only changes I made were for case 1, where having an empty gene list was disallowed for importing. I changed that so case 1 allows the user to delete all genes from the original gene panel when importing an empty gene panel. |
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
} | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a hunch that Windows is changing the newline characters in this file, hence the weird diff. I'll look around for solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too, as I get this message from doing commits:
jtquach@DESKTOP-IST3H2B:~/cbioportal_windows10$ git add core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
warning: LF will be replaced by CRLF in core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java.
The file will have its original line endings in your working directory
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
bc0843e
to
e9c9eec
Compare
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/util/GenePanelUtil.java
Outdated
Show resolved
Hide resolved
d591058
to
dad1b5b
Compare
core/src/test/java/org/mskcc/cbio/portal/util/TestGenePanelUtil.java
Outdated
Show resolved
Hide resolved
9f1c44b
to
d6c2b64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I left a few tiny changes, but for the most part, I think this is ready to ship.
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/mskcc/cbio/portal/scripts/UpdateGenePanel.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/mskcc/cbio/portal/util/TestGenePanelUtil.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/mskcc/cbio/portal/util/TestGenePanelUtil.java
Outdated
Show resolved
Hide resolved
c395620
to
075eb15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup to existing code. Some comments for consideration.
GenePanel genePanel = DaoGenePanel.getGenePanelByStableId(stableId); | ||
|
||
if (genePanel == null) { | ||
ProgressMonitor.logWarning("Gene panel " + stableId + " does not exist in the database! Exiting."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if there was any thought to combining the update functionality into the Import script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to Luke, the import and update functionality should be separate from each other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@n1zea144 Leaving them as separate scripts makes it a bit harder to make the mistake of updating an existing gene panel when you instead meant to upload a new version of a gene panel for new samples. I think this is an easy enough mistake- the user copies the gene panel file, vims it, updates the genes, but forgets to change the name of the panel before running the script. By keeping the two scripts separate, we make it so a user that ignores warning messages and just presses enter a bunch cannot end up making this mistake.
524a21e
to
d2b14f7
Compare
pstmt.setLong(2, canonicalGene.getEntrezGeneId()); | ||
pstmt.executeUpdate(); | ||
} | ||
for (CanonicalGene canonicalGene : toRemove) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like its possible to remove all genes from the panel. Should we check this and if so, should we prevent this from happening ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In UpdateGenePanel.java, the user prompt in importData() verifies the changes that will be made to the gene panel, like so:
Other than that, I asked Luke if importing an empty gene panel should allow for the removal of all existing genes, and he stated that behavior sounds reasonable. I suppose if the user wanted to add back the genes, they could pull up an older version of their input file with the gene names.
I'll ping Luke about whether after all genes are removed, that the gene panel should be removed from gene_panel_list as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, so removal of all genes means stop the update script and don't make the deletions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not meaningful to have a panel in the database, which samples refer to in sample_profile, that contain no genes. I'm also not sure what the affect is on the web page. One approach is to remove the the panel and subsequent records in sample_profile (via cascade), but thats a bit messy and could happen by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke with @Luke-Sikina via Slack. Best approach is not to prevent an update which results in empty panel in database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, does that mean that within UpdateGenePanel, checking if the incoming set of genes is empty and aborting the script is not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added the check on lines 77-80 on UpdateGenePanel, after the gene panel null check:
if (genePanel == null) {
ProgressMonitor.logWarning("Gene panel " + stableId + " does not exist in the database! Exiting.");
return;
}
if (canonicalGenes == null || canonicalGenes.isEmpty()) {
ProgressMonitor.logWarning("Incoming gene panel is empty, which would result in the removal of all genes from gene panel " + stableId + ". Exiting.");
return;
}
(edited since I realized to use ProgressMonitor.logWarning
instead of System.err.println
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, checking/aborting when the incoming gene list from the panel file is empty should suffice, thanks for the adjustment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reviews!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just updated the markdown file again to reflect that trying to import an empty gene panel will abort the script.
0468141
to
83c7b91
Compare
pstmt.setLong(2, canonicalGene.getEntrezGeneId()); | ||
pstmt.executeUpdate(); | ||
} | ||
for (CanonicalGene canonicalGene : toRemove) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, checking/aborting when the incoming gene list from the panel file is empty should suffice, thanks for the adjustment.
Added UpdateGenePanel, based off ImportGenePanel Refactored and moved identical methods into new util file Updating allows user to add/remove genes from existing gene panel in database Allow empty gene panel to be inputted, but abort script if so Prompt user with Y/n to notify that this is an update and not import script Preview genes to be added/removed before finalizing Y/n prompt Unit tests for utility methods Updated ImportGenePanel calls Formatted code for readability Added javadocs Added null check in extractPropertyValue Added 12 testpanel genes and their aliases for testing Added Perl script and updated markdown documentation Edited markdown documentation on how to use the update script Co-authored-by: Luke Sikina <lucas.sikina@gmail.com>
83c7b91
to
e483396
Compare
Fix #7286
Describe changes proposed in this pull request:
Any screenshots or GIFs?
Before:
After: