-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add hart2015 essentiality suite #563
Conversation
- GEO: GSE75189
This is awesome! A quick thought: can the xlsx file be avoided? |
no, not at the moment - would like to retain the original Hart2015 data file intact |
I understand - it makes a lot of sense. And since this file will not be modified, it's not going to continuously enlarge the repository. |
thanks - yes there's no need to change the content, any modification, such as converting into a plain-text version (which does make sense), would confuse people though |
Is the aim to run the essentiality suite manually in conjunction with every release? |
yes, this test can be applied to evaluate each release among others |
Updated essentiality evaluation using combined (
The results show that the accumulated curations since v1.15 have positive effect to TP, FN, but adversely affect TN and FP (by which PRs?) |
now it might be better to spin out a new release, to present our recent work for community inspection. it would also be good at some point to look into which PRs led to increased FP, probably by decreasing TN |
To me, this is already done in the
Why "at some point" and not now when the changes are more fresh? |
the inspection is best achieved by using the model, and community access is mainly through the releases, instead of the
sure go ahead please |
Code example for running essentiality analysis with Hart2015 datasets: % load model and essential tasks
load('Human-GEM.mat'); % or: ihuman = importYaml('Human-GEM.yml');
taskStruct = parseTaskList('metabolicTasks_Essential.txt');
% generate tINIT models and estimate essential genes
eGenes = estimateEssentialGenes(ihuman, 'Hart2015_RNAseq.txt', taskStruct);
% compare model predictions with experimental data
results = evaluateHart2015Essentiality(eGenes); it may take 2-4 hours or more, depending on computer's configuration, to complete this analysis on a laptop or desktop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice, good work @haowang-bioinfo. I agree it's a bit unfortunate with the large-ish .xlsx file, but as @mihai-sysbio says, it won't need to be changed.
Shouldn't we add this then to an action that would run for every pull request from |
not sure if this is suitable for GH action, because the computation load is very heavy. Probably we would know better after solving #635 |
The previous steps works fine, the last one encounters an error
|
- cherry pick `adjust_pvalues` function from repo GeneSetAnalysisMatlab for convenient use
ah, yes - now |
Successfully run and got this result
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Main improvements in this PR:
This PR provides infrastructure of essentiality test suite, as proposed in #390, using Hart2015 datasets with following components:
Hart2015_RNAseq.txt
: RNA-seq sequences of Hart2015 cell lines (GEO: GSE75189)Hart2015_TableS2.xlsx
: Calcuated Bayes factors for CRISPR targeted genesgetTaskEssentialGenes
: Identify genes essential for different tasks in different tINIT modelsestimateEssentialGenes
: Generate tINIT models and estimate essential genesevaluateHart2015Essentiality
: Evaluate and compare Hart2015 experimental fitness genes with predicted resultsI hereby confirm that I have:
develop
as a target branch