-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert to use BigWig (geneBody_coverage2.py) to calculate the gene body coverage? #195
Comments
Interesting! We already remove the subsampled approach though. Could you compare it to the latest version with |
@ewels Our current deployed pipeline uses geneBody_coverage2.py (bigwig file as input) and I do not know when it has been changed to the subsampled approach Lines 810 to 812 in 37f260d
geneBody_coverage3.py? did you mean geneBody_coverage2.py in RSeQC v3.0.0 (the latest version)? I have tested for quite many projects and it worked well and took just 1~2hrs. Shall we revert to use BigWig? |
I'm a bit lost with this.. @apeltzer - can you remember where we are with this process? Any thoughts on the above? |
I'm a bit lost too, but I'll try to summarize. 1.) Initially, this pipeline did run That's just the historical summary of what we already tried doing. We can, of course, use RSEQC 3.0.0 and revert to subsampled + bigWig on these, which should be by far the fastest way of doing this. Though, I didn't test this so far. |
@apeltzer yes, if we use bigwig as the input of |
Hey both, if it is fast indeed, do we need to change anything then? Or just revert back to BAM -> bigWig -> |
Just came in in the Slack channel:
|
So maybe we could actually do it in the way that we subsample and use |
I think "just revert back to BAM -> bigWig -> geneBody_coverage2.py" without doing subsample approach is the fastest way since it takes some time but won't speed up much. |
Because it just came up in the Slack channel: Could you test using the |
|
This has been removed in PR #195, and will be fixed when the |
Script geneBody_coverage2.py (uses bigwig file as input) in RSeQC v3.0.0 is improved a lot by adding pyBigWig package. It took ~50min. for 12GB BAM file, pipeline duration: 6h 33m 33s. But
geneBody_coverage.py using subsampled BAM (7.6GB) took ~17hrs, pipeline duration: 1d 15h 44m 56s which took much longer time.
The text was updated successfully, but these errors were encountered: