Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delly output from bcbio does not include translocations #2793

Closed
ekarlins opened this issue Apr 18, 2019 · 3 comments
Closed

delly output from bcbio does not include translocations #2793

ekarlins opened this issue Apr 18, 2019 · 3 comments

Comments

@ekarlins
Copy link

I've started testing the bcbio structural variant pipeline using these callers:
svcaller: [lumpy, manta, cnvkit, metasv, wham, delly]

One thing I noticed is that the output files from delly don't have any inter-chromosomal translocations. I think this is because bcbio runs delly one chromosome at a time and uses an exclusion file that excludes all other chromosomes. I recognize that this was done for speed, but it would be nice to have the translocation output from delly to compare to the translocations from the other callers (lumpy and manta seem to have translocations).

I ran delly on NA12878 outside of bcbio and got ~2800 translocations in the output. Delly output from the same sample from bcbio has 0 translocations.

Thanks!
Eric

@chapmanb
Copy link
Member

Eric;
Thanks for the question. You're exactly right in your assessment of the implementation. When we didn't parallelize and ran with translocations delly was too slow for inclusion. The runtimes were extremely long and it didn't seem practical to include in that way. We haven't investigated recently so don't know the status of the latest version, and it may have improved. If you're finding the runtimes reasonable and are willing to submit a way to do this inside of bcbio we'd be happy to work with you on including them. Thanks again for the discussion.

@ekarlins
Copy link
Author

@chapmanb Thanks for your reply!
My recent run of delly(using v0.7.9) on NA12878, without excluding chromosomes, took ~3.5 hours to complete, using 1 core and ~9GB of memory. That seems like a reasonable run time to me.
Do you think we'd want an option to run it either way? We could leave the current svcaller "delly" to run each chromosome in parallel and have another option to the svcaller list, "delly-full", which would include the whole genome and emit translocation calls.
Or if you think 3.5 hours is in line with the other callers we could just replace the way delly is run within bcbio.

@chapmanb
Copy link
Member

Eric;
Thanks for the additional details and for looking into this. The way we typically approach changes like this is to add an option so you could specify tools_on: [delly-full] and folks could use that to turn on the new behavior. That would allow us to test runtimes across a range of samples to see how they vary with different types of inputs, and eventually could make it the new default if all works well. Thanks again for thinking through this and let us know if we can help with any pointers or suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants