Generic pre-processing and GVCF generation WDLs #104

Open
wants to merge 1 commit into
from

Conversation

Projects
None yet
1 participant
Collaborator

vdauwera commented Apr 20, 2017 edited

This PR adds two WDLs:

  1. A generic, stripped-down version of the Broad production / GOTC single sample pipeline (no QC steps included). Starts with unmapped bams per readgroup and ends with HaplotypeCaller GVCF generation. Inputs requirements are relaxed to be reference-agnostic (two jsons provided, for hg38 and b37) and can be applied to exomes instead of genomes by swapping out intervals files and specifying interval padding (which is important for all exome target lists but unnecessary for the Broad's WGS intervals list).

  2. Same as above but without the GVCF generation, to serve as common pre-processing script for all variant discovery pipelines.

Note that the BQSR tasks have been tweaked to use all GATK3 tools instead of the more performant GATK4 tools, for benchmarking purposes.

@vdauwera vdauwera Generic pre-processing workflows
	- One "uBAM to clean BAM"; one "uBAM to HaplotypeCaller GVCF"
	- Reference-agnostic (hg38 and b37 jsons provided)
	- Can be applied to exomes instead of genomes by swapping out intervals files
fd87f5a

vdauwera changed the title from Generic pre-processing and GVCF generation WDL to Generic pre-processing and GVCF generation WDLs Apr 22, 2017

vdauwera requested a review from cjllanwarne Apr 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment