Skip to content

How to Prepare a GATK tool for WDL Auto Generation

Chris Norman edited this page Sep 9, 2020 · 3 revisions

The gatkWDLGen and gatkWDLGenValidation gradle tasks generate WDL and input JSON files for any tool that is annotated with both @WorkflowProperties and @CommandLineProgramProperties. For each such tool, two pairs of WDL/input-JSON files are generated:

  • A default WDL and input JSON, named toolname.wdl and toolnameInput.json, with parameters for required args only.
  • A second pair of files named toolnameAllArgs.wdl and toolnameAllAgsInput.json with parameters for all tool args.

(A third pair of files with names that end in Test are also generated for each tool, but these are for internal consumption only, and are used by the gradle gatkWDLGenValidation test task).

To prepare a tool for WDL autogen, use the following guidelines:

Annotate the Tool Class

Add a @WorkflowProperties annotation to the tool class (it must also have an @CommandLineProgramProperties annotation). Choose appropriate values for the attributes, such as memory, cpu, etc. The values of all @WorkflowProperties attributes must be WDL-compatible. See the @WorkflowProperties interface for the available attributes and default values.

Annotate the Tool Arguments

  • All tool input and output @Arguments that are either a file or directory (or a collection of files or directories), should be annotated with a @WorkflowInput or WorkflowOutput annotation.@WorkflowInput arguments may also specify localizationOptional = true, if the tool can handle non-local (cloud) inputs. Generally, this requires the arg to have GATKPath type.
  • The underlying type of these input and output @Arguments should be GATKPath, or some class derived from GATKPath (i.e., FeatureInput). The arguments should not have type String (java.nio.File is acceptable, but GATKPath is preferable since arguments that use GATKPath support non-local file systems, such as GCS, Hadoop, etc.).

Add Companion Resources

Companion resources are files that are not (usually) explicitly passed to the tool as tool arguments, but which must be present at runtime for the tool/workflow to succeed, such as a bam index, reference dictionary, or reference index. The @WorkflowInput or WorkflowOutput annotations allow for the specification of companion resources that are either optional, via the optionalCompanions attribute, or required, via the requiredCompanions attribute. Required companions on an argument that is itself optional will be treated as optional. Specifying companion resources results in the generated workflow and task, and inputs JSON file, having placeholder arguments for these companions. For example:

@WorkflowInput(optionalCompanions = { StandardArgumentDefinitions.REFERENCE_INDEX_COMPANION, StandardArgumentDefinitions.REFERENCE_DICTIONARY_COMPANION })
@Argument(fullName = StandardArgumentDefinitions.REFERENCE_LONG_NAME, shortName = StandardArgumentDefinitions.REFERENCE_SHORT_NAME, doc = "Reference sequence", optional = true)
private GATKPath referenceInputPath;

The generated WDL for a tool with this argument will have associated companion index and dictionary file arguments that share the target argument’s type (the WDL type being File in the case of GATKPath):

...
File? reference
File? referenceDictionary
File? referenceIndex

WDL AutoGen Failures

The WDL generator contains code to translate any known Java argument type into a WDL compatible type (i.e, List<FeatureInput<VariantContext>> must be translated into Array[File] in WDL). If a new tool is added, or an argument is added to an existing tool that is not handled by WDL autogen, the following error can occur, and means that the WDL generator must be updated with code to handle the transformation of the new Java argument type:

"Don't know how to convert Java type %s in %s to a corresponding WDL type. The WDL generator type converter code must be updated to support this Java type."

This is most likely to be encountered as a travis PR failure, since most developers will not be running the WDL generator locally before submitting. For that reason, it is a good idea to run the gatkWDLGenValidation task before submitting whenever adding or modifying or adding a new tool with @WorkflowProperties or @WorkflowInput/@WorkflowOutput annotations.

Notes:

WDL autogen does not handle arguments that use tags (VQSR for one).