Discrepancies within WDL script in converting to CWL #118

Open
pkarnati2004 opened this Issue Jun 22, 2017 · 0 comments

Comments

Projects
None yet
1 participant

We are converting the WDL scripts to CWL, and there seem to be some discrepancies regarding the correct way to format arguments within the command line with boolean values.

While working on converting the WDL script to CWL using a wdl2cwl converter (https://github.com/common-workflow-language/wdl2cwl), we found that if a boolean value exists for a particular argument, errors were being raised. There is an issue regarding the interpolation of the argument within the command line.

According to the WDL HaplotypeCaller_3.6 script, if an argument takes in a boolean and its default value is false, then the command line includes both the argument and the value false at the end.

For example, examining this piece of code

-allowNonUniqueKmersInRef ${default="false" allowNonUniqueKmersInRef}

With a default value of false, the script would return

-allowNonUniqueKmersInRef false

if given no value (using the default) when in reality it should return nothing, as only a true value actually causes it to return something in the command line. An "invalid argument value false" error is returned in this case. The WDL script does not seem to portray that. We suspect that the interpolation should instead look like

${true = '-allowNonUniqueKmersInRef, false = '', Boolean bool}

as shown the documentation for wdl (https://github.com/broadinstitute/wdl/blob/develop/SPEC.md#integer-lengtharrayx). When converted to CWL, the command line output does not run if done the current way. If false, the argument should not exist on the command line, and if true, the command line should have

-allowNonUniqueKmersInRef.

The true and false values should not exist in the command line.

There also seem to be problems regarding arguments with array inputs. According to the documentation (https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_haplotypecaller_HaplotypeCaller.php#--input_prior) for something such as kmerSize, with an array of [10, 25], then command line should look like

-kmerSize 10 -kmerSize 25.

However, the output is -kmerSize [10,25], which is not understood by the program as it is not formatted correctly. According to the WDL documentation, the argument should look like array [String] prefix(String, Array [X]) so that a prefix exists for each value. In conversion from WDL to CWL, these problems are causing errors that causes the program to fail.

The format of the WDL seems to be incorrect, if we are understanding the code correctly. Is it possible that there exists a different version of the WDL code for HaplotypeCaller?

Perhaps we aren't understanding the code correctly. Any clarification is appreciated! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment