Permalink
Newer
Older
100644 2693 lines (2068 sloc) 79.1 KB
May 11, 2015 @scottfrazer Remove references to CWL
1 # Workflow Description Language
Apr 29, 2015 Add language specification
2
3 ## Table Of Contents
4
5 <!---toc start-->
6
May 11, 2015 @scottfrazer Remove references to CWL
7 * [Workflow Description Language](#workflow-description-language)
Apr 29, 2015 Add language specification
8 * [Table Of Contents](#table-of-contents)
9 * [Introduction](#introduction)
May 20, 2015 spec and grammar changes
10 * [State of the Specification](#state-of-the-specification)
Apr 29, 2015 Add language specification
11 * [Language Specification](#language-specification)
12 * [Global Grammar Rules](#global-grammar-rules)
May 20, 2015 Adding more expression evaluation stuff to the spec
13 * [Whitespace, Strings, Identifiers, Constants](#whitespace-strings-identifiers-constants)
Apr 29, 2015 Add language specification
14 * [Types](#types)
Jul 31, 2015 @scottfrazer Update the specification
15 * [Fully Qualified Names & Namespaced Identifiers](#fully-qualified-names--namespaced-identifiers)
Apr 29, 2015 Add language specification
16 * [Declarations](#declarations)
17 * [Expressions](#expressions)
18 * [Operator Precedence Table](#operator-precedence-table)
May 20, 2015 Adding more expression evaluation stuff to the spec
19 * [Member Access](#member-access)
20 * [Map and Array Indexing](#map-and-array-indexing)
21 * [Function Calls](#function-calls)
22 * [Array Literals](#array-literals)
23 * [Map Literals](#map-literals)
Apr 29, 2015 Add language specification
24 * [Document](#document)
25 * [Import Statements](#import-statements)
26 * [Task Definition](#task-definition)
27 * [Sections](#sections)
28 * [Command Section](#command-section)
Jul 31, 2015 @scottfrazer Update the specification
29 * [Command Parts](#command-parts)
30 * [Command Part Options](#command-part-options)
31 * [sep](#sep)
32 * [true and false](#true-and-false)
33 * [default](#default)
Jul 17, 2015 @scottfrazer Documenation updates
34 * [Alternative heredoc syntax](#alternative-heredoc-syntax)
Jul 31, 2015 @scottfrazer Update the specification
35 * [Stripping Leading Whitespace](#stripping-leading-whitespace)
Apr 29, 2015 Add language specification
36 * [Outputs Section](#outputs-section)
Jul 17, 2015 @scottfrazer Documenation updates
37 * [String Interpolation](#string-interpolation)
Apr 29, 2015 Add language specification
38 * [Runtime Section](#runtime-section)
39 * [docker](#docker)
40 * [memory](#memory)
41 * [Parameter Metadata Section](#parameter-metadata-section)
42 * [Metadata Section](#metadata-section)
43 * [Examples](#examples)
44 * [Example 1: Simplest Task](#example-1-simplest-task)
45 * [Example 2: Inputs/Outputs](#example-2-inputsoutputs)
46 * [Example 3: Runtime/Metadata](#example-3-runtimemetadata)
47 * [Example 4: BWA mem](#example-4-bwa-mem)
48 * [Example 5: Word Count](#example-5-word-count)
49 * [Example 6: tmap](#example-6-tmap)
50 * [Workflow Definition](#workflow-definition)
Jul 17, 2015 @scottfrazer Documenation updates
51 * [Call Statement](#call-statement)
Apr 29, 2015 Add language specification
52 * [Scatter](#scatter)
53 * [Loops](#loops)
54 * [Conditionals](#conditionals)
55 * [Outputs](#outputs)
Jul 17, 2015 @scottfrazer Documenation updates
56 * [Namespaces](#namespaces)
Jul 31, 2015 @scottfrazer Update the specification
57 * [Scope](#scope)
58 * [Optional Parameters & Type Constraints](#optional-parameters--type-constraints)
59 * [Prepending a String to an Optional Parameter](#prepending-a-string-to-an-optional-parameter)
60 * [Scatter / Gather](#scatter--gather)
61 * [Variable Resolution](#variable-resolution)
62 * [Task-Level Resolution](#task-level-resolution)
63 * [Workflow-Level Resolution](#workflow-level-resolution)
64 * [Computing Inputs](#computing-inputs)
65 * [Task Inputs](#task-inputs)
66 * [Workflow Inputs](#workflow-inputs)
67 * [Specifying Workflow Inputs in JSON](#specifying-workflow-inputs-in-json)
68 * [Type Coercion](#type-coercion)
Jul 17, 2015 @scottfrazer Documenation updates
69 * [Standard Library](#standard-library)
Jan 5, 2016 @scottfrazer Updating specification
70 * [File stdout()](#file-stdout)
71 * [File stderr()](#file-stderr)
72 * [Array\[String\] read_lines(String|File)](#arraystring-read_linesstringfile)
73 * [Array\[Array\[String\]\] read_tsv(String|File)](#arrayarraystring-read_tsvstringfile)
74 * [Map\[String, String\] read_map(String|File)](#mapstring-string-read_mapstringfile)
75 * [Object read_object(String|File)](#object-read_objectstringfile)
76 * [Array\[Object\] read_objects(String|File)](#arrayobject-read_objectsstringfile)
77 * [mixed read_json(String|File)](#mixed-read_jsonstringfile)
78 * [Int read_int(String|File)](#int-read_intstringfile)
79 * [String read_string(String|File)](#string-read_stringstringfile)
80 * [Float read_float(String|File)](#float-read_floatstringfile)
81 * [Boolean read_boolean(String|File)](#boolean-read_booleanstringfile)
Jul 31, 2015 @scottfrazer Update the specification
82 * [File write_lines(Array\[String\])](#file-write_linesarraystring)
83 * [File write_tsv(Array\[Array\[String\]\])](#file-write_tsvarrayarraystring)
84 * [File write_map(Map\[String, String\])](#file-write_mapmapstring-string)
85 * [File write_object(Object)](#file-write_objectobject)
86 * [File write_objects(Array\[Object\])](#file-write_objectsarrayobject)
87 * [File write_json(mixed)](#file-write_jsonmixed)
Apr 29, 2015 Add language specification
88 * [Data Types & Serialization](#data-types--serialization)
Jul 31, 2015 @scottfrazer Update the specification
89 * [Serialization of Task Inputs](#serialization-of-task-inputs)
90 * [Primitive Types](#primitive-types)
91 * [Compound Types](#compound-types)
92 * [Array serialization](#array-serialization)
93 * [Array serialization by expansion](#array-serialization-by-expansion)
94 * [Array serialization using write_lines()](#array-serialization-using-write_lines)
95 * [Array serialization using write_json()](#array-serialization-using-write_json)
96 * [Map serialization](#map-serialization)
97 * [Map serialization using write_map()](#map-serialization-using-write_map)
98 * [Map serialization using write_json()](#map-serialization-using-write_json)
99 * [Object serialization](#object-serialization)
100 * [Object serialization using write_object()](#object-serialization-using-write_object)
101 * [Object serialization using write_json()](#object-serialization-using-write_json)
102 * [Array\[Object\] serialization](#arrayobject-serialization)
103 * [Array\[Object\] serialization using write_objects()](#arrayobject-serialization-using-write_objects)
104 * [Array\[Object\] serialization using write_json()](#arrayobject-serialization-using-write_json)
105 * [De-serialization of Task Outputs](#de-serialization-of-task-outputs)
106 * [Primitive Types](#primitive-types)
107 * [Compound Types](#compound-types)
108 * [Array deserialization](#array-deserialization)
109 * [Array deserialization using read_lines()](#array-deserialization-using-read_lines)
110 * [Array deserialization using read_json()](#array-deserialization-using-read_json)
111 * [Map deserialization](#map-deserialization)
112 * [Map deserialization using read_map()](#map-deserialization-using-read_map)
113 * [Map deserialization using read_json()](#map-deserialization-using-read_json)
114 * [Object deserialization](#object-deserialization)
115 * [Object deserialization using read_object()](#object-deserialization-using-read_object)
116 * [Array\[Object\] deserialization](#arrayobject-deserialization)
117 * [Object deserialization using read_objects()](#object-deserialization-using-read_objects)
Apr 29, 2015 Add language specification
118
119 <!---toc end-->
120
121 ## Introduction
122
Aug 17, 2015 @scottfrazer minor changes
123 WDL is meant to be a *human readable and writable* way to express tasks and workflows. The "Hello World" tool in WDL would look like this:
Apr 29, 2015 Add language specification
124
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
125 ```wdl
Apr 29, 2015 Add language specification
126 task hello {
Jul 31, 2015 @scottfrazer Update the specification
127 String pattern
128 File in
129
Apr 29, 2015 Add language specification
130 command {
Jul 31, 2015 @scottfrazer Update the specification
131 egrep '${pattern}' '${in}'
132 }
133
Aug 17, 2015 @scottfrazer minor changes
134 runtime {
135 docker: "broadinstitute/my_image"
136 }
137
Jul 31, 2015 @scottfrazer Update the specification
138 output {
139 Array[String] matches = read_lines(stdout())
Apr 29, 2015 Add language specification
140 }
141 }
Jul 31, 2015 @scottfrazer Update the specification
142
143 workflow wf {
144 call hello
145 }
Apr 29, 2015 Add language specification
146 ```
147
Aug 17, 2015 @scottfrazer minor changes
148 This describes a task, called 'hello', which has two parameters (`String pattern` and `File in`). A `task` definition is a way of **encapsulating a UNIX command and environment and presenting them as functions**. Tasks have both inputs and outputs. Inputs are declared as declarations at the top of the `task` definition, while outputs are defined in the `output` section.
Apr 29, 2015 Add language specification
149
Aug 17, 2015 @scottfrazer minor changes
150 The user must provide a value for these two parameters in order for this task to be runnable. Implementations of WDL should accept their [inputs as JSON format](#specifying-workflow-inputs-in-json). For example, the above task needs values for two parameters: `String pattern` and `File in`:
151
152 |Variable |Value |
153 |-------------------|---------|
154 |wf.hello.pattern |^[a-z]+$ |
155 |wf.hello.in |/file.txt|
Apr 29, 2015 Add language specification
156
Jul 31, 2015 @scottfrazer Update the specification
157 Or, in JSON format:
158
159 ```json
160 {
161 "wf.hello.pattern": "^[a-z]+$",
162 "wf.hello.in": "/file.txt"
163 }
164 ```
165
166 Running the `wf` workflow with these parameters would yield a command line from the `call hello`:
Apr 29, 2015 Add language specification
167
168 ```
169 egrep '^[a-z]+$' '/file.txt'
170 ```
171
172 A simple workflow that runs this task in parallel would look like this:
173
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
174 ```wdl
Apr 29, 2015 Add language specification
175 workflow example {
May 6, 2015 change types from lower case to upper-case first letter
176 Array[File] files
Apr 29, 2015 Add language specification
177 scatter(path in files) {
178 call hello {input: in=path}
179 }
180 }
181 ```
182
Jul 31, 2015 @scottfrazer Update the specification
183 The inputs to this workflow would be `example.files` and `example.hello.pattern`.
Apr 29, 2015 Add language specification
184
May 13, 2015 Removing output section from 'call' blocks, updating parser, adding a…
185 ## State of the Specification
186
Aug 17, 2015 @scottfrazer minor changes
187 **17 August 2015**
May 13, 2015 Removing output section from 'call' blocks, updating parser, adding a…
188
Jul 31, 2015 @scottfrazer Update the specification
189 * Added concept of fully-qualified-name as well as namespace identifier.
190 * Changed task definitions to have all inputs as declarations.
191 * Changed command parameters (`${`...`}`) to accept expressions and fewer "declarative" elements
192 * command parameters also are required to evaluate to primitive types
193 * Added a `output` section to workflows
194 * Added a lot of functions to the standard library for serializing/deserializing WDL values
195 * Specified scope, namespace, and variable resolution semantics
May 13, 2015 Removing output section from 'call' blocks, updating parser, adding a…
196
Apr 29, 2015 Add language specification
197 # Language Specification
198
199 ## Global Grammar Rules
200
May 20, 2015 Adding more expression evaluation stuff to the spec
201 ### Whitespace, Strings, Identifiers, Constants
Apr 29, 2015 Add language specification
202
203 These are common among many of the following sections
204
205 ```
206 $ws = (0x20 | 0x9 | 0xD | 0xA)+
207 $identifier = [a-zA-Z][a-zA-Z0-9_]+
Jul 31, 2015 @scottfrazer Update the specification
208 $string = "([^\\\"\n]|\\[\\"\'nrbtfav\?]|\\[0-7]{1,3}|\\x[0-9a-fA-F]+|\\[uU]([0-9a-fA-F]{4})([0-9a-fA-F]{4})?)*"
209 $string = '([^\\\'\n]|\\[\\"\'nrbtfav\?]|\\[0-7]{1,3}|\\x[0-9a-fA-F]+|\\[uU]([0-9a-fA-F]{4})([0-9a-fA-F]{4})?)*'
May 20, 2015 Adding more expression evaluation stuff to the spec
210 $boolean = 'true' | 'false'
Jul 31, 2015 @scottfrazer Update the specification
211 $integer = [1-9][0-9]*|0[xX][0-9a-fA-F]+|0[0-7]*
212 $float = (([0-9]+)?\.([0-9]+)|[0-9]+\.|[0-9]+)([eE][-+]?[0-9]+)?
Apr 29, 2015 Add language specification
213 ```
214
Jul 31, 2015 @scottfrazer Update the specification
215 `$string` can accept the following between single or double-quotes:
216
217 * Any character not in set: `\\`, `"` (or `'` for single-quoted string), `\n`
218 * An escape sequence starting with `\\`, followed by one of the following characters: `\\`, `"`, `'`, `[nrbtfav]`, `?`
219 * An escape sequence starting with `\\`, followed by 1 to 3 digits of value 0 through 7 inclusive. This specifies an octal escape code.
220 * An escape sequence starting with `\\x`, followed by hexadecimal characters `0-9a-fA-F`. This specifies a hexidecimal escape code.
221 * An escape sequence starting with `\\u` or `\\U` followed by either 4 or 8 hexadecimal characters `0-9a-fA-F`. This specifies a unicode code point
Apr 29, 2015 Add language specification
222
223 ### Types
224
225 All inputs and outputs must be typed.
226
227 ```
Jul 31, 2015 @scottfrazer Update the specification
228 $type = ($primitive_type | $array_type | $map_type | $object_type) $type_postfix_quantifier?
Jan 5, 2016 @scottfrazer Updating specification
229 $primitive_type = ('Boolean' | 'Int' | 'Float' | 'File' | 'String')
Jul 31, 2015 @scottfrazer Update the specification
230 $array_type = 'Array' '[' ($primitive_type | $object_type | $array_type) ']'
231 $object_type = 'Object'
232 $map_type = 'Map' '[' $primitive_type ',' ($primitive_type | $array_type | $map_type | $object_type) ']'
233 $type_postfix_quantifier = '?' | '+'
Apr 29, 2015 Add language specification
234 ```
235
236 Some examples of types:
237
May 6, 2015 change types from lower case to upper-case first letter
238 * `File`
239 * `Array[File]`
240 * `Map[String, String]`
241 * `Object`
Apr 29, 2015 Add language specification
242
Jul 31, 2015 @scottfrazer Update the specification
243 Types can also have a `$type_postfix_quantifier` (either `?` or `+`):
244
245 * `?` means that the value is optional. Any expressions that fail to evaluate because this value is missing will evaluate to the empty string.
246 * `+` can only be applied to `Array` types, and it signifies that the array is required to have one or more values in it
247
248 For more details on the `$type_postfix_quantifier`, see the section on [Optional Parameters & Type Constraints](#optional-parameters--type-constraints)
249
250 For more information on type and how they are used to construct commands and define outputs of tasks, see the [Data Types & Serialization](#data-types--serialization) section.
251
252 ### Fully Qualified Names & Namespaced Identifiers
253
254 ```
255 $fully_qualified_name = $identifier ('.' $identifier)*
256 $namespaced_identifier = $identifier ('.' $identifier)*
257 ```
258
259 A fully qualified name is the unique identifier of any particular `call` or call input or output. For example:
260
261 other.wdl
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
262 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
263 task foobar {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
264 File in
Jul 31, 2015 @scottfrazer Update the specification
265 command {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
266 sh setup.sh ${in}
Jul 31, 2015 @scottfrazer Update the specification
267 }
268 output {
269 File results = stdout()
270 }
271 }
272 ```
273
274 main.wdl
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
275 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
276 import "other.wdl" as other
277
278 task test {
279 String my_var
280 command {
281 ./script ${my_var}
282 }
283 output {
284 File results = stdout()
285 }
286 }
287
288 workflow wf {
289 Array[String] arr = ["a", "b", "c"]
290 call test
291 call test as test2
292 call other.foobar
293 output {
Dec 23, 2015 @mcovarr Remove commas from another example and the BNF.
294 test.results
Jul 31, 2015 @scottfrazer Update the specification
295 foobar.results
296 }
297 scatter(x in arr) {
298 call test as scattered_test {
299 input: my_var=x
300 }
301 }
302 }
303 ```
304
305 The following fully-qualified names would exist within `workflow wf` in main.wdl:
306
307 * `wf` - References top-level workflow
308 * `wf.test` - References the first call to task `test`
309 * `wf.test2` - References the second call to task `test` (aliased as test2)
310 * `wf.test.my_var` - References the `String` input of first call to task `test`
311 * `wf.test.results` - References the `File` output of first call to task `test`
312 * `wf.test2.my_var` - References the `String` input of second call to task `test`
313 * `wf.test2.results` - References the `File` output of second call to task `test`
314 * `wf.foobar.results` - References the `File` output of the call to `other.foobar`
315 * `wf.foobar.input` - References the `File` input of the call to `other.foobar`
316 * `wf.arr` - References the `Array[String]` declaration on the workflow
317 * `wf.scattered_test` - References the scattered version of `call test`
318 * `wf.scattered_test.my_var` - References an `Array[String]` for each element used as `my_var` when running the scattered version of `call test`.
319 * `wf.scattered_test.results` - References an `Array[File]` which are the accumulated results from scattering `call test`
320 * `wf.scattered_test.1.results` - References an `File` from the second invocation (0-indexed) of `call test` within the scatter block. This particular invocation used value "b" for `my_var`
321
322 A namespaced identifier has the same syntax as a fully-qualified name. It is interpreted as the left-hand side being the name of a namespace and then the right-hand side being the name of a workflow, task, or namespace within that namespace. Consider this workflow:
323
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
324 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
325 import "other.wdl" as ns
326 workflow wf {
327 call ns.ns2.task
328 }
329 ```
330
331 Here, `ns.ns2.task` is a namespace identifier (see the [Call Statement](#call-statement) section for more details). Namespace identifiers, like fully-qualified names are left-associative, which means `ns.ns2.task` is interpreted as `((ns.ns2).task)`, which means `ns.ns2` would have to resolve to a namespace so that `.task` could be applied. If `ns2` was a task definition within `ns`, then this namespaced identifier would be invalid.
Apr 29, 2015 Add language specification
332
333 ### Declarations
334
335 ```
336 $declaration = $type $identifier ('=' $expression)?
337 ```
338
Jul 31, 2015 @scottfrazer Update the specification
339 Declarations are declared at the top of any [scope](#scope).
Apr 29, 2015 Add language specification
340
May 20, 2015 Adding more expression evaluation stuff to the spec
341 In a [task definition](#task-definition), declarations are interpreted as inputs to the task that are not part of the command line itself.
Apr 29, 2015 Add language specification
342
343 If a declaration does not have an initialization, then the value is expected to be provided by the user before the workflow or task is run.
344
345 Some examples of declarations:
346
May 6, 2015 change types from lower case to upper-case first letter
347 * `File x`
348 * `String y = "abc"`
349 * `Float pi = 3 + .14`
350 * `Map[String, String] m`
Apr 29, 2015 Add language specification
351
Jul 31, 2015 @scottfrazer Update the specification
352 A declaration may also refer to elements that are outputs of tasks. For example:
353
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
354 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
355 task test {
356 String var
357 command {
358 ./script ${var}
359 }
360 output {
361 String value = read_string(stdout())
362 }
363 }
364
365 task test2 {
366 Array[String] array
367 command {
368 ./script ${write_lines(array)}
369 }
370 output {
371 Int value = read_int(stdout())
372 }
373 }
374
375 workflow wf {
376 call test as x {input: var="x"}
377 call test as y {input: var="y"}
378 Array[String] strs = [x.value, y.value]
379 call test2 as z {input: array=strs}
380 }
381 ```
382
383 `strs` in this case would not be defined until both `call test as x` and `call test as y` have successfully completed. Before that's the case, `strs` is undefined. If any of the two tasks fail, then evaluation of `strs` should return an error to indicate that the `call test2 as z` operation should be skipped.
384
Apr 29, 2015 Add language specification
385 ### Expressions
386
387 ```
388 $expression = '(' $expression ')'
389 $expression = $expression '.' $expression
390 $expression = $expression '[' $expression ']'
391 $expression = $expression '(' ($expression (',' $expression)*)? ')'
392 $expression = '!' $expression
393 $expression = '+' $expression
394 $expression = '-' $expression
395 $expression = $expression '*' $expression
396 $expression = $expression '%' $expression
397 $expression = $expression '/' $expression
398 $expression = $expression '+' $expression
399 $expression = $expression '-' $expression
400 $expression = $expression '<' $expression
401 $expression = $expression '=<' $expression
402 $expression = $expression '>' $expression
403 $expression = $expression '>=' $expression
404 $expression = $expression '==' $expression
405 $expression = $expression '!=' $expression
406 $expression = $expression '&&' $expression
407 $expression = $expression '||' $expression
May 20, 2015 Adding more expression evaluation stuff to the spec
408 $expression = '{' ($expression ':' $expression)* '}'
409 $expression = '[' $expression* ']'
410 $expression = $string | $integer | $float | $boolean | $identifier
Apr 29, 2015 Add language specification
411 ```
412
413 Below are the valid results for operators on types. Any combination not in the list will result in an error.
414
May 20, 2015 spec and grammar changes
415 |LHS Type |Operators |RHS Type |Result |Semantics|
416 |-----------|-----------|-----------------|---------|---------|
417 |`Boolean`|`==`|`Boolean`|`Boolean`||
418 |`Boolean`|`!=`|`Boolean`|`Boolean`||
419 |`Boolean`|`>`|`Boolean`|`Boolean`||
420 |`Boolean`|`>=`|`Boolean`|`Boolean`||
421 |`Boolean`|`<`|`Boolean`|`Boolean`||
422 |`Boolean`|`<=`|`Boolean`|`Boolean`||
423 |`Boolean`|`||`|`Boolean`|`Boolean`||
424 |`Boolean`|`&&`|`Boolean`|`Boolean`||
May 20, 2015 Adding more expression evaluation stuff to the spec
425 |`File`|`+`|`File`|`File`|Append file paths|
May 20, 2015 spec and grammar changes
426 |`File`|`==`|`File`|`Boolean`||
427 |`File`|`!=`|`File`|`Boolean`||
428 |`File`|`+`|`String`|`File`||
429 |`File`|`==`|`String`|`Boolean`||
430 |`File`|`!=`|`String`|`Boolean`||
431 |`Float`|`+`|`Float`|`Float`||
432 |`Float`|`-`|`Float`|`Float`||
433 |`Float`|`*`|`Float`|`Float`||
434 |`Float`|`/`|`Float`|`Float`||
435 |`Float`|`%`|`Float`|`Float`||
436 |`Float`|`==`|`Float`|`Boolean`||
437 |`Float`|`!=`|`Float`|`Boolean`||
438 |`Float`|`>`|`Float`|`Boolean`||
439 |`Float`|`>=`|`Float`|`Boolean`||
440 |`Float`|`<`|`Float`|`Boolean`||
441 |`Float`|`<=`|`Float`|`Boolean`||
442 |`Float`|`+`|`Int`|`Float`||
443 |`Float`|`-`|`Int`|`Float`||
444 |`Float`|`*`|`Int`|`Float`||
445 |`Float`|`/`|`Int`|`Float`||
446 |`Float`|`%`|`Int`|`Float`||
447 |`Float`|`==`|`Int`|`Boolean`||
448 |`Float`|`!=`|`Int`|`Boolean`||
449 |`Float`|`>`|`Int`|`Boolean`||
450 |`Float`|`>=`|`Int`|`Boolean`||
451 |`Float`|`<`|`Int`|`Boolean`||
452 |`Float`|`<=`|`Int`|`Boolean`||
453 |`Float`|`+`|`String`|`String`||
454 |`Int`|`+`|`Float`|`Float`||
455 |`Int`|`-`|`Float`|`Float`||
456 |`Int`|`*`|`Float`|`Float`||
457 |`Int`|`/`|`Float`|`Float`||
458 |`Int`|`%`|`Float`|`Float`||
459 |`Int`|`==`|`Float`|`Boolean`||
460 |`Int`|`!=`|`Float`|`Boolean`||
461 |`Int`|`>`|`Float`|`Boolean`||
462 |`Int`|`>=`|`Float`|`Boolean`||
463 |`Int`|`<`|`Float`|`Boolean`||
464 |`Int`|`<=`|`Float`|`Boolean`||
465 |`Int`|`+`|`Int`|`Int`||
466 |`Int`|`-`|`Int`|`Int`||
467 |`Int`|`*`|`Int`|`Int`||
May 20, 2015 Adding more expression evaluation stuff to the spec
468 |`Int`|`/`|`Int`|`Int`|Integer division|
469 |`Int`|`%`|`Int`|`Int`|Integer division, return remainder|
May 20, 2015 spec and grammar changes
470 |`Int`|`==`|`Int`|`Boolean`||
471 |`Int`|`!=`|`Int`|`Boolean`||
472 |`Int`|`>`|`Int`|`Boolean`||
473 |`Int`|`>=`|`Int`|`Boolean`||
474 |`Int`|`<`|`Int`|`Boolean`||
475 |`Int`|`<=`|`Int`|`Boolean`||
476 |`Int`|`+`|`String`|`String`||
477 |`String`|`+`|`Float`|`String`||
478 |`String`|`+`|`Int`|`String`||
479 |`String`|`+`|`String`|`String`||
480 |`String`|`==`|`String`|`Boolean`||
481 |`String`|`!=`|`String`|`Boolean`||
482 |`String`|`>`|`String`|`Boolean`||
483 |`String`|`>=`|`String`|`Boolean`||
484 |`String`|`<`|`String`|`Boolean`||
485 |`String`|`<=`|`String`|`Boolean`||
486 ||`-`|`Float`|`Float`||
487 ||`+`|`Float`|`Float`||
488 ||`-`|`Int`|`Int`||
489 ||`+`|`Int`|`Int`||
490 ||`!`|`Boolean`|`Boolean`||
Apr 29, 2015 Add language specification
491
492 ### Operator Precedence Table
493
494 | Precedence | Operator type | Associativity | Example |
495 |------------|-----------------------|---------------|----------------------|
496 | 12 | Grouping | n/a | (x) |
497 | 11 | Member Access | left-to-right | x.y |
498 | 10 | Index | left-to-right | x[y] |
499 | 9 | Function Call | left-to-right | x(y,z,...) |
500 | 8 | Logical NOT | right-to-left | !x |
501 | | Unary Plus | right-to-left | +x |
502 | | Unary Negation | right-to-left | -x |
503 | 7 | Multiplication | left-to-right | x*y |
504 | | Division | left-to-right | x/y |
505 | | Remainder | left-to-right | x%y |
506 | 6 | Addition | left-to-right | x+y |
507 | | Subtraction | left-to-right | x-y |
508 | 5 | Less Than | left-to-right | x<y |
509 | | Less Than Or Equal | left-to-right | x<=y |
510 | | Greater Than | left-to-right | x>y |
511 | | Greater Than Or Equal | left-to-right | x>=y |
512 | 4 | Equality | left-to-right | x==y |
513 | | Inequality | left-to-right | x!=y |
514 | 3 | Logical AND | left-to-right | x&&y |
515 | 2 | Logical OR | left-to-right | x\|\|y |
516 | 1 | Assignment | right-to-left | x=y |
517
May 20, 2015 Adding more expression evaluation stuff to the spec
518 ### Member Access
Apr 29, 2015 Add language specification
519
520 The syntax `x.y` refers to member access. `x` must be an object or task in a workflow. A Task can be thought of as an object where the attributes are the outputs of the task.
521
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
522 ```wdl
Apr 29, 2015 Add language specification
523 workflow wf {
Jul 31, 2015 @scottfrazer Update the specification
524 Object obj
525 Object foo
Apr 29, 2015 Add language specification
526
Jul 31, 2015 @scottfrazer Update the specification
527 # This would cause a syntax error,
528 # because foo is defined twice in the same namespace.
529 call foo {
Apr 29, 2015 Add language specification
530 input: var=obj.attr # Object attribute
531 }
Jul 31, 2015 @scottfrazer Update the specification
532
533 call foo as foo2 {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
534 input: var=foo.out # Task output
Apr 29, 2015 Add language specification
535 }
536 }
537 ```
538
May 20, 2015 Adding more expression evaluation stuff to the spec
539 ### Map and Array Indexing
Apr 29, 2015 Add language specification
540
541 The syntax `x[y]` is for indexing maps and arrays. If `x` is an array, then `y` must evaluate to an integer. If `x` is a map, then `y` must evaluate to a key in that map.
542
May 20, 2015 Adding more expression evaluation stuff to the spec
543 ### Function Calls
544
Jan 5, 2016 @scottfrazer Updating specification
545 Function calls, in the form of `func(p1, p2, p3, ...)`, are either [standard library functions](#standard-library) or engine-defined functions.
Jul 31, 2015 @scottfrazer Update the specification
546
547 In this current iteration of the spec, users cannot define their own functions.
May 20, 2015 Adding more expression evaluation stuff to the spec
548
549 ### Array Literals
550
551 Arrays values can be specified using Python-like syntax, as follows:
552
553 ```
554 Array[String] a = ["a", "b", "c"]
555 Array[Int] b = [0,1,2]
556 ```
557
558 ### Map Literals
Apr 29, 2015 Add language specification
559
May 20, 2015 Adding more expression evaluation stuff to the spec
560 Maps values can be specified using a similar Python-like sytntax:
561
562 ```
563 Map[Int, Int] = {1: 10, 2: 11}
564 Map[String, Int] = {"a": 1, "b": 2}
565 ```
Apr 29, 2015 Add language specification
566
567 ## Document
568
569 ```
570 $document = ($import | $task | $workflow)+
571 ```
572
573 `$document` is the root of the parse tree and it consists of one or more import statement, task, or workflow definition
574
575 ## Import Statements
576
May 3, 2015 @scottfrazer Remove references to CWL
577 A WDL file may contain import statements to include WDL code from other sources
Apr 29, 2015 Add language specification
578
579 ```
Aug 17, 2015 @scottfrazer minor changes
580 $import = 'import' $ws+ $string ($ws+ 'as' $ws+ $identifier)?
Apr 29, 2015 Add language specification
581 ```
582
May 20, 2015 spec and grammar changes
583 The import statement specifies that `$string` which is to be interpted as a URI which points to a WDL file. The engine is responsible for resolving the URI and downloading the contents. The contents of the document in each URI must be WDL source code.
584
May 28, 2015 @scottfrazer Spec and parser updates
585 If a namespace identifier (via the `as $identifer` syntax) is specified, then all the tasks and workflows imported will only be accessible through that [namespace](#namespaces). If no namespace identifier is specified, then all tasks and workflows from the URI are imported into the current namespace.
Apr 29, 2015 Add language specification
586
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
587 ```wdl
Apr 29, 2015 Add language specification
588 import "http://example.com/lib/stdlib"
589 import "http://example.com/lib/analysis_tasks" as analysis
590
591 workflow wf {
May 20, 2015 spec and grammar changes
592 File bam_file
593
594 # file_size is from "http://example.com/lib/stdlib"
595 call file_size {
596 input: file=bam_file
597 }
Apr 29, 2015 Add language specification
598 call analysis.my_analysis_task {
May 20, 2015 spec and grammar changes
599 input: size=file_size.bytes, file=bam_file
Apr 29, 2015 Add language specification
600 }
601 }
602 ```
603
May 28, 2015 @scottfrazer Spec and parser updates
604 Engines should at the very least support the following protocols for import URIs:
605
606 * `http://` and `https://`
607 * `file://`
608 * no protocol (which should be interpreted as `file://`
609
Apr 29, 2015 Add language specification
610 ## Task Definition
611
612 A task is a declarative construct with a focus on constructing a command from a template. The command specification is interpreted in an engine specific way, though a typical case is that a command is a UNIX command line which would be run in a Docker image.
613
614 Tasks also define their outputs, which is essential for building dependencies between tasks. Any other data specified in the task definition (e.g. runtime information and meta-data) is optional.
615
616 ```
Aug 17, 2015 @scottfrazer minor changes
617 $task = 'task' $ws+ $identifier $ws* '{' $ws* $declaration* $task_sections $ws* '}'
Apr 29, 2015 Add language specification
618 ```
619
620 For example, `task name { ... }`. Inside the curly braces defines the sections.
621
622 ### Sections
623
624 The task has one or more sections:
625
626 ```
Jul 31, 2015 @scottfrazer Update the specification
627 $task_sections = ($command | $runtime | $task_output | $parameter_meta | $meta)+
Apr 29, 2015 Add language specification
628 ```
629
630 > *Additional requirement*: Exactly one `$command` section needs to be defined, preferably as the first section.
631
632 ### Command Section
633
634 ```
635 $command = 'command' $ws* '{' (0xA | 0xD)* $command_part+ $ws+ '}'
636 $command = 'command' $ws* '<<<' (0xA | 0xD)* $command_part+ $ws+ '>>>'
637 ```
638
Jul 31, 2015 @scottfrazer Update the specification
639 A command is a *task section* that starts with the keyword 'command', and is enclosed in curly braces or `<<<` `>>>`. The body of the command specifies the literal command line to run with placeholders (`$command_part_var`) for the parts of the command line that needs to be filled in.
Apr 29, 2015 Add language specification
640
Jul 31, 2015 @scottfrazer Update the specification
641 #### Command Parts
Apr 29, 2015 Add language specification
642
643 ```
644 $command_part = $command_part_string | $command_part_var
645 $command_part_string = ^'${'+
Jul 31, 2015 @scottfrazer Update the specification
646 $command_part_var = '${' $var_option* $expression '}'
Apr 29, 2015 Add language specification
647 ```
648
649 The parser should read characters from the command line until it reaches a `${` character sequence. This is interpreted as a literal string (`$command_part_string`).
650
651 The parser should interpret any variable enclosed in `${`...`}` as a `$command_part_var`.
652
Jul 31, 2015 @scottfrazer Update the specification
653 The `$expression` usually references declarations at the task level. For example:
Apr 29, 2015 Add language specification
654
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
655 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
656 task test {
657 String flags
658 command {
659 ps ${flags}
660 }
661 }
662 ```
Apr 29, 2015 Add language specification
663
Jul 31, 2015 @scottfrazer Update the specification
664 In this case `flags` within the `${`...`}` is an expression. The `$expression` can also be more complex, like a function call: `write_lines(some_array_value)`
Apr 29, 2015 Add language specification
665
Jul 31, 2015 @scottfrazer Update the specification
666 > **NOTE**: the `$expression` in this context can only evaluate to a primitive type (e.g. not `Array`, `Map`, or `Object`). The only exception to this rule is when `sep` is specified as one of the `$var_option` fields
Apr 29, 2015 Add language specification
667
Jul 31, 2015 @scottfrazer Update the specification
668 As another example, consider how the parser would parse the following command:
Apr 29, 2015 Add language specification
669
670 ```
Jul 31, 2015 @scottfrazer Update the specification
671 grep '${start}...${end}' ${input}
Apr 29, 2015 Add language specification
672 ```
673
674 This command would be parsed as:
675
676 * `grep '` - command_part_string
677 * `${start}` - command_part_var
678 * `...` - command_part_string
679 * `${end}` - command_part_var
680 * `' ` - command_part_string
Jul 31, 2015 @scottfrazer Update the specification
681 * `${input}` - command_part_var
Jul 17, 2015 @scottfrazer Documenation updates
682
Jul 31, 2015 @scottfrazer Update the specification
683 #### Command Part Options
Apr 29, 2015 Add language specification
684
685 ```
686 $var_option = $var_option_key $ws* '=' $ws* $var_option_value
Jul 31, 2015 @scottfrazer Update the specification
687 $var_option_key = 'sep' | 'true' | 'false' | 'quote' | 'default'
Apr 29, 2015 Add language specification
688 $var_option_value = $expression
689 ```
690
691 The `$var_option` is a set of key-value pairs for any additional and less-used options that need to be set on a parameter.
692
Jul 31, 2015 @scottfrazer Update the specification
693 ##### sep
Apr 29, 2015 Add language specification
694
Jul 31, 2015 @scottfrazer Update the specification
695 'sep' is interpreted as the separator string used to join multiple parameters together. `sep` is only valid if the expression evaluates to an `Array`.
696
697 For example, if there were a declaration `Array[Int] ints = [1,2,3]`, the command `python script.py ${sep=',' numbers}` would yield the command line:
Apr 29, 2015 Add language specification
698
699 ```
700 python script.py 1,2,3
701 ```
702
Jul 31, 2015 @scottfrazer Update the specification
703 Alternatively, if the command were `python script.py ${sep=' ' numbers}` it would parse to:
Apr 29, 2015 Add language specification
704
705 ```
706 python script.py 1 2 3
707 ```
708
709 > *Additional Requirements*:
710 >
711 > 1. sep MUST accept only a string as its value
712
Jul 31, 2015 @scottfrazer Update the specification
713 ##### true and false
Apr 29, 2015 Add language specification
714
May 6, 2015 change types from lower case to upper-case first letter
715 'true' and 'false' are only used for type Boolean and they specify what the parameter returns when the Boolean is true or false, respectively.
Apr 29, 2015 Add language specification
716
May 6, 2015 change types from lower case to upper-case first letter
717 For example, `${true='--enable-foo', false='--disable-foo' Boolean yes_or_no}` would evaluate to either `--enable-foo` or `--disable-foo` based on the value of yes_or_no.
Apr 29, 2015 Add language specification
718
May 6, 2015 change types from lower case to upper-case first letter
719 If either value is left out, then it's equivalent to specifying the empty string. If the parameter is `${true='--enable-foo' Boolean yes_or_no}`, and a value of false is specified for this parameter, then the parameter will evaluate to the empty string.
Apr 29, 2015 Add language specification
720
721 > *Additional Requirement*:
722 >
723 > 1. `true` and `false` values MUST be strings.
May 6, 2015 change types from lower case to upper-case first letter
724 > 2. `true` and `false` are only allowed if the type is `Boolean`
Apr 29, 2015 Add language specification
725
Jul 31, 2015 @scottfrazer Update the specification
726 ##### default
Apr 29, 2015 Add language specification
727
728 This specifies the default value if no other value is specified for this parameter.
729
730 > *Additional Requirements*:
731 >
732 > 1. The type of the expression must match the type of the parameter
Jul 31, 2015 @scottfrazer Update the specification
733 > 2. If 'default' is specified, the `$type_postfix_quantifier` for the variable's type MUST be `?`
734
735 #### Alternative heredoc syntax
736
737 Sometimes a command is sufficiently long enough or might use `{` characters that using a different set of delimiters would make it more clear. In this case, enclose the command in `<<<`...`>>>`, as follows:
738
Aug 15, 2015 @scottfrazer Move spec changes
739 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
740 task heredoc {
741 File in
742
743 command<<<
744 python <<CODE
745 with open("${in}") as fp:
746 for line in fp:
747 if not line.startswith('#'):
748 print(line.strip())
749 CODE
750 >>>
751 }
752 ```
753
754 Parsing of this command should be the same as the prior section describes.
755
756 #### Stripping Leading Whitespace
757
Aug 15, 2015 @scottfrazer Move spec changes
758 Any text inside of the `command` section, after instantiated, should have all *common leading whitespace* removed. In the `task heredoc` example in the previous section, if the user specifies a value of `/path/to/file` as the value for `File in`, then the command should be:
Jul 31, 2015 @scottfrazer Update the specification
759
760 ```
761 python <<CODE
762 with open("/path/to/file") as fp:
763 for line in fp:
764 if not line.startswith('#'):
765 print(line.strip())
766 CODE
767 ```
768
769 The 2-spaces that were common to each line were removed.
770
771 If the user mixes tabs and spaces, the behavior is undefined. A warning is suggested, and perhaps a convention of 4 spaces per tab. Other implementations might return an error in this case.
Apr 29, 2015 Add language specification
772
773 ### Outputs Section
774
775 The outputs section defines which of the files and values should be exported after a successful run of this tool.
776
777 ```
Jul 31, 2015 @scottfrazer Update the specification
778 $task_output = 'output' $ws* '{' ($ws* $task_output_kv $ws*)* '}'
779 $task_output_kv = $type $identifier $ws* '=' $ws* $string
Apr 29, 2015 Add language specification
780 ```
781
782 The outputs section contains typed variable definitions and a binding to the variable that they export.
783
784 The left-hand side of the equality defines the type and name of the output.
785
786 The right-hand side defines the path to the file that contains that variable definition.
787
788 For example, if a task's output section looks like this:
789
790 ```
791 output {
Jul 17, 2015 @scottfrazer Documenation updates
792 Int threshold = read_int("threshold.txt")
Apr 29, 2015 Add language specification
793 }
794 ```
795
796 Then the task is expecting a file called "threshold.txt" in the current working directory where the task was executed. Inside of that file must be one line that contains only an integer and whitespace. See the [Data Types & Serialization](#data-types--serialization) section for more details.
797
Jul 17, 2015 @scottfrazer Documenation updates
798 The filename strings may also contain variable definitions themselves (see the [String Interpolation](#string-interpolation) section below for more details):
Apr 29, 2015 Add language specification
799
800 ```
801 output {
Jul 17, 2015 @scottfrazer Documenation updates
802 Array[String] quality_scores = read_lines("${sample_id}.scores.txt")
Apr 29, 2015 Add language specification
803 }
804 ```
805
806 If this is the case, then `sample_id` is considered an input to the task.
807
Dec 23, 2015 @cjllanwarne Modified spec to mention outputs referencing other outputs
808 As with inputs, the outputs can reference previous outputs in the same block. The only requirement is that the output being referenced must be specified *before* the output which uses it.
Apr 29, 2015 Add language specification
809
Dec 23, 2015 @cjllanwarne Modified spec to mention outputs referencing other outputs
810 ```
811 output {
812 String a = "a"
813 String ab = a + "b"
814 }
815 ```
816
817
818 Globs can be used to define outputs which contain many files. The glob function generates an array of File outputs:
819
820 ```
821 output {
822 Array[File] output_bams = glob("*.bam")
823 }
824 ```
Apr 29, 2015 Add language specification
825
Jul 17, 2015 @scottfrazer Documenation updates
826 ### String Interpolation
827
828 Within tasks, any string literal can use string interpolation to access the value of any of the task's inputs. The most obvious example of this is being able to define an output file which is named as function of its input. For example:
829
Aug 15, 2015 @scottfrazer Move spec changes
830 ```wdl
Jul 17, 2015 @scottfrazer Documenation updates
831 task example {
Jul 31, 2015 @scottfrazer Update the specification
832 String prefix
833 File bam
834 command {
835 python analysis.py --prefix=${prefix} ${bam}
836 }
Jul 17, 2015 @scottfrazer Documenation updates
837 output {
838 File analyzed = "${prefix}.out"
839 File bam_sibling = "${bam}.suffix"
840 }
841 }
842 ```
843
844 Any `${identifier}` inside of a string literal must be replaced with the value of the identifier. If prefix were specified as `foobar`, then `"${prefix}.out"` would be evaluated to `"foobar.out"`.
845
Apr 29, 2015 Add language specification
846 ### Runtime Section
847
848 ```
849 $runtime = 'runtime' $ws* '{' ($ws* $runtime_kv $ws*)* '}'
Jan 5, 2016 @scottfrazer Updating specification
850 $runtime_kv = $identifier $ws* '=' $ws* $expression
Apr 29, 2015 Add language specification
851 ```
852
Jan 5, 2016 @scottfrazer Updating specification
853 The runtime section defines key/value pairs for runtime information needed for this task. Individual backends will define which keys they will inspect so a key/value pair may or may not actually be honored depending on how the task is run.
Jun 12, 2015 @geoffjentry Update the runtime section
854
Jan 5, 2016 @scottfrazer Updating specification
855 Values can be any expression and it is up to the engine to reject keys and/or values that do not make sense in that context. For example, consider the following WDL:
856
857 ```wdl
858 task test {
859 command {
860 python script.py
861 }
862 runtime {
863 docker: ["ubuntu:latest", "broadinstitute/scala-baseimage"]
864 }
865 }
866 ```
867
868 The value for the `docker` runtime attribute in this case is an array of values. The parser should accept this. Some engines might interpret it as an "either this image or that image" or could reject it outright.
869
870 Since values are expressions, they can also reference variables in the task:
871
872 ```wdl
873 task test {
874 String ubuntu_version
875
876 command {
877 python script.py
878 }
879 runtime {
880 docker: "ubuntu:" + ubuntu_version
881 }
882 }
883 ```
884
885 Most key/value pairs are arbitrary. However, the following keys have recommended conventions:
Apr 29, 2015 Add language specification
886
887 #### docker
888
Jan 5, 2016 @scottfrazer Updating specification
889 Location of a Docker image for which this task ought to be run. This can have a format like `ubuntu:latest` or `broadinstitute/scala-baseimage` in which case it should be interpreted as an image on DockerHub (i.e. it is valid to use in a `docker pull` command).
890
891 ```wdl
892 task docker_test {
893 String arg
894
895 command {
896 python process.py ${arg}
897 }
898 runtime {
899 docker: "ubuntu:latest"
900 }
901 }
902 ```
Apr 29, 2015 Add language specification
903
904 #### memory
905
906 Memory requirements for this task. This should be an integer value with suffixes like `B`, `KB`, `MB`, ... or binary suffixes `KiB`, `MiB`, ...
907
Jan 5, 2016 @scottfrazer Updating specification
908 ```wdl
909 task docker_test {
910 String arg
911
912 command {
913 python process.py ${arg}
914 }
915 runtime {
916 memory: "2GB"
917 }
918 }
919 ```
920
Apr 29, 2015 Add language specification
921 ### Parameter Metadata Section
922
923 ```
924 $parameter_meta = 'parameter_meta' $ws* '{' ($ws* $parameter_meta_kv $ws*)* '}'
925 $parameter_meta_kv = $identifier $ws* '=' $ws* $string
926 ```
927
928 This purely optional section contains key/value pairs where the keys are names of parameters and the values are string descriptions for those parameters.
929
930 > *Additional requirement*: Any key in this section MUST correspond to a parameter in the command line
931
932 ### Metadata Section
933
934 ```
935 $meta = 'meta' $ws* '{' ($ws* $meta_kv $ws*)* '}'
936 $meta_kv = $identifier $ws* '=' $ws* $string
937 ```
938
939 This purely optional section contains key/value pairs for any additional meta data that should be stored with the task. For example, perhaps author or contact email.
940
941 ### Examples
942
943 #### Example 1: Simplest Task
944
Aug 15, 2015 @scottfrazer Move spec changes
945 ```wdl
Apr 29, 2015 Add language specification
946 task hello_world {
947 command {echo hello world}
948 }
949 ```
950
951 #### Example 2: Inputs/Outputs
952
Aug 15, 2015 @scottfrazer Move spec changes
953 ```wdl
Apr 29, 2015 Add language specification
954 task one_and_one {
Jul 31, 2015 @scottfrazer Update the specification
955 String pattern
956 File infile
957
Apr 29, 2015 Add language specification
958 command {
Jul 31, 2015 @scottfrazer Update the specification
959 grep ${pattern} ${infile}
Apr 29, 2015 Add language specification
960 }
961 output {
May 20, 2015 spec and grammar changes
962 File filtered = stdout()
Apr 29, 2015 Add language specification
963 }
964 }
965 ```
966
967 #### Example 3: Runtime/Metadata
968
Aug 15, 2015 @scottfrazer Move spec changes
969 ```wdl
Apr 29, 2015 Add language specification
970 task runtime_meta {
Jul 31, 2015 @scottfrazer Update the specification
971 String memory_mb
972 String sample_id
973 String param
974 String sample_id
975
Apr 29, 2015 Add language specification
976 command {
977 java -Xmx${memory_mb}M -jar task.jar -id ${sample_id} -param ${param} -out ${sample_id}.out
978 }
979 output {
May 7, 2015 Fix type names
980 File results = "${sample_id}.out"
Apr 29, 2015 Add language specification
981 }
982 runtime {
983 docker: "broadinstitute/baseimg"
984 }
985 parameter_meta {
986 memory_mb: "Amount of memory to allocate to the JVM"
987 param: "Some arbitrary parameter"
988 sample_id: "The ID of the sample in format foo_bar_baz"
989 }
990 meta {
991 author: "Joe Somebody"
992 email: "joe@company.org"
993 }
994 }
995 ```
996
997 #### Example 4: BWA mem
998
Aug 15, 2015 @scottfrazer Move spec changes
999 ```wdl
1000 task bwa_mem_tool {
Jul 31, 2015 @scottfrazer Update the specification
1001 Int threads
1002 Int min_seed_length
1003 Int min_std_max_min
1004 File reference
1005 File reads
1006
Apr 29, 2015 Add language specification
1007 command {
Jul 31, 2015 @scottfrazer Update the specification
1008 bwa mem -t ${threads} \
1009 -k ${min_seed_length} \
1010 -I ${sep=',' min_std_max_min+} \
1011 ${reference} \
1012 ${sep=' ' reads+} > output.sam
Apr 29, 2015 Add language specification
1013 }
1014 output {
May 7, 2015 Fix type names
1015 File sam = "output.sam"
Apr 29, 2015 Add language specification
1016 }
1017 runtime {
Aug 15, 2015 @scottfrazer Move spec changes
1018 docker: "broadinstitute/baseimg"
Apr 29, 2015 Add language specification
1019 }
1020 }
1021 ```
1022
Jul 31, 2015 @scottfrazer Update the specification
1023 Notable pieces in this example is `${sep=',' min_std_max_min+}` which specifies that min_std_max_min can be one or more integers (the `+` after the variable name indicates that it can be one or more). If an `Array[Int]` is passed into this parameter, then it's flattened by combining the elements with the separator character (`sep=','`).
Apr 29, 2015 Add language specification
1024
1025 This task also defines that it exports one file, called 'sam', which is the stdout of the execution of bwa mem.
1026
1027 The 'docker' portion of this task definition specifies which that this task must only be run on the Docker image specified.
1028
1029 #### Example 5: Word Count
1030
Aug 15, 2015 @scottfrazer Move spec changes
1031 ```wdl
1032 task wc2_tool {
Jul 31, 2015 @scottfrazer Update the specification
1033 File file1
Apr 29, 2015 Add language specification
1034 command {
Jul 31, 2015 @scottfrazer Update the specification
1035 wc ${file1}
Apr 29, 2015 Add language specification
1036 }
1037 output {
May 20, 2015 spec and grammar changes
1038 Int count = read_int(stdout())
Apr 29, 2015 Add language specification
1039 }
1040 }
1041
Aug 15, 2015 @scottfrazer Move spec changes
1042 workflow count_lines4_wf {
Jul 6, 2015 Minor spec modification
1043 Array[File] files
Apr 29, 2015 Add language specification
1044 scatter(f in files) {
Aug 15, 2015 @scottfrazer Move spec changes
1045 call wc2_tool {
May 3, 2015 @scottfrazer Remove references to CWL
1046 input: file1=f
1047 }
Apr 29, 2015 Add language specification
1048 }
1049 output {
Aug 15, 2015 @scottfrazer Move spec changes
1050 wc2_tool.count
Apr 29, 2015 Add language specification
1051 }
1052 }
1053 ```
1054
1055 In this example, it's all pretty boilerplate, declarative code, except for some language-y like features, like `firstline(stdout)` and `append(list_of_count, wc2-tool.count)`. These both can be implemented fairly easily if we allow for custom function definitions. Parsing them is no problem. Implementation would be fairly simple and new functions would not be hard to add. Alternatively, this could be something like JavaScript or Python snippets that we run.
1056
1057 #### Example 6: tmap
1058
Jan 5, 2016 @scottfrazer Updating specification
1059 This task should produce a command line like this:
Apr 29, 2015 Add language specification
1060
1061 ```
1062 tmap mapall \
1063 stage1 map1 --min-seq-length 20 \
1064 map2 --min-seq-length 20 \
1065 stage2 map1 --max-seq-length 20 --min-seq-length 10 --seed-length 16 \
1066 map2 --max-seed-hits -1 --max-seq-length 20 --min-seq-length 10
1067 ```
1068
1069 Task definition would look like this:
1070
Aug 15, 2015 @scottfrazer Move spec changes
1071 ```wdl
1072 task tmap_tool {
Jul 31, 2015 @scottfrazer Update the specification
1073 Array[String] stages
1074 File reads
1075
Apr 29, 2015 Add language specification
1076 command {
Jul 31, 2015 @scottfrazer Update the specification
1077 tmap mapall ${sep=' ' stages} < ${reads} > output.sam
Apr 29, 2015 Add language specification
1078 }
1079 output {
May 7, 2015 Fix type names
1080 File sam = "output.sam"
Apr 29, 2015 Add language specification
1081 }
1082 }
1083 ```
1084
1085 For this particular case where the command line is *itself* a mini DSL, The best option at that point is to allow the user to type in the rest of the command line, which is what `${sep=' ' stages+}` is for. This allows the user to specify an array of strings as the value for `stages` and then it concatenates them together with a space character
1086
1087 |Variable|Value|
1088 |--------|-----|
1089 |reads |/path/to/fastq|
1090 |stages |["stage1 map1 --min-seq-length 20 map2 --min-seq-length 20", "stage2 map1 --max-seq-length 20 --min-seq-length 10 --seed-length 16 map2 --max-seed-hits -1 --max-seq-length 20 --min-seq-length 10"]|
1091
1092 ## Workflow Definition
1093
1094 ```
Jul 31, 2015 @scottfrazer Update the specification
1095 $workflow = 'workflow' $ws* '{' $ws* $workflow_element* $ws* '}'
Apr 29, 2015 Add language specification
1096 $workflow_element = $call | $loop | $conditional | $declaration | $scatter
1097 ```
1098
1099 A workflow is defined as the keyword `workflow` and the body being in curly braces.
1100
Aug 15, 2015 @scottfrazer Move spec changes
1101 An example of a workflow that runs one task (not defined here) would be:
1102
1103 ```wdl
Apr 29, 2015 Add language specification
1104 workflow wf {
May 6, 2015 change types from lower case to upper-case first letter
1105 Array[File] files
1106 Int threshold
1107 Map[String, String] my_map
Aug 15, 2015 @scottfrazer Move spec changes
1108
1109 call analysis_job {
1110 input: search_paths=files, threshold=threshold, gender_lookup=my_map
1111 }
Apr 29, 2015 Add language specification
1112 }
1113 ```
1114
Jul 17, 2015 @scottfrazer Documenation updates
1115 ### Call Statement
Apr 29, 2015 Add language specification
1116
1117 ```
Jul 31, 2015 @scottfrazer Update the specification
1118 $call = 'call' $ws* $namespaced_identifier $ws+ ('as' $identifier)? $ws* $call_body?
1119 $call_body = '{' $ws* $inputs? $ws* '}'
1120 $inputs = 'input' $ws* ':' $ws* $variable_mappings
1121 $variable_mappings = $variable_mapping_kv (',' $variable_mapping_kv)*
1122 $variable_mapping_kv = $identifier $ws* '=' $ws* $expression
Apr 29, 2015 Add language specification
1123 ```
1124
Jul 31, 2015 @scottfrazer Update the specification
1125 A workflow may call other tasks/workflows via the `call` keyword. The `$namespaced_identifier` is the reference to which task to run. Most commonly, it's simply the name of a task (see examples below), but it can also use `.` as a namespace resolver.
1126
1127 See the section on [Fully Qualified Names & Namespaced Identifiers](#fully-qualified-names--namespaced-identifiers) for details about how the `$namespaced_identifier` ought to be interpreted
1128
Aug 15, 2015 @scottfrazer Move spec changes
1129 All `call` statements must be uniquely identifiable. By default, the call's unique identifier is the task name (e.g. `call foo` would be referenced by name `foo`). However, if one were to `call foo` twice in a workflow, each subsequent `call` statement will need to alias itself to a unique name using the `as` clause: `call foo as bar`.
Jul 17, 2015 @scottfrazer Documenation updates
1130
Aug 15, 2015 @scottfrazer Move spec changes
1131 A `call` statement may reference a workflow too (e.g. `call other_workflow`). In this case, the `$inputs` section specifies a subset of the workflow's inputs and must specify fully qualified names.
Apr 29, 2015 Add language specification
1132
Aug 15, 2015 @scottfrazer Move spec changes
1133 ```wdl
1134 import "lib.wdl" as lib
Apr 29, 2015 Add language specification
1135 workflow wf {
1136 call my_task
1137 call my_task as my_task_alias
1138 call my_task as my_task_alias2 {
1139 input: threshold=2
1140 }
1141 call lib.other_task
1142 }
1143 ```
1144
Aug 15, 2015 @scottfrazer Move spec changes
1145 The `$call_body` is optional and is meant to specify how to satisfy a subset of the the task or workflow's input parameters as well as a way to map tasks outputs to variables defined in the [visible scopes](#scope).
Apr 29, 2015 Add language specification
1146
1147 A `$variable_mapping` in the `$inputs` section maps parameters in the task to expressions. These expressions usually reference outputs of other tasks, but they can be arbitrary expressions.
1148
Aug 15, 2015 @scottfrazer Move spec changes
1149 As an example, here is a workflow in which the second task requires an output from the first task:
Apr 29, 2015 Add language specification
1150
Aug 15, 2015 @scottfrazer Move spec changes
1151 ```wdl
Apr 29, 2015 Add language specification
1152 task task1 {
Jul 31, 2015 @scottfrazer Update the specification
1153 command {
1154 python do_stuff.py
1155 }
1156 output {
1157 File results = stdout()
1158 }
Apr 29, 2015 Add language specification
1159 }
1160 task task2 {
Jul 31, 2015 @scottfrazer Update the specification
1161 File foobar
1162 command {
1163 python do_stuff2.py ${foobar}
1164 }
1165 output {
1166 File results = stdout()
1167 }
Apr 29, 2015 Add language specification
1168 }
1169 workflow wf {
1170 call task1
Jul 31, 2015 @scottfrazer Update the specification
1171 call task2 {
1172 input: foobar=task1.results
1173 }
Apr 29, 2015 Add language specification
1174 }
1175 ```
1176
1177 ### Scatter
1178
1179 ```
Jul 31, 2015 @scottfrazer Update the specification
1180 $scatter = 'scatter' $ws* '(' $ws* $scatter_iteration_statment $ws* ')' $ws* $scatter_body
1181 $scatter_iteration_statment = $identifier $ws* 'in' $ws* $expression
1182 $scatter_body = '{' $ws* $workflow_element* $ws* '}'
Apr 29, 2015 Add language specification
1183 ```
1184
Jul 31, 2015 @scottfrazer Update the specification
1185 A "scatter" clause defines that everything in the body (`$scatter_body`) can be run in parallel. The clause in parentheses (`$scatter_iteration_statement`) declares which collection to scatter over and what to call each element.
Apr 29, 2015 Add language specification
1186
Jul 31, 2015 @scottfrazer Update the specification
1187 The `$scatter_iteration_statement` has two parts: the "item" and the "collection". For example, `scatter(x in y)` would define `x` as the item, and `y` as the collection. The item is always an identifier, while the collection is an expression that MUST evaluate to an `Array` type. The item will represent each item in that expression. For example, if `y` evaluated to an `Array[String]` then `x` would be a `String`.
Apr 29, 2015 Add language specification
1188
Jul 31, 2015 @scottfrazer Update the specification
1189 The `$scatter_body` defines a set of scopes that will execute in the context of this scatter block.
1190
1191 For example, if `$expression` is an array of integers of size 3, then the body of the scatter clause can be executed 3-times in parallel. `$identifier` would refer to each integer in the array.
Apr 29, 2015 Add language specification
1192
1193 ```
1194 scatter(i in integers) {
1195 call task1{input: num=i}
Jul 31, 2015 @scottfrazer Update the specification
1196 call task2{input: num=task1.output}
Apr 29, 2015 Add language specification
1197 }
1198 ```
1199
1200 In this example, `task2` depends on `task1`. Variable `i` has an implicit `index` attribute to make sure we can access the right output from `task1`. Since both task1 and task2 run N times where N is the length of the array `integers`, any scalar outputs of these tasks is now an array.
1201
1202 ### Loops
1203
Jul 31, 2015 @scottfrazer Update the specification
1204 > **TODO**: This section is not complete
1205
Apr 29, 2015 Add language specification
1206 ```
1207 $loop = 'while' '(' $expression ')' '{' $workflow_element* '}'
1208 ```
1209
1210 Loops are distinct from scatter clauses because the body of a while loop needs to be executed to completion before another iteration is considered for iteration. The `$expression` condition is evaluated only when the iteration count is zero or if all `$workflow_element`s in the body have completed successfully for the current iteration.
1211
1212 ### Conditionals
1213
1214 ```
1215 $conditional = 'if' '(' $expression ')' '{' $workflow_element* '}'
1216 ```
1217
1218 Conditionals only execute the body if the expression evaluates to true
1219
1220 ### Outputs
1221
1222 ```
Dec 23, 2015 @mcovarr Remove commas from another example and the BNF.
1223 $workflow_output = 'output' '{' ($workflow_output_fqn ($workflow_output_fqn)* '}'
Jul 31, 2015 @scottfrazer Update the specification
1224 $workflow_output_fqn = $fully_qualified_name '.*'?
Apr 29, 2015 Add language specification
1225 ```
1226
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1227 Each `workflow` definition can specify an optional `output` section. This section lists outputs from individual `call`s that you also want to expose as outputs to the `workflow` itself. Replacing call output names with a `*` acts as a match-all wildcard.
Apr 29, 2015 Add language specification
1228
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1229 If the `output {...}` section is omitted, then the workflow includes all outputs from all calls in its final output.
Apr 29, 2015 Add language specification
1230
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1231 The output names in this section must be qualified with the call which created them, as in the example below.
Apr 29, 2015 Add language specification
1232
1233 ```
Jul 31, 2015 @scottfrazer Update the specification
1234 task task1 {
1235 command { ./script }
1236 output { File results = stdout() }
Apr 29, 2015 Add language specification
1237 }
1238
Jul 31, 2015 @scottfrazer Update the specification
1239 task task2 {
1240 command { ./script2 }
1241 output {
1242 File results = stdout()
1243 String value = read_string("some_file")
Apr 29, 2015 Add language specification
1244 }
1245 }
1246
1247 workflow wf {
Jul 31, 2015 @scottfrazer Update the specification
1248 call task1
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1249 call task2 as altname
Jul 31, 2015 @scottfrazer Update the specification
1250 output {
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1251 task1.*
1252 altname.value
Apr 29, 2015 Add language specification
1253 }
1254 }
1255 ```
1256
Dec 1, 2015 @cjllanwarne Edited the workflow outputs description to be clearer.
1257 In this example, the fully-qualified names that would be exposed as workflow outputs would be `wf.task1.results`, `wf.altname.value`.
Apr 29, 2015 Add language specification
1258
May 28, 2015 @scottfrazer Spec and parser updates
1259 # Namespaces
1260
1261 Import statements can be used to pull in tasks/workflows from other locations as well as create namespaces. In the simplest case, an import statement adds the tasks/workflows that are imported into the current namespace. For example:
1262
1263 tasks.wdl
1264 ```
1265 task x {
1266 command { python script.py }
1267 }
1268 task y {
1269 command { python script2.py }
1270 }
1271 ```
1272
1273 workflow.wdl
1274 ```
1275 import "tasks.wdl"
1276
1277 workflow wf {
1278 call x
1279 call y
1280 }
1281 ```
1282
1283 Tasks `x` and `y` are in the same namespace as workflow `wf` is. However, if workflow.wdl could put all of those tasks behind a namespace:
1284
1285 workflow.wdl
1286 ```
1287 import "tasks.wdl" as ns
1288
1289 workflow wf {
1290 call ns.x
1291 call ns.y
1292 }
1293 ```
1294
1295 Now everything inside of `tasks.wdl` must be accessed through the namespace `ns`.
1296
1297 Each namespace contains: namespaces, tasks, and workflows. The names of these needs to be unique within that namespace. For example, there cannot be a task named `foo` and also a namespace named `foo`. Also there can't be a task and a workflow with the same names, or two workflows with the same name.
1298
Jul 31, 2015 @scottfrazer Update the specification
1299 # Scope
Apr 29, 2015 Add language specification
1300
Jul 31, 2015 @scottfrazer Update the specification
1301 Scopes are defined as:
May 20, 2015 spec and grammar changes
1302
Jul 31, 2015 @scottfrazer Update the specification
1303 * `workflow {...}` blocks
1304 * `call` blocks
1305 * `while(expr) {...}` blocks
1306 * `if(expr) {...}` blocks
1307 * `scatter(x in y) {...}` blocks
May 20, 2015 spec and grammar changes
1308
Jul 31, 2015 @scottfrazer Update the specification
1309 Inside of any scope, variables may be [declared](#declarations). The variables declared in that scope are visible to any sub-scope, recursively. For example:
May 20, 2015 spec and grammar changes
1310
Jul 31, 2015 @scottfrazer Update the specification
1311 ```
1312 task my_task {
1313 Int x
1314 File f
1315 command {
1316 my_cmd --integer=${var} ${f}
1317 }
1318 }
May 20, 2015 spec and grammar changes
1319
Jul 31, 2015 @scottfrazer Update the specification
1320 workflow wf {
1321 Array[File] files
1322 Int x = 2
1323 scatter(file in files) {
1324 Int x = 3
1325 call my_task {
1326 Int x = 4
1327 input: var=x, f=file
1328 }
1329 }
1330 }
1331 ```
Jul 17, 2015 @scottfrazer Documenation updates
1332
Jul 31, 2015 @scottfrazer Update the specification
1333 `my_task` will use `x=4` to set the value for `var` in its command line. However, `my_task` also needs a value for `x` which is defined at the task level. Since `my_task` has two inputs (`x` and `var`), and only one of those is set in the `call my_task` declaration, the value for `my_task.x` still needs to be provided by the user when the workflow is run.
Jul 17, 2015 @scottfrazer Documenation updates
1334
Jul 31, 2015 @scottfrazer Update the specification
1335 # Optional Parameters & Type Constraints
May 20, 2015 spec and grammar changes
1336
Jul 31, 2015 @scottfrazer Update the specification
1337 [Types](#types) can be optionally suffixed with a `?` or `+` in certain cases.
1338
1339 * `?` means that the parameter is optional. A user does not need to specify a value for the parameter in order to satisfy all the inputs to the workflow.
1340 * `+` applies only to `Array` types and it represents a constraint that the `Array` value must containe one-or-more elements.
Jul 17, 2015 @scottfrazer Documenation updates
1341
1342 ```
Jul 31, 2015 @scottfrazer Update the specification
1343 task test {
1344 Array[File] a
1345 Array[File]+ b
1346 Array[File]? c
1347 #File+ d <-- can't do this, + only applies to Arrays
1348
1349 command {
1350 /bin/mycmd ${sep=" " a}
1351 /bin/mycmd ${sep="," b}
1352 /bin/mycmd ${write_lines(c)}
1353 }
1354 }
1355
1356 workflow wf {
1357 call test
1358 }
1359 ```
1360
1361 If you provided these values for inputs:
1362
1363 |var |value|
1364 |---------|-----|
1365 |wf.test.a|["1", "2", "3"]|
1366 |wf.test.b|[]|
1367
1368 The workflow engine should reject this because `wf.test.b` is required to have at least one element. If we change it to:
1369
1370 |var |value|
1371 |---------|-----|
1372 |wf.test.a|["1", "2", "3"]|
1373 |wf.test.b|["x"]|
1374
1375 This would be valid input because `wf.test.c` is not required. Given these values, the command would be instantiated as:
1376
1377 ```
1378 /bin/mycmd 1 2 3
1379 /bin/mycmd x
1380 /bin/mycmd
1381 ```
1382
1383 If our inputs were:
1384
1385 |var |value|
1386 |---------|-----|
1387 |wf.test.a|["1", "2", "3"]|
1388 |wf.test.b|["x","y"]|
1389 |wf.test.c|["a","b","c","d"]|
1390
1391 Then the command would be instantiated as:
1392
1393 ```
1394 /bin/mycmd 1 2 3
1395 /bin/mycmd x,y
1396 /bin/mycmd /path/to/c.txt
1397 ```
1398
1399 ## Prepending a String to an Optional Parameter
1400
1401 Sometimes, optional parameters need a string prefix. Consider this task:
1402
1403 ```wdl
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1404 task test {
Jul 31, 2015 @scottfrazer Update the specification
1405 String? val
1406 command {
1407 python script.py --val=${val}
1408 }
1409 }
1410 ```
1411
1412 Since `val` is optional, this command line can be instantiated in two ways:
1413
1414 ```
1415 python script.py --val=foobar
1416 ```
1417
1418 Or
1419
1420 ```
1421 python script.py --val=
1422 ```
1423
1424 The latter case is very likely an error case, and this `--val=` part should be left off if a value for `val` is omitted. To solve this problem, modify the expression inside the template tag as follows:
1425
1426 ```
1427 python script.py ${"--val=" + val}
1428 ```
1429
1430 # Scatter / Gather
1431
1432 The `scatter` block is meant to parallelize a series of identical tasks but give them slightly different inputs. The simplest example is:
1433
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1434 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1435 task inc {
1436 Int i
1437
1438 command <<<
1439 python -c "print(${i} + 1)"
1440 >>>
1441
1442 output {
1443 Int incremented = read_int(stdout())
1444 }
1445 }
1446
1447 workflow wf {
1448 Array[Int] integers = [1,2,3,4,5]
1449 scatter(i in integers) {
1450 call inc{input: i=i}
1451 }
1452 }
1453 ```
1454
1455 Running this workflow (which needs no inputs), would yield a value of `[2,3,4,5,6]` for `wf.inc`. While `task inc` itself returns an `Int`, when it is called inside a scatter block, that type becomes an `Array[Int]`.
1456
1457 Any task that's downstream from the call to `inc` and outside the scatter block must accept an `Array[Int]`:
1458
1459
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1460 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1461 task inc {
1462 Int i
1463
1464 command <<<
1465 python -c "print(${i} + 1)"
1466 >>>
1467
1468 output {
1469 Int incremented = read_int(stdout())
1470 }
1471 }
1472
1473 task sum {
1474 Array[Int] ints
1475
1476 command <<<
1477 python -c "print(${sep="+" ints})"
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1478 >>>
Jul 31, 2015 @scottfrazer Update the specification
1479
1480 output {
1481 Int sum = read_int(stdout())
1482 }
1483 }
1484
1485 workflow wf {
1486 Array[Int] integers = [1,2,3,4,5]
1487 scatter (i in integers) {
1488 call inc {input: i=i}
1489 }
1490 call sum {input: ints = inc.increment}
1491 }
1492 ```
1493
1494 This workflow will output a value of `20` for `wf.sum.sum`. This works because `call inc` will output an `Array[Int]` because it is in the scatter block.
1495
1496 However, from inside the scope of the scatter block, the output of `call inc` is still an `Int`. So the following is valid:
1497
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1498 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1499 workflow wf {
1500 Array[Int] integers = [1,2,3,4,5]
1501 scatter(i in integers) {
1502 call inc {input: i=i}
1503 call inc as inc2 {input: i=inc.incremented}
1504 }
1505 call sum {input: ints = inc2.increment}
1506 }
1507 ```
1508
1509 In this example, `inc` and `inc2` are being called in serial where the output of one is fed to another. inc2 would output the array `[3,4,5,6,7]`
1510
1511 # Variable Resolution
1512
1513 Inside of [expressions](#expressions), variables are resolved differently depending on if the expression is in a `task` declaration or a `workflow` declaration
1514
1515 ## Task-Level Resolution
1516
1517 Inside a task, resolution is trivial: The variable referenced MUST be a [declaration](#declarations) of the task. For example:
1518
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1519 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1520 task my_task {
1521 Array[String] strings
1522 command {
1523 python analyze.py --strings-file=${write_lines(strings)}
1524 }
1525 }
1526 ```
1527
1528 Inside of this task, there exists only one expression: `write_lines(strings)`. In here, when the expression evaluator tries to resolve `strings`, which must be a declaration of the task (in this case it is).
1529
1530 ## Workflow-Level Resolution
1531
1532 In a workflow, resolution works by traversing the scope heirarchy starting from expression that references the variable.
1533
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1534 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1535 workflow wf {
1536 String s = "wf_s"
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1537 String t = "t"
Jul 31, 2015 @scottfrazer Update the specification
1538 call my_task {
1539 String s = "my_task_s"
1540 input: in0 = s+"-suffix", in1 = t+"-suffix"
1541 }
1542 }
1543 ```
1544
1545 In this example, there are two expressions: `s+"-suffix"` and `t+"-suffix"`. `s` is resolved as `"my_task_s"` and `t` is resolved as `"t"`.
1546
1547 # Computing Inputs
1548
1549 Both tasks and workflows have a typed inputs that must be satisfied in order to run. The following sections describe how to compute inputs for `task` and `workflow` declarations
1550
1551 ## Task Inputs
1552
1553 Tasks define all their outputs as declarations at the top of the task definition.
1554
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1555 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1556 task test {
1557 String s
1558 Int i
1559 Float f
1560
1561 command {
1562 ./script.sh -i ${i} -f ${f}
1563 }
1564 }
1565 ```
1566
1567 In this example, `s`, `i`, and `f` are inputs to this task. Even though the command line does not reference `${s}`. Implementations of WDL engines may display a warning or report an error in this case, since `s` isn't used.
1568
1569 ## Workflow Inputs
1570
1571 Workflows have declarations, like tasks, but a workflow must also account for all calls to sub-tasks when determining inputs.
1572
1573 Workflows also return their inputs as fully qualified names. Tasks only return the names of the variables as inputs (as they're guaranteed to be unique within a task). However, since workflows can call the same task twice, names might collide. The general algorithm for computing inputs going something like this:
1574
1575 * Take all inputs to all `call` statements in the workflow
1576 * Subtract out all inputs that are satisfied through the `input: ` section
1577 * Add in all declarations which don't have a static value defined
1578
1579 Consider the following workflow:
1580
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1581 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1582 task t1 {
1583 String s
1584 Int x
1585
1586 command {
1587 ./script --action=${s} -x${x}
1588 }
1589 output {
1590 Int count = read_int(stdout())
1591 }
1592 }
1593
1594 task t2 {
1595 String s
1596 Int t
1597 Int x
1598
1599 command {
1600 ./script2 --action=${s} -x${x} --other=${t}
1601 }
1602 output {
1603 Int count = read_int(stdout())
1604 }
1605 }
1606
1607 task t3 {
1608 Int y
1609 File ref_file # Do nothing with this
1610
1611 command {
1612 python -c "print(${y} + 1)"
1613 }
1614 output {
1615 Int incr = read_int(stdout())
1616 }
1617 }
1618
1619 workflow wf {
1620 Int int_val
1621 Int int_val2 = 10
1622 Array[Int] my_ints
1623 File ref_file
1624
1625 call t1 {
1626 input: x=int_val
1627 }
1628 call t2 {
1629 input: x=int_val, t=t1.count
1630 }
1631 scatter(i in my_ints) {
1632 call t3 {
1633 input: y=i, ref=ref_file
1634 }
1635 }
1636 }
1637 ```
1638
1639 The inputs to `wf` would be:
1640
1641 * `wf.t1.s` as a `String`
1642 * `wf.t2.s` as a `String`
1643 * `wf.int_val` as an `Int`
1644 * `wf.my_ints` as an `Array[Int]`
1645 * `wf.ref_file` as a `File`
1646
1647 ## Specifying Workflow Inputs in JSON
1648
1649 Once workflow inputs are computed (see previous section), the value for each of the fully-qualified names needs to be specified per invocation of the workflow. Workflow inputs are specified in JSON or YAML format. In JSON, the inputs to the workflow in the previous section can be:
1650
1651 ```
1652 {
1653 "wf.t1.s": "some_string",
1654 "wf.t2.s": "some_string",
1655 "wf.int_val": 3,
1656 "wf.my_ints": [5,6,7,8],
1657 "wf.ref_file": "/path/to/file.txt"
1658 }
1659 ```
1660
1661 It's important to note that the type in JSON must be coercable to the WDL type. For example `wf.int_val` expects an integer, but if we specified it in JSON as `"wf.int_val": "3"`, this coercion from string to integer is not valid and would result in a type error. See the section on [Type Coercion](#type-coercion) for more details.
1662
1663 # Type Coercion
1664
Jan 5, 2016 @scottfrazer Updating specification
1665 WDL values can be created from either JSON values or from native language values. The below table references String-like, Integer-like, etc to refer to values in a particular programming language. For example, "String-like" could mean a `java.io.String` in the Java context or a `str` in Python. An "Array-like" could refer to a `Seq` in Scala or a `list` in Python.
1666
1667 |WDL Type |Can Accept |Notes / Constraints|
1668 |---------|-------------|-------------------|
1669 |`String` |JSON String||
1670 | |String-like||
1671 | |`String`|Identity coercion|
1672 | |`File`||
1673 |`File` |JSON String|Interpreted as a file path|
1674 | |String-like|Interpreted as file path|
1675 | |`String`|Interpreted as file path|
1676 | |`File`|Identity Coercion|
1677 |`Int` |JSON Number|Use floor of the value for non-integers|
1678 | |Integer-like||
1679 | |`Int`|Identity coercion|
1680 |`Float` |JSON Number||
1681 | |Float-like||
1682 | |`Float`|Identity coercion|
1683 |`Boolean`|JSON Boolean||
1684 | |Boolean-like||
1685 | |`Boolean`|Identity coercion|
1686 |`Array[T]`|JSON Array|Elements must be coercable to `T`|
1687 | |Array-like|Elements must be coercable to `T`|
1688 |`Map[K, V]`|JSON Object|keys and values must be coercable to `K` and `V`, respectively|
1689 | |Map-like|keys and values must be coercable to `K` and `V`, respectively|
Jul 31, 2015 @scottfrazer Update the specification
1690
1691 # Standard Library
1692
Jan 5, 2016 @scottfrazer Updating specification
1693 ## File stdout()
Jul 31, 2015 @scottfrazer Update the specification
1694
Jan 5, 2016 @scottfrazer Updating specification
1695 Returns a `File` reference to the stdout that this task generated.
Jul 31, 2015 @scottfrazer Update the specification
1696
Jan 5, 2016 @scottfrazer Updating specification
1697 ## File stderr()
Jul 31, 2015 @scottfrazer Update the specification
1698
Jan 5, 2016 @scottfrazer Updating specification
1699 Returns a `File` reference to the stderr that this task generated.
Jul 31, 2015 @scottfrazer Update the specification
1700
Jan 5, 2016 @scottfrazer Updating specification
1701 ## Array[String] read_lines(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1702
Jan 5, 2016 @scottfrazer Updating specification
1703 Given a file-like object (`String`, `File`) as a parameter, this will read each line as a string and return an `Array[String]` representation of the lines in the file.
Jul 31, 2015 @scottfrazer Update the specification
1704
1705 The order of the lines in the returned `Array[String]` must be the order in which the lines appear in the file-like object.
1706
1707 This task would `grep` through a file and return all strings that matched the pattern:
1708
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1709 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1710 task do_stuff {
1711 String pattern
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1712 File file
Jul 17, 2015 @scottfrazer Documenation updates
1713 command {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1714 grep '${pattern}' ${file}
Jul 17, 2015 @scottfrazer Documenation updates
1715 }
1716 output {
1717 Array[String] matches = read_lines(stdout())
1718 }
1719 }
1720 ```
1721
Jan 5, 2016 @scottfrazer Updating specification
1722 ## Array[Array[String]] read_tsv(String|File)
Jul 17, 2015 @scottfrazer Documenation updates
1723
Jan 5, 2016 @scottfrazer Updating specification
1724 the `read_tsv()` function takes one parameter, which is a file-like object (`String`, `File`) and returns an `Array[Array[String]]` representing the table from the TSV file.
Apr 29, 2015 Add language specification
1725
May 20, 2015 spec and grammar changes
1726 If the parameter is a `String`, this is assumed to be a local file path relative to the current working directory of the task.
Apr 29, 2015 Add language specification
1727
1728 For example, if I write a task that outputs a file to `./results/file_list.tsv`, and my task is defined as:
1729
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1730 ```wdl
Apr 29, 2015 Add language specification
1731 task do_stuff {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1732 File file
Apr 29, 2015 Add language specification
1733 command {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1734 python do_stuff.py ${file}
Apr 29, 2015 Add language specification
1735 }
1736 output {
Jul 17, 2015 @scottfrazer Documenation updates
1737 Array[Array[String]] output_table = read_tsv("./results/file_list.tsv")
Apr 29, 2015 Add language specification
1738 }
1739 }
1740 ```
1741
Jul 17, 2015 @scottfrazer Documenation updates
1742 Then when the task finishes, to fulfull the `outputs_table` variable, `./results/file_list.tsv` must be a valid TSV file or an error will be reported.
1743
Jan 5, 2016 @scottfrazer Updating specification
1744 ## Map[String, String] read_map(String|File)
Apr 29, 2015 Add language specification
1745
Jan 5, 2016 @scottfrazer Updating specification
1746 Given a file-like object (`String`, `File`) as a parameter, this will read each line from a file and expect the line to have the format `col1\tcol2`. In other words, the file-like object must be a two-column TSV file.
Apr 29, 2015 Add language specification
1747
Jul 17, 2015 @scottfrazer Documenation updates
1748 This task would `grep` through a file and return all strings that matched the pattern:
1749
1750 The following task would write a two-column TSV to standard out and that would be interpreted as a `Map[String, String]`:
1751
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1752 ```wdl
Jul 17, 2015 @scottfrazer Documenation updates
1753 task do_stuff {
Jul 31, 2015 @scottfrazer Update the specification
1754 String flags
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1755 File file
Jul 17, 2015 @scottfrazer Documenation updates
1756 command {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1757 ./script --flags=${flags} ${file}
Jul 17, 2015 @scottfrazer Documenation updates
1758 }
1759 output {
1760 Map[String, String] mapping = read_map(stdout())
1761 }
1762 }
1763 ```
1764
Jan 5, 2016 @scottfrazer Updating specification
1765 ## Object read_object(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1766
1767 Given a file-like object that contains a 2-row and n-column TSV file, this function will turn that into an Object.
1768
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1769 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1770 task test {
1771 command <<<
1772 python <<CODE
1773 print('\t'.join(["key_{}".format(i) for i in range(3)]))
1774 print('\t'.join(["value_{}".format(i) for i in range(3)]))
1775 CODE
1776 >>>
1777 output {
1778 Object my_obj = read_object(stdout())
1779 }
1780 }
1781 ```
1782
1783 The command will output to stdout the following:
1784
1785 ```
1786 key_1\tkey_2\tkey_3
1787 value_1\tvalue_2\tvalue_3
1788 ```
1789
1790 Which would be turned into an `Object` in WDL that would look like this:
1791
1792 |Attribute|Value|
1793 |---------|-----|
1794 |key_1 |"value_1"|
1795 |key_2 |"value_2"|
1796 |key_3 |"value_3"|
1797
Jan 5, 2016 @scottfrazer Updating specification
1798 ## Array[Object] read_objects(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1799
1800 Given a file-like object that contains a 2-row and n-column TSV file, this function will turn that into an Object.
1801
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1802 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1803 task test {
1804 command <<<
1805 python <<CODE
1806 print('\t'.join(["key_{}".format(i) for i in range(3)]))
1807 print('\t'.join(["value_{}".format(i) for i in range(3)]))
1808 print('\t'.join(["value_{}".format(i) for i in range(3)]))
1809 print('\t'.join(["value_{}".format(i) for i in range(3)]))
1810 CODE
1811 >>>
1812 output {
1813 Array[Object] my_obj = read_objects(stdout())
1814 }
1815 }
1816 ```
1817
1818 The command will output to stdout the following:
1819
1820 ```
1821 key_1\tkey_2\tkey_3
1822 value_1\tvalue_2\tvalue_3
1823 value_1\tvalue_2\tvalue_3
1824 value_1\tvalue_2\tvalue_3
1825 ```
1826
1827 Which would be turned into an `Array[Object]` in WDL that would look like this:
1828
1829 |Index|Attribute|Value|
1830 |-----|---------|-----|
1831 |0 |key_1 |"value_1"|
1832 | |key_2 |"value_2"|
1833 | |key_3 |"value_3"|
1834 |1 |key_1 |"value_1"|
1835 | |key_2 |"value_2"|
1836 | |key_3 |"value_3"|
1837 |2 |key_1 |"value_1"|
1838 | |key_2 |"value_2"|
1839 | |key_3 |"value_3"|
1840
Jan 5, 2016 @scottfrazer Updating specification
1841 ## mixed read_json(String|File)
Jul 17, 2015 @scottfrazer Documenation updates
1842
Jan 5, 2016 @scottfrazer Updating specification
1843 the `read_json()` function takes one parameter, which is a file-like object (`String`, `File`) and returns a data type which matches the data structure in the JSON file. The mapping of JSON type to WDL type is:
Jul 17, 2015 @scottfrazer Documenation updates
1844
1845 |JSON Type|WDL Type|
1846 |---------|--------|
1847 |object|`Map[String, ?]`|
1848 |array|`Array[?]`|
Jan 5, 2016 @scottfrazer Updating specification
1849 |number|`Int` or `Float`|
Jul 17, 2015 @scottfrazer Documenation updates
1850 |string|`String`|
1851 |boolean|`Boolean`|
1852 |null|???|
1853
1854 If the parameter is a `String`, this is assumed to be a local file path relative to the current working directory of the task.
1855
1856 For example, if I write a task that outputs a file to `./results/file_list.json`, and my task is defined as:
1857
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1858 ```wdl
Jul 17, 2015 @scottfrazer Documenation updates
1859 task do_stuff {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1860 File file
Jul 17, 2015 @scottfrazer Documenation updates
1861 command {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1862 python do_stuff.py ${file}
Jul 17, 2015 @scottfrazer Documenation updates
1863 }
1864 output {
1865 Map[String, String] output_table = read_json("./results/file_list.json")
1866 }
1867 }
1868 ```
1869
Jul 31, 2015 @scottfrazer Update the specification
1870 Then when the task finishes, to fulfull the `output_table` variable, `./results/file_list.json` must be a valid TSV file or an error will be reported.
1871
Jan 5, 2016 @scottfrazer Updating specification
1872 ## Int read_int(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1873
1874 The `read_int()` function takes a file path which is expected to contain 1 line with 1 integer on it. This function returns that integer.
1875
Jan 5, 2016 @scottfrazer Updating specification
1876 ## String read_string(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1877
1878 The `read_string()` function takes a file path which is expected to contain 1 line with 1 string on it. This function returns that string.
1879
1880 No trailing newline characters should be included
1881
Jan 5, 2016 @scottfrazer Updating specification
1882 ## Float read_float(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1883
1884 The `read_float()` function takes a file path which is expected to contain 1 line with 1 floating point number on it. This function returns that float.
1885
Jan 5, 2016 @scottfrazer Updating specification
1886 ## Boolean read_boolean(String|File)
Jul 31, 2015 @scottfrazer Update the specification
1887
1888 The `read_boolean()` function takes a file path which is expected to contain 1 line with 1 Boolean value (either "true" or "false" on it). This function returns that Boolean value.
1889
1890 ## File write_lines(Array[String])
1891
1892 Given something that's compatible with `Array[String]`, this writes each element to it's own line on a file. with newline `\n` characters as line separators.
1893
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1894 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1895 task example {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1896 Array[String] array = ["first", "second", "third"]
Jul 31, 2015 @scottfrazer Update the specification
1897 command {
1898 ./script --file-list=${write_lines(array)}
1899 }
1900 }
1901 ```
1902
1903 If this task were run, the command might look like:
1904
1905 ```
1906 ./script --file-list=/local/fs/tmp/array.txt
1907 ```
1908
1909 And `/local/fs/tmp/array.txt` would contain:
1910
1911 ```
1912 first
1913 second
1914 third
1915 ```
1916
1917 ## File write_tsv(Array[Array[String]])
1918
1919 Given something that's compatible with `Array[Array[String]]`, this writes a TSV file of the data structure.
1920
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1921 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1922 task example {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1923 Array[String] array = [["one", "two", "three"], ["un", "deux", "trois"]]
Jul 31, 2015 @scottfrazer Update the specification
1924 command {
1925 ./script --tsv=${write_tsv(array)}
1926 }
1927 }
1928 ```
1929
1930 If this task were run, the command might look like:
1931
1932 ```
1933 ./script --tsv=/local/fs/tmp/array.tsv
1934 ```
1935
1936 And `/local/fs/tmp/array.tsv` would contain:
1937
1938 ```
1939 one\ttwo\tthree
1940 un\tdeux\ttrois
1941 ```
1942
1943 ## File write_map(Map[String, String])
1944
1945 Given something that's compatible with `Map[String, String]`, this writes a TSV file of the data structure.
1946
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
1947 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
1948 task example {
1949 Map[String, String] map = {"key1": "value1", "key2": "value2"}
1950 command {
1951 ./script --map=${write_map(map)}
1952 }
1953 }
1954 ```
1955
1956 If this task were run, the command might look like:
1957
1958 ```
1959 ./script --tsv=/local/fs/tmp/map.tsv
1960 ```
1961
1962 And `/local/fs/tmp/map.tsv` would contain:
1963
1964 ```
1965 key1\tvalue1
1966 key2\tvalue2
1967 ```
1968
1969 ## File write_object(Object)
1970
1971 Given any `Object`, this will write out a 2-row, n-column TSV file with the object's attributes and values.
1972
1973 ```
1974 task test {
1975 Object input
1976 command <<<
1977 /bin/do_work --obj=${write_object(input)}
1978 >>>
1979 output {
1980 File results = stdout()
1981 }
1982 }
1983 ```
1984
1985 if `input` were to have the value:
1986
1987 |Attribute|Value|
1988 |---------|-----|
1989 |key_1 |"value_1"|
1990 |key_2 |"value_2"|
1991 |key_3 |"value_3"|
1992
1993 The command would instantiate to:
1994
1995 ```
1996 /bin/do_work --obj=/path/to/input.tsv
1997 ```
1998
1999 Where `/path/to/input.tsv` would contain:
2000
2001 ```
2002 key_1\tkey_2\tkey_3
2003 value_1\tvalue_2\tvalue_3
2004 ```
2005
2006 ## File write_objects(Array[Object])
2007
2008 Given any `Array[Object]`, this will write out a 2+ row, n-column TSV file with each object's attributes and values.
2009
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2010 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2011 task test {
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2012 Array[Object] in
Jul 31, 2015 @scottfrazer Update the specification
2013 command <<<
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2014 /bin/do_work --obj=${write_objects(in)}
Jul 31, 2015 @scottfrazer Update the specification
2015 >>>
2016 output {
2017 File results = stdout()
2018 }
2019 }
2020 ```
2021
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2022 if `in` were to have the value:
Apr 29, 2015 Add language specification
2023
Jul 31, 2015 @scottfrazer Update the specification
2024 |Index|Attribute|Value|
2025 |-----|---------|-----|
2026 |0 |key_1 |"value_1"|
2027 | |key_2 |"value_2"|
2028 | |key_3 |"value_3"|
2029 |1 |key_1 |"value_4"|
2030 | |key_2 |"value_5"|
2031 | |key_3 |"value_6"|
2032 |2 |key_1 |"value_7"|
2033 | |key_2 |"value_8"|
2034 | |key_3 |"value_9"|
Apr 29, 2015 Add language specification
2035
Jul 31, 2015 @scottfrazer Update the specification
2036 The command would instantiate to:
Apr 29, 2015 Add language specification
2037
Jul 31, 2015 @scottfrazer Update the specification
2038 ```
2039 /bin/do_work --obj=/path/to/input.tsv
2040 ```
Apr 29, 2015 Add language specification
2041
Jul 31, 2015 @scottfrazer Update the specification
2042 Where `/path/to/input.tsv` would contain:
Apr 29, 2015 Add language specification
2043
Jul 31, 2015 @scottfrazer Update the specification
2044 ```
2045 key_1\tkey_2\tkey_3
2046 value_1\tvalue_2\tvalue_3
2047 value_4\tvalue_5\tvalue_6
2048 value_7\tvalue_8\tvalue_9
2049 ```
Jul 17, 2015 @scottfrazer Documenation updates
2050
Jul 31, 2015 @scottfrazer Update the specification
2051 ## File write_json(mixed)
Apr 29, 2015 Add language specification
2052
Jan 5, 2016 @scottfrazer Updating specification
2053 Given something with any type, this writes the JSON equivalent to a file. See the table in the definition of [read_json()](#mixed-read_jsonstringfile)
Apr 29, 2015 Add language specification
2054
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2055 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2056 task example {
2057 Map[String, String] map = {"key1": "value1", "key2": "value2"}
2058 command {
2059 ./script --map=${write_json(map)}
2060 }
2061 }
2062 ```
Apr 29, 2015 Add language specification
2063
Jul 31, 2015 @scottfrazer Update the specification
2064 If this task were run, the command might look like:
Apr 29, 2015 Add language specification
2065
Jul 31, 2015 @scottfrazer Update the specification
2066 ```
2067 ./script --tsv=/local/fs/tmp/map.json
2068 ```
2069
2070 And `/local/fs/tmp/map.json` would contain:
2071
2072 ```
2073 {
2074 "key1": "value1"
2075 "key2": "value2"
2076 }
2077 ```
Apr 29, 2015 Add language specification
2078
2079
Jul 31, 2015 @scottfrazer Update the specification
2080 # Data Types & Serialization
2081
Jan 5, 2016 @scottfrazer Updating specification
2082 Tasks and workflows are given values for their input parameters in order to run. The type of each of those input parameters are declarations on the `task` or `workflow`. Those input parameters can be any [valid type](#types):
Apr 29, 2015 Add language specification
2083
Jan 5, 2016 @scottfrazer Updating specification
2084 Primitive Types:
2085
2086 * String
2087 * Int
2088 * Float
2089 * File
2090 * Boolean
Apr 29, 2015 Add language specification
2091
2092 Compound Types:
Jan 5, 2016 @scottfrazer Updating specification
2093
2094 * Array[T] (e.g. `Array[String]`)
2095 * Map[K, V] (e.g. `Map[Int, Int]`)
2096 * Object
Apr 29, 2015 Add language specification
2097
Jul 31, 2015 @scottfrazer Update the specification
2098 When a WDL workflow engine instantiates a command specified in the `command` section of a `task`, it must serialize all `${...}` tags in the command into primitive types.
Apr 29, 2015 Add language specification
2099
2100 For example, if I'm writing a tool that operates on a list of FASTQ files, there are a variety of ways that this list can be passed to that task:
2101
Jul 31, 2015 @scottfrazer Update the specification
2102 * A file containing one file path per line (e.g. `Rscript analysis.R --files=fastq_list.txt`)
2103 * A file containing a JSON list (e.g. `Rscript analysis.R --files=fastq_list.json`)
2104 * Enumerated on the command line (e.g. (`Rscript analysis.R 1.fastq 2.fastq 3.fastq`)
Apr 29, 2015 Add language specification
2105
2106 Each of these methods has its merits and one method might be better for one tool while another method would be better for another tool.
2107
Jul 31, 2015 @scottfrazer Update the specification
2108 On the other end, tasks need to be able to communicate data structures back to the workflow engine. For example, let's say this same tool that takes a list of FASTQs wants to return back a `Map[File, Int]` representing the number of reads in each FASTQ. A tool might choose to output it as a two-column TSV or as a JSON object and WDL needs to know how to convert that to the proper data type.
Apr 29, 2015 Add language specification
2109
Jul 31, 2015 @scottfrazer Update the specification
2110 WDL provides some [standard library functions](#standard-library) for converting compound types like `Array` into primitive types, like `File`.
Apr 29, 2015 Add language specification
2111
Jul 31, 2015 @scottfrazer Update the specification
2112 When a task finishes, the `output` section defines how to convert the files and stdout/stderr into WDL types. For example,
Apr 29, 2015 Add language specification
2113
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2114 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2115 task test {
2116 Array[File] files
Apr 29, 2015 Add language specification
2117 command {
Jul 31, 2015 @scottfrazer Update the specification
2118 Rscript analysis.R --files=${sep=',' files}
Apr 29, 2015 Add language specification
2119 }
2120 output {
Jul 31, 2015 @scottfrazer Update the specification
2121 Array[String] strs = read_lines(stdout())
Apr 29, 2015 Add language specification
2122 }
2123 }
2124 ```
2125
Jan 5, 2016 @scottfrazer Updating specification
2126 Here, the expression `read_lines(stdout())` says "take the output from stdout, break into lines, and return that result as an Array[String]". See the definition of [read_lines](#arraystring-read_linesstringfile) and [stdout](#file-stdout) for more details.
Apr 29, 2015 Add language specification
2127
Jul 31, 2015 @scottfrazer Update the specification
2128 ## Serialization of Task Inputs
Apr 29, 2015 Add language specification
2129
Jul 31, 2015 @scottfrazer Update the specification
2130 ### Primitive Types
Apr 29, 2015 Add language specification
2131
Jul 31, 2015 @scottfrazer Update the specification
2132 Serializing primitive inputs into strings is intuitively easy because the value is just turned into a string and inserted into the command line.
2133
2134 Consider this example:
Apr 29, 2015 Add language specification
2135
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2136 ```wdl
Apr 29, 2015 Add language specification
2137 task output_example {
Jul 31, 2015 @scottfrazer Update the specification
2138 String s
2139 Int i
2140 Float f
2141
Apr 29, 2015 Add language specification
2142 command {
Jan 5, 2016 @scottfrazer Updating specification
2143 python do_work.py ${s} ${i} ${f}
Apr 29, 2015 Add language specification
2144 }
2145 }
2146 ```
2147
Jul 31, 2015 @scottfrazer Update the specification
2148 If I provide values for the declarations in the task as:
Apr 29, 2015 Add language specification
2149
Jul 31, 2015 @scottfrazer Update the specification
2150 |var|value|
2151 |---|-----|
2152 |s |"str"|
2153 |i |2 |
2154 |f |1.3 |
Apr 29, 2015 Add language specification
2155
Jul 31, 2015 @scottfrazer Update the specification
2156 Then, the command would be instantiated as:
Apr 29, 2015 Add language specification
2157
2158 ```
Jan 5, 2016 @scottfrazer Updating specification
2159 python do_work.py str 2 1.3
Apr 29, 2015 Add language specification
2160 ```
2161
Jul 31, 2015 @scottfrazer Update the specification
2162 ### Compound Types
Apr 29, 2015 Add language specification
2163
Jul 31, 2015 @scottfrazer Update the specification
2164 Compound types, like `Array` and `Map` must be converted to a primitive type before it can be used in the command. There are many ways to turn a compound types into primitive types, as laid out in following sections
Apr 29, 2015 Add language specification
2165
Jul 31, 2015 @scottfrazer Update the specification
2166 #### Array serialization
Apr 29, 2015 Add language specification
2167
2168 Arrays can be serialized in two ways:
2169
2170 * **Array Expansion**: elements in the list are flattened to a string with a separator character.
2171 * **File Creation**: create a file with the elements of the array in it and passing that file as the parameter on the command line.
2172
Jul 31, 2015 @scottfrazer Update the specification
2173 ##### Array serialization by expansion
Apr 29, 2015 Add language specification
2174
Jul 31, 2015 @scottfrazer Update the specification
2175 The array flattening approach can be done if a parameter is specified as `${sep=' ' my_param}`. `my_param` must be declared as an `Array` of primitive types. When the value of `my_param` is specified, then the values are joined together with the separator character (a space in this case). For example:
Apr 29, 2015 Add language specification
2176
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2177 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2178 task test {
2179 Array[File] bams
2180 command {
2181 python script.py --bams=${sep=',' bams}
2182 }
2183 }
Apr 29, 2015 Add language specification
2184 ```
2185
2186 If passed an array for the value of `bams`:
2187
2188 |Element |
2189 |--------------|
2190 |/path/to/1.bam|
2191 |/path/to/2.bam|
2192 |/path/to/3.bam|
2193
2194 Would produce the command `python script.py --bams=/path/to/1.bam,/path/to/2.bam,/path/to/1.bam`
2195
Jul 31, 2015 @scottfrazer Update the specification
2196 ##### Array serialization using write_lines()
Apr 29, 2015 Add language specification
2197
2198 An array may be turned into a file with each element in the array occupying a line in the file.
2199
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2200 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2201 task test {
2202 Array[File] bams
2203 command {
2204 sh script.sh ${write_lines(bams)}
2205 }
2206 }
Apr 29, 2015 Add language specification
2207 ```
2208
2209 if `bams` is given this array:
2210
2211 |Element |
2212 |--------------|
2213 |/path/to/1.bam|
2214 |/path/to/2.bam|
2215 |/path/to/3.bam|
2216
2217 Then, the resulting command line could look like:
2218
2219 ```
Jul 31, 2015 @scottfrazer Update the specification
2220 sh script.sh /jobs/564758/bams
Apr 29, 2015 Add language specification
2221 ```
2222
Jul 31, 2015 @scottfrazer Update the specification
2223 Where `/jobs/564758/bams` would contain:
Apr 29, 2015 Add language specification
2224
2225 ```
2226 /path/to/1.bam
2227 /path/to/2.bam
2228 /path/to/3.bam
2229 ```
2230
Jul 31, 2015 @scottfrazer Update the specification
2231 ##### Array serialization using write_json()
Apr 29, 2015 Add language specification
2232
2233 The array may be turned into a JSON document with the file path for the JSON file passed in as the parameter:
2234
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2235 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2236 task test {
2237 Array[File] bams
2238 command {
2239 sh script.sh ${write_json(bams)}
2240 }
2241 }
Apr 29, 2015 Add language specification
2242 ```
2243
2244 if `bams` is given this array:
2245
2246 |Element |
2247 |--------------|
2248 |/path/to/1.bam|
2249 |/path/to/2.bam|
2250 |/path/to/3.bam|
2251
2252 Then, the resulting command line could look like:
2253
2254 ```
2255 sh script.sh /jobs/564758/bams.json
2256 ```
2257
2258 Where `/jobs/564758/bams.json` would contain:
2259
2260 ```
2261 [
2262 "/path/to/1.bam",
2263 "/path/to/2.bam",
2264 "/path/to/3.bam"
2265 ]
2266 ```
2267
Jul 31, 2015 @scottfrazer Update the specification
2268 #### Map serialization
Apr 29, 2015 Add language specification
2269
2270 Map types cannot be serialized on the command line directly and must be serialized through a file
2271
Jul 31, 2015 @scottfrazer Update the specification
2272 ##### Map serialization using write_map()
Apr 29, 2015 Add language specification
2273
Jul 31, 2015 @scottfrazer Update the specification
2274 The map type can be serialized as a two-column TSV file and the parameter on the command line is given the path to that file, using the `write_map()` function:
Apr 29, 2015 Add language specification
2275
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2276 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2277 task test {
2278 Map[String, Float] sample_quality_scores
2279 command {
2280 sh script.sh ${write_map(sample_quality_scores)}
2281 }
2282 }
Apr 29, 2015 Add language specification
2283 ```
2284
Jul 31, 2015 @scottfrazer Update the specification
2285 if `sample_quality_scores` is given this Map[String, Float] as:
Apr 29, 2015 Add language specification
2286
2287 |Key |Value |
2288 |-------|------|
2289 |sample1|98 |
2290 |sample2|95 |
2291 |sample3|75 |
2292
2293 Then, the resulting command line could look like:
2294
2295 ```
2296 sh script.sh /jobs/564757/sample_quality_scores.tsv
2297 ```
2298
2299 Where `/jobs/564757/sample_quality_scores.tsv` would contain:
2300
2301 ```
2302 sample1\t98
2303 sample2\t95
2304 sample3\t75
2305 ```
2306
Jul 31, 2015 @scottfrazer Update the specification
2307 ##### Map serialization using write_json()
Apr 29, 2015 Add language specification
2308
Jul 31, 2015 @scottfrazer Update the specification
2309 The map type can also be serialized as a JSON file and the parameter on the command line is given the path to that file, using the `write_json()` function:
Apr 29, 2015 Add language specification
2310
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2311 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2312 task test {
2313 Map[String, Float] sample_quality_scores
2314 command {
2315 sh script.sh ${write_json(sample_quality_scores)}
2316 }
2317 }
Apr 29, 2015 Add language specification
2318 ```
2319
2320 if sample_quality_scores is given this map:
2321
2322 |Key |Value |
2323 |-------|------|
2324 |sample1|98 |
2325 |sample2|95 |
2326 |sample3|75 |
2327
2328 Then, the resulting command line could look like:
2329
2330 ```
2331 sh script.sh /jobs/564757/sample_quality_scores.json
2332 ```
2333
2334 Where `/jobs/564757/sample_quality_scores.json` would contain:
2335
2336 ```
2337 {
2338 "sample1": 98,
2339 "sample2": 95,
2340 "sample3": 75
2341 }
2342 ```
2343
Jul 31, 2015 @scottfrazer Update the specification
2344 #### Object serialization
Apr 29, 2015 Add language specification
2345
Jul 31, 2015 @scottfrazer Update the specification
2346 An object is a more general case of a map where the keys are strings and the values are of arbitrary types and treated as strings. Objects can be serialized with either `write_object()` or `write_json()` functions:
Apr 29, 2015 Add language specification
2347
Jul 31, 2015 @scottfrazer Update the specification
2348 ##### Object serialization using write_object()
Apr 29, 2015 Add language specification
2349
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2350 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2351 task test {
2352 Object sample
2353 command {
2354 perl script.pl ${write_object(sample)}
2355 }
2356 }
Apr 29, 2015 Add language specification
2357 ```
2358
2359 if sample is provided as:
2360
2361 |Attribute|Value |
2362 |---------|------|
2363 |attr1 |value1|
2364 |attr2 |value2|
2365 |attr3 |value3|
2366 |attr4 |value4|
2367
2368 Then, the resulting command line could look like:
2369
2370 ```
2371 perl script.pl /jobs/564759/sample.tsv
2372 ```
2373
2374 Where `/jobs/564759/sample.tsv` would contain:
2375
2376 ```
2377 attr1\tattr2\tattr3\tattr4
2378 value1\tvalue2\tvalue3\tvalue4
2379 ```
2380
Jul 31, 2015 @scottfrazer Update the specification
2381 ##### Object serialization using write_json()
Apr 29, 2015 Add language specification
2382
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2383 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2384 task test {
2385 Object sample
2386 command {
2387 perl script.pl ${write_json(sample)}
2388 }
2389 }
Apr 29, 2015 Add language specification
2390 ```
2391
2392 if sample is provided as:
2393
2394 |Attribute|Value |
2395 |---------|------|
2396 |attr1 |value1|
2397 |attr2 |value2|
2398 |attr3 |value3|
2399 |attr4 |value4|
2400
2401 Then, the resulting command line could look like:
2402
2403 ```
2404 perl script.pl /jobs/564759/sample.json
2405 ```
2406
2407 Where `/jobs/564759/sample.json` would contain:
2408
2409 ```
2410 {
2411 "attr1": "value1",
2412 "attr2": "value2",
2413 "attr3": "value3",
2414 "attr4": "value4",
2415 }
2416 ```
Jul 31, 2015 @scottfrazer Update the specification
2417 #### Array[Object] serialization
2418
2419 `Array[Object]` must guarantee that all objects in the array have the same set of attributes. These can be serialized with either `write_objects()` or `write_json()` functions, as described in following sections.
2420
2421 ##### Array[Object] serialization using write_objects()
2422
2423 an `Array[Object]` can be serialized using `write_objects()` into a TSV file:
2424
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2425 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2426 task test {
2427 Array[Object] sample
2428 command {
2429 perl script.pl ${write_objects(sample)}
2430 }
2431 }
2432 ```
2433
2434 if sample is provided as:
2435
2436 |Index|Attribute|Value |
2437 |-----|---------|-------|
2438 |0 |attr1 |value1 |
2439 | |attr2 |value2 |
2440 | |attr3 |value3 |
2441 | |attr4 |value4 |
2442 |1 |attr1 |value5 |
2443 | |attr2 |value6 |
2444 | |attr3 |value7 |
2445 | |attr4 |value8 |
2446
2447 Then, the resulting command line could look like:
2448
2449 ```
2450 perl script.pl /jobs/564759/sample.tsv
2451 ```
2452
2453 Where `/jobs/564759/sample.tsv` would contain:
2454
2455 ```
2456 attr1\tattr2\tattr3\tattr4
2457 value1\tvalue2\tvalue3\tvalue4
2458 value5\tvalue6\tvalue7\tvalue8
2459 ```
2460
2461 ##### Array[Object] serialization using write_json()
2462
2463 an `Array[Object]` can be serialized using `write_json()` into a JSON file:
2464
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2465 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2466 task test {
2467 Array[Object] sample
2468 command {
2469 perl script.pl ${write_json(sample)}
2470 }
2471 }
2472 ```
2473
2474 if sample is provided as:
2475
2476 |Index|Attribute|Value |
2477 |-----|---------|-------|
2478 |0 |attr1 |value1 |
2479 | |attr2 |value2 |
2480 | |attr3 |value3 |
2481 | |attr4 |value4 |
2482 |1 |attr1 |value5 |
2483 | |attr2 |value6 |
2484 | |attr3 |value7 |
2485 | |attr4 |value8 |
2486
2487 Then, the resulting command line could look like:
2488
2489 ```
2490 perl script.pl /jobs/564759/sample.json
2491 ```
2492
2493 Where `/jobs/564759/sample.json` would contain:
2494
2495 ```
2496 [
2497 {
2498 "attr1": "value1",
2499 "attr2": "value2",
2500 "attr3": "value3",
2501 "attr4": "value4"
2502 },
2503 {
2504 "attr1": "value5",
2505 "attr2": "value6",
2506 "attr3": "value7",
2507 "attr4": "value8"
2508 }
2509 ]
2510 ```
2511
2512 ## De-serialization of Task Outputs
2513
2514 A task's command can only output data as files. Therefore, every de-serialization function in WDL takes a file input and returns a WDL type
2515
2516 ### Primitive Types
2517
2518 De-serialization of primitive types is done through a `read_*` function. For example, `read_int("file/path")` and `read_string("file/path")`.
2519
2520 For example, if I have a task that outputs a `String` and an `Int`:
2521
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2522 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2523 task output_example {
2524 String param1
2525 String param2
2526 command {
2527 python do_work.py ${param1} ${param2} --out1=int_file --out2=str_file
2528 }
2529 output {
2530 Int my_int = read_int("int_file")
2531 String my_str = read_string("str_file")
2532 }
2533 }
2534 ```
2535
2536 Both files `file_with_int` and `file_with_uri` should contain one line with the value on that line. This value is then validated against the type of the variable. If `file_with_int` contains a line with the text "foobar", the workflow must fail this task with an error.
2537
2538 ### Compound Types
2539
2540 Tasks can also output to a file or stdout/stderr an `Array`, `Map`, or `Object` data structure in a two major formats:
2541
2542 * JSON - because it fits naturally with the types within WDL
2543 * Text based / TSV - These are usually simple table and text-based encodings (e.g. `Array[String]` could be serialized by having each element be a line in a file)
2544
2545 #### Array deserialization
2546
2547 Maps are deserialized from:
2548
2549 * Files that contain a JSON Array as their top-level element.
2550 * Any file where it is desirable to interpret each line as an element of the `Array`.
2551
2552 ##### Array deserialization using read_lines()
2553
2554 `read_lines()` will return an `Array[String]` where each element in the array is a line in the file.
2555
2556 This return value can be auto converted to other `Array` types. For example:
2557
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2558 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2559 task test {
2560 command <<<
2561 python <<CODE
2562 import random
2563 for i in range(10):
2564 print(random.randrange(10))
2565 CODE
2566 >>>
2567 output {
2568 Array[Int] my_ints = read_lines(stdout())
2569 }
2570 }
2571 ```
2572
2573 `my_ints` would contain ten random integers ranging from 0 to 10.
2574
2575 ##### Array deserialization using read_json()
2576
2577 `read_json()` will return whatever data type resides in that JSON file
2578
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2579 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2580 task test {
2581 command <<<
2582 echo '["foo", "bar"]'
2583 >>>
2584 output {
2585 Array[String] my_array = read_json(stdout())
2586 }
2587 }
2588 ```
2589
2590 This task would assign the array with elements `"foo"` and `"bar"` to `my_array`.
2591
2592 If the echo statement was instead `echo '{"foo": "bar"}'`, the engine MUST fail the task for a type mismatch.
2593
2594 #### Map deserialization
2595
2596 Maps are deserialized from:
2597
2598 * Files that contain a JSON Object as their top-level element.
2599 * Files that contain a two-column TSV file.
2600
2601 ##### Map deserialization using read_map()
2602
2603 `read_map()` will return an `Map[String, String]` where the keys are the first column in the TSV input file and the corresponding values are the second column.
2604
2605 This return value can be auto converted to other `Map` types. For example:
2606
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2607 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2608 task test {
2609 command <<<
2610 python <<CODE
2611 for i in range(3):
2612 print("key_{idx}\t{idx}".format(idx=i)
2613 CODE
2614 >>>
2615 output {
2616 Map[String, Int] my_ints = read_map(stdout())
2617 }
2618 }
2619 ```
2620
2621 This would put a map containing three keys (`key_0`, `key_1`, and `key_2`) and three respective values (`0`, `1`, and `2`) as the value of `my_ints`
2622
2623 ##### Map deserialization using read_json()
2624
2625 `read_json()` will return whatever data type resides in that JSON file. If that file contains a JSON object with homogeneous key/value pair types (e.g. `string -> int` pairs), then the `read_json()` function would return a `Map`.
2626
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2627 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2628 task test {
2629 command <<<
2630 echo '{"foo":"bar"}'
2631 >>>
2632 output {
2633 Map[String, String] my_map = read_json(stdout())
2634 }
2635 }
2636 ```
2637
2638 This task would assign the one key-value pair map in the echo statement to `my_map`.
2639
2640 If the echo statement was instead `echo '["foo", "bar"]'`, the engine MUST fail the task for a type mismatch.
2641
2642 #### Object deserialization
2643
2644 Objects are deserialized from files that contain a two-row, n-column TSV file. The first row are the object attribute names and the corresponding entries on the second row are the values.
2645
2646 ##### Object deserialization using read_object()
2647
2648 `read_object()` will return an `Object` where the keys are the first row in the TSV input file and the corresponding values are the second row (corresponding column).
2649
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2650 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2651 task test {
2652 command <<<
2653 python <<CODE
2654 print('\t'.join(["key_{}".format(i) for i in range(3)]))
2655 print('\t'.join(["value_{}".format(i) for i in range(3)]))
2656 CODE
2657 >>>
2658 output {
2659 Object my_obj = read_object(stdout())
2660 }
2661 }
2662 ```
2663
2664 This would put an object containing three attributes (`key_0`, `key_1`, and `key_2`) and three respective values (`value_0`, `value_1`, and `value_2`) as the value of `my_obj`
2665
2666 #### Array[Object] deserialization
2667
2668 `Array[Object]` MUST assume that all objects in the array are homogeneous (they have the same attributes, but the attributes don't have to have the same values)
2669
2670 An `Array[Object]` is deserialized from files that contains at least 2 rows and a uniform n-column TSV file. The first row are the object attribute names and the corresponding entries on the subsequent rows are the values
2671
2672 ##### Object deserialization using read_objects()
Apr 29, 2015 Add language specification
2673
Jul 31, 2015 @scottfrazer Update the specification
2674 `read_object()` will return an `Object` where the keys are the first row in the TSV input file and the corresponding values are the second row (corresponding column).
2675
Aug 15, 2015 @scottfrazer Fixing some syntax bugs
2676 ```wdl
Jul 31, 2015 @scottfrazer Update the specification
2677 task test {
2678 command <<<
2679 python <<CODE
2680 print('\t'.join(["key_{}".format(i) for i in range(3)]))
2681 print('\t'.join(["value_{}".format(i) for i in range(3)]))
2682 print('\t'.join(["value_{}".format(i) for i in range(3)]))
2683 print('\t'.join(["value_{}".format(i) for i in range(3)]))
2684 CODE
2685 >>>
2686 output {
2687 Array[Object] my_obj = read_objects(stdout())
2688 }
2689 }
2690 ```
Apr 29, 2015 Add language specification
2691
Jul 31, 2015 @scottfrazer Update the specification
2692 This would create an array of **three identical** `Object`s containing three attributes (`key_0`, `key_1`, and `key_2`) and three respective values (`value_0`, `value_1`, and `value_2`) as the value of `my_obj`