Skip to content

Commit

Permalink
Document more features of cwltool (#246)
Browse files Browse the repository at this point in the history
* Show cwltool --make-template to users in the chapter where we introduce the inputs objects
* Add a troubleshooting section for the cachedir docs

Co-authored-by: Michael R. Crusoe <1330696+mr-c@users.noreply.github.com>
  • Loading branch information
kinow and mr-c committed Oct 10, 2022
1 parent df4a7a2 commit 611c806
Show file tree
Hide file tree
Showing 8 changed files with 174 additions and 1 deletion.
2 changes: 1 addition & 1 deletion .github/workflows/gh-pages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:

- name: Install apt packages
run: |
sudo apt-get install -y graphviz
sudo apt-get install -y graphviz tree
- name: Set up Python
uses: actions/setup-python@v4
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ _site
.Rhistory
.RData
_build/
build/
*.egg-info/

src/_includes/cwl/**/output.txt
Expand Down
1 change: 1 addition & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ build:
nodejs: "16"
apt_packages:
- graphviz
- tree

sphinx:
configuration: src/conf.py
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
cwlVersion: v1.2
class: Workflow

inputs:
text:
type: string
default: 'Hello World'
outputs:
reversed_message:
type: string
outputSource: step_b/reversed_message

steps:
step_a:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
text: string
outputs:
step_a_stdout:
type: File
outputBinding:
glob: 'stdout.txt'
baseCommand: echo
arguments: [ '-n', '$(inputs.text)' ]
in:
text: text
out: [step_a_stdout]
step_b:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
step_a_stdout: File
outputs:
reversed_message:
type: string
outputBinding:
glob: stdout.txt
loadContents: true
outputEval: $(self[0].contents)
baseCommand: rev
arguments: [ $(inputs.step_a_stdout) ]
in:
step_a_stdout:
source: step_a/step_a_stdout
out: [reversed_message]
47 changes: 47 additions & 0 deletions src/_includes/cwl/troubleshooting/troubleshooting-wf1.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
cwlVersion: v1.2
class: Workflow

inputs:
text:
type: string
default: 'Hello World'
outputs:
reversed_message:
type: string
outputSource: step_b/reversed_message

steps:
step_a:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
text: string
outputs:
step_a_stdout:
type: File
outputBinding:
glob: 'stdout.txt'
arguments: ['echo', '-n', '$(inputs.text)']
in:
text: text
out: [step_a_stdout]
step_b:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
step_a_stdout: File
outputs:
reversed_message:
type: string
outputBinding:
glob: stdout.txt
loadContents: true
outputEval: $(self[0].contents)
baseCommand: revv
arguments: [ $(inputs.step_a_stdout) ]
in:
step_a_stdout:
source: step_a/step_a_stdout
out: [reversed_message]
1 change: 1 addition & 0 deletions src/topics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ best-practices.md
file-formats.md
metadata-and-authorship.md
specifying-software-requirements.md
troubleshooting.md
```
12 changes: 12 additions & 0 deletions src/topics/inputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,18 @@ Create a file called `inp-job.yml`:
:name: inp-job.yml
```

````{note}
You can use `cwltool` to create a template input object. That saves you from having
to type all the input parameters in a input object file:
```{runcmd} cwltool --make-template inp.cwl
:working-directory: src/_includes/cwl/inputs
```
You can redirect the output to a file, i.e. `cwltool --make-template inp.cwl > inp-job.yml`,
and then modify the default values with your desired input values.
````

Notice that "example_file", as a `File` type, must be provided as an
object with the fields `class: File` and `path`.

Expand Down
63 changes: 63 additions & 0 deletions src/topics/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Troubleshooting

In this section you will find ways to troubleshoot when you have problems executing CWL.
We focus on `cwltool` here but some of these techniques may apply to other CWL Runners.

## Run `cwltool` with `cachedir`

You can use the `--cachedir` option when running a workflow to tell `cwltool` to
cache intermediate files (files that are not input nor output files, but created
while your workflow is running). By default, these files are created in a
temporary directory but writing them to a separate directory makes accessing
them easier.

In the following example `troubleshooting-wf1.cwl` we have two steps, `step_a` and `step_b`.
The workflow is equivalent to `echo "Hello World" | rev`, which would print the message
"Hello World" reversed, i.e. "dlroW olleH". However, the second step, `step_b`, **has a typo**,
where instead of executing the `rev` command it tries to execute `revv`, which
fails.

```{literalinclude} /_includes/cwl/troubleshooting/troubleshooting-wf1.cwl
:language: cwl
:name: "`troubleshooting-wf1.cwl`"
:caption: "`troubleshooting-wf1.cwl`"
:emphasize-lines: 42
```

Let's execute this workflow with `/tmp/cachedir/` as the `--cachedir` value (`cwltool` will
create the directory for you if it does not exist already):

```{runcmd} cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1.cwl
:working-directory: src/_includes/cwl/troubleshooting
:emphasize-lines: 12-14, 19-21
```

The workflow is in the `permanentFail` status due to `step_b` failing to execute the
non-existent `revv` command. The `step_a` was executed successfully and its output
has been cached in your `cachedir` location. You can inspect the intermediate files
created:

```{runcmd} tree /tmp/cachedir
:emphasize-lines: 4
```

Each workflow step has received a unique ID (the long value that looks like a hash).
The `${HASH}.status` files display the status of each step executed by the workflow.
And the `step_a` output file `stdout.txt` is visible in the output of the command above.

Now fix the typo so `step_b` executes `rev` (i.e. replace `revv` by `rev` in the
`step_b`). After fixing the typo, when you execute `cwltool` with the same arguments
as the previous time, note that now `cwltool` output contains information about
pre-cached outputs for `step_a`, and about a new cache entry for the output of `step_b`.
Also note that the status of `step_b` is now of success.

```{runcmd} cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1-stepb-fixed.cwl
:working-directory: src/_includes/cwl/troubleshooting
:emphasize-lines: 12, 16-18
```

In this example the workflow step `step_a` was not re-evaluated as it had been cached, and
there was no change in its execution or output. Furthermore, `cwltool` was able to recognize
when it had to re-evaluate `step_b` after we fixed the executable name. This technique is
useful for troubleshooting your CWL documents and also as a way to prevent `cwltool` to
re-evaluate steps unnecessarily.

0 comments on commit 611c806

Please sign in to comment.