Skip to content


Repository files navigation

Linux Build Status Code coverage Documentation Status


Python Utilities and Autogenerated Classes for loading and parsing CWL v1.0, CWL v1.1, and CWL v1.2 documents.

Requires Python 3.8+


pip3 install cwl-utils

To install from source:

git clone
cd cwl-utils
pip3 install .


Pull the all referenced software container images

cwl-docker-extract is useful to cache or pre-pull all software container images referenced in a CWL CommandLineTool or CWL Workflow (including all referenced CommandLineTools and sub-Workflows and so on).

The default behaviour is to use the Docker engine to download and save the software container images in Docker format.

cwl-docker-extract path_to_my_workflow.cwl
cwl-docker-extract --dir DIRECTORY path_to_my_workflow.cwl

Or you can use the Singularity software container engine to download and save the software container images and convert them to the Singularity format at the same time.

cwl-docker-extract --singularity --dir DIRECTORY path_to_my_workflow.cwl

Print all referenced software packages

cwl-cite-extract prints all software packages found (recursively) in the specified CWL document.

Currently the package name and any listed specs and version field are printed for all SoftwareRequirement s found.

cwl-cite-extract path_to_my_workflow.cwl

Replace CWL Expressions with concrete steps

cwl-expression-refactor refactors CWL documents so that any CWL Expression evaluations are separate steps (either CWL ExpressionTools or CWL CommandLineTools.) This allows execution by CWL engines that do not want to support inline expression evaluation outside of concrete steps, or do not want to directly support CWL's optional InlineJavascriptRequirement at all.

cwl-expression-refactor directory/path/to/save/outputs path_to_my_workflow.cwl [more_workflows.cwl]

Split a packed CWL document

cwl-graph-split splits a packed CWL document file into multiple files.

Packed CWL documents use the $graph construct to contain multiple CWL Process objects (Workflow, CommandLineTool, ExpressionTool, Operation). Typically packed CWL documents contain a CWL Workflow under the name "main" and the workflow steps (including any sub-workflows).

cwl-graph-split --outdir optional/directory/path/to/save/outputs path_to_my_workflow.cwl

Normalize a CWL document

cwl-normalizer normalizes one or more CWL document so that for each document, a JSON format CWL document is produces with it and all of its dependencies packed together, upgrading to CWL v1.2, as needed. Can optionally refactor CWL Expressions into separate steps in the manner of cwl-expression-refactor.

cwl-normalizer directory/path/to/save/outputs path_to_my_workflow.cwl [more_workflows.cwl]

Using the CWL Parsers

from pathlib import Path
from ruamel import yaml
import sys

from cwl_utils.parser import load_document_by_uri, save

# File Input - This is the only thing you will need to adjust or take in as an input to your function:
cwl_file = Path("testdata/md5sum.cwl")  # or a plain string works as well

# Import CWL Object
cwl_obj = load_document_by_uri(cwl_file)

# View CWL Object
print("List of object attributes:\n{}".format("\n".join(map(str, dir(cwl_obj)))))

# Export CWL Object into a built-in typed object
saved_obj = save(cwl_obj)
print(f"Export of the loaded CWL object: {saved_obj}.")


Regenerate parsers

To regenerate install the schema_salad package and run:

cwl_utils/parser/ was created via schema-salad-tool --codegen python --codegen-parser-info "org.w3id.cwl.v1_0" > cwl_utils/parser/

cwl_utils/parser/ was created via schema-salad-tool --codegen python --codegen-parser-info "org.w3id.cwl.v1_1" > cwl_utils/parser/

cwl_utils/parser/ was created via schema-salad-tool --codegen python --codegen-parser-info "org.w3id.cwl.v1_2" > cwl_utils/parser/


To release CWLUtils, bump the version in cwl_utils/, and tag that commit with the new version. The gh-action-pypi-publish should release that tag.