Skip to content

Latest commit



255 lines (188 loc) · 9.39 KB


File metadata and controls

255 lines (188 loc) · 9.39 KB

WES Tutorial for running CWL workflows through the Toil Engine on AWS AGC


Although the process for running WDL workflows through the Cromwell engine on AWS AGC is very similar to the process described in this tutorial, there are a few differences. For information on how to launch a WDL workflow on AWS AGC please consult this tutorial <wes-wdl-agc-tutorial>.

Amazon Web Services' (AWS) Amazon Genomics CLI (AGC) is a command line tool for launching cloud infrastructure within AWS accounts that can be used to execute genomics workflows. The infrastructure deployed by AGC implements the WES standard, and thus can be directly used by the Dockstore CLI.

Check out the official AGC Info page.

For developers, check out the official AGC GitHub page.

Download and Install AWS AGC

AGC provides a quick-start guide for initial setup and getting familiar with the tool. The following workflow execution tutorial will cover all steps for using AGC once it has been installed.

Tutorial Topics

The following CWL workflow tutorials will cover:

  1. Deploying an AGC project and context
  2. Configuring the Dockstore CLI to communicate with AGC infrastructure
  3. Launching a workflow

Configuring AGC and the Dockstore CLI

  1. Create a file named agc-project.yaml that contains:

    name: dockstoreAgcTutorialProject
    schemaVersion: 1
          - type: cwl
            engine: toil

This will create an AGC project named dockstoreAgcTutorialProject, with a single context named ctx2.


For AGC infrastructure to interact with an S3 resource, the desired S3 bucket must be specified in the project's agc-project.yaml file and your AWS account must already have access to the S3 resource. For more information on how to this, please click here.

  1. Activate AGC on your account. If this is your first time running AGC on an account, this may take a few minutes.

    agc account activate
  2. Deploy an AGC context by running the below command in the same directory as agc-project.yaml. This will take approximately 10 minutes.

    agc context deploy ctx2
  3. Retrieve the WES endpoint created by the context. This will return a few values, the WES endpoint is the value under WESENDPOINT:

    agc context describe ctx2

5. Copy the WES endpoint into the Dockstore CLI config file located at ~/.dockstore/config and append ga4gh/wes/v1 to the end of the URL. Your Dockstore CLI config file should have a named AWS profile included to allow the CLI to authorize requests to AWS. The resulting config file will look similar to:

authorization: aws-wes-profile
type: aws

6. To verify that the Dockstore CLI is communicating with the AGC infrastructure, list the WES server info. A JSON response will be printed to your terminal with the server's configuration.

dockstore workflow wes service-info


At this point, the AGC infrastructure is deployed and the Dockstore CLI has been configured.

The AGC context and Dockstore configuration file do not need to be modified for the remainder of these examples, and will continue to function until the resources are modified and/or destroyed.

Words Workflow

The Dockstore entry associated with this workflow can be found here words.


cwlVersion: v1.0
class: Workflow
  ScatterFeatureRequirement: {}
  words: File
  vowels: string[]
    type: File
    outputSource: sumWords/summaryFile

    scatter: vowel
      words: words
      vowel: vowels
    out: [countFile]
      class: CommandLineTool
      baseCommand: grep
        words: File
        vowel: string
        - $(inputs.vowel)
        - $(inputs.words.path)
        - --count
          type: stdout
      stdout: count.txt
      countFiles: [countWordsWithLetter/countFile]
    out: [summaryFile]
      class: CommandLineTool
      baseCommand: ["awk", "{ sum += $1 } END { print sum }"]
          type: File[]
            position: 1
          type: stdout
      stdout: summary.txt
  1. This workflows takes a file, and an array of strings as an input. Create a file named input.json in your working directory with the contents:


      "words": {
        "class": "File",
        "path": ""
      "vowels": ["a","e","i","o","u"]
  2. Since this workflow is publicly posted on, we can quickly launch it by passing the Dockstore CLI the entry and input files.

    dockstore workflow wes launch --entry --json input.json
  3. The above command will return a unique run ID, similar to:


    Copy the run ID and run the following to get the workflow run logs:

    dockstore workflow wes logs --id run-00000000000000000000000000000000

    The logs returned will look similar to:

      "run_id" : "run-00000000000000000000000000000000",
      "request" : {
        "workflow_params" : {
          "words" : {
            "class" : "File",
            "path" : ""
          "vowels" : [ "a", "e", "i", "o", "u" ]
        "workflow_type" : "CWL",
        "workflow_type_version" : "v1.0",
        "tags" : {
          "Client" : "Dockstore"
        "workflow_engine_parameters" : { },
        "workflow_url" : ""
      "state" : "COMPLETE",
      "run_log" : {
        "name" : null,
        "cmd" : [ "<CENSORED>" ],
        "start_time" : "2023-04-20T21:15:35.906100",
        "end_time" : "2023-04-20T21:19:37.501446",
        "stdout" : "../../../../toil/wes/v1/logs/run-00000000000000000000000000000000/stdout",
        "stderr" : "../../../../toil/wes/v1/logs/run-00000000000000000000000000000000/stderr",
        "exit_code" : 0
      "task_logs" : [ ],
      "outputs" : {
        "summaryFile" : {
          "location" : "s3://<CENSORED FILE LOCATION>",
          "basename" : "summary.txt",
          "nameroot" : "summary",
          "nameext" : ".txt",
          "class" : "File",
          "checksum" : "sha1$ce1e58dd77758f13b49d2ef4c33a651e353fe074",
          "size" : 7

4. The output of this workflow is a text file containing a number. To retrieve the file's contents, you can navigate to the S3 URL via the AWS console, or copy the file contents using the AWS CLI:

aws s3 cp s3://<CENSORED FILE LOCATION> -

5. When you are finished running workflows on your AGC context, you need to destroy it. Destroy your AGC context by running the below command in the same directory as agc-project.yaml. This will take approximately 20 minutes.

agc context destroy ctx2