I'm setting out to use the SevenBridges WES client to run samtools stats on a cram file. The instructions described here https://docs.cancergenomicscloud.org/docs/run-a-workflow are the starting point for how to do this.


In [1]:
from fasp.workflow import sbWESClient
cl = sbWESClient('cgc','forei/gecco','~/.keys/sbcgc_key.json')

The above uses a client to get the details of a task that was run from the SevenBridges CGC user interface. 

The getTaskStatus function below is simply a wrapper around https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs/{run_id} which deals with authentication, passing and retrieving the request. That  gives some clues about how to fill out a request to submit the same task via WES instead of the UI.

It's worth noting that though DRS was not used at all to create the task within the UI the file paths in the WES response do use a DRS notation for them.

In [2]:
cl.getTaskStatus('0a528553-1292-493c-8db6-db1c3ce7831b', verbose=True)

Get request sent to: https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs/0a528553-1292-493c-8db6-db1c3ce7831b
{
  "request": {
    "tags": {},
    "workflow_params": {
      "name": "SAMtools Stats 1.8 run - 01-09-21 17:44:31",
      "project": "forei/gecco",
      "inputs": {
        "total_memory_GB": null,
        "coverage_limit": null,
        "include_only_read_group": null,
        "remove_duplicates": null,
        "max_insert_size": null,
        "reference_file": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/5bad6c83e4b0abc138917143",
          "name": "references-hs37d5-hs37d5.fasta",
          "class": "File"
        },
        "alignment_input_file": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/5ba9223ee4b0abc138883360",
          "name": "117438.recal.cram",
          "class": "File"
        }
      }
    },
    "workflow_type": "CWL",
    "workflow_engine_params": {}
  },
  "state": "COMPLETE",
  "outputs": {
    "statistics": {
      "path": "drs

'COMPLETE'

Looking at that response gives some clues about how to edit the example provided in the documentation.

How the task looks in the UI is also helpful.
![alt text](SAMToolsTask.png "samtools task as shown in SevenBridges CGC UI")


Filling out the body for a WES request to run the same thing, the project information is easy to work out. Inputs too seem pretty straightforward. Even though it's not present in the status above it's also pretty obvious that workflow_url should be the URI for the samtools stats app in my gecco project. The only tricky one was workflow_type_version. The log for the task run via the UI gives us a clue for that; job.json contains "cwlVersion" : "sbg:draft-2".

With all that we come up with the following body for the request.

In [4]:
body = {
  "workflow_params": {
    "project": "forei/gecco",
    "inputs": {
      "alignment_input_file":
        {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/5ba9223ee4b0abc138883360",
          "name": "117438.recal.cram",
          "class": "File"
        },
      "reference_file": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/5bad6c83e4b0abc138917143",
          "name": "references-hs37d5-hs37d5.fasta",
        "class": "File"
      }
    }
  },
  "workflow_type": "CWL",
  "workflow_type_version": "sbg:draft-2",
  "workflow_url": "sbg://forei/gecco/samtools-stats-1-8/10"
}

In [5]:
response = cl.runWorkflow(body,verbose=True)
print(response)

sending to https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs
{"status":"UNKNOWN","message":"HTTP 415 Unsupported Media Type"}
<Response [415]>
WES run submission failed. Response status:415


SystemExit: 1

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


For confirmation, the same 415 response is obtained if I submit the same details via Postman. In both cases the content-type is application/json.