## Running a Workflow on a Seven Bridges WES server
I'm setting out to use the SevenBridges WES client to run samtools stats on a cram file. The instructions described here https://docs.cancergenomicscloud.org/docs/run-a-workflow are the starting point for how to do this.


In [7]:
#from fasp.workflow import sbWESClient
from fasp.workflow import sbcgcWESClient

cl = sbcgcWESClient('forei/CNest', debug=True)

The above instantiates a client for the SevenBridges Cancer Genomics Cloud (CGC ). 

### Checking a previous run
For information we'll first use the client to get the details of a task that was run from the CGC user interface
The getTaskStatus function below is simply a wrapper around https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs/{run_id} which deals with authentication, passing and retrieving the request. That  gives some clues about how to fill out a request to submit the same task via WES instead of the UI.

It's worth noting that though DRS was not used at all to create the task within the UI the file paths in the WES response do use a DRS notation for them.

In [3]:
cl.getTaskStatus('266de7e2-613f-4545-9dec-67fe89fc43b8', verbose=True)

Get request sent to: https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs/266de7e2-613f-4545-9dec-67fe89fc43b8
{
  "request": {
    "tags": {},
    "workflow_params": {
      "name": "cnest-step1 run - 05-12-22 15:11:14",
      "project": "forei/cnest",
      "inputs": {
        "bed": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/626bfb1bf26c93517368984e",
          "basename": "hg38.1kb.baits.bed",
          "nameext": ".bed",
          "class": "File",
          "nameroot": "hg38.1kb.baits"
        },
        "project": "test_proj10"
      }
    },
    "workflow_type": "CWL",
    "workflow_engine_params": {}
  },
  "state": "COMPLETE",
  "outputs": {
    "output": {
      "path": "drs://cgc-ga4gh-api.sbgenomics.com/627d244df26c935173c4201c",
      "basename": "test_proj10",
      "class": "File",
      "nameroot": "test_proj10"
    }
  },
  "run_id": "266de7e2-613f-4545-9dec-67fe89fc43b8",
  "run_log": {
    "name": "cnest-step1 run - 05-12-22 15:11:14",
    "cmd": null,

'COMPLETE'

### CNest Step 1 via WES
Reverse engineering what we can see above we can run CNest via WES as follows:

In [5]:
params = {
    "project": "forei/cnest",
        "inputs": {
        "bed": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/626bfb1bf26c93517368984e",
          "name": "hg38.1kb.baits.bed",
          "class": "File"
        },
        "project": "test_proj"
      }
 
    }


Now we have formulated the body in the way that it can be passed to a client function as follows.

In [6]:
import json
run_id= cl.runGenericWorkflow(
    workflow_url='sbg://forei/cnest/cnest-step1',
    workflow_params = json.dumps(params),
    workflow_type = "CWL",
    workflow_type_version = "v1.1",
    verbose=False
)
run_id

'1c344a90-0e97-4309-baeb-1b367a4098af'

In [11]:
cl.getTaskStatus(run_id)

'QUEUED'

### Running Step 2
Get the details of the manual run of step 2

In [8]:
cl.getTaskStatus('84904f51-04a0-426d-850d-9fb0f1b0b331', verbose=True)

Get request sent to: https://cgc-ga4gh-api.sbgenomics.com/ga4gh/wes/v1/runs/84904f51-04a0-426d-850d-9fb0f1b0b331
{
  "request": {
    "tags": {},
    "workflow_params": {
      "name": "CNest step2 run - 05-08-22 21:38:11",
      "project": "forei/cnest",
      "inputs": {
        "index_txt": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ac",
          "name": "index.txt",
          "class": "File"
        },
        "index_bed": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ae",
          "name": "index.bed",
          "class": "File"
        },
        "project": "test_proj",
        "index_tab": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92af",
          "name": "index_tab.txt",
          "class": "File"
        },
        "sample": "test_bam2",
        "bam": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/6272e873d125a52cff9b0247",
          "name": "TCGA-3X-AAVA-01A-11R-A41D-

'COMPLETE'

### Run CNest Step 2 via WES

Set up the paramters as above

In [9]:
params = {
    "project": "forei/cnest",
    "inputs": {
        "index_txt": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ac",
          "name": "index.txt",
          "class": "File"
        },
        "index_bed": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ae",
          "name": "index.bed",
          "class": "File"
        },
        "project": "test_proj",
        "index_tab": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92af",
          "name": "index_tab.txt",
          "class": "File"
        },
        "sample": "test_bam",
        "bam": {
          "path": "drs://cgc-ga4gh-api.sbgenomics.com/6272e873d125a52cff9b0247",
          "name": "TCGA-3X-AAVA-01A-11R-A41D-13_mirna_gdc_realn.bam",
          "secondaryFiles": [
            {
              "path": "drs://cgc-ga4gh-api.sbgenomics.com/6272ec5df26c93517378730b",
              "name": "TCGA-3X-AAVA-01A-11R-A41D-13_mirna_gdc_realn.bam.bai",
              "class": "File"
            }
          ],
          "class": "File"
        }
      }
    }



In [10]:
#import json
run_id= cl.runGenericWorkflow(
    workflow_url='sbg://forei/cnest/cnest-step2/14',
    workflow_params = json.dumps(params),
    workflow_type = "CWL",
    workflow_type_version = "sbg:draft-2",
    verbose=False
)
run_id

'1bb836cb-7905-476d-8dbf-278a8fbf6394'

Can we access the BioDataCatalyst file directly via DRS?

In [13]:
params['inputs']['bam'] = {
          "path": "drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671d1",
          "name": "HG00445.final.cram",
          "secondaryFiles": [
            {
              "path": "drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671cf",
              "name": "HG00445.final.cram.crai",
              "class": "File"
            }
          ],
          "class": "File"
        }

In [14]:
params

{'project': 'forei/cnest',
 'inputs': {'index_txt': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ac',
   'name': 'index.txt',
   'class': 'File'},
  'index_bed': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ae',
   'name': 'index.bed',
   'class': 'File'},
  'project': 'test_proj',
  'index_tab': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92af',
   'name': 'index_tab.txt',
   'class': 'File'},
  'sample': 'test_bam',
  'bam': {'path': 'drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671d1',
   'name': 'HG00445.final.cram',
   'secondaryFiles': [{'path': 'drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671cf',
     'name': 'HG00445.final.cram.crai',
     'class': 'File'}],
   'class': 'File'}}}

In [15]:
run_id= cl.runGenericWorkflow(
    workflow_url='sbg://forei/cnest/cnest-step2/14',
    workflow_params = json.dumps(params),
    workflow_type = "CWL",
    workflow_type_version = "sbg:draft-2",
    verbose=False
)
run_id

Full response status:
<Response [400]>
Full response content:
b'{"msg":"Following file references can not be resolved: drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671d1","status_code":400}'
Full response headers:
{'Server': 'nginx', 'Date': 'Wed, 18 May 2022 12:54:11 GMT', 'Content-Type': 'application/json', 'Content-Length': '148', 'Connection': 'keep-alive', 'X-Frame-Options': 'DENY', 'X-Xss-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff', 'X-Download-Options': 'noopen', 'Content-Security-Policy': "frame-ancestors 'none'; report-uri https://sbgenomics.report-uri.com/r/d/csp/enforce", 'Strict-Transport-Security': 'max-age=63072000'}


RuntimeError: WES run submission failed. Response status:400

So we cannot pass a BDC drs id to a WES task run on CGC. 

Validated that CGC is capable of "importing" the file using the same DRS id as above. It is validating that I have access to the file (though note this is a public file) It's just passing at as WES that doesn't work.

### Running via a signed URL obtained from DRS

In [None]:
drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671d1

Can we run the above with a bam file from a URL obtained via DRS?

We'll try with the Gen3 id of the same file as above



In [20]:
params['inputs']['bam'] = {
          "path": "drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671d1",
          "name": "HG00445.final.cram",
          "secondaryFiles": [
            {
              "path": "drs://ga4gh-api.sb.biodatacatalyst.nhlbi.nih.gov/626c079e645ccb7324c671cf",
              "name": "HG00445.final.cram.crai",
              "class": "File"
            }
          ],
          "class": "File"
        }


In [21]:
params

{'project': 'forei/cnest',
 'inputs': {'index_txt': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ac',
   'name': 'index.txt',
   'class': 'File'},
  'index_bed': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92ae',
   'name': 'index.bed',
   'class': 'File'},
  'project': 'test_proj',
  'index_tab': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/627653faf26c9351737f92af',
   'name': 'index_tab.txt',
   'class': 'File'},
  'sample': 'test_bam2',
  'bam': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/6272e873d125a52cff9b0247',
   'name': 'TCGA-3X-AAVA-01A-11R-A41D-13_mirna_gdc_realn.bam',
   'secondaryFiles': [{'path': 'drs://cgc-ga4gh-api.sbgenomics.com/6272ec5df26c93517378730b',
     'name': 'TCGA-3X-AAVA-01A-11R-A41D-13_mirna_gdc_realn.bam.bai',
     'class': 'File'}],
   'class': 'File'}},
 'bam': {'path': 'drs://cgc-ga4gh-api.sbgenomics.com/6272e873d125a52cff9b0247',
  'name': 'TCGA-3X-AAVA-01A-11R-A41D-13_mirna_gdc_realn.bam',
  'secondaryFiles

In [2]:
from fasp.loc import bdcDRSClient
drs_client = bdcDRSClient("/Users/forei/.keys/bdc_credentials.json")

In [5]:
drs_id = '626c079e645ccb7324c671d1'
drs_client.getObject(drs_id)

{"msg":"No bundle found","status_code":404}



404