Skip to content

Workflow Runner API

Soumya Brahma edited this page Feb 5, 2016 · 57 revisions

Table of Contents

Status
The first implementation of this API is made to interface the Taverna Server, and is under development. A deployment of the latest snapshot is available at http://sandbox.wf4ever-project.org/runner/default/ - which accesses http://sandbox.wf4ever-project.org/taverna-server/

API function overview

Research Objects, for the purpose of Wf4Ever, will generally contain workflows. In order to assess if a workflow is functional, it is generally useful to be able to (re)-execute a workflow.

Different workflow systems have different ways of running a workflow. For instance, Taverna has the Taverna Server, while Wings has a portal and a Pegasus/Condor engine in the backend. This API intends to provide a common lightweight interface within Wf4Ever for features such as "Run this workflow please" and "Show me the data from that workflow run".

At its heart, this API mirrors the RODL API, but the ROs exposed by this service each represent a particular workflow run, structured to show inputs, outputs, console logs, provenance and annotations containing wfprov and wfdesc mappings. Thus it intends to be possible to use existing RODL compatible tools with this service, for instance adding from the RO command line tool, browsing with the Portal or transforming to wfdesc using the Workflow Transformer service.

API usage

Accessing the root of the service, in this specification exemplified as http://example.com/runner, SHOULD redirect to a default server runs resource. From here the client may either:

  • POST a new workflow run, providing as a minimum the workflow definition
  • GET a list of existing workflow runs
  • DELETE existing workflow runs
Navigating the workflow runs would allow inspection of workflow status, outputs and other resources exposed by the underlying workflow server.

A client may also create a new run by uploading a workflow definition, provide inputs and initiate running the workflow.

See the Resources and formats below for details.

Link relations

Resources are located using specific properties in the RO manifest for the workflow run.

Property Description runner:workflow Used in the workflow run description to link the workflow run with the main workflow to run, such as uploaded on RO creation. It is a subproperty of ore:aggregates. runner:inputs Used in the workflow run description to link the workflow run with the list of required workflow inputs, if any. It is a subproperty of ore:aggregates. runner:outputs Used in the workflow run description to link the workflow run with the list of (expected or actual) workflow outputs, if any. It is a subproperty of ore:aggregates. runner:logs Used in the workflow run description to link the workflow run with the list of logs, such as stdout, if any. It is a subproperty of ore:aggregates. runner:provenance Used in the workflow run description to link the workflow run with the list of provenance related resources, if any. It is a subproperty of ore:aggregates. runner:workingDirectory Used in the workflow run description to link the workflow run with the list of working directory and its files, if any. It is a subproperty of ore:aggregates.

Property Description
runner:workflow Used in the workflow run description to link the workflow run with the main workflow to run, such as uploaded on RO creation. It is a subproperty of ore:aggregates.
runner:inputs Used in the workflow run description to link the workflow run with the list of required workflow inputs, if any. It is a subproperty of ore:aggregates.
runner:outputs Used in the workflow run description to link the workflow run with the list of (expected or actual) workflow outputs, if any. It is a subproperty of ore:aggregates.
runner:logs Used in the workflow run description to link the workflow run with the list of logs, such as stdout, if any. It is a subproperty of ore:aggregates.
runner:provenance Used in the workflow run description to link the workflow run with the list of provenance related resources, if any. It is a subproperty of ore:aggregates.
runner:workingDirectory Used in the workflow run description to link the workflow run with the list of working directory and its files, if any. It is a subproperty of ore:aggregates.

Resources and formats

All formats are based on RDF in text/turtle and application/rdf+xml (by content negotiation) unless noted otherwise.

The resource types are listed below. Specifically, a compliant implementation of the Workflow runner API SHOULD support:

  • Finding default workspace to redirect to the RO workspace of the default server
  • Retrieve runs in workspace to see current runs
  • Submit new run to workspace to create a new run
  • Retrieve run to view a run and its resources
  • Retrieving the workflow status to check the current status
  • Changing the workflow status to initiate running of the workflow
  • Retrieving the outputs when the workflow has status http://purl.org/wf4ever/runner#Finished
Resource type Description
Workspace Represents a list of workflow runs, similarly to how an RO service specified a list of research objects. The only format available is text/uri-list, which returns a list of URIs that SHOULD point to research objects representing workflow runs.
Workflow run A workflow run is represented as a research object and as such it shares the format of the research object as defined in the RO API. The preferred format is RDF; the support for ZIP and HTML formats is optional. The RDF format may be subject to content negotiation.
Workflow The workflow as posted by the creator. It may be a workflow description as an RDF file (format subject to content negotiation) or the actual workflow file, such as application/vnd.taverna.t2flow+xml in case of a Taverna 2 workflow.
Workflow status A one-element list of URIs, in which the URI is one of predefined values indicating the status of the workflow run. The format is text/uri-list.
Inputs Any resource that has been submitted as an input to the workflow run. When submitting an input, it is possible to specify an external reference by using a “text/uri-list” format.
Outputs Any outputs generated by the workflow run. Special formats can be used to indicate an error in generating the specific output, such as application/vnd.wf4ever.runner.error.
Provenance An ro:Folder aggregating provenance resources.
Working Directory An ro:Folder, which content will be/was the current directory (./) when running the workflow
Logs An ro:Folder aggregating the log files.

Finding default workspace

HEAD or GET on this entry point SHOULD redirect to a workspace of workflow runs on the default server:

C: HEAD http://example.com/runner HTTP/1.1
C: Accept: text/turtle

S: HTTP/1.1 303 See Other
S: Location: http://example.com/runner/default/
The returned location MUST point to a workspace (see [#Retrieve] below).

The service MAY return 405 Method Not Allowed if it has no default server, in which case it MUST support browsing of explicit servers (see below).

Browsing other workflow servers

The service MAY support browsing other workflow servers than the default, by ways of POSTing a text/uri-list specifying the service.

C: POST http://example.com/runner HTTP/1.1
C: Content-Type: text/uri-list
C:
C: http://galaxy.example.net/server/


S: HTTP/1.1 303 See Other
S: Location: http://example.com/runner,galaxy=/server/
The returned location MUST point to a workspace of workflow runs.

The service SHOULD return 400 Bad Request if more than one URI was included, or the URI was malformed.

This specification does not require any particular URI templates for the redirection. It is an implementation detail how the Workflow Runner service relates the request to the actual, underlying workflow execution service.
⚠️ Clients MUST ensure that the submitted URI is encoded according to RFC 3986, for instance http://example.net/fred%20and%20me/ rather than http://example.net/fred and me/.
Servers MAY use the submitted URI as a basis for constructing the returned URL, but MUST then ensure that it is likewise properly escaped.

Retrieve runs in workspace



The list of server runs is represented as a RODL workspace, where each RO represents a run.

C: GET http://example.com/runner/default/ HTTP/1.1
C: Accept: text/uri-list
C: Authorization: Bearer h480djs93hd8

S: HTTP/1.1 200 OK
S: Content-Type: text/uri-list
S: 
S: http://example.com/runner/default/1/
S: http://example.com/runner/default/2/
S: http://example.com/runner/default/4/
Each URI returned, if any, SHOULD point to a research object representing a workflow run.

Submit new run to workspace

Creating a new run is similar to creating a new research object, but requires the content-type text/uri-list to include the URL for the workflow definition to run.

C: POST http://example.com/runner/default/ HTTP/1.1
C: Content-Type: text/uri-list
C: Content-Length: ...
C: Slug: 1337
C: Authorization: Bearer h480djs93hd8C: C: http://example.net/workflow.t2flow

S: HTTP/1.1 201 Created
S: Location: http://example.com/runner/default/1337/
The returned location refers to a research object representing the run.

The client MAY provide the Slug: header to suggest a name to include in the created run, which the service MAY support. The service SHOULD ensure the returned run URI is unique, even if multiple POSTs submit the same workflow URL.

The service SHOULD attempt to retrieve the provided workflow definition before responding to the request.

The service SHOULD NOT start running the workflow immediately, but wait for the client to modify its status. (See below).

The service SHOULD fail with 502 Bad Gateway if it is unable to retrieve the submitted workflow definition due to network issues or HTTP errors (including 404), or 504 Gateway Timeout if the request for the definition timed out. The service SHOULD include an error message in the response body to indicate the nature of this failure.

The service SHOULD fail with 501 Not Implemented if the service did successfully retrieve the workflow definition, but the underlying workflow server does not support its format. The server MAY include an error message in the response body to indicate supported workflow definition formats and/or media types.

Retrieve run

A workflow run is represented as a research object, thus retrieving it will redirect to a manifest listing its constituent resources.

C: GET http://example.com/runner/default/1337/ HTTP/1.1
C: Accept: text/turtle

S: HTTP/1.1 303 See Other
S: Location: http://example.com/runner/default/1337/manifest

C: GET http://example.com/runner/default/1337/manifest HTTP/1.1
C: Accept: text/turtle

S: HTTP/1.1 200 OK
S: Content-Type: text/turtle
S:
S: @base <http://example.com/runner/default/1337/> .
S: @prefix ro: <http://purl.org/wf4ever/ro#> .
S: @prefix ore: <http://www.openarchives.org/ore/> .
S: # ..
S: <http://example.com/runner/default/1337/> a ro:ResearchObject ;
S:     ore:aggregates <workflow>, <status> .
S: # ...
The manifest MUST include Workflow Runner specific extensions to indicate the corresponding Workflow Runner API specific resources that are supported by the service. These are declared in the namespace http://purl.org/wf4ever/runner# (prefix runner: below) and associated with the research object, which MUST be of the type runner:WorkflowRun.
@prefix runner: <http://purl.org/wf4ever/runner#> .
@prefix wfdesc: <http://purl.org/wf4ever/wfdesc#> .
@prefix wf4ever: <http://purl.org/wf4ever/wf4ever#> .
@base <http://example.com/runner/default/1337/> .
# ..
<> a runner:WorkflowRun, ro:ResearchObject, wf4ever:WorkflowResearchObject ;
    ore:aggregates <workflow>, <status>, <inputs>, <outputs>, <logs> ;
    runner:workflow <workflow> ;
    runner:status <status> ;
    runner:inputs <inputs> ;
    runner:outputs <outputs> ;
    runner:logs <logs> .

<workflow> a runner:Workflow, ro:Resource .
<status> a runner:Status, ro:Resource .
<inputs> a runner:Inputs, ro:Folder ;
    ore:isDescribedBy <inputs/> .
<outputs> a runner:Outputs, ro:Folder ;
    ore:isDescribedBy <outputs/> .
<logs> a runner:Logs, ro:Folder ;
    ore:isDescribedBy <logs/> .

# ... proxies, annotations et al.
Supported properties and types:
Property Type Superclass Description
runner:WorkflowRun wf4ever:WorkflowResearchObject A research object that represents a particular workflow run
runner:workflow runner:Workflow wfdesc:Workflow (Required)The main workflow to run,such as uploaded on RO creation
runner:status runner:Status ro:Resource (Required)The status of the workflow, such as 'Running' or 'Finished'
runner:inputs runner:Inputs ro:Folder List of required workflow inputs, if any
runner:outputs runner:Outputs ro:Folder List of (expected or actual) workflow outputs, if any
runner:logs runner:Logs ro:Folder List of logs, such as stdout, if any
runner:provenance runner:Provenance ro:Folder List of provenance related resources, if any
runner:workingDirectory runner:WorkingDirectory ro:Folder List of working directory and its files, if any

Clone this wiki locally