The Application Package
defines the internal script definition and configuration that will be executed by a Process
. This package is based on _ (CWL
). Using the extensive _ as backbone for internal execution of the process allows it to run multiple type of applications, whether they are referenced to by Docker
image, scripts (bash, python, etc.), some remote Process
and more.
Note
The large community and use cases covered by CWL
makes it extremely versatile. If you encounter any issue running your Application Package
in Weaver (such as file permissions for example), chances are that there exists a workaround somewhere in the _. Most typical problems are usually handled by some flag or argument in the CWL
definition, so this reference should be explored first. Please also refer to FAQ
section as well as existing _. Ultimately if no solution can be found, open an new issue about your specific problem.
All processes deployed locally into Weaver using a CWL
package definition will have their full package definition available with _ request.
Note
The package request is a Weaver-specific implementation, and therefore, is not necessarily available on other ADES
/EMS
implementation as this feature is not part of _ specification.
Following CWL
package definition represents the :pyweaver.processes.builtin.jsonarray2netcdf
process.
../../weaver/processes/builtin/jsonarray2netcdf.cwl
The first main components is the class: CommandLineTool
that tells Weaver it will be an atomic process (contrarily to CWL Workflow presented later).
The other important sections are inputs
and outputs
. These define which parameters will be expected and produced by the described application. Weaver supports most formats and types as specified by _. See Inputs/Outputs Type for more details.
When deploying a CommandLineTool
that only needs to execute script or shell commands, it is recommended to define an appropriate _ to containerize the Process
, even though no advanced operation is needed. The reason for this is because there is no way for Weaver to otherwise know for sure how to provide all appropriate dependencies that this operation might need. In order to preserve processing environment and results separate between any Process
and Weaver itself, the executions will either be automatically containerized (with some default image), or blocked entirely when Weaver cannot resolve the appropriate execution environment. Therefore, it is recommended that the Application Package
provider defines a specific image to avoid unexpected failures if this auto-resolution changes across versions.
Below are minimalistic Application Package
samples that make use of a shell command and a custom Python script for quickly running some operations, without actually needing to package any specialized Docker
image.
The first example simply outputs the contents of a file
input using the cat
command. Because the Docker
image debian:stretch-slim
is specified, we can guarantee that the command will be available within its containerized environment. In this case, we also take advantage of the stdout.log
which is always collected by Weaver (along with the stderr
) in order to obtain traces produced by any Application Package
when performing Job
executions.
../examples/docker-shell-script-cat.cwl
The second example takes advantage of the _ to generate a Python script dynamically (i.e.: script.py
), prior to executing it for processing the received inputs and produce the output file. Because a Python runner is required, the _ specification defines a basic Docker
image that meets our needs. Note that in this case, special interpretation of $(...)
entries within the definition can be provided to tell CWL
how to map Job
input values to the dynamically created script.
../examples/docker-python-script-report.cwl
When advanced processing capabilities and more complicated environment preparation are required, it is recommended to package and push pre-built Docker
images to a remote registry. In this situation, just like for app_pkg_script
examples, the _ is needed. The definitions would also be essentially the same as previous examples, but with more complicated operations and possibly larger amount of inputs or outputs.
Whenever a Docker
image reference is detected, Weaver will ensure that the application will be pulled using CWL
capabilities in order to run it.
Because Application Package
providers could desire to make use of Docker
images hosted on private registries, Weaver offers the capability to specify an authorization token through HTTP request headers during the Process
deployment. More specifically, the following definition can be provided during a Deploy <proc_op_deploy>
request.
POST /processes HTTP/1.1
Host: weaver.example.com
Content-Type: application/json;charset=UTF-8
X-Auth-Docker: Basic <base64_token>
{ "processDescription": { }, "executionUnit": { } }
The X-Auth-Docker
header should be defined exactly like any typical Authorization
headers (_). The name X-Auth-Docker
is inspired from existing implementations that employ X-Auth-Token
in a similar fashion. The reason why Authorization
and X-Auth-Token
headers are not themselves employed in this case is to ensure that they do not interfere with any proxy or server authentication mechanism, which Weaver could be located behind.
For the moment, only Basic
(7617
) authentication is supported. To generate the base64 token, following methods can be used:
echo -n "<username>:<password>" | base64
import base64
base64.b64encode(b"<username>:<password>")
When the HTTP X-Auth-Docker
header is detected in combination of a _ entry within the Application Package
of the Process
being deployed, Weaver will parse the targeted Docker
registry defined in dockerPull
and will attempt to identify it for later authentication towards it with the provided token. Given a successful authentication, Weaver should then be able to pull the Docker
image whenever required for launching new Job
executions.
Note
Weaver only attempts to authenticate itself temporarily at the moment when the Job
is submitted to retrieve the Docker
image, and only if the image is not already available locally. Because of this, the provided authentication token should have a sufficient lifetime to run the Job
at later times, considering any retention time of cached Docker
images on the server. If the cache is cleaned, and the Docker
image is made unavailable, Weaver will attempt to authenticate itself again when receiving the new Job
. It is left up to the developer and Application Package
provider to manage expired tokens in Weaver according to their needs. To resolve such cases, the _ request or an entire re-deployment of the Process
could be accomplished, whichever is more convenient for them.
4.5.0 Specification and handling of the X-Auth-Docker
header for providing an authentication token.
Weaver also supports CWL
class: Workflow
. When an Application Package
is defined this way, the _ will attempt to resolve each step
as another process. The reference to the CWL
definition can be placed in any location supported as for the case of atomic processes (see details about supported package locations <WPS-REST>
).
The following CWL
definition demonstrates an example Workflow
process that would resolve each step
with local processes of match IDs.
../../tests/functional/application-packages/WorkflowSubsetIceDays/package.cwl
For instance, the jsonarray2netcdf
(Builtin
) middle step in this example corresponds to the CWL CommandLineTool process presented in previous section. Other processes referenced in this Workflow
can be found in Weaver Test Resources_.
Steps processes names are resolved using the variations presented below. Important care also needs to be given to inputs and outputs definitions between each step.
In order to resolve referenced processes as steps, Weaver supports 3 formats.
- Process ID explicitly given.
Any visible process from _ response should be resolved this way.
(e.g.:jsonarray2netcdf
resolves to pre-deployed :pyweaver.processes.builtin.jsonarray2netcdf
). - Full URL to the process description endpoint, provided that it also offers a _ endpoint (Weaver-specific).
- Full URL to the explicit CWL file (usually corresponding to (2) or the
href
provided in deployment body).
When an URL to the CWL
process "file" is provided with an extension, it must be one of the supported values defined in :pyweaver.processes.wps_package.PACKAGE_EXTENSIONS
. Otherwise, Weaver will refuse it as it cannot figure out how to parse it.
Because Weaver and the underlying CWL executor need to resolve all steps in order to validate their input and output definitions correspond (id, format, type, etc.) in order to chain them, all intermediate processes MUST be available. This means that you cannot Deploy <proc_op_deploy>
nor Execute <proc_op_execute>
a Workflow
-flavored Application Package
until all referenced steps have themselves been deployed and made visible.
Warning
Because Weaver needs to convert given CWL
documents into equivalent WPS
process definition, embedded CWL
processes within a Workflow
step are not supported currently. This is a known limitation of the implementation, but not much can be done against it without major modifications to the code base. See also issue #56.
- :py
weaver.processes.wps_package.get_package_workflow_steps
Deploy <proc_op_deploy>
request details.
Inputs and outputs of connected steps are required to match types and formats in order for the workflow to be valid. This means that a process that produces an output of type String
cannot be directly chained to a process that takes as input a File
, even if the String
of the first process represents an URL that could be resolved to a valid file reference. In order to chain two such processes, an intermediate operation would need to be defined to explicitly convert the String
input to the corresponding File
output. This is usually accomplished using Builtin
processes, such as in the previous example.
Since formats must also match (e.g.: a process producing application/json
cannot be mapped to one producing application/x-netcdf
), all mismatching formats must also be converted with an intermediate step if such operation is desired. This ensures that workflow definitions are always explicit and that as little interpretation, variation or assumptions are possible between each execution. Because of this, all application generated by Weaver will attempt to preserve and enforce matching input/output format
definition in both CWL
and WPS
as long as it does not introduce ambiguous results (see File Format
for more details).
Because CWL
definition and WPS
process description inherently provide "duplicate" information, many fields can be mapped between one another. In order to handle any provided metadata in the various supported locations by both specifications, as well as to extend details of deployed processes, each Application Package
get its details merged with complementary WPS
description.
In some cases, complementary details are only documentation-related, but some information directly affect the format or execution behaviour of some parameters. A common example is the maxOccurs
field provided by WPS
that does not have an exactly corresponding specification in CWL
(any-sized array). On the other hand, CWL
also provides data preparation steps such as initial staging (i.e.: InitialWorkDirRequirement
) that doesn't have an equivalent under the WPS
process description. For this reason, complementary details are merged and reflected on both sides (as applicable), when non-ambiguous resolution is possible.
In case of conflicting metadata, the CWL
specification will most of the time prevail over the WPS
metadata fields simply because it is expected that a strict CWL specification is provided upon deployment. The only exceptions to this situation are when WPS
specification help resolve some ambiguity or when WPS
enforces the parametrisation of some elements, such as with maxOccurs
field.
Note
Metadata merge operation between CWL
and WPS
is accomplished on per-mapped-field basis. In other words, more explicit details such as maxOccurs
could be obtained from WPS
and simultaneously the same input's format
could be obtained from the CWL
side. Merge occurs bidirectionally for corresponding information.
The merging strategy of process specifications also implies that some details can be omitted from one context if they can be inferred from corresponding elements in the other. For example, the CWL
and WPS
context both define keywords
(with minor naming variation) as a list of strings. Specifying this metadata in both locations is redundant and only makes the process description longer. Therefore, the user is allowed to provide only one of the two and Weaver will take care to propagate the information to the lacking location.
In order to help understand the resolution methodology between the contexts, following sub-section will cover supported mapping between the two specifications, and more specifically, how each field impacts the mapped equivalent metadata.
Warning
Merging of corresponding fields between CWL
and WPS
is a Weaver-specific implementation. The same behaviour is not necessarily supported by other implementations. For this reason, any converted information between the two contexts will be transferred to the other context if missing in order for both specification to reflect the similar details as closely as possible, wherever context the metadata originated from.
Inputs and outputs (I/O
) id
from the CWL
context will be respectively matched against corresponding id
or identifier
field from I/O of WPS
context. In the CWL
definition, all of the allowed I/O structures are supported, whether they are specified using an array list with explicit definitions, using "shortcut" variant (i.e.: <type>[]
), or using key-value pairs (see _ for more details). Regardless of array or mapping format, CWL
requires that all I/O have unique id
. On the WPS
side, either a mapping or list of I/O are also expected with unique id
.
4.0 Previous versions only supported WPS
I/O using the listing format. Both can be used interchangeably in both CWL
and WPS
contexts as of this version.
To summarize, the following CWL
and WPS
I/O definitions are all equivalent and will result into the same process definition after deployment. For simplification purpose, below examples omit all but mandatory fields (only of the inputs
and outputs
portion of the full deployment body) to produce the same result. Other fields are discussed afterward in specific sections.
{
"inputs": [
{
"id": "single-str",
"type": "string"
},
{
"id": "multi-file",
"type": "File[]"
}
],
"outputs": [
{
"id": "output-1",
"type": "File"
},
{
"id": "output-2",
"type": "File"
}
]
} |
{
"inputs": {
"single-str": {
"type": "string"
},
"multi-file": {
"type": "File[]"
}
},
"outputs": {
"output-1": {
"type": "File"
},
"output-2": {
"type": "File"
}
}
} |
{
"inputs": [
{
"id": "single-str"
},
{
"id": "multi-file",
"formats": []
}
],
"outputs": [
{
"id": "output-1",
"formats": []
},
{
"id": "output-2",
"formats": []
}
]
} |
The WPS
example above requires a format
field for the corresponding CWL
File
type in order to distinguish it from a plain string. More details are available in Inputs/Outputs Type below about this requirement.
Finally, it is to be noted that above CWL
and WPS
definitions can be specified in the Deploy <proc_op_deploy>
request body with any of the following variations:
- Both are simultaneously fully specified (valid although extremely verbose).
- Both partially specified as long as sufficient complementary information is provided.
- Only
CWL
I/O
is fully provided (with empty or even unspecifiedinputs
oroutputs
section fromWPS
).
Warning
Weaver assumes that its main purpose is to eventually execute an Application Package
and will therefore prioritize specification in CWL
over WPS
to infer types. Because of this, any unmatched id
from the WPS
context against provided CWL
id
s of the same I/O section will be dropped, as they ultimately would have no purpose during CWL
execution.
This does not apply in the case of referenced WPS-1/2
processes since no CWL
is available in the first place.
In the CWL
context, the type
field indicates the type of I/O
. Available types are presented in the _ portion of the specification.
Warning
Weaver has two unsupported CWL
type
, namely Any
and Directory
. This limitation is intentional as WPS
does not offer equivalents. Furthermore, both of these types make the process description too ambiguous. For instance, most processes expect remote file references, and providing a Directory
doesn't indicate an explicit reference to which files to retrieve during stage-in operation of a Job
execution.
In the WPS
context, three data types exist, namely Literal
, BoundingBox
and Complex
data.
As presented in the example of the previous section, I/O
in the WPS
context does not require an explicit indication of the type from one of Literal
, BoundingBox
and Complex
data. Instead, WPS
type is inferred using the matched API schema of the I/O. For instance, Complex
I/O (e.g.: file reference) requires the formats
field to distinguish it from a plain string
. Therefore, specifying either format
in CWL
or formats
in WPS
immediately provides all needed information for Weaver to understand that this I/O is expected to be a file reference. A combination of bbox
and crs
fields would otherwise indicate a BoundingBox
I/O (see note <bbox-note>
). If none of the two previous schemas are matched, the I/O type resolution falls back to Literal
data of string
type. To employ another primitive data type such as Integer
, an explicit indication needs to be provided as follows.
{
"id": "input",
"literalDataDomains": [
{"dataType": {"name": "integer"}}
]
}
Obviously, the equivalent CWL
definition is simpler in this case (i.e.: only type: int
is required). It is therefore recommended to take advantage of Weaver's merging strategy in this case by providing only the details through the CWL
definition and have the corresponding WPS
I/O type automatically deduced by the generated process. If desired, literalDataDomains
can still be explicitly provided as above to ensure that it gets parsed as intended type.
With more recent versions of Weaver, it is also possible to employ OpenAPI
schema definitions provided in the WPS
I/O to specify the explicit structure that applies to Literal
, BoundingBox
and Complex
data types. When OpenAPI
schema are detected, they are also considered in the merging strategy along with other specifications provided in CWL
and WPS
contexts. More details about OAS
context is provided in OpenAPI Schema
section.
An input or output resolved as CWL
File
type, equivalent to a WPS
ComplexData
, supports format
specification. Every mimeType
field nested under formats
entries of the WPS
definition will be mapped against corresponding namespaced format
of CWL
.
Note
For OGC API - Processes
conformance and backward compatible support, both mimeType
and mediaType
can be used interchangeably for Process Deployment <proc_op_deploy>
. For Process Description <proc_op_describe>
, the employed name depends on the requested schema
as query parameter, defaulting to OGC API - Processes
mediaType
representation if unspecified.
Following is an example where input definitions are equivalent in both CWL
and WPS
contexts.
{
"id": "input",
"formats": [
{"mimeType": "application/x-netcdf"},
{"mimeType": "application/json"}
]
} |
{
"inputs": [
{
"id": "input",
"format": [
"edam:format_3650",
"iana:application/json"
]
}
],
"$namespaces": {
"edam": "http://edamontology.org/",
"iana": "https://www.iana.org/assignments/media-types/"
}
} |
As demonstrated, both contexts accept multiple formats for inputs. These effectively represent supported formats by the underlying application. The two Media-Types
selected for this example are chosen specifically to demonstrate how CWL
formats must be specified. More precisely, CWL
requires a real schema definition referencing to an existing ontology to validate formats, specified through the $namespaces
section. Each format entry is then defined as a mapping of the appropriate namespace to the identifier of the ontology. Alternatively, you can also provide the full URL of the ontology reference in the format string.
Like many other fields, this information can become quite rapidly redundant and difficult to maintain. For this reason, Weaver will automatically fill the missing detail if only one of the two corresponding information between CWL
and WPS
is provided. In other words, an application developer could only specify the I/O
's formats
in the WPS
portion during process deployment, and Weaver will take care to update the matching CWL
definition without any user intervention. This makes it also easier for the user to specify supported formats since it is generally easier to remember names of Media-types
than full ontology references. Weaver has a large set of commonly employed Media-Types
that it knows how to convert to corresponding ontologies. Also, Weaver will look for new Media-Types
it doesn't explicitly know about onto either the IANA
or the EDAM
ontologies in order to attempt automatically resolving them.
When formats are resolved between the two contexts, Weaver applies information in a complimentary fashion. This means for example that if the user provided application/x-netcdf
on the WPS
side and iana:application/json
on the CWL
side, both resulting contexts will have both of those formats combined. Weaver will not favour one location over the other, but will rather merge them if they can be resolved into different and valid entities.
Since formats
is a required field for WPS
ComplexData
definitions (see Inputs/Outputs Type
) and that Media-Types
are easier to provide in this context, it is recommended to provide all of them in the WPS
definition. Alternatively, the Inputs/Outputs Schema
representation also located within the WPS
I/O definitions can be used to provide contentMediaType
.
Above examples present the minimal content of formats
JSON
objects (i.e.: mimeType
or mediaType
value), but other fields, such as encoding
and schema
can be provided as well to further refine the specific format supported by the corresponding I/O
definition. These fields are directly mapped, merged and combined against complementary details provided with contentMediaType
, and contentEncoding
and contentSchema
within an OAS
schema (see Inputs/Outputs Schema
).
Warning
Format specification differs between CWL
and WPS
in the case of outputs.
Although WPS
definition allows multiple supported formats for output that are later resolved to the applied one onto the produced result of the job, CWL
only considers the output format
that directly indicates the applied schema. There is no concept of supported format in the CWL
world. This is simply because CWL
cannot predict nor reliably determine which output will be produced by a given application execution without running it, and therefore cannot expose consistent output specification before running the process. Because CWL
requires to validate the full process integrity before it can be executed, this means that only a single output format is permitted in its context (providing many will raise a validation error when parsing the CWL
definition).
To ensure compatibility with multiple supported formats outputs of WPS
, any output that has more that one format will have its format
field dropped in the corresponding CWL
definition. Without any format
on the CWL
side, the validation process will ignore this specification and will effectively accept any type of file. This will not break any execution operation with CWL
, but it will remove the additional validation layer of the format (which especially deteriorates process resolution when chaining processes inside a CWL Workflow
).
If the WPS
output only specifies a single MIME-type, then the equivalent format (after being resolved to a valid ontology) will be preserved on the CWL
side since the result is ensured to be the unique one provided. For this reason, processes with specific single-format output are be preferred whenever possible. This also removes ambiguity in the expected output format, which usually requires a toggle input specifying the desired type for processes providing a multi-format output. It is instead recommended to produce multiple processes with a fixed output format for each case.
Allowed values in the context of WPS
Literal
data provides a mean for the application developer to restrict inputs to a specific set of values. In CWL
, the same can be achieved using an enum
definition. Therefore, the following two variants are equivalent and completely interchangeable.
{
"id": "input",
"literalDataDomains": [
{"allowedValues": ["value-1", "value-2"]}
]
} |
{
"id": "input",
"type": {
"type": "enum",
"symbols": ["value-1", "value-2"]
}
} |
Weaver will ensure to propagate such definitions bidirectionally in order to update the CWL
or WPS
correspondingly with the provided information in the other context if missing. The primitive type to apply to a missing WPS
specification when resolving it from a CWL
definition is automatically inferred with the best matching type from provided values in the enum
list.
Note that enum
such as these will also be applied on top of Multiple and Optional Values
definitions presented next.
Inputs that take multiple values or references can be specified using minOccurs
and maxOccurs
in WPS
context, while they are specified using the array
type in CWL. While the same minOccurs
parameter with a value of zero (0
) can be employed to indicate an optional input, CWL
requires the type to specify "null"
or to use the shortcut ?
character suffixed to the base type to indicate optional input. Resolution between WPS
and CWL
for the merging strategy implies all corresponding parameter combinations and checks in this case.
Warning
Ensure to specify "null"
with quotes when working with JSON
, YAML
and CWL
file formats and/or contents submitted to API
requests or with the CLI
. Using an unquoted null
will result into a parsed None
value which will not be detected as nullable CWL
type.
Because CWL
does not take an explicit amount of maximum occurrences, information in this case are not necessarily completely interchangeable. In fact, WPS
is slightly more verbose and easier to define in this case than CWL
because all details are contained within the same two parameters. Because of this, it is often preferable to provide the minOccurs
and maxOccurs
in the WPS
context, and let Weaver infer the array
and/or "null"
type requirements automatically. Also, because of all implied parameters in this situation to specify the similar details, it is important to avoid providing contradicting specifications as Weaver will have trouble guessing the intended result when merging specifications. If unambiguous guess can be made, CWL
will be employed as deciding definition to resolve erroneous mismatches (as for any other corresponding fields).
update warning according to Weaver issue #25
Warning
Parameters minOccurs
and maxOccurs
are not permitted for outputs in the WPS
context. Native WPS
therefore does not permit multiple output reference files. This can be worked around using a _ file, but this use case is not covered by Weaver yet as it requires special mapping with CWL
that does support array
type as output (see issue #25).
Note
Although WPS
multi-value inputs are defined as a single entity during deployment, special care must be taken to the format in which to specify these values during execution. Please refer to Multiple Inputs
section of Execute <proc_op_execute>
request.
Following are a few examples of equivalent WPS
and CWL
definitions to represent multiple values under a given input. Some parts of the following definitions are purposely omitted to better highlight the concise details of multiple and optional information.
{
"id": "input-multi-required",
"format": "application/json",
"minOccurs": 1,
"maxOccurs": "unbounded"
} |
{
"id": "input-multi-required",
"format": "iana:application/json",
"type": {
"type": "array", "items": "File"
}
} |
It can be noted from the examples that minOccurs
and maxOccurs
can be either an integer
or a string
representing one. This is to support backward compatibility of older WPS
specification that always employed strings although representing numbers. Weaver understands and handles both cases. Also, maxOccurs
can have the special string value "unbounded"
, in which case the input is considered to be allowed an unlimited amount if entries (although often capped by another implicit machine-level limitation such as memory capacity). In the case of CWL
, an array
is always considered as unbounded, therefore WPS
is the only context that can limit this amount.
4.16
Alternatively to parameters presented in previous sections, and employed for representing Multiple and Optional Values
, Allowed Values
specifications, supported File Format
definitions and/or Inputs/Outputs Type
identification, the OpenAPI
specification can be employed to entirely define the I/O
schema. More specifically, this is accomplished by providing an OAS
-compliant structure under the schema
field of each corresponding I/O
. This capability allows each Process
to be compliant with OGC API - Processes
specification that requires this detail in the Process Description <proc_op_describe>
. The same kind of schema
definitions can be used for the Deploy <proc_op_deploy>
operation.
For example, the below representations are equivalent between WPS
, OAS
and CWL
definitions. Obviously, corresponding definitions can become more or less complicated with multiple combinations of corresponding parameters presented later in this section. Some definitions are also not completely portable between contexts.
{
"id": "input",
"literalDataDomains": [
{
"allowedValues": [
"value-1",
"value-2"
]
}
],
"minOccurs": 2,
"maxOccurs": 4
} |
{
"id": "input",
"schema": {
"type": "array",
"items": {
"type": "string",
"enum": [
"value-1",
"value-2"
]
},
"minItems": 2,
"maxItems": 4
}
} |
{
"id": "input",
"type": {
"type": "array",
"items": {
"type": "enum",
"symbols": [
"value-1",
"value-2"
]
}
}
} |
An example with extensive variations of supported I/O
definitions with OAS
is available in tests/functional/application-packages/EchoProcess/describe.yml_. This is also the corresponding example provided by OGC API - Processes
standard to ensure Weaver complies to its specification.
As per all previous parameters in CWL
and WPS
contexts, details provided in OAS
schema are complementary and Weaver will attempt to infer, combine and convert between the various representations as best as possible according to the level of details provided.
Furthermore, Weaver will extend (as needed) any provided schema
during Process Deployment <proc_op_deploy>
if it can identify that the specific OAS
definition is inconsistent with other parameters. For example, if minOccurs
/maxOccurs
were provided by indicating that the I/O
must have exactly between [2-4] elements, but only a single OAS
object was defined under schema
, that OAS
definition would be converted to the corresponding array, as single values are not permitted in this case. Similarly, if the range of items was instead [1-4], the OAS
definition would be adjusted with oneOf
keyword, allowing both single value and array representation of those values, when submitted for Process Execution <proc_op_execute>
.
Below is a summary of fields that are equivalent or considered to identify similar specifications (corresponding fields are aligned in the table). Note that all OAS
elements are always nested under the schema
field of an I/O
, with parameters located where appropriate as per OpenAPI
specification. Other OAS
fields are still permitted, but are not explicitly handled to search for corresponding definitions in WPS
and CWL
contexts.
Parameters in WPS Context |
Parameters in OAS Context |
Parameters in CWL Context |
---|---|---|
|
|
|
|
|
|
|
|
|
In order to be OGC
-compliant, any previously deployed Process
will automatically generate any missing schema
specification for all I/O
it employs when calling its Process Description <proc_op_describe>
. Similarly, a deployed Process
that did not make use of the schema
representation method to define its I/O
will also generate the corresponding OAS
definitions from other WPS
and CWL
contexts, provided those definitions offered sufficiently descriptive and valid I/O
parameters for deployment.
Along above parameter combinations, OAS
context also accomplishes the auto-detection of common JSON
structures to convert between raw-data string formatted as JSON
, literal JSON
object embedded in the body, and application/json
file references toward the corresponding Complex
WPS
input or output. When any of those three JSON
definition is detected, other equivalent representations will be added using a oneOf
keyword if they were not already explicitly provided in schema
. When analyzing and combining those definitions, any OAS
$ref
or contentSchema
specifications will be used to resolve the corresponding type: object
with the most explicit schema
definition available. If this cannot be achieved, a generic object
allowing any additionalProperties
(i.e.: no JSON
schema variation) will be used instead. External URIs pointing to an OAS
schema formatted either as JSON
or YAML
are resolved and fetched inline as needed during I/O
merging strategy to interpret specified references.
Following is a sample representation of equivalent variants JSON
definitions, which would be automatically expended using the oneOf
structure with other missing components if applicable.
{
"id:" "input",
"schema": {
"oneOf": [
{
"type": "string",
"contentMediaType": "application/json"
"contentSchema": "http://host.com/schema.json"
},
{
"$ref": "http://host.com/schema.json"
}
]
}
} |
{
"id:" "input",
"schema": {
"oneOf": [
{
"type": "string",
"contentMediaType": "application/json"
},
{
"type": "object",
"additionalProperties": true
}
]
}
} |
Special handling of well-known OAS
type: object
structures is also performed to convert them to more specific and appropriate WPS
types intended for their purpose. For instance, a measurement value provided along with an Unit of Measure (UoM
) is converted to a WPS
Literal
. An object containing bbox
and crs
fields with the correct schema are converted to WPS
BoundingBox
type. Except for these special cases, all other OAS
type: object
are otherwise converted to WPS
Complex
type, which in turn is communicated to the CWL
application using a File
I/O
. Other non-JSON
definitions are also converted using the same WPS
Complex
/CWL
File
, but their values cannot be submitted with literal JSON
structures during Process Execution <proc_op_execute>
, only using raw-data (i.e: encoding string) or a file reference.
File tests/functional/application-packages/EchoProcess/describe.yml_ provides example I/O
definitions for mentioned special OAS
interpretations and more advanced JSON
schemas with nested OAS
keywords.
It is important to consider that all OAS
schema
that can be provided during a Deploy <proc_op_deploy>
request or retrieved from a Process Description <proc_op_describe>
only define the expected value representations of the I/O
data to be submitted for Execution <proc_op_execute>
request. In other words, an I/O
typed as Complex
that can be submitted using any of the supported file_reference_types
to be forwarded to CWL
SHOULD NOT add any URI-related definition in schema
. It is implicit for every Process
that an I/O
of given supported Media-Types
can be submitted by reference using a link pointing to contents of such types. This implicit file reference interpretation serves multiple purposes.
- Using only expected value definition and leaving out the by-reference equivalent greatly simplifies the
schema
definitions since every singleComplex
I/O
does not need to provide a very verboseschema
containing aoneOf(file-ref,raw-data)
representation to indicate that data can be submitted both by value or by reference. - Using a generic
{"type": "string", "format": "uri"}
OAS
schema does not convey theMedia-Types
requirements as well as inferring them "link-to"{"type": "string", "contentMediaType: <format>}
. It is therefore better to omit them entirely as they do not add anyI/O
descriptive value. - Because the above string-formatted
uri
are left out from definitions, it can instead be used explicitly in anI/O
specification to indicate to Weaver that theProcess
uses aLiteral
URI string, that must not be fetched by Weaver, and must be passed down as plain string URI directly without modification or interpretation to the underlyingCWL
Application Package
.
To summarize, strings with format: uri
will NOT be considered as Complex
I/O
by Weaver. They will be seen as any other string Literal
, but this allows a Process
describing its I/O
as an external URI reference. This can be useful for an application that handles itself the retrieval of the resource referred to by this URI. To represent supported formats of Complex
file references, the schema
should be represented using the following structures. If the contentMediaType
happens to be JSON
, then the explicit OAS
object
schema can be added as well, as presented in oas_json_types
section.
{
"id:" "input",
"schema": {
"type": "string",
"contentMediaType": "image/png",
"contentEncoding": "base64"
}
} |
{
"id:" "input",
"schema": {
"oneOf": [
{
"type": "string",
"contentMediaType": "application/gml+xml"
},
{
"type": "string",
"contentMediaType": "application/kml+xml"
}
]
}
} |
Metadata fields are transferred between WPS
(from Process
description) and CWL
(from Application Package
) when match is possible. Per I/O
definition that support certain metadata fields (notably descriptions), are also transferred.
Note
Because the schema
(OAS
) definitions are embedded within WPS
I/O definitions, corresponding metadata fields ARE NOT transferred. This choice is made in order to keep schema
succinct such that they only describe the structure of the expected data type and format, and to avoid too much metadata duplication for each I/O
in the resulting Process
description.
Below is a list of compatible elements.
Parameters in WPS Context |
Parameters in CWL Context |
---|---|
keywords |
s:keywords (expecting s in $namespace referring to http://schema.org1) |
metadata (using title and href fields) |
$schemas /$namespace (using namespace name and HTTP references) |
title |
label |
abstract /description |
doc |
Footnotes