Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor wps-process for workflow steps #372

Merged
merged 43 commits into from
Jan 12, 2022
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
540d8c2
refactor wps-process in workflow (reuse staging operations) + fix wor…
fmigneault Nov 24, 2021
33491e6
apply refactoring of workflow step for WPS-1 remote process
fmigneault Dec 9, 2021
84f5726
fix wps-1 refactored input parsing
fmigneault Dec 9, 2021
9391267
create test workflow to evaluate fix #371
fmigneault Dec 9, 2021
310b1fd
test workflow working for nestd output globs (resolves #371)
fmigneault Dec 9, 2021
0a0e3b7
experimental test workflows
fmigneault Dec 9, 2021
76fdba1
Merge branch 'master' into fix-workflow-output-binding
fmigneault Dec 15, 2021
2bf232b
add builtin file-index-selector + add test workflow wps1+builtin-sele…
fmigneault Dec 16, 2021
6367bea
refactor app package file contents / retrieval
fmigneault Dec 16, 2021
5203c76
add validation of resolved CWL app pkg after schema validation
fmigneault Dec 16, 2021
bd641c7
minor updates to docs + continue test file contents refactor
fmigneault Dec 16, 2021
8b39492
Merge remote-tracking branch 'origin/master' into fix-workflow-output…
fmigneault Dec 16, 2021
1e25950
fix broken docs references + validate CWL app pkg reported in docs us…
fmigneault Dec 16, 2021
18c5591
fix test file references
fmigneault Dec 16, 2021
fe603ae
Merge branch 'cli-literal-body' into fix-workflow-output-binding
fmigneault Dec 16, 2021
b982f70
Merge branch 'cli-literal-body' into fix-workflow-output-binding
fmigneault Dec 16, 2021
eb7ab5b
fix literalinclude in docs
fmigneault Dec 16, 2021
249face
fix incorrect provider field schema in CWL WPS1Requirement and ESGF_C…
fmigneault Dec 16, 2021
0051091
add implementations of WPS1Requirement detection directly provided as…
fmigneault Dec 16, 2021
8452ddc
fixes for WPS1Requirement CWL execution unit/href/owsContext
fmigneault Dec 18, 2021
7f79945
patch test workflows
fmigneault Dec 18, 2021
7d92a55
minor updates
fmigneault Dec 18, 2021
568d32e
fix XML status location URL when process is started from WPS-1 endpoi…
fmigneault Dec 18, 2021
53eeeaa
Merge branch 'master' into fix-workflow-output-binding
fmigneault Dec 18, 2021
f2c576e
working workflow of ogc-api, builtin, wps-1 processes chaining
fmigneault Dec 21, 2021
a716bca
remove obsolete glob mapping
fmigneault Dec 21, 2021
24af314
Merge remote-tracking branch 'origin/master' into fix-workflow-output…
fmigneault Dec 21, 2021
90fc162
remove test workflow process ID rename to simplify operations
fmigneault Dec 21, 2021
b08c334
add reference to tmp test for scatter (relates to https://github.com/…
fmigneault Dec 21, 2021
b6c8574
adjust logging level for potential failing causes in Wps3Process step
fmigneault Dec 21, 2021
5daa46d
move workflow operations detail to doc + fix openapi WPS endpoint in doc
fmigneault Dec 21, 2021
89d0331
fix conflicts between doc CLI autogen method sections and process ope…
fmigneault Dec 21, 2021
b1ca8c6
fixes for docs
fmigneault Dec 21, 2021
d8a29a2
update changelog
fmigneault Dec 21, 2021
47b62c8
fix lint
fmigneault Dec 21, 2021
d9037f4
fix WPS-1 XML status location
fmigneault Dec 22, 2021
e38dd8b
move media-type CWL prefix cleanup downward to handle format list as …
fmigneault Dec 22, 2021
8d0a382
fix failing href package from unknwon Content-Type
fmigneault Dec 22, 2021
b87bc89
adjust doc deploy/describe/execute references
fmigneault Dec 22, 2021
4a9222f
fix deploy executionUnit assuming CWL package when href possible
fmigneault Dec 22, 2021
31d5e80
fix lint
fmigneault Dec 22, 2021
d127ffe
fix Wps3Process.monitor returned bool
fmigneault Jan 11, 2022
407c0de
ignore celery 5.2.2 (CVE-2021-23727) warning by safety (hardcode unti…
fmigneault Jan 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 30 additions & 3 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,38 @@ Changes

Changes:
--------
- No change.
- Refactor Workflow operation flow to reuse shared input and output staging operations between implementations.
Each new step process implementation now only requires to implement the specific operations related to deployment,
execution, monitoring and result retrieval for their process, without need to consider Workflow intermediate staging
operations to transfer files between steps.
- Refactor ``Wps1Process`` and ``Wps3Process`` step processes to follow new workflow operation flow.
dbyrns marked this conversation as resolved.
Show resolved Hide resolved
- Add ``builtin`` process ``file_index_selector`` that allows the selection of a specific file within an array of files.
- Add tests to validate chaining of Workflow steps using different combinations of process types
including `WPS-1`, `OGC-API` and ``builtin`` implementations.
- Move `CWL` script examples in documentation to separate package files in order to directly reference them in
tests validating their deployment and execution requests.
- Move all ``tests/functional/application-packages`` definitions into distinct directories to facilitate categorization
of corresponding deployment, execution and package contents, and better support the various Workflow testing location
of those files with backward compatibility.
- Add logs final entry after retrieved internal `CWL` application logs to help highlight delimitation with following
entries from the parent `Process`.

Fixes:
------
- No change.
- Fix handling of `CWL` Workflow outputs between steps when nested glob output binding are employed
(resolves `#371 <https://github.com/crim-ca/weaver/issues/371>`_).
- Fix resolution of ``builtin`` process Python reference when executed locally within a Workflow step.
- Fix resolution of process type `WPS-1` from its package within a Workflow step executed as `OGC-API` process.
- Fix resolution of ``WPS1Requirement`` directly provided as `CWL` execution unit within the deployment body.
- Fix deployment body partially dropping invalid ``executionUnit`` sub-fields causing potential misinterpretation
of the intended application package.
- Fix resolution of package or `WPS-1` reference provided by ``href`` with erroneous ``Content-Type`` reported by the
returned response. Attempts auto-resolution of detected `CWL` (as `JSON` or `YAML`) and `WPS-1` (as `XML`) contents.
- Fix resolution of ``format`` reference within `CWL` I/O record after interpretation of the loaded application package.
- Fix missing `WPS` endpoint responses in generated `OpenAPI` for `ReadTheDocs` documentation.
- Fix reporting of `WPS-1` status location as the `XML` file URL instead of the `JSON` `OGC-API` endpoint when `Job`
was originally submitted through the `WPS-1` interface.
- Fix and improve multiple typing definitions.

`4.7.0 <https://github.com/crim-ca/weaver/tree/4.7.0>`_ (2021-12-21)
========================================================================
Expand Down Expand Up @@ -592,7 +619,7 @@ Fixes:
Changes:
--------
- Add ``weaver.wps.utils.get_wps_client`` function to handle the creation of ``owslib.wps.WebProcessingService`` client
with appropriate request options configuration from application settings.
with appropriate *request options* configuration from application settings.

Fixes:
------
Expand Down
27 changes: 27 additions & 0 deletions docs/examples/docker-python-script-report.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
cwlVersion: v1.0
class: CommandLineTool
baseCommand:
- python3
- script.py
inputs:
- id: amount
type: int
- id: cost
type: float
outputs:
- id: quote
type: File
outputBinding:
glob: report.txt
requirements:
DockerRequirement:
dockerPull: "python:3.7-alpine"
InitialWorkDirRequirement:
listing:
# below script is generated dynamically in the working directory, and then called by the base command
- entryname: script.py
entry: |
amount = $(inputs.amount)
cost = $(inputs.cost)
with open("report.txt", "w") as report:
report.write(f"Order Total: {amount * cost:0.2f}$\n")
16 changes: 16 additions & 0 deletions docs/examples/docker-shell-script-cat.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
cwlVersion: v1.0
class: CommandLineTool
baseCommand: cat
requirements:
DockerRequirement:
dockerPull: "debian:stretch-slim"
inputs:
- id: file
type: File
inputBinding:
position: 1
outputs:
- id: output
type: File
outputBinding:
glob: stdout.log
5 changes: 5 additions & 0 deletions docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,11 @@ Glossary
Entity that offers an ensemble of :term:`Process` under it. It is typically a reference to a remote service,
where any :term:`Process` it provides is fetched dynamically on demand.

Request Options
Configuration settings that can be defined for `Weaver` in order to automatically insert additional
HTTP request parameters, authentication or other any relevant rules when target URLs are matched.
See also :ref:`conf_request_options`.

S3
Simple Storage Service (:term:`AWS` S3), bucket file storage.

Expand Down
4 changes: 2 additions & 2 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Once `Weaver` package is installed (see :ref:`installation`), it provides a comm
as well as a :py:class:`weaver.cli.WeaverClient` to allow simplified interactions through shell calls or Python scripts.

This offers to the user methods to use file references (e.g.: local :term:`CWL` :term:`Application Package` definition)
to rapidly operate with functionalities such as :ref:`Deploy`, :ref:`Describe`, :ref:`Execute` and any other operation
described in :ref:`proc_operations` section.
to rapidly operate with functionalities such as :ref:`Deploy <proc_op_deploy>`, :ref:`Describe <proc_op_describe>`,
:ref:`Execute <proc_op_execute>` and any other operation described in :ref:`proc_operations` section.

For details about using the Python :py:class:`weaver.cli.WeaverClient`, please refer directly to its documentation
and its underlying methods.
Expand Down
6 changes: 3 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def doc_redirect_include(file_path):
# note:
# setting 'weaver.build_docs' allows to ignore part of code that cause problem or require unnecessary
# configuration for the purpose of parsing the source to generate the OpenAPI
config = Configurator(settings={"weaver.wps": False, "weaver.wps_restapi": True, "weaver.build_docs": True})
config = Configurator(settings={"weaver.wps": True, "weaver.wps_restapi": True, "weaver.build_docs": True})
config.include("weaver") # need to include package to apply decorators and parse routes
api_spec_file = os.path.join(DOC_BLD_ROOT, "api.json")
# must disable references when using redoc (alpha version note rendering them correctly)
Expand Down Expand Up @@ -168,8 +168,8 @@ def doc_redirect_include(file_path):

# allow conversion of quotes and repeated dashes to other representation characters
# https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smartquotes
# disable to avoid problems with '--param' employed in document of CLI.
smartquotes = False
# To avoid problems with '--param' employed in document of CLI, provide them as ``--param``.
smartquotes = True

# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
Expand Down
12 changes: 10 additions & 2 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,12 @@ they are optional and which default value or operation is applied in each situat
|
| Enables the WPS-1/2 endpoint.

.. seealso::
:ref:`wps_endpoint`

.. warning::

At the moment, this setting must be ``true`` to allow job execution as the worker monitors this endpoint.
At the moment, this setting must be ``true`` to allow :term:`Job` execution as the worker monitors this endpoint.
This could change with future developments (see issue `#21 <https://github.com/crim-ca/weaver/issues/21>`_).

- | ``weaver.wps_path = <url-path>``
Expand Down Expand Up @@ -206,7 +209,8 @@ they are optional and which default value or operation is applied in each situat
|
| Encryption settings as well as custom email templates are available. Default email template defined in
`email-template`_ is employed if none is provided. Email notifications are sent only on job
completion if an email was provided in the :ref:`Execute` request body (see also: :ref:`Email Notification`).
completion if an email was provided in the :ref:`Execute <proc_op_execute>` request body
(see also: :ref:`Email Notification`).


.. note::
Expand Down Expand Up @@ -333,11 +337,15 @@ simply set setting ``weaver.wps_processes_file`` as *undefined* (i.e.: nothing a
- `wps_processes.yml.example`_


.. _conf_request_options:

Configuration of Request Options
=======================================

.. todo:: complete docs

:term:`Request Options`

``weaver.ssl_verify``


Expand Down
2 changes: 1 addition & 1 deletion docs/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Please refer to below references for more details.
.. seealso::

- Supported :term:`Application Package` definitions in :ref:`process-wps-rest` deployment.
- :ref:`Deploy` request.
- :ref:`Deploy <proc_op_deploy>` request.


Fixing permission error on input files
Expand Down
75 changes: 16 additions & 59 deletions docs/source/package.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,64 +77,20 @@ available within its containerized environment. In this case, we also take advan
is always collected by `Weaver` (along with the ``stderr``) in order to obtain traces produced by any
:term:`Application Package` when performing :term:`Job` executions.

.. code-block:: yaml
.. literalinclude:: ../examples/docker-shell-script-cat.cwl
:language: yaml
:caption: Sample CWL definition of a shell script

cwlVersion: v1.0
class: CommandLineTool
baseCommand: cat
requirements:
DockerRequirement:
dockerPull: "debian:stretch-slim"
inputs:
- id: file
type: File
inputBinding:
position: 1
outputs:
- id: output
type: File
outputBinding:
glob: stdout.log


The second example takes advantage of the |cwl-workdir-req|_ to generate a Python script dynamically
(i.e.: ``script.py``), prior to executing it for processing the received inputs and produce the output file.
Because a Python runner is required, the |cwl-docker-req|_ specification defines a basic :term:`Docker` image that
meets our needs. Note that in this case, special interpretation of ``$(...)`` entries within the definition can be
provided to tell :term:`CWL` how to map :term:`Job` input values to the dynamically created script.

.. code-block:: yaml
.. literalinclude:: ../examples/docker-python-script-report.cwl
:language: yaml
:caption: Sample CWL definition of a Python script

cwlVersion: v1.0
class: CommandLineTool
baseCommand:
- python3
- script.py
inputs:
- id: amount
type: int
- id: cost
type: float
outputs:
- id: quote
type: File
outputBinding:
glob: report.txt
requirements:
DockerRequirement:
dockerPull: "python:3.7-alpine"
InitialWorkDirRequirement:
listing:
# below script is generated dynamically in the working directory, and then called by the base command
entryname: script.py
entry: |
amount = $(inputs.amount)
cost = $(inputs.cost)
with open("report.txt", "w") as report:
report.write(f"Order Total: {amount * cost}$\n")

.. _app_pkg_docker:

Dockerized Applications
Expand All @@ -151,7 +107,7 @@ using :term:`CWL` capabilities in order to run it.
Because :term:`Application Package` providers could desire to make use of :term:`Docker` images hosted on private
registries, `Weaver` offers the capability to specify an authorization token through HTTP request headers during
the :term:`Process` deployment. More specifically, the following definition can be provided during a
:ref:`Deploy` request.
:ref:`Deploy <proc_op_deploy>` request.

.. code-block:: http

Expand Down Expand Up @@ -211,7 +167,7 @@ definition can be placed in any location supported as for the case of atomic pro
The following :term:`CWL` definition demonstrates an example ``Workflow`` process that would resolve each ``step`` with
local processes of match IDs.

.. literalinclude:: ../../tests/functional/application-packages/workflow_subset_ice_days.cwl
.. literalinclude:: ../../tests/functional/application-packages/WorkflowSubsetIceDays/package.cwl
:language: JSON
:linenos:

Expand Down Expand Up @@ -243,8 +199,9 @@ figure out how to parse it.

Because `Weaver` and the underlying `CWL` executor need to resolve all steps in order to validate their input and
output definitions correspond (id, format, type, etc.) in order to chain them, all intermediate processes **MUST**
be available. This means that you cannot :ref:`Deploy` nor :ref:`Execute` a ``Workflow``-flavored
:term:`Application Package` until all referenced steps have themselves been deployed and made visible.
be available. This means that you cannot :ref:`Deploy <proc_op_deploy>` nor :ref:`Execute <proc_op_execute>`
a ``Workflow``-flavored :term:`Application Package` until all referenced steps have themselves been deployed and
made visible.

.. warning::

Expand All @@ -256,7 +213,7 @@ be available. This means that you cannot :ref:`Deploy` nor :ref:`Execute` a ``Wo
.. seealso::

- :py:func:`weaver.processes.wps_package.get_package_workflow_steps`
- :ref:`Deploy` request details.
- :ref:`Deploy <proc_op_deploy>` request details.

Step Inputs/Outputs
~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -331,8 +288,8 @@ structures are supported, whether they are specified using an array list with ex
variant, or using key-value pairs (see |cwl-io-map|_ for more details). Regardless of array or mapping format,
:term:`CWL` requires that all I/O have unique ``id``. On the :term:`WPS` side, a list of I/O is *always* expected.
This is because :term:`WPS` I/O with multiple values (array in :term:`CWL`) are specified by repeating the ``id`` with
each value instead of defining the value as a list of those values during :ref:`Execute` request (see also
:ref:`Multiple Inputs`).
each value instead of defining the value as a list of those values during :ref:`Execute <proc_op_execute>` request
(see also :ref:`Multiple Inputs`).

To summarize, the following :term:`CWL` and :term:`WPS` I/O definitions are all equivalent and will result into the
same process definition after deployment. For simplification purpose, below examples omit all but mandatory fields
Expand Down Expand Up @@ -375,8 +332,8 @@ Other fields are discussed afterward in specific sections.
The :term:`WPS` example above requires a ``format`` field for the corresponding :term:`CWL` ``File`` type in order to
distinguish it from a plain string. More details are available in `Inputs/Outputs Type`_ below about this requirement.

Finally, it is to be noted that above :term:`CWL` and :term:`WPS` definitions can be specified in the :ref:`Deploy`
request body with any of the following variations:
Finally, it is to be noted that above :term:`CWL` and :term:`WPS` definitions can be specified in
the :ref:`Deploy <proc_op_deploy>` request body with any of the following variations:

1. Both are simultaneously fully specified (valid although extremely verbose).
2. Both partially specified as long as sufficient complementary information is provided.
Expand Down Expand Up @@ -405,7 +362,7 @@ In the :term:`CWL` context, the ``type`` field indicates the type of I/O. Availa
**intentional** as :term:`WPS` does not offer equivalents. Furthermore, both of these types make the process
description too ambiguous. For instance, most processes expect remote file references, and providing a
``Directory`` doesn't indicate an explicit reference to which files to retrieve during stage-in operation of
a job execution.
a :term:`Job` execution.


In the :term:`WPS` context, three data types exist, namely ``Literal``, ``BoundingBox`` and ``Complex`` data.
Expand Down Expand Up @@ -604,7 +561,7 @@ employed as deciding definition to resolve erroneous mismatches (as for any othe
.. note::
Although :term:`WPS` multi-value inputs are defined as a single entity during deployment, special care must be taken
to the format in which to specify these values during execution. Please refer to :ref:`Multiple Inputs` section
of :ref:`Execute` request.
of :ref:`Execute <proc_op_execute>` request.

Following are a few examples of equivalent :term:`WPS` and :term:`CWL` definitions to represent multiple values under
a given input. Some parts of the following definitions are purposely omitted to better highlight the concise details
Expand Down
Loading