Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to choose between synchronous and asynchronous mode for job creation #30

Closed
gfenoy opened this issue Jun 20, 2019 · 9 comments
Closed
Labels
OGC API Hackathon OGC API Hackathon 2019

Comments

@gfenoy
Copy link
Contributor

gfenoy commented Jun 20, 2019

If I can find information about the jobControlOptions when listing /processes, I am not able to find the way to ask the OGC API - Processing server to run a job asynchronously like I was able to do using the mode attribute in WPS 2.0.

One option may be to add a mode parameter (in: "query") that can take sync or async as value (with sync as the default value).

Another option may be to consider using the "respond-async" Preference header defined in RFC 7240. If we use it, we can then add the following parameter to the /processes/{id}/jobs path for the post request:

{
 "required":false,
 "in":"header",
 "name":"Prefer",
 "schema":
  {
   "type":"string",
   "enum": ["respond-async"]
  }
}

Hence, one can specify that the job should be run asynchronously, if possible, by passing the "respond-async" Preference header, in other case the job should be run synchronously.

Also, I don't see any response body definition for an execute request and the client should extract information about the created {jobID} from the Location server response header.

An option may be to use a jobInfo object with {jobID} as id and a statusInfo object as infos can be an option as we are basically providing one of the links available from the /processes/{id}/jobs/{jobID} path corresponding to the Status location as the Location header.

On the other hand, to get something similar to the WPS 2.0, then we may consider that in case the client ask the Processing server to execute a job asynchronously then the server response may be a jobInfo (corresponding to the StatusInfo document in WPS 2.0). Another option may be to consider using the "respond-async" Preference header defined in RFC 7240. If we use it, we can then add the following parameter to the /processes/{id}/jobs path for the post request.

In case there is no request Preference header, then the Server can use the auto mode (meaning that the server decides if the job is run synchronously or not) or the sync mode (in such a case, we may consider returning the /processes/{id}/jobs/{jobID}/result path in the Location response header and provide an outputInfo).

@bpross-52n bpross-52n added the OGC API Hackathon OGC API Hackathon 2019 label Jun 20, 2019
@francbartoli
Copy link
Contributor

@gfenoy shouldn't this be part of the execute model?

@bpross-52n
Copy link
Contributor

@gfenoy is there another reason for the jobInfo wrapper as the additional id? We could also extend the status info with the id and also with an exceptions element, as proposed in #32

@christophenoel
Copy link
Contributor

christophenoel commented Jun 20, 2019

This is indeed missing in the schema 👍

execute:
      type: object
      required:
        - outputs
        - mode
        - response
      properties:
        inputs:
          type: array
          items:
            $ref: '#/components/schemas/input'
        outputs:
          type: array
          items:
            $ref: '#/components/schemas/output'
        mode:
          type: string
          enum:
            - sync
            - async
            - auto
        response:
          type: string
          enum:
            - raw
            - document

@bpross-52n
Copy link
Contributor

The thing with requesting raw data output in WPS is that it only worked for exactly one output. Maybe it is more useful to follow the approach outlined here: #21. Introducing ../result/{resultID} to get raw data for a specific output.

@gfenoy
Copy link
Contributor Author

gfenoy commented Jun 20, 2019

@gfenoy shouldn't this be part of the execute model?

Actually, this is where I am lost personally. I am unable to identify clearly what should be included in the model and what shall not be.

Typically here, the definition of this parameter (both mode and response) can be declared from within the /api.

For example, in this link you can see the declaration of both mode (oas-header1) and response (oas-header2).

I would like to inform that using the definition from the /api without any modification of the model seemed to be a solution at the begining. Nevertheless, I realized when implementing it that it would make it a bit hard to implement a client as it should then be able to parse any possible parameters defined in the /api. On the other hand, there seems to be a plenty of OpenAPI clients able to do many things already...

@gfenoy
Copy link
Contributor Author

gfenoy commented Jun 20, 2019

@gfenoy is there another reason for the jobInfo wrapper as the additional id? We could also extend the status info with the id and also with an exceptions element, as proposed in #32

@bpross-52n you are totally right, after checking the xsd schemas again, it should definitely go into the statusInfo object.

@christophenoel
Copy link
Contributor

The thing with requesting raw data output in WPS is that it only worked for exactly one output. Maybe it is more useful to follow the approach outlined here: #21. Introducing ../result/{resultID} to get raw data for a specific output.

After adding back the "href" attribute in the outputs, the proposed solution is redundant and does not make sense (and would have to deal with output arrays): for accessing the various ouput, you simply read all the links that you request by reference in your execute request.

@christophenoel
Copy link
Contributor

Quoting chat:

Matthias Mohr @m-mohr 09:05
How would chaining without synchronous execution work?

Brad Hards @bradh 09:07
The async notification would invoke the next step in the chain?

Matthias Mohr @m-mohr 09:09
Once notifications are included, yes.

Francesco Bartoli @francbartoli 09:23
How would chaining without synchronous execution work?

once you submit a DAG/workflow then your machinery should know which is the state and the next step

Spacebel sa (Christophe Noel) @spacebel 09:23
It depends what you call "chaining". I believe any serious workflow/business process shall be handled by a workflow engine/ dag engine/ script engine (allowing parallel executions etc.). We have prototyped chaining using CWL engine (TB14), BPMN engine (ESE), BPEL engine (SSEGrid). When using simple scripting, the WPS client used in the script simply needs to support either polling strategy, either notification (callback).

Marian Neagul @mneagul 09:28
@spacebel I agree, for current serious scenarios we need a Workflow engine. Allowing triggers would open the opportunity for future scenarios and coupling WPS Servers or Processes without requiring a separate middleware.

Spacebel sa (Christophe Noel) @spacebel 09:28
For those interested by sync mode (and raw single output), are you not interested in having the HTTP Get KVP parameters from XML spec for submitting jobs. Using that strategy (which I personally have not used), you can submit to a process an input by ref, with a HREF which itself generates a processing output (chaining). I believe this is a pro of sync mode. Note also that we can make use of CONFORMANCE: some implementation may support sync mode, others async or both (I think it is reported in the capabilities in XML)

Marian Neagul @mneagul 09:28
Also triggers would allow different scenarios not related to triggering other jobs

Brad Hards @bradh 09:29
I don't want the core of OGC API for Processes to be as complex as current WPS.

Francesco Bartoli @francbartoli 09:29
I don't want the core of OGC API for Processes to be as complex as current WPS.

Totally agree

@bpross-52n
Copy link
Contributor

Mode attribute is back in the current execute.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OGC API Hackathon OGC API Hackathon 2019
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants