interventions: agree on the set-outputs/remove/forget discussions #172

oliver-sanders · 2023-04-19T15:44:48Z

Proposal cylc set
Proposal task expiration

oliver-sanders · 2023-05-15T14:06:03Z

Clarification on use cases 6 & 7 in the cylc set proposal...

6.?? Set jobs to failed when a job platform is known to be down

I don’t think this case is valid. (Unless I’ve misunderstood the requirement?).

This case describes the scenario where a job has successfully submitted (or even started?) on a remote platform which subsequently becomes uncontactable leaving us with a job "stuck" in the submitted(/running?) state. We cannot poll or kill these tasks, so at Cylc 7 this could stall workflows. The cylc reset command was used to work around the issue allowing us to disown the job submission in situations where Cylc could not confirm that they had failed.

I think this issue can still occur at Cylc 8, if so we need a mechanism for telling Cylc to disown these job submissions so that we can re-submit on another system and continue.

Suggested solutions:

cylc forget <job>
cylc message <id> <task> -- failed

7.?? Set switch tasks at an optional branch point, to direct the future flow

I’m not sure this is valid either. Why would we need to do this?

Sometimes tasks act as "if" statements in workflows, governing graph branching. With optional outputs these branching patterns are likely to become more common as people pull these "if" statements out of task logic and into the workflow graph.

a => b
a:x? => x => b
a:y? => y => b
a:z? => z => b

E.G. we have a few workflows where the first task in every cycle yields an output which decides which data source to use based on runtime conditions. Users might want to intervene in this decision rather than leaving it up to the automatic logic in order to work around unexpected issues, for development or to test recovery logic manually. They may want --wait behaviour for this case.

To do this they need to be able to set the desired output on the switch task (covered by the cylc set proposal), but will probably also want to remove/expire the task to prevent it from being re-run and potentially trigger another branch in so doing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interventions: agree on the set-outputs/remove/forget discussions #172

interventions: agree on the set-outputs/remove/forget discussions #172

oliver-sanders commented Apr 19, 2023 •

edited by hjoliver

Loading

oliver-sanders commented May 15, 2023 •

edited

Loading

6.?? Set jobs to failed when a job platform is known to be down

7.?? Set switch tasks at an optional branch point, to direct the future flow

hjoliver commented May 18, 2023

oliver-sanders commented Jul 25, 2023

interventions: agree on the set-outputs/remove/forget discussions #172

interventions: agree on the set-outputs/remove/forget discussions #172

Comments

oliver-sanders commented Apr 19, 2023 • edited by hjoliver Loading

oliver-sanders commented May 15, 2023 • edited Loading

6.?? Set jobs to failed when a job platform is known to be down

Suggested solutions:

7.?? Set switch tasks at an optional branch point, to direct the future flow

Suggested solutions:

hjoliver commented May 18, 2023

oliver-sanders commented Jul 25, 2023

oliver-sanders commented Apr 19, 2023 •

edited by hjoliver

Loading

oliver-sanders commented May 15, 2023 •

edited

Loading