Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spawning parentless clock-xtriggered task #5572

Closed
dpmatthews opened this issue Jun 8, 2023 · 10 comments · Fixed by #5658 · May be fixed by #5585
Closed

Spawning parentless clock-xtriggered task #5572

dpmatthews opened this issue Jun 8, 2023 · 10 comments · Fixed by #5658 · May be fixed by #5585
Milestone

Comments

@dpmatthews
Copy link
Contributor

Reported by @hjoliver on Element:

Consider a cycling graph with a parentless clock-xtriggered task foo at the top of each cycle. Currently (perhaps pending a future capability for clock triggers to spawn dependent tasks on demand):

  • the clock-xtrigger requires a waiting foo in the active pool, to wait on it
  • successive instances of foo get auto-spawned out to the runahead limit (there are no parent tasks to do It)

At normal start-up, the first instance of foo gets spawned by the scheduler, then (having no parents to do so) it spawns future instances of itself.

However, when triggering a new flow, or when starting up with --start-tasks, at some other point in the graph, we likely need to arrange for the "next" instance of foo to be spawned as waiting on the clock-xtrigger. At the moment this can't be done (since cylc spawn was removed):

  • triggering a previous instance of foo will spawn the correct "next" one, but it will cause unwanted action downstream of the previous instance
  • triggering the "next" instance will cause it to run before it's clock-xtrigger is satisfied
  • setting outputs or prerequisites won't help
@dpmatthews dpmatthews added this to the cylc-8.x milestone Jun 8, 2023
@dpmatthews
Copy link
Contributor Author

We already have the case that trigger actually just queues a task and you need a second trigger to make it run immediately.
We could extent this to:

  • Triggering a task that doesn't have all it's pre-requisites met marks it as ready to run subject to any clock trigger (in the absence of any clock trigger it will be queued).
  • Triggering a task that is purely waiting on a clock trigger queues it.
  • Triggering a task which is queued submits it regardless of any queue limits.

I quite like this because it avoids the need to remember to use a different command when dealing with a clock triggered task.
In the vast majority of cases (tasks that don't have clock triggers or have clock triggers which have already passed) it would be no different to now.

@dpmatthews
Copy link
Contributor Author

See also cylc/cylc-doc#476

@hjoliver
Copy link
Member

hjoliver commented Jun 8, 2023

Hmmm, on reflection, there is a problem for my use case above,

Going by your rules, a clock-triggered parent-less task should be queued to run if manually triggered, because it has no prerequisites, even if it is an abstract task in the future graph (i.e. not in n=0 yet).

The problem is I start the flow part way through the graph in a particular cycle, so none of these parent-less clock-triggered tasks get spawned at all. I want to spawn them manually into n=0 as waiting on their clock-triggers, starting at the next cycle, in order to set up the rest of the workflow to run normally.

I guess the first rules would have to be:

  • Triggering an abstract task spawns it into n=0 with prerequisites (but not clock-triggers) satisfied
  • Triggering a waiting n=0 task satisfies any prerequisites, and queues it if not waiting on a clock-trigger
  • Triggering an n=0 task waiting only on a clock trigger, queues it
  • ... (same from here)

As a consequence, to make a parent-less clock-triggered task queue immediately you will have to trigger it twice if abstract but only once if in n=0.

On the other hand, we already need to understand the difference between waiting n=0 tasks and "waiting" abstract future tasks for a couple of other reasons (e.g. n=0 already have flow numbers assigned, and can get flow-merged).

What do you think?

@hjoliver
Copy link
Member

hjoliver commented Jun 8, 2023

As a consequence, to make a parent-less clock-triggered task queue immediately you will have to trigger it twice if abstract but only once if in n=0

Actually, that's similar to triggering unqueued vs queued tasks too. So maybe that's OK.

@dpmatthews
Copy link
Contributor Author

Yes, sorry. What you describe is what I had in mind. It's just difficult to know the best way to describe this without referring to spawning and n=0 and without inventing new terminology (e.g "abstract task"!). The current help for cylc trigger says "Force tasks to run despite unsatisfied prerequisites" which already doesn't make a lot of sense when applied to new flows or to parentless tasks so we could do with improving this.

@hjoliver
Copy link
Member

hjoliver commented Jun 12, 2023

and without inventing new terminology (e.g "abstract task"!).

I have used that terminology on and off, to distinguish a "concrete" task (which is represented by a task proxy object in the scheduler) from one that only exists in an abstract sense - somewhere ahead in the potentially infinite graph.

But yes, it is frickn difficult to write clearly about this stuff. For users, I'm hoping n=0 or active task window will do.

@oliver-sanders
Copy link
Member

I hate to through a spanner in the works here, but I can't say I'm especially fond of adding an extra layer to cylc trigger for this purpose:

  • Trigger isn't a natural home for this behaviour.
  • Adding an extra layer to trigger functionality makes more common use cases confusing/awkward.
  • This isn't just about clock-triggers but xtriggers in general.
  • It's just difficult to know the best way to describe this without referring to spawning and n=0 and without inventing new terminology (e.g "abstract task"!)

Since the proposed cylc set command will also require the same implicit spawning behaviour as cylc trigger (i.e. if I set prereqs or outputs on task not in the pool, it will be automatically spawned for me), cylc set --pre=all should spawn a parentless task whether it has any prerequisites to satisfy or not and may be a better home for this behaviour.

@hjoliver
Copy link
Member

hjoliver commented Jun 21, 2023

@oliver-sanders -

I tend to agree that ideally "trigger" should mean "run now", end of story. But the fact is trigger does have this "layered" behaviour already - for queued tasks. And for the moment, we don't have an alternative place to put this.

Also, although I didn't actually say it above, I have been assuming that:

  • we'll need "run now despite all constraints" capability, as well as this layered approach
  • the user interface for this capability is still subject to revision under the cylc set discussion (and it may indeed end up under cylc set rather than cylc trigger).

However, in the interim (cylc set has been a long time coming!), I've implemented it ( #5585) in the only current suitable command because I need this functionality pretty quickly for a certain DR project (which involves starting up part way through a cycle point). (And it helped that @dpmatthews agreed/contributed too).

Anyhow, it sounds as if you agree that we need the underlying capability. We just need to figure out the best place to put it. If we don't get cylc set done and dusted soon, let's get it in cylc trigger with a warning about future interface change.

This isn't just about clock-triggers but xtriggers in general.

We need to be able to "trigger" any xtriggered task to start checking on its external condition now, rather than to run now, in order to set up future xtriggered tasks to run at the right time, if the workflow was not started in the normal way.

@hjoliver
Copy link
Member

hjoliver commented Jun 21, 2023

It's just difficult to know the best way to describe this without referring to spawning and n=0 and without inventing new terminology (e.g "abstract task"!)

That's not really an argument against this or any other functionality unless there is another way to do it that avoids any need to understand these concepts (whatever names we give them). That we haven't settled on the terminology doesn't necessarily imply that is possible to avoid the concepts altogether.

Unlike in Cylc 7, the Cylc 8 scheduler task pool makes good intuitive sense as the "n=0 active window" on the graph (which only contains "active tasks", with some provisos on exactly what "active" means). "Spawning" now just refers to how a tasks gets into that window. Users have to understand what the n=0 window is, in order to understand what they are seeing in the UIs. But that is not IMO leakage of internal implementation, unless you think it merely implementation that we can't treat the entirety of an infinite graph equally all at once. (Isn't that a recent movie...?)

@wxtim
Copy link
Member

wxtim commented Jan 24, 2024

@dpmatthews - Can I check my interpretation of this:

srcdir=${HOME}/cylc-src/delme
mkdir -p ${srcdir}
cd ${srcdir}

cat > flow.cylc <<__ICI__
[scheduler]
    allow implicit tasks = True
    cycle point format = %Y

[scheduling]
    runahead limit = P1Y

    initial cycle point = 1000
    hold after cycle point = 1005
    [[xtriggers]]
        xt = xrandom(24, 0, 'some label'):PT1S
    [[graph]]
        P1Y = @xt => foo => bar
__ICI__

cylc vip . 

# Watch until flow has decent head start/hits the hold after then ctrl+c
cylc log -m t delme

cylc set delme//1000/bar --pre=all --flow=new --meta='should-this-spawn-1000/bar'

cylc log -m t delme

Would we expect the trigger command to lead to the spawning on 1001/foo &c...?

@dpmatthews dpmatthews modified the milestones: cylc-8.x, cylc-8.3.0 Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants