Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate ICP and FCP graphs using the runahead limit #5036

Closed
wants to merge 5 commits into from

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Aug 2, 2022

A bit of late night fun (sad!).

Implement "start-up and shutdown graphs" (effectively isolated from the main graph so that perpetual dependence on R1 & $ tasks can be avoided) ... by manipulating the runahead limit.

These changes close #4912

Imagine a task prep is supposed to prepare the run directory for all tasks; and clean is supposed to tidy up at the end after all tasks have finished:

[scheduling]
    cycling mode = integer
    final cycle point = 5
    isolate initial cycle point = True
    isolate final cycle point = True
    [[graph]]
        R1 = "prep => foo"
        P1 = "foo"
        R1/$ = "foo => clean"
[runtime]
    [[prep, clean, foo]]
        script = sleep 10

With all tasks having the same run length, on master (without the new config items) 1/prep runs at the same time as 2/foo, 3/foo, 4/foo, 5/foo, then 1/foo runs at the same time as 5/clean. Which is clearly not the intention, so we'd need additional ugly dependencies to make it work.

Or ... on this PR branch, 1/prep => 1/foo runs to completion, then 2/foo, 3/foo, 4/foo run to completion, then 5/foo => 5/clean runs. 馃挜

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg and conda-environment.yml.
  • Appropriate tests are included (unit and/or functional).
  • Appropriate change log entry included.

@hjoliver hjoliver added this to the cylc-8.1.0 milestone Aug 2, 2022
@hjoliver hjoliver self-assigned this Aug 2, 2022
@hjoliver hjoliver marked this pull request as ready for review August 2, 2022 23:33
@hjoliver
Copy link
Member Author

hjoliver commented Aug 3, 2022

Assigning you as first reviewer @oliver-sanders, since it is was your idea to get "start-up graphs" cheaply and safely by manipulating the runahead limit (and a fine idea it was).

Copy link
Member

@dwsutherland dwsutherland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feature, I might implore our operations team to switch to this.
Would be good if there was a short hand way of specifying the ICP without knowing it via the CLI (i.e. if we had to reflow the prep tasks).

Tested locally before and after (True/False), and found working 馃憤

@@ -34,10 +34,13 @@ ones in. -->

### Enhancements

[#5036](https://github.com/cylc/cylc-flow/pull/5036) - optional isolation of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalise "optional" (and 5032 below it)?

@dpmatthews
Copy link
Contributor

    [[graph]]
        R1 = "prep => foo"
        P1 = "foo"
        R1/$ = "foo => clean"

Not sure I like this. One of the key features of Cylc is the support for parallel cycles. In this example, if foo is a very long task (or series of tasks) and you want to run lots of cycles in parallel, it's going to be very frustrating if you have to spend a long waiting for the first cycle to complete before the workflow can start running cycles in parallel as intended.

One way around this is to make the start and end tasks run on separate cycles:

    [[graph]]
        R1 = "prep"
        P1!(^,$) = "foo" # or R/+P1/P1!$
        R1/$ = "clean"

(syntax provided by @oliver-sanders !)
However, the syntax is horrible and you also have to adjust the start and stop points.

It would be much nicer if we could do something like:

    [[graph]]
        start = "prep"
        P1 = "foo"
        finish = "clean"

I guess that's a lot more work?

@hjoliver
Copy link
Member Author

hjoliver commented Aug 5, 2022

@dpmatthews -

It would be much nicer if we could do something like:

Haha, that was my original suggestion, albeit explicitly restricted to start and finish graphs - you might have chimed in back on #4903!!

I guess that's a lot more work?

Well, I can think about it, if you and @oliver-sanders agree - he didn't like the separate graphs approach, which is how we ended up here. But maybe the restriction to "start" and "finish" graphs would make that acceptable - what do you think Oliver?

I think the runahead limit implementation is OK. It's opt-in, we can explain the potential downside and that prep[^] => foo) is an alternative. And perhaps worth noting that in a workflow with many cycles, less than optimal scheduling in the first won't matter much.

However, the syntax is horrible and you also have to adjust the start and stop points.

Cool that our syntax support that though. But inter-cycle triggers would make that even worse (the offset would break pre-initial ignore).

@hjoliver
Copy link
Member Author

hjoliver commented Aug 5, 2022

Long story short, I also think this is better:

    [[graph]]
        start = "prep"
        P1 = "foo"
        finish = "clean"

If you look at the description of #4903 that's what I was aiming at, but maybe I shot myself in the foot there by trying to generalize it!

@dpmatthews
Copy link
Contributor

you might have chimed in back on #4903!!

Sorry!

But inter-cycle triggers would make that even worse (the offset would break pre-initial ignore).

Yuck - hadn't thought of that.

@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 5, 2022

@oliver-sanders agree - he didn't like the separate graphs approach

I'm ok with separate recurrences for the ICP/FCP. I think a fully abstracted multi-graph solution would require a lot more thought than we can reasonably provide at this point in time.

Long term it might also be the case that workflow modules / composition (e.g. the advanced handling of sub-workflows) may provide a simpler solution to this sort of problem.

Long story short, I also think this is better:

[[graph]]
start = "prep"
P1 = "foo"
finish = "clean"

It's definitely better, the issue is trying to work out how to implement it! The "start" and "finish" cycles need to be assigned a "cycle point" but what should that be? We could do something like subtract PT1S from the ICP at configuration time, but that's a bit icky (and won't necessarily work with cycle point format). Or we could implement start and finish as special cycle points, but that requires patching all cycling interfaces to handle these special keywords.

@hjoliver
Copy link
Member Author

hjoliver commented Aug 5, 2022

Or we could implement start and finish as special cycle points, but that requires patching all cycling interfaces to handle these special keywords.

I was thinking along those lines, but I haven't thought through it in depth yet.

@hjoliver
Copy link
Member Author

Superseded by #5090

@hjoliver hjoliver closed this Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Variable runahead limit, for start-up graphs.
4 participants