Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task proxy spawn on demand. #3515

Merged
merged 29 commits into from Jul 29, 2020
Merged

Task proxy spawn on demand. #3515

merged 29 commits into from Jul 29, 2020

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Mar 9, 2020

Spawn on Demand implementation as per proposal https://cylc.github.io/cylc-admin/proposal-spawn-on-d.html

See #3474 (superseded) for earlier discussion and some proof of efficiency.

This branch:

  • correctly runs the "complex" test workflow for multiple cycles (last I checked, recently)
  • handles reflow properly as described in the proposal
  • seems to work for all manner of manual test cases

What's still to do:

  • restart: need to record upstream output status for each task (for prerequisites) in the DB, not just task status
  • refine the cycle point based housekeeping to account for inter-cycle dependence (no longer needed)
  • unit and functional tests (many mods needed)
  • implement reflow stop mechanisms as described in the proposal
  • implement cylc spawn for specific individual outputs

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Appropriate tests are included (unit and/or functional).
  • Already covered by existing tests.
  • Appropriate change log entry included.

@hjoliver hjoliver added efficiency For notable efficiency improvements WIP POC Proof of Concept labels Mar 9, 2020
@hjoliver hjoliver added this to the cylc-8.0.0 milestone Mar 9, 2020
@hjoliver hjoliver self-assigned this Mar 9, 2020
@hjoliver hjoliver mentioned this pull request Mar 9, 2020
7 tasks
@hjoliver hjoliver force-pushed the spawn-on-demand branch 2 times, most recently from 05ca046 to 5d134a5 Compare March 16, 2020 06:30
@hjoliver
Copy link
Member Author

Restart now implemented, and tests/restart (non-remote) functional tests working. (Description updated above)

@hjoliver hjoliver force-pushed the spawn-on-demand branch 3 times, most recently from 1add368 to 98c7ec3 Compare April 8, 2020 00:07
@hjoliver hjoliver force-pushed the spawn-on-demand branch 2 times, most recently from 9150a0d to 92c293e Compare April 16, 2020 12:56
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
@hjoliver
Copy link
Member Author

hjoliver commented Apr 17, 2020

Latest Travis-CI results on this branch:

  • All unit tests pass
  • Failed 2 functional tests out of 1867 (in 2 test scripts out of 577)
    • graphql/02-root-queries.t DONE
    • offset/04-cycle-offset-chain.t DONE
    • flakytests/ DONE

@hjoliver hjoliver force-pushed the spawn-on-demand branch 2 times, most recently from ee7fb4c to 578107b Compare April 21, 2020 10:51
@dwsutherland
Copy link
Member

@hjoliver - Got graphql/02-root-queries.t to pass with:

(flow) sutherlander@cortex-vbox:cylc-flow$ git diff
diff --git a/tests/graphql/02-root-queries.t b/tests/graphql/02-root-queries.t
index 02fc49682..386d769bf 100755
--- a/tests/graphql/02-root-queries.t
+++ b/tests/graphql/02-root-queries.t
@@ -144,18 +144,12 @@ cat > expected << __HERE__
         {
             "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}baa"
         },
-        {
-            "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}bar"
-        },
         {
             "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}foo"
         },
         {
             "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}qar"
         },
-        {
-            "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}qaz"
-        },
         {
             "id": "${USER}${ID_DELIM}${SUITE_NAME}${ID_DELIM}20190101T00${ID_DELIM}qux"
         }

Ran it 3 times to check consistency (but haven't run it on a different/faster machine).

@hjoliver
Copy link
Member Author

Thanks @dwsutherland - if you're happy that's the correct expected result, I'll make your change and leave the test like that for now.

I see my latest change has caused the unit tests to fail 😬 (I know why though, easy to fix though).

@hjoliver
Copy link
Member Author

Unit tests fixed, 3 or 4 "absolute offset" tests broken now, as a result of previous change. Damn it, almost there.

@hjoliver hjoliver force-pushed the spawn-on-demand branch 2 times, most recently from 9bf6097 to 7535e12 Compare April 23, 2020 08:10
@hjoliver
Copy link
Member Author

All tests passing 💥

@hjoliver hjoliver removed POC Proof of Concept WIP labels Apr 30, 2020
@hjoliver hjoliver marked this pull request as ready for review April 30, 2020 07:09
No implicit task creation by ICP offsets.
@hjoliver
Copy link
Member Author

So I think we just need to consider absolute triggers in that bit of the validation code. I'll fix it on this branch if easy to do...

Done in latest commit. Other issues raised all addressed except for the "absolute triggers + sequential tasks" one. (I'll comment on that soon...)

hjoliver and others added 4 commits July 28, 2020 10:27
Update tests/functional/api-suite-info/01-get-graph-raw-2/suite.rc
Update tests/functional/api-suite-info/02-get-graph-raw-3/suite.rc
Update tests/functional/api-suite-info/03-get-graph-raw-4/suite.rc

Co-authored-by: Melanie Hall <37735232+datamel@users.noreply.github.com>
@hjoliver
Copy link
Member Author

hjoliver commented Jul 28, 2020

@dwsutherland - feedback addressed. Your examples above work, just need to add a test for the improved absolute trigger handling [done]

Copy link
Member

@dwsutherland dwsutherland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjoliver - Great, when trying to run the suite:

[scheduling]
    initial cycle point = 20200101T00
    [[special tasks]]
        sequential = foo
    [[dependencies]]
        #R1 = wiz
        P1M = """
#foo[-P1M] => foo
wiz[^] => foo
"""

I get the following:

(flow) sutherlander@cortex-vbox:wiz$ cylc validate wiz
TaskDefError: No cycling sequences defined for wiz
2020-07-29T12:34:40+12:00 INFO - Cold Start 20200101T0000+12
2020-07-29T12:34:40+12:00 INFO - Suite server: url=tcp://cortex-vbox:43093/ pid=10184
2020-07-29T12:34:40+12:00 INFO - Suite publisher: url=tcp://cortex-vbox:43064
2020-07-29T12:34:40+12:00 INFO - Run: (re)start=0 log=1
2020-07-29T12:34:40+12:00 INFO - Cylc version: 8.0a2
2020-07-29T12:34:40+12:00 INFO - Run mode: live
2020-07-29T12:34:40+12:00 INFO - Initial point: 20200101T0000+12
2020-07-29T12:34:40+12:00 INFO - Final point: None
2020-07-29T12:34:40+12:00 INFO - Suite shutting down - AUTOMATIC
2020-07-29T12:34:41+12:00 INFO - [client-command] identify sutherlander@cortex-vbox:cylc-uiserver
2020-07-29T12:34:41+12:00 INFO - DONE

(branch needs de-conflicted again)

Nice work! 🥇

@hjoliver hjoliver merged commit e0e356a into cylc:master Jul 29, 2020
@kinow
Copy link
Member

kinow commented Jul 29, 2020

Impressive work @hjoliver !!! 🍾 🍻

@TomekTrzeciak
Copy link
Contributor

This one's big! I'm really happy to see this happening already for Cylc 8 🎉 🍾 🚀

(name, offset, output,
offset_is_from_icp, offset_is_irregular, offset_is_absolute)

TODO: offset_is_from_icp is '^' or None - should be boolean?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjoliver - does this need addressing still?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spotting. The variable name suggests we thought the value is boolean. It's actually of no consequence as (I checked) the variable is only used in if tests (where '^' vs None behaves just like True vs `False). However, I'll follow-up with a small change to make the code less obtuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
db change Change to the workflow database structure efficiency For notable efficiency improvements
Projects
None yet
8 participants