1792 - task polling logic and state reset. #2396

hjoliver · 2017-08-12T20:48:54Z

Allow reset to 'submitted' or 'running'.
Allow polling of 'succeeded' or 'failed' tasks (but not 'succeeded' by default).
Poll to confirm, if a message implies a state reversal (the message could be, e.g. a late - and therefore out-of-order - poll result)
Remove 'enable resurrection' - any task can return from failed.
Document how to handle preemption in light of these changes.

~~Not ready for review yet. TODO~~

~~tidy up code - callback args complicated by is_second_poll flag~~
~~test: e.g. manually reset a long-running task to succeeded, then poll to return it to running~~
~~don't automatically poll succeeded tasks~~

hjoliver · 2017-08-17T12:18:22Z

A bit of an efficiency gain? (somewhat surprisingly):

Version                        Run            Elapsed Time (s)  CPU Time - Total (s)  Max Memory (kb)
reset-to-submitted-or-running  complex suite  1021.6            492.1                 86108.0        
master                         complex suite  1004.9            499.2                 87652.0

oliver-sanders · 2017-08-17T14:15:46Z

somewhat surprisingly

Indeed yes but I can confirm it is repeatable:

Version	Run	Elapsed Time (s)	CPU Time - Total (s)	Max Memory (kb)
hjoliver/reset-to-submitted-or-running	complex suite	1066.5	310.8	70096.0
master	complex suite	1070.5	308.4	71948.0

I'd be interested in finding out what causes the discrepancy between the memory usage on these two platforms.

hjoliver · 2017-08-22T08:16:41Z

... what causes the discrepancy between the memory usage on these two platforms.

Probably(?) differences in the Python interpreter itself and/or std lib modules loaded?

matthewrmshin · 2017-09-11T07:54:40Z

Branch in conflicts. (Sorry.)

* Allow reset to 'submitted' or 'running'. * Allow polling of succeeded or failed tasks (but not succeeded by default). * Poll to confirm, if a message implies a state reversal. * Remove 'enable resurrection' - all tasks can return from failed. * Document how to handle preemption in light of these changes.

hjoliver · 2017-09-11T10:15:37Z

Deconflicted.

matthewrmshin · 2017-09-11T10:53:57Z

Extra memory usage likely to be caused by the new attribute in each task state object?

hjoliver · 2017-09-11T11:08:25Z

No, it uses less memory according to the results above. Oliver was referring to platform differences, with the same branch. (or have I misunderstood you?)

hjoliver · 2017-09-11T11:10:38Z

Looks like my deconfliction was not entirely successful...

matthewrmshin · 2017-09-11T11:59:07Z

OK.

hjoliver · 2017-09-12T01:27:29Z

Fixed.

matthewrmshin

A few comments on style. Change is otherwise OK.

matthewrmshin · 2017-09-12T08:08:06Z

lib/cylc/task_outputs.py

+    def set_incomplete(self, message):
+        """Set output message to incomplete."""
+        if message in self._by_message:
+            self._by_message[message][_IS_COMPLETED] = False


Should just use self.set_completed(message, is_completed=False)?

OK I'll refactor these completion methods slightly...

matthewrmshin · 2017-09-12T08:16:49Z

lib/cylc/task_events_mgr.py

+            if (itask.state.is_greater_than(TASK_STATUS_RUNNING) and not
+                    itask.state.confirming_with_poll):
+                itask.state.confirming_with_poll = True
+                poll_func(self.suite, [itask], msg=poll_msg)


Do we need a return here as well?

matthewrmshin · 2017-09-12T08:22:26Z

lib/cylc/task_events_mgr.py

+                    itask.state.confirming_with_poll):
+                itask.state.confirming_with_poll = True
+                poll_func(self.suite, [itask], msg=poll_msg)
+                return


We have a block here that is repeated multiple times with only 1 difference. I wonder if it is worth moving this little bit of logic into a separate private method or not.

Yes, agreed (I recall considering that at the time, but forgot to come back to it...)

hjoliver · 2017-09-12T10:20:50Z

Feedback addressed, let see if the tests all pass...

matthewrmshin · 2017-09-12T11:30:20Z

Something has gone wrong with these:

./tests/restart/30-outputs.t
./tests/suite-state/05-message.t

hjoliver · 2017-09-12T23:57:12Z

All tests good now.

oliver-sanders · 2017-09-13T09:12:30Z

lib/cylc/cfgspec/suite.py

@@ -548,6 +547,7 @@ def upg(cfg, descr):
    u.obsolete('7.2.2', ['cylc', 'simulation mode'])
    u.obsolete('7.2.2', ['runtime', '__MANY__', 'dummy mode'])
    u.obsolete('7.2.2', ['runtime', '__MANY__', 'simulation mode'])
+    u.obsolete('7.5.0', ['runtime', '__MANY__', 'enable resurrection'])


Do we need to update this to 7.6.0?

Yes, hang on a minute...

oliver-sanders · 2017-09-21T10:28:14Z

This pull appears to have had a detrimental impact on CPU for the diamond suite, though a positive impact on memory.

hjoliver added this to the soon milestone Aug 12, 2017

hjoliver self-assigned this Aug 12, 2017

hjoliver force-pushed the reset-to-submitted-or-running branch 3 times, most recently from 5c45b73 to cc9514f Compare August 16, 2017 07:50

hjoliver requested review from oliver-sanders and dvalters August 16, 2017 07:55

matthewrmshin requested review from matthewrmshin and removed request for dvalters August 17, 2017 10:08

oliver-sanders added the efficiency For notable efficiency improvements label Aug 17, 2017

matthewrmshin modified the milestones: soon, next release Sep 5, 2017

hjoliver added 2 commits September 11, 2017 21:17

Improve poll logging.

54d628d

hjoliver force-pushed the reset-to-submitted-or-running branch from cc9514f to 54d628d Compare September 11, 2017 10:14

hjoliver added 2 commits September 12, 2017 13:16

pep8

f5be17b

post-rebase fix2

31d249b

matthewrmshin reviewed Sep 12, 2017

View reviewed changes

Address PR review feedback.

740c212

hjoliver added 2 commits September 13, 2017 08:58

Fix previous, after test failures.

82c2a43

Command help tidy [skip ci]

0fe2dc8

matthewrmshin approved these changes Sep 13, 2017

View reviewed changes

oliver-sanders approved these changes Sep 13, 2017

View reviewed changes

Updated deprecation warning. [skip ci]

f124bcf

oliver-sanders merged commit d658be2 into cylc:master Sep 13, 2017

hjoliver deleted the reset-to-submitted-or-running branch December 4, 2017 09:15

dpmatthews mentioned this pull request Nov 15, 2021

Polling can incorrectly return a failed task to the running state #4513

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1792 - task polling logic and state reset. #2396

1792 - task polling logic and state reset. #2396

hjoliver commented Aug 12, 2017 •

edited

hjoliver commented Aug 17, 2017

oliver-sanders commented Aug 17, 2017 •

edited

hjoliver commented Aug 22, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 11, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 11, 2017

hjoliver commented Sep 11, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 12, 2017

matthewrmshin left a comment

matthewrmshin Sep 12, 2017

hjoliver Sep 12, 2017

matthewrmshin Sep 12, 2017

hjoliver Sep 12, 2017

matthewrmshin Sep 12, 2017

hjoliver Sep 12, 2017

hjoliver commented Sep 12, 2017 •

edited

matthewrmshin commented Sep 12, 2017

hjoliver commented Sep 12, 2017

oliver-sanders Sep 13, 2017

hjoliver Sep 13, 2017

hjoliver Sep 13, 2017

oliver-sanders commented Sep 21, 2017

1792 - task polling logic and state reset. #2396

1792 - task polling logic and state reset. #2396

Conversation

hjoliver commented Aug 12, 2017 • edited

hjoliver commented Aug 17, 2017

oliver-sanders commented Aug 17, 2017 • edited

hjoliver commented Aug 22, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 11, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 11, 2017

hjoliver commented Sep 11, 2017

matthewrmshin commented Sep 11, 2017

hjoliver commented Sep 12, 2017

matthewrmshin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hjoliver commented Sep 12, 2017 • edited

matthewrmshin commented Sep 12, 2017

hjoliver commented Sep 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliver-sanders commented Sep 21, 2017

hjoliver commented Aug 12, 2017 •

edited

oliver-sanders commented Aug 17, 2017 •

edited

hjoliver commented Sep 12, 2017 •

edited