Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1792 - task polling logic and state reset. #2396

Merged
merged 8 commits into from Sep 13, 2017

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Aug 12, 2017

Close #1792.

  • Allow reset to 'submitted' or 'running'.
  • Allow polling of 'succeeded' or 'failed' tasks (but not 'succeeded' by default).
  • Poll to confirm, if a message implies a state reversal (the message could be, e.g. a late - and therefore out-of-order - poll result)
  • Remove 'enable resurrection' - any task can return from failed.
  • Document how to handle preemption in light of these changes.

Not ready for review yet. TODO

  • tidy up code - callback args complicated by is_second_poll flag
  • test: e.g. manually reset a long-running task to succeeded, then poll to return it to running
  • don't automatically poll succeeded tasks

@hjoliver hjoliver added this to the soon milestone Aug 12, 2017
@hjoliver hjoliver self-assigned this Aug 12, 2017
@hjoliver hjoliver force-pushed the reset-to-submitted-or-running branch 3 times, most recently from 5c45b73 to cc9514f Compare August 16, 2017 07:50
@matthewrmshin matthewrmshin requested review from matthewrmshin and removed request for dvalters August 17, 2017 10:08
@hjoliver
Copy link
Member Author

A bit of an efficiency gain? (somewhat surprisingly):

Version                        Run            Elapsed Time (s)  CPU Time - Total (s)  Max Memory (kb)
reset-to-submitted-or-running  complex suite  1021.6            492.1                 86108.0        
master                         complex suite  1004.9            499.2                 87652.0   

@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 17, 2017

somewhat surprisingly

Indeed yes but I can confirm it is repeatable:

Version Run Elapsed Time (s) CPU Time - Total (s) Max Memory (kb)
hjoliver/reset-to-submitted-or-running complex suite 1066.5 310.8 70096.0
master complex suite 1070.5 308.4 71948.0

I'd be interested in finding out what causes the discrepancy between the memory usage on these two platforms.

@oliver-sanders oliver-sanders added the efficiency For notable efficiency improvements label Aug 17, 2017
@hjoliver
Copy link
Member Author

... what causes the discrepancy between the memory usage on these two platforms.

Probably(?) differences in the Python interpreter itself and/or std lib modules loaded?

@matthewrmshin matthewrmshin modified the milestones: soon, next release Sep 5, 2017
@matthewrmshin
Copy link
Contributor

Branch in conflicts. (Sorry.)

 * Allow reset to 'submitted' or 'running'.
 * Allow polling of succeeded or failed tasks (but not succeeded by default).
 * Poll to confirm, if a message implies a state reversal.
 * Remove 'enable resurrection' - all tasks can return from failed.
 * Document how to handle preemption in light of these changes.
@hjoliver
Copy link
Member Author

Deconflicted.

@matthewrmshin
Copy link
Contributor

Extra memory usage likely to be caused by the new attribute in each task state object?

@hjoliver
Copy link
Member Author

No, it uses less memory according to the results above. Oliver was referring to platform differences, with the same branch. (or have I misunderstood you?)

@hjoliver
Copy link
Member Author

Looks like my deconfliction was not entirely successful...

@matthewrmshin
Copy link
Contributor

OK.

@hjoliver
Copy link
Member Author

Fixed.

Copy link
Contributor

@matthewrmshin matthewrmshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on style. Change is otherwise OK.

def set_incomplete(self, message):
"""Set output message to incomplete."""
if message in self._by_message:
self._by_message[message][_IS_COMPLETED] = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just use self.set_completed(message, is_completed=False)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'll refactor these completion methods slightly...

if (itask.state.is_greater_than(TASK_STATUS_RUNNING) and not
itask.state.confirming_with_poll):
itask.state.confirming_with_poll = True
poll_func(self.suite, [itask], msg=poll_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a return here as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good spot

itask.state.confirming_with_poll):
itask.state.confirming_with_poll = True
poll_func(self.suite, [itask], msg=poll_msg)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a block here that is repeated multiple times with only 1 difference. I wonder if it is worth moving this little bit of logic into a separate private method or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed (I recall considering that at the time, but forgot to come back to it...)

@hjoliver
Copy link
Member Author

hjoliver commented Sep 12, 2017

Feedback addressed, let see if the tests all pass...

@matthewrmshin
Copy link
Contributor

Something has gone wrong with these:

./tests/restart/30-outputs.t
./tests/suite-state/05-message.t

@hjoliver
Copy link
Member Author

All tests good now.

@@ -548,6 +547,7 @@ def upg(cfg, descr):
u.obsolete('7.2.2', ['cylc', 'simulation mode'])
u.obsolete('7.2.2', ['runtime', '__MANY__', 'dummy mode'])
u.obsolete('7.2.2', ['runtime', '__MANY__', 'simulation mode'])
u.obsolete('7.5.0', ['runtime', '__MANY__', 'enable resurrection'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update this to 7.6.0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, hang on a minute...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@oliver-sanders oliver-sanders merged commit d658be2 into cylc:master Sep 13, 2017
@oliver-sanders
Copy link
Member

This pull appears to have had a detrimental impact on CPU for the diamond suite, though a positive impact on memory.

memory
cpu-time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
efficiency For notable efficiency improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants