New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-34076: Improve timeout handling by MPGraphExecutor #174
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks okay. My main confusion is why something that is timed out but still finishes before the time out completes can't be treated as a success.
|
||
@property | ||
def state(self): | ||
"""Job processing state (JobState)""" | ||
return self._state | ||
|
||
@property | ||
def terminated(self): | ||
"""Return True if job was killed by stop() method (`bool`)""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but also if the exit code was negative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sort of implied, if stop
was called but did not kill the process then it was not killed by stop. I'll expand the docstring.
jobs.setJobState(job, JobState.FAILED) | ||
if job.terminated: | ||
# Was killed due to timeout. | ||
if self.report.status == ExecutionStatus.SUCCESS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is killed due to time out and reports success why can't that be treated as success? How can you be timed out and successful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is treated as success, terminated
is False for that case, and check for exitcode == 0
is in the if
branch above. This branch is only for the case when it was actually killed by a signal (had non-zero exitcode
)
Detect the case when a task manages to finish successfully before it gets killed due to timeout. This guarantees more consistent status in a quantum report.
Running time for timing-out task increased to 100 seconds, this should reduce the chance for it to finish successfully before timeout is detected by a parent.
63d7cb8
to
19f7295
Compare
Detect the case when a task manages to finish successfully before it
gets killed due to timeout. This guarantees more consistent status in
a quantum report.
Increase running time for timed out task in a unit test.
Running time for timing-out task increased to 100 seconds, this should
reduce the chance for it to finish successfully before timeout is
detected by a parent.
Checklist
doc/changes