Fix worker-level stuck job timeout #1133
Merged
+5
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix a bug that came in with #1126 in which we we were correctly
calculating timeout, but then not passing it down to the stuck job
function when starting the stuck detection goroutine.
There is a test that was checking this worked, but due to the nature of
the bug, it was in effect detecting a stuck job after 0s and therefore
passing by accident. I looked into ways to add additional testing here,
but elected not to add more because they'd involve the sort of test I
really hate, which has to wait arbitrarily wait to try and check that
something did not happen, introducing both slowness and intermittency.
After the fix here lands, this is the sort of thing that's not too
likely to regress, and should be noticed quickly in case it does.
Fixes #1125.