task/runner: Fix NRE in publishing task result #92
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After a recent change to the
ErrYieldExecution
into aContinueAsync
output, I introduced a null reference exception on line 202 of therunner
any time a task failed: the output wasnil
so it panicked.A panic at that point in the code won't restart the process, but will be recovered by the AMQP consumer itself which will
nack
the event instead ofacking
. This means that the event goes back to the front of the queue, which means that any failing task would keep 1 task-runner go-routine stuck.After 15 failed tasks on each region, the
task-runner
would stop executing any tasks at all! It just entered an eternal loop of picking up a task, failing, panicking, putting it back in the front of the queue and repeat.We probably need an alert for increased panics as well. We had that on papertrail but I think haven't created on Loki?