Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We hit a pretty unfortunate edge case in this library where one broken process will corrupt the entire pool. When that happens, the entire process pool (
self._executor
) is broken and the worker does not recover, but it also does not terminate. The result is that it continues to pick up new jobs and send them to the process pool to execute and they immediately throw a BrokenExecutor exception. For us, this resulted in the corrupt worker essentially draining our queue and funneling them directly to the dead queue.It does look like there is an older attempt to handle this here, however it is unreachable because this line catches all exceptions first.
The fix in this pr, which we've been using for a while now, recovers from this state by throwing the broken process pool away and letting the next tick re-create a fresh one. This has been working really well, since broken processes are very rare (for us), and all the initial jobs impacted by the broken process are retried.
This PR also includes a small useful change to enable the proto client to accept and pass all the kwargs that the connection accepts along. We use that client directly in a few places and need to forward some options to the connection.