-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_result_handler dies on raised exceptions [multiprocessing] #83371
Comments
Raising an Exception in a callback handler of apply_async and/or map_async will kill the _result_handler thread. This causes unexpected behavior as all subsequent callbacks won't be called and worse, pool.join will deadlock. The documentation states that callbacks should return immediately but it does not warn the user against raising an exception. Attached are steps to reproduce. |
…cess pools User-supplied callbacks are called from an internal pool management thread. At present any exceptions they raise are not caught and so propagate out and kill the thread. This then causes problems for subsequent pool operations, including joining the pool hanging. As a QoL improvement, catch and handle any such exceptions using the system exception hook. Thus by default details of the exception will be printed to stderr, but the pool's integrity will remain intact.
could we instead raise the exception out of AsyncResult.get ? |
I don't think that would be appropriate: a result has been successfully computed with no exception, IMO it should be returned from
Whoops, I was totally mistaken about there being multiple callbacks, sorry! However I still don't think a callback failing should change the computed result. |
it doesn't change the computed result, only the exception. because this bug totally hoses the multiprocessing Pool were sort of free to change the behaviour however we want. It doesn't matter if code could break it's already broken! So we could raise an ExceptionGroup out of AsyncResult.get() or perhaps we could mark the Pool as broken so all methods and AsyncResult.get() raise |
Sure, but from the point of view of the calling code it is the same thing. They call
Right. I've never used exception groups but AIUI starting to throw them from a method that previously didn't is a breaking API change. However, it doesn't really matter in this case, since if it happens the program would have been badly broken previously anyway. Is that correct? So, the semantics of
Existing exception handling code will not catch these new exception groups, which is good: we wouldn't want them to be treated as "normal" failures.
I like the sound of this, but it does get a bit complicated. If we are E.g. consider exiting with a "normal" error (note if we didn't do the with Pool() as pool:
res = pool.async_apply(foo, callback=raising)
# Ensure the task has run and set the error, but don't let it propagate out
try:
res.get()
except CallbackError:
pass
raise ValueError("error") ...and exiting via callback error: with Pool() as pool:
pool.apply_async(foo, callback=raising).get() We can't avoid the duplication by checking if it is already in the list of sub-exceptions: that fails for managers which use proxies and unpickles the same exception group into two separate instances. Instead I've skipped adding the exception if it is a I Edited to add: looks like it is not working properly after all, I will continue to debug tomorrow, but let me know if you have any comments on the approach in general in the meantime. |
(the interpreter doesn't crash per se, just that the interpreter hangs; but no segmentation fault occurs, which is why I changed the labels) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: