-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio: BaseEventLoop.close() shutdowns the executor without waiting causing leak of dangling threads #78218
Comments
AMD64 FreeBSD 10.x Shared 3.7: ... Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2) ok |
It's easy to reproduce the issue on Linux: diff --git a/Lib/test/test_asyncio/test_events.py b/Lib/test/test_asyncio/test_events.py
index 11cd950df1..df4c2b9849 100644
--- a/Lib/test/test_asyncio/test_events.py
+++ b/Lib/test/test_asyncio/test_events.py
@@ -359,7 +359,7 @@ class EventLoopTestsMixin:
called = True
def run():
- time.sleep(0.05)
+ time.sleep(1.0)
f2 = self.loop.run_in_executor(None, run)
f2.cancel() The problem is that BaseEventLoop.close() shutdowns its default executor without waiting: def close(self):
...
executor = self._default_executor
if executor is not None:
self._default_executor = None
executor.shutdown(wait=True) I fixed a similar issue in socketserver:
I suggest to wait by default, but maybe also add a block_on_close attribute to BaseEventLoop (default: False) just for backward compatibility. What do you think Yury, Andrew, and Guido? |
Yury, Andrew: Do you know if the executor doesn't wait on purpose? Would it be possible to change that in Python 3.8? |
Maybe. At least we need to add a "timeout" argument to asyncio.run() to let it wait for executor jobs. I'm also thinking about making OS threads cancellable/interruptable in Python 3.8 (if they run pure Python code). |
The shutdown() method of concurrent.futures.Executor API doesn't accept a timeout. It waits for multiple things. I added "block_on_close = True" class attribute to socketserver.ForkingMixIn and socketserver.ThreadingMixIn. By default, server_close() waits until all children complete, but the wait is non-blocking if block_on_close is false. |
I wrote PR 13786 to fix this issue. |
"executor.shutdown(wait=False)" was added by this change: commit 4ca7355
Guido van Rossum proposed to wait for the executor. Guido proposed " executor.shutdown(wait=False)", but he didn't comment why wait=False: |
Łukasz Langa reverted my change (PR 13786). I pushed my change without any review to try to fix https://bugs.python.org/issue37137 which blocked the Python 3.8 beta1. Sadly, my change didn't fix this issue. The root cause was differently (and it's now fixed). Andrew Svetlov asked to make some changes on my API. Yury Selivanov added "I don't like this last minute api decision. Please don't merge this.". See PR 13786 for the discussion. In short, my change was reverted because of the API. But I still consider that calling executor.shutdown(wait=False) in loop.close() in wrong: we must wait until it finishes. Otherwise, the code "leaks" threads which continue to run and so can cause troubles. I would prefer to wait by default. Andrew, Yury: Would you be interested to rewrite my change in the "asyncio way"? IMHO this change can wait Python 3.9, it's not a critical bug. |
I still think that this is a valuable change but we need to discuss public API. In fact, in my job we used to override default executor and wait for its shutdown with explicit waiting to make the system robust. The problem is that loop.close() can "hang" for minutes if there are pending background jobs or slow DNS resolution requests. We controlled our tasks running in the thread pool carefully and used aiodns for DNS resolving. |
Maybe concurrent.futures should be enhanced for that? Or from asyncio, is it possible to poll the executor and emit a warning if some threads take longer than <threshold in esconds>? It would be great if we could display repr() of the slow job in that case. Yury also asked for a timeout, but it's not supported by concurrent.futures yet, and I'm not sure what should be done in case of timeout? Retry? Log a warning? |
I prefer postponing to think about it until all required 3.8 tasks are done :) |
This bug still occurs randomly on the CI. Example on AppVeyor: test_reader_callback (test.test_asyncio.test_events.SelectEventLoopTests) ... ok Any plan to reapply my change, or fix it? |
I can try to help with this. I'm not the most familiar with the internals of asyncio, but I think it would provide a good learning experience. |
I've opened PR-15735 which applies the same functionality as Victor's PR-13786, but adds the public getter and setter methods (for both AbstractEventLoop and BaseEventLoop) as requested by Andrew. Since this is still causing intermittent CI failures from test_asyncio, I think it's important to implement these changes in some form in the near future. |
I change the version to Python 3.9: the 3.8 branch don't accept new features anymore. |
Here's the API I propose to solve this problem: #15735 (review) Summary:
|
"shutdown_threadpool()" name What do you think of the "shutdown_default_executor()" name? The default executor can be overriden by set_default_executor(): def set_default_executor(self, executor):
if not isinstance(executor, concurrent.futures.ThreadPoolExecutor):
warnings.warn(
'Using the default executor that is not an instance of '
'ThreadPoolExecutor is deprecated and will be prohibited '
'in Python 3.9',
DeprecationWarning, 2)
self._default_executor = executor The default executor should always be a thread pool, so "shutdown_threadpool()" name is also correct. I have no strong preference. |
Yeah, I think it's a better name! |
Thnks, Kyle! |
asyncio.run() now shutdowns the default executor: nice, thanks! That was my request. IMHO it will make asyncio more reliable, especially for tests on the CI. If it becomes an issue in Python 3.9 (executor hangs forever), we can add new parameters (to run()?) to put a timeout for example. |
No problem, and thanks for all of the help from Andrew, Yury, and Victor!
Awesome, that was my primary intention. (:
Feel free to add me to the nosy list if you have to open an issue for it, I'd be glad to help out with this if it becomes an issue. |
I got a report from a library that ties together asyncio and some other async libraries, getting errors like this: tests/test_taskgroups.py:60: in test_run_natively (more at: https://bugzilla.redhat.com/show_bug.cgi?id=1817681#c1 ) I'm not all that familiar with asyncio, but it seems to me that all event loops inherited from BaseEventLoop must be updated to implement this new method to correctly work with run() in Python 3.9. Is that right? If it is, this needs at least a much more prominent What's New entry. Or the hard NotImplementedError should turn into a warning. |
I reopen the issue.
That means that the project (python-anyio) implements its own event loop which inherits from AbstractEventLoop, it doesn't use BaseEventLoop which implements shutdown_default_executor(). AbstractEventLoop documentation says: https://docs.python.org/dev/library/asyncio-eventloop.html#asyncio.AbstractEventLoop "The Event Loop Methods section lists all methods that an alternative implementation of AbstractEventLoop should have defined." It points to the following documentation which lists shutdown_asyncgens(): It points to https://docs.python.org/dev/library/asyncio-eventloop.html#asyncio-event-loop
Yes
Raising NotImplementedError is a deliberate design choice. Documentation the new requirement in the following section sounds like a good idea. https://docs.python.org/dev/whatsnew/3.9.html#changes-in-the-python-api Kyle: Can you please add a short sentence there to document the new requirement? |
Yep, I'll open a PR soon. I'm glad this is coming up now rather than post-release, thanks for bringing attention to it. |
PR-19634 should address the primary issue of documenting in the 3.9 whatsnew that alternative event loops should implement loop.shutdown_default_executor(). For reference, here's the implementation in BaseEventLoop: cpython/Lib/asyncio/base_events.py Lines 556 to 574 in 9c82ea7
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: