New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-38267: Add thread timeout parameter to loop.shutdown_default_executor()
#16360
bpo-38267: Add thread timeout parameter to loop.shutdown_default_executor()
#16360
Conversation
Adding documentation changes and Misc/NEWS next. Before doing so, I wanted to make sure the implementation and API were correct. |
Lib/asyncio/base_events.py
Outdated
@@ -66,6 +66,8 @@ | |||
# Maximum timeout passed to select to avoid OS limitations | |||
MAXIMUM_SELECT_TIMEOUT = 24 * 3600 | |||
|
|||
# Default timeout for joining each thread in `shutdown_default_executor()` | |||
THREAD_JOIN_TIMEOUT = 300 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if this name is suitable, or if anyone has a better suggestion.
Lib/asyncio/runners.py
Outdated
@@ -21,6 +25,10 @@ def run(main, *, debug=False): | |||
It should be used as a main entry point for asyncio programs, and should | |||
ideally only be called once. | |||
|
|||
Each thread within the threadpool is given a timeout duration of 5 | |||
minutes. If the thread hasn't finishing joining within that duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1st1 While writing the documentation, I was thinking that @asvetlov's suggestion to add timeout as parameter for asyncio.run()
might be beneficial. Although it would add an additional knob for users to adjust, forcing them to strictly use a 5 minute timeout duration seems a bit restrictive.
Do you think it might be useful to give users the option to adjust the timeout duration by adding a timeout parameter to asynio.run()
? Since you regularly interact with daily users of asyncio, you might have a better idea of whether or not that would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that adding knobs is only justified when there's request for that from users. In this particular case it's a bit tricky.
The way I think about it is that it's easy to implement whatever timeout is necessary with the following pattern:
asyncio.run(main())
# versus
async def runner():
try:
return await main()
finally:
loop = asyncio.get_running_loop()
await loop.shutdown_default_executor(3600)
asyncio.run(runner())
The argument for this pattern is that in more or less advanced asyncio applications there's always a "runner()" like in the above. It usually sets up signal handlers, manages global state, etc. asyncio.run()
in those cases is simply providing a set of reasonable defaults on how you run your event loop and how you shut it down. And 5 minutes is a reasonable default in this case, I guess.
Long story short, I'd go with this design and learn from our users if we need to make it configurable.
This is just my opinion on this and I'd love to hear what @asvetlov thinks about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument for this pattern is that in more or less advanced asyncio applications there's always a "runner()" like in the above.
Ah, I actually hadn't thought as much about how relatively simple it would be for the users to create their own runner. I wonder if it would be worthwhile to document this pattern somewhere in asyncio-task.rst
, for those who desire further customization.
And 5 minutes is a reasonable default in this case, I guess.
Yeah 5 minutes seems reasonable for most purposes. If it's an issue, we can easily adjust it by modifying the constant.
Long story short, I'd go with this design and learn from our users if we need to make it configurable.
Good point. It might be a better idea to leave out the parameter initially and then add it in later if the users request it. It would be easy to add the new parameter to the API for asyncio.run()
if it's desired, but it's incredibly difficult to remove parameters in the future.
This is just my opinion on this and I'd love to hear what @asvetlov thinks about it.
Personally I have no issue with either of the options, I think both of them would be viable. After reading your latest response, I'm leaning more in favor of keeping the current iteration though.
The more I've been involved with API design, I've increasingly come to realize that there's frequently some degree of disconnect between what the users actually want and what the developers think the users want. It can definitely be worthwhile to start with a minimalist approach and add features based on user feedback, rather than making assumptions. Making too many assumptions can easily lead to feature bloat.
Do we really have it configured to wait for 600 seconds? Wasn't it 5 seconds?
How? It's a threadpool. From the standpoint of the user all threads in the pool are equal. I also think you're confused here. When you join a threadpool with an N seconds timeout it will simultaneously join all of its threads, giving all of them N seconds. The only difference is if we join all threads simultaneously or join them one by one. Doing the latter isn't a commonly accepted practice, and in reality is just wasteful.
@asvetlov will Python actually terminate a thread if we try to join it and it hits a timeout? IIRC the thread will simply continue to run, no? |
Oops, 600 was a typo, I mean 300 seconds/5 minutes (which is what it currently is). In the original comment, you had suggested 5 minutes:
No, not from thread.join(timeout), I was just looking over the implementation yesterday in threading.py. When timeout is not None for Instead, we would have to do something like this, to shutdown the async def shutdown_default_executor(self, timeout=None):
self._executor_shutdown_called = True
if self._default_executor is None:
return
future = self.create_future()
thread = threading.Thread(target=self._do_shutdown, args=(future,))
thread.start()
try:
await future
finally:
thread.join(timeout)
if (thread.is_alive()):
# Timeout was reached before threads finished joining
# emit warning?
self._default_executor.shutdown(wait=False)
def _do_shutdown(self, future):
try:
self._default_executor.shutdown(wait=True)
self.call_soon_threadsafe(future.set_result, None)
except Exception as ex:
self.call_soon_threadsafe(future.set_exception, ex) |
I was under the impression that the threads were being joined concurrently rather than in parallel. Meaning that there's cooperative multitasking being done with overlapping time periods from the threads joining, but each thread is not being joined simultaneously with true parallelism. Am I misunderstanding something? |
Oh well, in CPython it's grey area :) Most of the time all threads work cooperatively, i.e. that's how the interpreter executes them. The GIL does serialize how opcodes are executed forcing threaded code to cooperate. But thinking about threads in Python in that way isn't productive. Just like it's not really productive to think about parallel vs concurrent threading in languages without the GIL -- ideally your code should be written in such a way that it doesn't care (i.e. access to shared data is properly guarded and serialized). Trying to give all threads an equal opportunity to finalize & close isn't the job of asyncio code though. The GIL is a limitation that imposes certain restrictions on how threads work in CPython, but we shouldn't care. We should assume that threads in Python work in parallel otherwise it would be insanely hard to reason about them. Moreover, what if you write code that assumes that threads in CPython run cooperatively and then we ship CPython without GIL? :) [that won't happen anytime soon] Therefore I see no need for a loop like I hope I made myself clear here, feel free to ask more questions. |
Ah, I might have had the difference between concurrency vs parallelism a bit excessively drilled into my head then over the last year. It's been a strong area of focus for me recently and I've tried to ensure that I don't mix up the two areas, but perhaps it's been to a fault.
I can definitely understand this point from the perspective of structuring the code, but from the real-time perspective of the actual amount of time it will take to finish joining the threads, doesn't the amount of time scale with each additional thread that needs to be joined? I.E. if you have a thread X that takes N amount of time to finishing joining, increasing the count of X to join in the thread pool will increase N. From my understanding, it's not exactly linear, but the total amount of time does increase with each additional thread to join in the current implementation of the GIL. But perhaps this is an unneeded layer of complexity that most users wouldn't consider when specifying a timeout, so from an API perspective it makes more sense for the timeout to be for the thread pool? I think we should think about it somewhat for creating a reasonable default timeout constant though. I'm not sure if the currently 300 seconds is enough, but if you think it will fit most general use cases for
I certainly hope that will happen some day! It's actually been a goal of mine to be able to contribute towards that effort, believe it or not. I still have a lot to learn about the GIL and the underlying interpreter implementation before I could even consider trying to help make the GIL unneeded, but I hope that I'm knowledgeable enough to do so at some point in time. (:
Thanks, this is a fairly complex area and there's definitely a lot that I have to learn. I think that I understand it from an abstract perspective, but sometimes it can be difficult to separate details from the categories of:
Optimally while developing asyncio, I should have an understanding of 2 and 3 while providing an API that focuses on 1. But it's certainly easier said than done sometimes. Do you typically have a strategy for not getting lost in the details and focusing on the productive user experience? |
Still need to implement "a warning is emitted and the threadpool is closed"
No strategy, just experience of dealing with code and people. But that's highly subjective and thus I can be wrong. That's why you always want to gather opinions from multiple engineers. In the case of asyncio those multiple engineers is @asvetlov :) |
Lib/asyncio/base_events.py
Outdated
if (thread.is_alive()): | ||
warnings.warn("The ThreadPoolExecutor did not finishing joining" | ||
f"its threads within {timeout} seconds.", | ||
RuntimeWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1st1 Couple of questions regarding the warning:
-
Should it be a
RuntimeWarning
orResourceWarning
? -
Should I add a test totest_asyncio/test_base_events.py
to make sure the the warning is triggered in an appropriate scenario?
Edit: Regarding the test, I don't think there's a way to test for this without creating a dangling thread due to the nature of the warning. Personally, I don't think it's worth adding an intentional resource leak to the asyncio regression tests just to test the functionality of a single warning. If there's another way to test for the warning without creating a dangling thread, let me know.
Set the stacklevel to 2 and fix a spacing issue in the message
@1st1 Is there anything that needs to be changed in this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for two style nits. I'll try to merge my suggestions.
@kumaraditya303 Do you have time to fix the merge conflicts? |
I don't have permission to push to this PR so I created #97561. |
/cc @1st1 @asvetlov
https://bugs.python.org/issue38267