bpo-38267: Add thread timeout parameter to `loop.shutdown_default_executor()` #16360

aeros · 2019-09-24T20:23:26Z

https://bugs.python.org/issue38267

aeros · 2019-09-24T20:24:41Z

Adding documentation changes and Misc/NEWS next. Before doing so, I wanted to make sure the implementation and API were correct.

Lib/asyncio/base_events.py

aeros · 2019-09-24T20:27:19Z

Lib/asyncio/base_events.py

@@ -66,6 +66,8 @@
 # Maximum timeout passed to select to avoid OS limitations
 MAXIMUM_SELECT_TIMEOUT = 24 * 3600

+# Default timeout for joining each thread in `shutdown_default_executor()`
+THREAD_JOIN_TIMEOUT = 300


Let me know if this name is suitable, or if anyone has a better suggestion.

Lib/asyncio/base_events.py

aeros · 2019-09-25T00:44:09Z

Lib/asyncio/runners.py

@@ -21,6 +25,10 @@ def run(main, *, debug=False):
    It should be used as a main entry point for asyncio programs, and should
    ideally only be called once.

+    Each thread within the threadpool is given a timeout duration of 5
+    minutes. If the thread hasn't finishing joining within that duration,


@1st1 While writing the documentation, I was thinking that @asvetlov's suggestion to add timeout as parameter for asyncio.run() might be beneficial. Although it would add an additional knob for users to adjust, forcing them to strictly use a 5 minute timeout duration seems a bit restrictive.

Do you think it might be useful to give users the option to adjust the timeout duration by adding a timeout parameter to asynio.run()? Since you regularly interact with daily users of asyncio, you might have a better idea of whether or not that would be useful.

I think that adding knobs is only justified when there's request for that from users. In this particular case it's a bit tricky.

The way I think about it is that it's easy to implement whatever timeout is necessary with the following pattern:

asyncio.run(main()) # versus async def runner(): try: return await main() finally: loop = asyncio.get_running_loop() await loop.shutdown_default_executor(3600) asyncio.run(runner())

The argument for this pattern is that in more or less advanced asyncio applications there's always a "runner()" like in the above. It usually sets up signal handlers, manages global state, etc. asyncio.run() in those cases is simply providing a set of reasonable defaults on how you run your event loop and how you shut it down. And 5 minutes is a reasonable default in this case, I guess.

Long story short, I'd go with this design and learn from our users if we need to make it configurable.

This is just my opinion on this and I'd love to hear what @asvetlov thinks about it.

The argument for this pattern is that in more or less advanced asyncio applications there's always a "runner()" like in the above.

Ah, I actually hadn't thought as much about how relatively simple it would be for the users to create their own runner. I wonder if it would be worthwhile to document this pattern somewhere in asyncio-task.rst, for those who desire further customization.

And 5 minutes is a reasonable default in this case, I guess.

Yeah 5 minutes seems reasonable for most purposes. If it's an issue, we can easily adjust it by modifying the constant.

Long story short, I'd go with this design and learn from our users if we need to make it configurable.

Good point. It might be a better idea to leave out the parameter initially and then add it in later if the users request it. It would be easy to add the new parameter to the API for asyncio.run() if it's desired, but it's incredibly difficult to remove parameters in the future.

This is just my opinion on this and I'd love to hear what @asvetlov thinks about it.

Personally I have no issue with either of the options, I think both of them would be viable. After reading your latest response, I'm leaning more in favor of keeping the current iteration though.

The more I've been involved with API design, I've increasingly come to realize that there's frequently some degree of disconnect between what the users actually want and what the developers think the users want. It can definitely be worthwhile to start with a minimalist approach and add features based on user feedback, rather than making assumptions. Making too many assumptions can easily lead to feature bloat.

aeros · 2019-09-25T22:36:13Z

I can work on resolving the conflicts later tonight or tomorrow after someone gets the chance to look over #16403. The conflicts were probably created from the recently merged #16337. They should be fairly straightforward to resolve though.

1st1 · 2019-10-03T21:23:59Z

Do we really have it configured to wait for 600 seconds? Wasn't it 5 seconds?

I think it might be more straightforward from a design standpoint to have a specific timeout per-thread.

How? It's a threadpool. From the standpoint of the user all threads in the pool are equal.

I also think you're confused here. When you join a threadpool with an N seconds timeout it will simultaneously join all of its threads, giving all of them N seconds. The only difference is if we join all threads simultaneously or join them one by one. Doing the latter isn't a commonly accepted practice, and in reality is just wasteful.

Each thread within the threadpool is given a timeout duration of 5
minutes. If the thread hasn't finishing joining within that duration,
the lock is released and the thread is terminated.

@asvetlov will Python actually terminate a thread if we try to join it and it hits a timeout? IIRC the thread will simply continue to run, no?

aeros · 2019-10-03T21:38:56Z

@1st1

Do we really have it configured to wait for 600 seconds? Wasn't it 5 seconds?

Oops, 600 was a typo, I mean 300 seconds/5 minutes (which is what it currently is). In the original comment, you had suggested 5 minutes:

We can have a global asyncio "reasonable" timeout for shutting down threads -- say 5 minutes

will Python actually terminate a thread if we try to join it and it hits a timeout?

No, not from thread.join(timeout), I was just looking over the implementation yesterday in threading.py. When timeout is not None for thread.join(), the join operation stops blocking once the duration has reached, but the thread remains alive. The way to check if the timeout was reached is with thread.is_alive(). I had misunderstood this part when I initially opened the PR.

Instead, we would have to do something like this, to shutdown the ThreadPoolExecutor when the thread did not join within the timeout duration. I think a warning of some form should be emitted, but that's optional:

    async def shutdown_default_executor(self, timeout=None):
        self._executor_shutdown_called = True
        if self._default_executor is None:
            return
        future = self.create_future()
        thread = threading.Thread(target=self._do_shutdown, args=(future,))
        thread.start()
        try:
            await future
        finally:
            thread.join(timeout)
        if (thread.is_alive()):
            # Timeout was reached before threads finished joining
            # emit warning?
            self._default_executor.shutdown(wait=False)

    def _do_shutdown(self, future):
        try:
            self._default_executor.shutdown(wait=True)
            self.call_soon_threadsafe(future.set_result, None)
        except Exception as ex:
            self.call_soon_threadsafe(future.set_exception, ex)

aeros · 2019-10-03T21:42:50Z

@1st1:

I also think you're confused here. When you join a threadpool with an N seconds timeout it will simultaneously join all of its threads, giving all of them N seconds.

I was under the impression that the threads were being joined concurrently rather than in parallel. Meaning that there's cooperative multitasking being done with overlapping time periods from the threads joining, but each thread is not being joined simultaneously with true parallelism. Am I misunderstanding something?

1st1 · 2019-10-03T21:59:22Z

I was under the impression that the threads were being joined concurrently rather than in parallel.

Oh well, in CPython it's grey area :) Most of the time all threads work cooperatively, i.e. that's how the interpreter executes them. The GIL does serialize how opcodes are executed forcing threaded code to cooperate. But thinking about threads in Python in that way isn't productive. Just like it's not really productive to think about parallel vs concurrent threading in languages without the GIL -- ideally your code should be written in such a way that it doesn't care (i.e. access to shared data is properly guarded and serialized).

Trying to give all threads an equal opportunity to finalize & close isn't the job of asyncio code though. The GIL is a limitation that imposes certain restrictions on how threads work in CPython, but we shouldn't care. We should assume that threads in Python work in parallel otherwise it would be insanely hard to reason about them. Moreover, what if you write code that assumes that threads in CPython run cooperatively and then we ship CPython without GIL? :) [that won't happen anytime soon]

Therefore I see no need for a loop like [th.join(timeout=N) for th in executor.threads]. Simply waiting on executor.join_all_threads(timeout=N) (the code snippets are figurative) is enough.

I hope I made myself clear here, feel free to ask more questions.

aeros · 2019-10-03T22:42:07Z

@1st1

But thinking about threads in Python in that way isn't productive.

Ah, I might have had the difference between concurrency vs parallelism a bit excessively drilled into my head then over the last year. It's been a strong area of focus for me recently and I've tried to ensure that I don't mix up the two areas, but perhaps it's been to a fault.

We should assume that threads in Python work in parallel otherwise it would be insanely hard to reason about them.

I can definitely understand this point from the perspective of structuring the code, but from the real-time perspective of the actual amount of time it will take to finish joining the threads, doesn't the amount of time scale with each additional thread that needs to be joined?

I.E. if you have a thread X that takes N amount of time to finishing joining, increasing the count of X to join in the thread pool will increase N. From my understanding, it's not exactly linear, but the total amount of time does increase with each additional thread to join in the current implementation of the GIL. But perhaps this is an unneeded layer of complexity that most users wouldn't consider when specifying a timeout, so from an API perspective it makes more sense for the timeout to be for the thread pool?

I think we should think about it somewhat for creating a reasonable default timeout constant though. I'm not sure if the currently 300 seconds is enough, but if you think it will fit most general use cases for asyncio.run() and we emit some form of warning when it times out it should be fine. This would make it easier for easier to realize that they may have to use their own custom implementation for edge cases.

Moreover, what if you write code that assumes that threads in CPython run cooperatively and then we ship CPython without GIL? :) [that won't happen anytime soon]

I certainly hope that will happen some day! It's actually been a goal of mine to be able to contribute towards that effort, believe it or not. I still have a lot to learn about the GIL and the underlying interpreter implementation before I could even consider trying to help make the GIL unneeded, but I hope that I'm knowledgeable enough to do so at some point in time. (:

I hope I made myself clear here, feel free to ask more questions.

Thanks, this is a fairly complex area and there's definitely a lot that I have to learn. I think that I understand it from an abstract perspective, but sometimes it can be difficult to separate details from the categories of:

Productive - what developers/users of asyncio are/should be thinking about
Underlying implementation - how it actually works, such as GIL limitations
Theoretical - abstract ideas, such as concurrency vs parallelism

Optimally while developing asyncio, I should have an understanding of 2 and 3 while providing an API that focuses on 1. But it's certainly easier said than done sometimes. Do you typically have a strategy for not getting lost in the details and focusing on the productive user experience?

Still need to implement "a warning is emitted and the threadpool is closed"

Doc/library/asyncio-eventloop.rst

1st1 · 2019-10-03T23:06:24Z

Do you typically have a strategy for not getting lost in the details and focusing on the productive user experience?

No strategy, just experience of dealing with code and people. But that's highly subjective and thus I can be wrong. That's why you always want to gather opinions from multiple engineers. In the case of asyncio those multiple engineers is @asvetlov :)

Lib/asyncio/base_events.py

aeros · 2019-10-05T19:33:43Z

Lib/asyncio/base_events.py

+        if (thread.is_alive()):
+            warnings.warn("The ThreadPoolExecutor did not finishing joining"
+                             f"its threads within {timeout} seconds.",
+                             RuntimeWarning)


@1st1 Couple of questions regarding the warning:

Should it be a RuntimeWarning or ResourceWarning?

~~Should I add a test to test_asyncio/test_base_events.py to make sure the the warning is triggered in an appropriate scenario?~~

Edit: Regarding the test, I don't think there's a way to test for this without creating a dangling thread due to the nature of the warning. Personally, I don't think it's worth adding an intentional resource leak to the asyncio regression tests just to test the functionality of a single warning. If there's another way to test for the warning without creating a dangling thread, let me know.

Set the stacklevel to 2 and fix a spacing issue in the message

aeros · 2020-01-20T23:24:48Z

@1st1 Is there anything that needs to be changed in this PR?

gvanrossum

LGTM except for two style nits. I'll try to merge my suggestions.

Lib/asyncio/base_events.py

Doc/library/asyncio-task.rst

gvanrossum · 2022-09-25T18:33:20Z

@kumaraditya303 Do you have time to fix the merge conflicts?

kumaraditya303 · 2022-09-26T10:53:49Z

Do you have time to fix the merge conflicts?

I don't have permission to push to this PR so I created #97561.

aeros added 2 commits September 24, 2019 15:40

Add thread timeout constant

700cf86

Add timeout param to loop.shutdown_default_executor()

1e7794b

aeros requested review from 1st1 and asvetlov as code owners September 24, 2019 20:23

the-knights-who-say-ni added the CLA signed label Sep 24, 2019

bedevere-bot added the awaiting review label Sep 24, 2019

aeros commented Sep 24, 2019

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

aeros commented Sep 24, 2019

View reviewed changes

aeros added the topic-asyncio label Sep 24, 2019

1st1 added needs backport to 3.8 only security fixes and removed needs backport to 3.8 only security fixes labels Sep 24, 2019

1st1 reviewed Sep 24, 2019

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

aeros and others added 12 commits September 24, 2019 19:25

Set timeout param default to None

a262c45

Update `loop.shutdown_default_executor() docstring

b1bb810

Move constant to runners.py

8a4c30c

Add thread timeout for asyncio.run()

ebc1b99

Update asyncio.run() docstring for shutdown_default_executor()

5b90766

Update asyncio.run() docstring for thread timeout

f189f04

Update asyncio.run() docs

16921a1

Fix whitespace in asyncio.run() docstring

b678818

Update shutdown_default_executor() docs

e332257

Patchcheck

7f8428b

📜🤖 Added by blurb_it.

577b2b2

Fix Misc/NEWS Sphinx role

b673080

aeros commented Sep 25, 2019

View reviewed changes

aeros requested a review from 1st1 September 25, 2019 00:44

aeros added the type-feature A feature request or enhancement label Sep 25, 2019

aeros mentioned this pull request Sep 30, 2019

Fix and improve asyncio.run() docs #16403

Merged

aeros added 2 commits October 3, 2019 17:15

Fix documentation for timeout parameter

f7c623c

Fix documentation for timeout parameter

169d2d4

aeros added 3 commits October 3, 2019 18:50

Fix asyncio.run() docs

f0dde2d

Still need to implement "a warning is emitted and the threadpool is closed"

Fix loop.shutdown_default_executor() docstring

b7c0608

Fix loop.shutdown_default_executor docs

1e95afb

aeros commented Oct 3, 2019

View reviewed changes

Doc/library/asyncio-eventloop.rst Show resolved Hide resolved

aeros added 3 commits October 3, 2019 20:48

Fix loop.shutdown_default_executor()

2b1f35d

Fix asyncio.run() docstring

e549996

Fix runners.py comment

8a5121c

1st1 reviewed Oct 4, 2019

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

Fix warning

4264c04

aeros commented Oct 5, 2019

View reviewed changes

aeros requested a review from 1st1 October 5, 2019 22:13

Fix RuntimeWarning

e9a0f19

Set the stacklevel to 2 and fix a spacing issue in the message

aeros mentioned this pull request Apr 10, 2022

Add thread timeout for loop.shutdown_default_executor #82448

Closed

ezio-melotti removed the CLA signed label Jul 13, 2022

gvanrossum approved these changes Sep 25, 2022

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

Doc/library/asyncio-task.rst Outdated Show resolved Hide resolved

bedevere-bot added awaiting merge and removed awaiting review labels Sep 25, 2022

Fix style nits

d9c2e37

kumaraditya303 closed this Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-38267: Add thread timeout parameter to `loop.shutdown_default_executor()` #16360

bpo-38267: Add thread timeout parameter to `loop.shutdown_default_executor()` #16360

aeros commented Sep 24, 2019 •

edited by bedevere-bot

aeros commented Sep 24, 2019 •

edited

aeros Sep 24, 2019

aeros Sep 25, 2019 •

edited

1st1 Sep 25, 2019 •

edited

aeros Sep 25, 2019 •

edited

aeros commented Sep 25, 2019

1st1 commented Oct 3, 2019

aeros commented Oct 3, 2019 •

edited

aeros commented Oct 3, 2019 •

edited

1st1 commented Oct 3, 2019 •

edited

aeros commented Oct 3, 2019

1st1 commented Oct 3, 2019 •

edited

aeros Oct 5, 2019 •

edited

aeros commented Jan 20, 2020

gvanrossum left a comment

gvanrossum commented Sep 25, 2022

kumaraditya303 commented Sep 26, 2022

bpo-38267: Add thread timeout parameter to loop.shutdown_default_executor() #16360

bpo-38267: Add thread timeout parameter to loop.shutdown_default_executor() #16360

Conversation

aeros commented Sep 24, 2019 • edited by bedevere-bot

aeros commented Sep 24, 2019 • edited

aeros Sep 24, 2019

Choose a reason for hiding this comment

aeros Sep 25, 2019 • edited

Choose a reason for hiding this comment

1st1 Sep 25, 2019 • edited

Choose a reason for hiding this comment

aeros Sep 25, 2019 • edited

Choose a reason for hiding this comment

aeros commented Sep 25, 2019

1st1 commented Oct 3, 2019

aeros commented Oct 3, 2019 • edited

aeros commented Oct 3, 2019 • edited

1st1 commented Oct 3, 2019 • edited

aeros commented Oct 3, 2019

1st1 commented Oct 3, 2019 • edited

aeros Oct 5, 2019 • edited

Choose a reason for hiding this comment

aeros commented Jan 20, 2020

gvanrossum left a comment

Choose a reason for hiding this comment

gvanrossum commented Sep 25, 2022

kumaraditya303 commented Sep 26, 2022

bpo-38267: Add thread timeout parameter to `loop.shutdown_default_executor()` #16360

bpo-38267: Add thread timeout parameter to `loop.shutdown_default_executor()` #16360

aeros commented Sep 24, 2019 •

edited by bedevere-bot

aeros commented Sep 24, 2019 •

edited

aeros Sep 25, 2019 •

edited

1st1 Sep 25, 2019 •

edited

aeros Sep 25, 2019 •

edited

aeros commented Oct 3, 2019 •

edited

aeros commented Oct 3, 2019 •

edited

1st1 commented Oct 3, 2019 •

edited

1st1 commented Oct 3, 2019 •

edited

aeros Oct 5, 2019 •

edited