-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid daemon threads in concurrent.futures #83993
Comments
Since bpo-37266 (which forbid daemon threads in subinterpreters), we probably want to forego daemon threads in concurrent.futures. This means we also need a way to run an atexit-like hook before non-daemon threads are joined on (sub)interpreter shutdown. See discussion below: |
I spent some further time considering the solution to the problem, and I still think something like a However, I'm not certain regarding the exact details. The simplest pure Python implementation I can think of would be through a global list in
Could something like the above pure Python implementation be adequate for our purposes? It seems like it would be to me, but I could very well be missing something. I'll of course have to test if it works as intended for replacing the daemon threads in concurrent.futures. Another factor to consider is whether or not something like this would be widely useful enough to consider adding Thoughts? |
I reopen the issue since it introduced a regression on buildbots. If no fix is found soon (let's say in 1 or 2 days), I will revert the change to get more time to investigate. If buildbots remain red, we will be other regressions which would make the situation even worse. |
(Oops typo) If buildbots remain red, we will *miss* other regressions which would make the situation even worse. |
Thanks for bringing attention to it Victor. It seems like a rather odd side effect, considering that PR-19149 had no C code and was internal to concurrent.futures and threading. I did not expect asyncio to be dependent on the executors using atexit and daemon threads. I'll see if I can figure it out, but this one might be a bit tricky to troubleshoot. |
Oh yes, I know that fixing such issue can be very tricky. For example, there is no threading._unregister_atexit() function. Maybe the callback stays alive after the test completes. |
I'm currently busy today, but I'll have some time to explore a few different potential solutions tomorrow. If I can't find anything to fix it by the end of the day, I'll prepare a PR to revert PR-19149, and re-apply it later (after more thoroughly testing to ensure it doesn't cause refleaks). Since I authored the PR, I'd like to take responsibility for addressing the refleak, whether that's reverting the PR or adding a fix on top of it. Would that be acceptable Victor, or will it need to be reverted sooner than tomorrow? If it needs to be done sooner, I can open a PR to revert it later tonight (in ~4 hours or so, when I'm available to do so). This the first time a patch of mine has introduced a regression, so I wanted to double check that tomorrow was okay. |
Kyle fixed bpo-40115, I close again this issue. Thanks Kyle for the test_asyncio fix! |
No problem! Thanks for the review. :-) |
Hi all! While browsing StackOverflow I came across this question: https://stackoverflow.com/q/67273533/2111778 The user created a ThreadPoolExecutor which started a Process using multiprocessing. The Process produces an exitcode of 0 in Python 3.8 but an exitcode of 1 in Python 3.9. I'm really not familiar with Python internals, but through monkey-patching Lib/concurrent/futures/thread.py I was able to pin this regression down to the change of
I know that multiprocessing in Python is a little funky, since I worked with it on my Master's thesis. I think the entire process gets forked (i.e. copied), so the child process also gets a copy of the active ThreadPoolExecutor, which I think causes some sort of problem there. Note that this behavior seems to differ based on OS. I can confirm the issue on Linux with the 'fork' method and disconfirm it with the 'spawn' and 'forkserver' methods. Could someone with more insight kindly take a look at this? Greetings Jan <3 |
@jan, without taking a look, I'd answer that indeed you should avoid using the "fork" method if you're doing any kind of multithreading in the parent process. "forkserver" is a good choice nowadays on Linux and will result in more robust code. |
The regression that @janfrederik.konopka points out also has it's own open issue: bpo-43944. I'm trying to work on a fix for this regression. Slowly but surely. Now I've finally found these threads, this information will be a big help! Any tips are appreciated. |
ThreadPoolExecutor always runs the worker threads as daemon=False and even installs an atexit handler to join all of the workers. So afaict it is impossible to exit the process if a thread has hung forever due to a bug; you have to have an separate process sidecar/supervisor thing to SIGKILL it. But we actually want to die on a watchdog timeout. https://stackoverflow.com/questions/49992329/the-workers-in-threadpoolexecutor-is-not-really-daemon python/cpython#83993 We can revisit at such time we become cpu bound on spawning new threads every time vs reusing! Adjacent changes: - Executor.submit() do not check_watchdog() - router and gateway install heartbeat functions with hypercorn_main to do that instead
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: