-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_concurrent_futures: reap_children() warnings on RHEL7 and RHEL8 buildbots #82727
Comments
https://buildbot.python.org/all/#/builders/254/builds/162 test_no_stale_references (test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest) ... 0.56s ok |
It looks similar to bpo-38448: "test_concurrent_futures: reap_children() reaped child process 26487 on AMD64 RHEL8 Refleaks 3.x" |
I ran "./python -m test --fail-env-changed -w -j50 -F test_concurrent_futures" on my laptop for 5 minutes: system load of 220,57 with a peak of 403 processes... but I failed to reproduce the issue :-( |
The system load was at 10.02 when the bug occurred: ... |
New warning on AMD64 RHEL8 LTO + PGO 3.x: ... |
I closed bpo-38448 as duplicate of this issue. Copy of its only message: AMD64 RHEL8 Refleaks 3.x: 0:27:13 load avg: 4.88 [416/419/1] test_concurrent_futures failed (env changed) (17 min 11 sec) -- running: test_capi (7 min 28 sec), test_gdb (8 min 49 sec), test_asyncio (23 min 23 sec) |
AMD64 Debian PGO 3.x: test_submit_keyword (test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest) ... 0.62s Warning -- reap_children() reaped child process 32627 |
Ran 168 tests in 160.415s Another failure in AMD64 RHEL8 LTO + PGO 3.x https://buildbot.python.org/all/#/builders/284/builds/300/steps/5/logs/stdio |
New failure on AMD64 RHEL7 LTO 3.x: test_interpreter_shutdown |
I managed to reproduce the issue on the RHEL8 worker using:
I can still reproduce the issue with PR 17641 fix, so it's not enough. I also applied PR 17640 and 2 extra changes: diff --git a/Lib/multiprocessing/managers.py b/Lib/multiprocessing/managers.py
index 1f9c2daa25..dbbf84b3d2 100644
--- a/Lib/multiprocessing/managers.py
+++ b/Lib/multiprocessing/managers.py
@@ -661,7 +661,7 @@ class BaseManager(object):
except Exception:
pass
- process.join(timeout=1.0)
+ process.join(timeout=0.1)
if process.is_alive():
util.info('manager still alive')
if hasattr(process, 'terminate'):
diff --git a/Lib/test/support/__init__.py b/Lib/test/support/__init__.py
index 215bab8131..110c7f945e 100644
--- a/Lib/test/support/__init__.py
+++ b/Lib/test/support/__init__.py
@@ -2399,6 +2399,7 @@ def reap_children():
# Reap all our dead child processes so we don't leave zombies around.
# These hog resources and might be causing some of the buildbots to die.
+ time.sleep(0.5)
while True:
try:
# Read the exit status of any child process which already completed |
Example of failure, with additional debug prints: 0:01:02 load avg: 42.13 [ 38/1] test_concurrent_futures failed (env changed) ---------------------------------------------------------------------- Ran 6 tests in 17.210s OK |
I can reproduce the issue with the following match file: test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest.test_no_stale_references |
I tested manually on the RHEL8 worker and I confirm that this change prevents the reap_children() warning. I close the issue. The change will be shortly backported to 3.7 and 3.8. -- I'm not sure if PR 17640 is useful, let's discuss it on the PR directly. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: