-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_pty fails when using setsid() #82728
Comments
regrtest has been modified in bpo-38502 to use setsid() when using multiprocessing mode (-jN command line option). Problem: David Bolen identified that test_pty started to fail on his bolen-ubuntu worker (Ubuntu 18.04.3) since my commit ecb035c. https://buildbot.python.org/all/#/builders/141/builds/2679 0:19:05 load avg: 1.81 [234/419/1] test_pty crashed (Exit code -1) -- running: test_unicodedata (55.5 sec) I can reproduce the issue locally: --- $ ./python -m test -j2 test_pty -v
== CPython 3.9.0a0 (heads/urlparse_ipv6:cc733a8cb6, Oct 21 2019, 11:34:36) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
== Linux-5.2.18-200.fc30.x86_64-x86_64-with-glibc2.29 little-endian
== cwd: /home/vstinner/python/master/build/test_python_20242
== CPU count: 8
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 0.70 Run tests in parallel using 2 child processes
0:00:00 load avg: 0.70 [1/1/1] test_pty crashed (Exit code -1)
test_basic (test.test_pty.PtyTest) ... == Tests result: FAILURE == 1 test failed: Total duration: 383 ms It's surprising that there is no output! I would prefer to keep process groups in regrtest, it's really helpful to be able to kill all processes spawned by a test worker process. I'm not sure how/why PTY depends is incompatible with setsid(). |
This also happens when running the test suite with high parallelism: ./python -m test -j 20 This fails with: == Tests result: FAILURE == 398 tests OK. 2 tests failed: |
Indeed, almost all buildbots have to repeat the test_pty. I think this needs to be fixed or process groups in regrtest should be limited or reverted. |
What do you mean by "limited"? Process groups really help to prevent to leak grandchild processes in multiprocessing tests, when tests are interrupted manually by CTRL+C or by a timeout (sadly, only when the timeout is handled by regrtest, not when it's handled by faulthandler). See bpo-38502 for the rationale. |
I mean to deactivate it by default and make opt-in.
I love process groups and they are awesome. But having these test being re-run on every buildbot and failing on my machine when just running test with -j is very annoying. |
Can't we fix test_pty? |
I think fixing the underlying pty issue should certainly be the goal, but the question is whether the process group change should remain active in the meantime, as its presence is causing a regression in the tests. I think such cases in the past are usually rolled back, right? I was originally on the fence since process groups address a real problem, especially in interactive testing, while creating an arguably aesthetic issue for my case of the buildbots (a warning rather than failure). But Pablo's point about a normal manual full test run failing (not a warning as with the buildbots) feels persuasive since that's probably as common as the issue being addressed by the change. Even if pre-existing, the pty failure is exposed by the process group change, so it might be best for the process group change to wait on fixing the pty issue. I don't know how to weigh the relative impact though, e.g,. how many people are likely to run into each failure case. There's probably more people doing a normal test run than breaking out of such tests though. At the least, it's a worst impact than just the warnings on the buildbots. Perhaps an intermediate fallback could be gating the process group change behind a regrtest option (opt-in) which could then preserve its benefits upon request, without negatively impacting the default test process, whether manual or on the buildbots. At least until resource is available to resolve the pty issue. |
I wrote PR 17519 which fix the bug. We just have to ignore SIGHUP signal. |
Ok, I merged my fix to master. The backport to 3.7 and 3.8 will follow quickly. I close the issue. Sorry for the inconvenience ;-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: