-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subprocess fails when used as init, vfork() returns EINVAL if PID=1 #91307
Comments
The bug introduced here: bpo-35823 If os.gepid() == 1, vfork() does not work and returns EINVAL. Please add check for the current pid in condition where CPython chooses between fork() and vfork(). |
Any possibility that you can test the attached PR as pid 1? |
Python should be built as a static binary and can be used as the init process on the kernel command line, no? I'm not sure that "static binary" is a requirement, since I'm commonly using init=/usr/bin/bash to fix a broken Linux, and on my Fedora, this program is "dynamically linked". Who uses Python as pid 1? I'm now curious :-) |
Well.
P.S. Program may or may not be static in order to work as PID=1. |
Thanks. I had wondered if this was really a pid=1 restriction or not, but I could definitely imagine kernel scenarios where vfork is simply forbidden regardless. There are a variety of policy mechanisms in kernels, mainline Linux or not, that _could_ do that kind of thing. As much as I'd like to expose that the fallback happened, emitting to stderr isn't friendly and using warnings from this code is complicated so I'm inclined to keep the silent fallback on failure simple as is until someone can demonstrate of it causing a practical problem. Outside of unusual configurations, if this were ever happening when it is not expected, people observing low subprocess performance could strace and witness the vfork syscall failure. I'll merge once our CI is happy. |
The reason for this particular EINVAL is explained in https://bugzilla.kernel.org/show_bug.cgi?id=215769#c6. (I suspect the underlying reason is that the parent and the child need different VDSOs if they are in separate time namespaces, and this implies that they can't share the address space). In hindsight, I should have realized that vfork()'s address space sharing could cause extra failure modes that plain fork() doesn't have. I was aware that many clone() flags are forbidden in conjunction with CLONE_VM, and it's a small step from there to realize that unshare() + vfork() could cause trouble. Sorry about that, and thanks for adding the fallback! Of course, since posix_spawn() is implemented via vfork() in modern glibc, it suffers from the same issue. For example, on Linux >= 5.6 (for time namespaces):
(EPERM instead of EINVAL is due to a bug in glibc's posix_spawn() error reporting: https://sourceware.org/bugzilla/show_bug.cgi?id=29109). Adding a fallback for posix_spawn() path seems more problematic because we can't detect vfork() failure precisely and hence we'd risk unnecessarily retrying in case of unrelated (and probably more widespread) errors. I think it's better to guide users to opt-out from both vfork() and posix_spawn() in this case. Alternatively, we could disable posix_spawn() on Linux entirely: its benefits are only theoretical, we don't control it and have to work around its bugs, and using different subprocess backends depending on Popen() arguments can only cause hard-to-understand behavioral differences. |
If an application calls directly vfork(), I understand that the syscall can fail with EINVAL in some special case about namespaces. But for posix_spawn(), maybe tomorrow the glibc can detect this case to decide if it can use vfork() or not, no? I don't think that the glibc can "fix" the vfork() function, but it can fix posix_spawn(), no? |
Glibc can't easily switch from vfork() to fork() in posix_spawn() because it currently relies on address space sharing to report any child-setup/execve() errors back to the parent (unlike subprocess, which uses a pipe for that). (Relying on AS sharing also breaks error reporting under qemu-user because it implements vfork() as fork(), but glibc sadly doesn't seem to care about this use case, despite that qemu-user is not uncommonly used in cross-compilation environments). Of course glibc can fix that if it wants, but let's ask it instead of guessing: https://sourceware.org/bugzilla/show_bug.cgi?id=29115. The general problem with posix_spawn() will remain in any case. It's not clear what the practical benefits of having two vfork()-based subprocess backends are (one of which we don't control), and so far we have only seen downsides. |
On macOS, posix_spawn() is a syscall. |
Sure, my comment is in the context of Linux only. We don't currently use vfork() on other platforms, and without vfork() the only efficient alternative to fork() is posix_spawn(). |
Good news: the glibc bug report led to the kernel fix, so since Linux |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: