Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for posix_spawn call in subprocess.Popen with close_fds=True #113117

Closed
kulikjak opened this issue Dec 14, 2023 · 8 comments
Closed
Assignees
Labels
3.13 bugs and security fixes type-feature A feature request or enhancement

Comments

@kulikjak
Copy link
Contributor

kulikjak commented Dec 14, 2023

Feature or enhancement

Proposal:

One of the reasons posix_spawn is rarely used with subprocess functions is that it doesn't support close_fds=True (which is the default value).

Previously, it was deemed infeasible to support this (#79718 (comment)), but since then, posix_spawn_file_actions_addclosefrom_np() has become available on many systems (I know about Solaris, Linux and FreeBSD), which should make it pretty simple to support it.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

These two are related:
#79718 was the original issue for using os.posix_spawn() in subprocess.
#86904 proposed changing close_fds to default to False, so posix_spawn() could be used more often

Linked PRs

@kulikjak kulikjak added the type-feature A feature request or enhancement label Dec 14, 2023
@kulikjak kulikjak changed the title Add support for posix_spawn call with close_fds=True Add support for posix_spawn call in subprocess.Popen with close_fds=True Dec 14, 2023
@ronaldoussoren
Copy link
Contributor

On macOS something this can be accomplished through posix_spawn_file_actions_addinherit_np:

Normally, for posix_spawn(2) and posix_spawnp(2), all file descriptors are inherited from the parent process into the spawned process, except for those explicitly
marked as close-on-exec. However if the flag POSIX_SPAWN_CLOEXEC_DEFAULT is set, then during the spawn operation, all pre-existing file descriptors in the parent
process are treated as if they had been marked close-on-exec i.e. none of them are automatically inherited. See posix_spawnattr_setflags(3). Only file
descriptors explicitly manipulated via file_actions are made available in the spawned process. In that case, posix_spawn_file_actions_addinherit_np() can be used
to make specific pre-existing file descriptors from the parent process be available in the spawned process.

@gpshead
Copy link
Member

gpshead commented Dec 15, 2023

I like your PR and think it makes sense. And would welcome another one for macOS's _np API flavor as well.

I'm curious what the fall-out from making this change could be. We may not actually want to use posix_spawn on Linux by default. Doing some quick testing with this PR, it appears a little slower than our vfork code:

bin/python3 -m timeit --setup 'import os; import subprocess; subprocess._HAVE_POSIX_SPAWN_CLOSEFROM=False; memory=1_000_000 * [os.open("/dev/null", 0) for _ in range(50)]' 'subprocess.run(["/bin/true"])'
1000 loops, best of 5: 290 usec per loop
bin/python3 -m timeit --setup 'import os; import subprocess; subprocess._HAVE_POSIX_SPAWN_CLOSEFROM=True; memory=1_000_000 * [os.open("/dev/null", 0) for _ in range(50)]' 'subprocess.run(["/bin/true"])'
1000 loops, best of 5: 350 usec per loop

That was run on Ubuntu 22.04. (turning off both posix_spawn & vfork in the above for a traditional fork takes 5600 usec)

While performance is still "good", and the time difference appears constant regardless of tweaks to the process memory footprint or number of open file descriptors. It's still a 20% performance hit vs our Linux default subprocess vfork code-path. Performance of subprocess using posix_spawn presumably depends on the libc implementation. Replacing the subprocess.py os.posix_spawn and _posixsubprocess.fork_exec calls with a dummy, the pure Python overhead is clearly larger for the fork_exec vfork case (16us) than it is for the posix_spawn case (4us). That points to the internals of my systems glibc posix_spawn implementation just being a bit slower (which I believe just uses vfork itself).

This is something to keep an eye on. We could enhance the conditional logic on Linux used to determine when to use posix_spawn or not if desired. I don't know if that is worth doing.

@gpshead gpshead self-assigned this Dec 15, 2023
@ronaldoussoren
Copy link
Contributor

Let's use #109154 for the macOS part. That issue does not have a PR yet, but implementing this should be too hard.

@kulikjak
Copy link
Contributor Author

On macOS something this can be accomplished through posix_spawn_file_actions_addinherit_np:

I see - sadly, I cannot help much with that as I don't have access to Mac, and it's a little bit more complex than just calling a different function - the Mac API seems more complicated (but also potentially more powerful).

I'm curious what the fall-out from making this change could be. We may not actually want to use posix_spawn on Linux by default.

Interesting - I guess we can always modify it to prefer vfork over posix_spawn on Linux when it's available. On other platforms, I think that posix_spawn should always be better than standard fork/exec.

@ronaldoussoren
Copy link
Contributor

On macOS something this can be accomplished through posix_spawn_file_actions_addinherit_np:

I see - sadly, I cannot help much with that as I don't have access to Mac, and it's a little bit more complex than just calling a different function - the Mac API seems more complicated (but also potentially more powerful).

I'll implement the macOS part in #109154 once your PR is resolved, either by emulating the Linux API or by exposing something closer to the native API.

@gpshead gpshead added the 3.13 bugs and security fixes label Dec 17, 2023
gpshead added a commit that referenced this issue Dec 17, 2023
#113118)

Add support for `os.POSIX_SPAWN_CLOSEFROM` and
`posix_spawn_file_actions_addclosefrom_np` and have the `subprocess` module use
them when available.  This means `posix_spawn` can now be used in the default
`close_fds=True` situation on many platforms.

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
@gpshead
Copy link
Member

gpshead commented Dec 17, 2023

Thanks a lot for this contribution! Merged. It'll be interesting to see how this works out. The buildbots are all happy with it.

The performance being 5-10% lower than our vfork codepath on Linux is annoying only because I bothered to measure it, it shouldn't be a big deal, but if it turns out to cause anyone issues we can revisit the Linux default behavior and just do that within subprocess._use_posix_spawn() in a future PR.

I'm not 100% confident in libc posix_spawn implementations getting everything right, at least on Linux, given the history of problems in much older glibc versions. But it sounds like recent versions are okay. I hope that holds up in practice.

I don't know if posix_spawn_file_actions_addclosefrom_np() existed on Linux when we were originally getting os.posix_spawn support working (a quick search suggests it may be fairly new to Linux?), but regardless it still is not meaningfully documented on Linux despite having been implemented (ugh). So I'm not surprised it was overlooked. The Solaris and FreeBSD docs help.

We'll use other issues to track the macOS things, as well as the potential for posix_spawn chdir support (which I think while available as a non-portable _np extension on some platforms is becoming a standard and will drop the _np suffix in some distant future? https://www.austingroupbugs.net/view.php?id=1208)

@gpshead gpshead closed this as completed Dec 17, 2023
@ned-deily
Copy link
Member

ned-deily commented Dec 19, 2023

PR #113118 breaks macOS framework builds such as those used by the python.org installer:

$ ./configure --enable-framework && make -j
[...]
./Modules/posixmodule.c:7170:19: error: use of undeclared identifier 'environ'
        envlist = environ;
                  ^
./Modules/posixmodule.c:7238:31: error: use of undeclared identifier 'environ'
    if (envlist && envlist != environ) {
                              ^

Note the comments elsewhere in posixmodule.c such as at line 1582:

#elif defined(WITH_NEXT_FRAMEWORK) || (defined(__APPLE__) && defined(Py_ENABLE_SHARED))
    /* environ is not accessible as an extern in a shared object on OSX; use
       _NSGetEnviron to resolve it. The value changes if you add environment
       variables between calls to Py_Initialize, so don't cache the value. */
    e = *_NSGetEnviron();

@gpshead
Copy link
Member

gpshead commented Dec 19, 2023

this was not what broke the macos build. that was #113119. see #113268 for a fix.

@gpshead gpshead closed this as completed Dec 19, 2023
ryan-duve pushed a commit to ryan-duve/cpython that referenced this issue Dec 26, 2023
…ds=True (python#113118)

Add support for `os.POSIX_SPAWN_CLOSEFROM` and
`posix_spawn_file_actions_addclosefrom_np` and have the `subprocess` module use
them when available.  This means `posix_spawn` can now be used in the default
`close_fds=True` situation on many platforms.

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…ds=True (python#113118)

Add support for `os.POSIX_SPAWN_CLOSEFROM` and
`posix_spawn_file_actions_addclosefrom_np` and have the `subprocess` module use
them when available.  This means `posix_spawn` can now be used in the default
`close_fds=True` situation on many platforms.

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants