-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for settting umask in subprocess children #82598
Comments
Another use of the deprecated unsafe preexec_fn was to call os.umask in the child prior to exec. As seen in freeipa/freeipa#3769 (see the code in there). We should add an explicit feature for this to avoid people's desire for preexec_fn or for the heavyweight workaround of an intermediate shell calling umask before doing another exec. Any common preexec_fn uses that we can encode into supported parameters will help our ability to remove the ill fated preexec_fn misfeature in the future. |
If we need to write a wrapper program for that, I would say that no, we don't "have to" provide something in the stdlib. In OpenStack, I wrote prlimit.py which is a preexec-like wrapper program to apply resource limits when calling a program. It's just a pure Python implementation of the Unix prlimit program. The Python implementation is used when the prlimit progrma is not available. https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/prlimit.py IMHO it's perfectly fine to explain that a wrapper program is needed to implement preexec-like features. |
We don't have to for all possible things, doing this just reduced friction for existing uses. In this case, calling umask in our child ourselves would be easy to support. (easier than the more important setuid/sid/gid/groups-ish stuff that recently went in) I'm trying to make sure we track what is blocking people from getting rid of preexec_fn in their existing code so that we can actually deprecate and get rid of the API entirely. |
If you consider posix_spawn(), I think that a convenient replacement for preexec_fn function would be a wrapper process which would execute *arbitrary Python code* before spawning the program. It would not only cover umask case, but also prlimit, and another other custom code. Pseudo-code of the wrapper: import sys
code = sys.argv[1]
argv = sys.argv[2:]
eval(code)
os.execv(argv[0], argv) The main risk is that the arbitrary code could create an inheritable file descriptor (not all C extensions respect the PEP-446) which would survive after replacing the process memory with the new program. Such design would allow to implement it in a third party package (on PyPI) for older Python versions as well. -- Currently, preexec_fn is a direct reference to a callable Python object in the current process. preexec_fn calls it just after fork(). Here I'm talking about running arbitrary Python code in a freshly spawned Python process. It's different. -- The new problems are:
This wrapper uses os.execv(). |
We should not provide such an "run arbitrary python code before execing the ultimate child" feature in the standard library. It is complicated, and assumes you have an ability to execute a new interpreter with its own slow startup time as an intermediate in the child in the first place. (embedded pythons do not have that ability) Leave that to someone to implement on top of subprocess as a thing on PyPI. |
What do you mean by "deprecated"? The parameter doesn't seem to be deprecated in the documentation: And when I use it, it doesn't emit any warning: import os, subprocess, sys
def func(): print(f"func called in {os.getpid()}")
argv = [sys.executable, "-c", "pass"]
print(f"parent: {os.getpid()}")
subprocess.run(argv, preexec_fn=func) Output: $ ./python -Werror x.py
parent: 22264
func called in 22265 If you want to deprecate it, it should be documented as deprecated and emit a DeprecatedWarning, no? |
pylint emits a warning on subprocess.Popen(preexec_fn=func): W1509: Using preexec_fn keyword which may be unsafe in But not when using subprocess.run(preexec_fn=func). Maybe a check is missing in pylint. Note: pyflakes doesn't complain about preexec_fn. |
Do you want to modify _posixsubprocess to call umask() between fork() and exec()? As it has been done for uid, gid and groups: call setreuid(), setregid() and setgroups()? If so, it means that posix_spawn() couldn't be used when the new umask parameter would be used, right? |
preexec_fn does not work in subinterpreters, which (amongst others) affects mod_wsgi and similar WSGI implementations. Therefore portable software must not use preexec_fn at all. |
Changed in version 3.8: The preexec_fn parameter is no longer supported in subinterpreters. The use of the parameter in a subinterpreter raises RuntimeError. The new restriction may affect applications that are deployed in mod_wsgi, uWSGI, and other embedded environments. |
preexec_fn has been mentally and advisability deprecated for years. :) As far as posix_spawn goes, I expect these kinds of between fork+exec features to be something that prevents posix_spawn from being usable. As are many other things. People who want to use posix_spawn will need to know that and seek to avoid those. That's a documentation issue first and foremost. Our existing POpen API is very old and wasn't designed to make that nice. A new API could be made that *only* supports posix_spawn available features if you want an entrypoint that encourages the generally lower latency posix_spawn path. (A subprocess.spawn function similar to subprocess.run perhaps?) That should be taken up within its own enhancement issue. |
Now to see if the more esoteric config buildbots find any platform issues to address... |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: