New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-35537: use os.posix_spawn in subprocess #11242
Conversation
@nanjekyejoannah: Thanks! Can you please open an issue on bugs.python.org to get a bug number? Then add "bpo-xxx: " to the title of your PR. Can you also please add a NEWS entry? You can use the "blurb" tool: https://devguide.python.org/committing/#what-s-new-and-news-entries |
@vstinner ack. Am doing this shortly. |
Lib/subprocess.py
Outdated
and not pass_fds | ||
and cwd is None | ||
and env is None | ||
and restore_signals is False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and not restore_signals" would be better. I prefer to avoid "is" to compare with True/False.
Lib/subprocess.py
Outdated
@@ -1390,6 +1390,11 @@ def _get_handles(self, stdin, stdout, stderr): | |||
errread, errwrite) | |||
|
|||
|
|||
def _posix_spawn(self, args, executable): | |||
"""Execute program using os.posix_spawn().""" | |||
env = dict(os.environ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really needeed to copy os.environ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to pass directly os.environ to os.posix_spawn().
@@ -0,0 +1 @@ | |||
subprocess.Popen can now use posix_spawn() in some cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better rendering (create links), I suggest:
subprocess.Popen can now use posix_spawn() in some cases. | |
:class:`subprocess.Popen` can now use :func:`os.posix_spawn` in some cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I will wait until minor remarks are addressed.
Lib/subprocess.py
Outdated
and env is None | ||
and restore_signals is False | ||
and not start_new_session): | ||
return self._posix_spawn(args, executable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to pass (executable, args) to follow os.posix_spawn() API. The function has not return value, so I suggest to write instead:
return self._posix_spawn(args, executable) | |
self._posix_spawn(executable, args) | |
return |
Lib/subprocess.py
Outdated
@@ -1390,6 +1390,11 @@ def _get_handles(self, stdin, stdout, stderr): | |||
errread, errwrite) | |||
|
|||
|
|||
def _posix_spawn(self, args, executable): | |||
"""Execute program using os.posix_spawn().""" | |||
env = dict(os.environ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to pass directly os.environ to os.posix_spawn().
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a sentence to the Optimization section of Doc/whatsnew/3.8.rst? Something like:
"subprocess.Popen is now up to 60x faster on Linux and 106x faster on FreeBSD when os.posix_spawn() can be used."
60x and 106x are from particular examples. If you use different examples (allocate 1 GiB or 4 GiB of memory in the parent) you can get different numbers. Concrete numbers depend on a program and do not have meaning without referring to concrete program. |
It's hard to measure the average speedup, but IMHO it's nice to say that it can be way faster in some cases. So do you want to explain it differently? For example: "subprocess.Popen is now up to 60x faster on Linux and 106x faster on FreeBSD when os.posix_spawn() can be used, the parent has large memory footprint (ex: 2 GiB), and the child process execution time is short." |
@nanjekyejoannah: Until we agree on the sentence to be added to What's New in Python 3.8, don't add it. We can add it in the following PR. |
I suggest to not use any precise numbers: "subprocess.Popen is now significantly faster" or "subprocess.Popen can be now tens times faster". |
@nanjekyejoannah It seems like you closed the PR. Was it on pupose? I'm unable to reopen the issue. |
@vstinner I will reopen did it by mistake. |
@nanjekyejoannah: this PR is now empty, it doesn't contain any commit :-( It seems like your removed your commit. You can use "git reflog" or open .git/logs/HEAD to recover your removed commit. |
Lib/subprocess.py
Outdated
@@ -1390,6 +1390,11 @@ def _get_handles(self, stdin, stdout, stderr): | |||
errread, errwrite) | |||
|
|||
|
|||
def _posix_spawn(self, args, executable): | |||
"""Execute program using os.posix_spawn().""" | |||
env = os.environ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: this variable is useless, just pass directly os.environ to posix_spawn().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I will merge this PR tomorrow except if someone sees a good reason to not merge it: https://bugs.python.org/issue35537#msg332239 |
and not start_new_session | ||
and os.path.dirname(executable) | ||
and self.stdin is self.stdout is self.stderr is None): | ||
self._posix_spawn(args, executable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking self.std*
is still not enough. Those fields are for the parent's ends of pipes, but if a caller redirects to a descriptor, there is no such end. For example, subprocess.call(['/bin/sleep', '1000'], stdout=fd, close_fds=False, restore_signals=False)
doesn't work properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izbyshev: Oh right! I fixed the test: test (p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite) instead of (self.stdin, self.stdout, self.stderr). Is it better now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, LGTM now.
There is another potential problem with posix_spawn
: asynchronous error notification. _posixsubprocess
uses a pipe to notify the parent about all errors up to and including execve
errors, so its style is synchronous. Glibc and musl use the pipe-based notification too, but they are not required to do that, and I don't know what other implementations do.
Test (p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite) instead of (self.stdin, self.stdout, self.stderr).
@nanjekyejoannah: I pushed directly a fix into your branch to fix the latest reported bug about standard streams ;-) |
Hum, @izbyshev found another issue (already now fixed) with this PR. Since I will be offline next 2 weeks, I prefer to not merge yet since I would be unable to handle any issue caused by the change. I now prefer to wait until I'm back from holiday to merge the PR. In the meanwhile, if another core dev feels brave enough to merge it, please go ahead ;-) |
okay noted. |
I expect that posix_spawn() sets errno as does execv(). I tested a non
existent executable and I got the expected exception and traceback.
The pipe matters more when preexec_fn is used to return the Python
traceback of the child and the exception type. But preeexec_fn doesn't make
sense with (cannot be supported by) posix_spawn().
…--
Night gathers, and now my watch begins. It shall not end until my death.
|
As I explained, this behavior is not required by POSIX, and the opposite behavior is allowed. Recent glibc- and musl-based Linux behave as you expect, but for glibc, this was added in 2017. In my test on Ubuntu 14.04, FreeBSD also seems not to use any synchronous notification. Other systems where |
I merged PR #11452 instead. |
This patch uses os.posix_spawn in subprocess.
posix_spawn() is faster but it's also safer: it allows us to do a lot of "actions" before exec(), before executing the new program. For example, you can close files and control signals.
@vstinner this is the patch without any options.
https://bugs.python.org/issue35537