-
-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Popen.communicate() breaks when child closes its side of pipe but not exits #79363
Comments
When communicate() is called in a loop, it crashes when the child process has already closed any piped standard stream, but still continues to be running. How this happens:
I think there may be a simple solution: before registering file descriptors in epoll, we may check whether any of them is already closed, and don't register it in that case. Here is a simple reproducible example, ran on Linux 4.15.0-1021-aws x86_64: import subprocess
child = subprocess.Popen(
['/usr/local/bin/python3.7', '-c', 'import os, time; os.close(1), time.sleep(30)'],
stdout=subprocess.PIPE,
)
while True:
try:
child.communicate(timeout=3)
break
except subprocess.TimeoutExpired:
# do something useful here
pass Here is a stacktrace: Traceback (most recent call last):
File "test.py", line 10, in <module>
child.communicate(timeout=3)
File "/usr/local/lib/python3.7/subprocess.py", line 933, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/local/lib/python3.7/subprocess.py", line 1666, in _communicate
selector.register(self.stdout, selectors.EVENT_READ)
File "/usr/local/lib/python3.7/selectors.py", line 352, in register
key = super().register(fileobj, events, data)
File "/usr/local/lib/python3.7/selectors.py", line 238, in register
key = SelectorKey(fileobj, self._fileobj_lookup(fileobj), events, data)
File "/usr/local/lib/python3.7/selectors.py", line 225, in _fileobj_lookup
return _fileobj_to_fd(fileobj)
File "/usr/local/lib/python3.7/selectors.py", line 40, in _fileobj_to_fd
"{!r}".format(fileobj)) from None
ValueError: Invalid file object: <_io.BufferedReader name=3> |
Sounds like the solution you'd want here is to just change each if check in _communicate, so instead of: if self.stdout:
selector.register(self.stdout, selectors.EVENT_READ)
if self.stderr:
selector.register(self.stderr, selectors.EVENT_READ) it does: if self.stdout and not self.stdout.closed:
selector.register(self.stdout, selectors.EVENT_READ)
if self.stderr and not self.stderr.closed:
selector.register(self.stderr, selectors.EVENT_READ) The `if self.stdin and input:` would also have to change. Right now it's buggy in a related, but far more complex way. Specifically if you call it with input the first time:
The fix for this would be to either:
or
Either way, the caching behavior for input should be properly documented; we clearly specify that output is preserved after a timeout and retrying communicate ("If the process does not terminate after timeout seconds, a TimeoutExpired exception will be raised. Catching this exception and retrying communication will not lose any output."), but we don't say anything about input, and right now, the behavior is the somewhat odd and hard to express: "Retrying a call to communicate when the original call was passed non-None/non-empty input requires subsequent call(s) to pass non-None, non-empty input. The input on said subsequent calls is otherwise ignored; only the unsent remainder of the original input is sent. Also, it will just fail completely if you pass non-empty input and it turns out the original input was sent completely on the previous call, in which case you *must* call it with input=None." It might also be worth changing the selectors module to raise a more obvious exception when register is passed a closed file-like object, but given it only requires non-integer fileobjs to have a .fileno() method, adding a requirement for a "closed" attribute/property could break other code. |
Hmm... Correction to my previous post. communicate itself has a test for: "if self._communication_started and input:" that raises an error if it passes, so the second call to communicate can only be passed None/empty input. And _communicate only explicitly closes self.stdin when input is falsy and _communication_started is False, so the required behavior right now is:
So I think we're actually okay on the code for stdin, but it would be a good idea to document that input *must* be None on all but the first call, and that the input passed to the first call is cached such that as long as at least one call to communicate completes without a TimeoutError (and the stdin isn't explicitly closed), it will all be sent. Sorry for the noise; I should have rechecked communicate itself, not just _communicate. |
@josh.r but you’re correct regarding cached data that isn’t sent on subsequent communicate() calls. If the child consumes the input too slowly, and timeout occurs before sending all input, the remaining part will be lost. Maybe it is not a bug, but it’s quite a confusing behavior, and I think it should be mentioned in the doc. |
backport automation appears unhappy at the moment. I'm keeping this open and assigned to me to manually run cherry_picker on this for 3.8 and 3.7 (if still open for non-security fixes). |
thanks everyone! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: