New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POpen bufsize=0 ignored with universal_newlines=True #85394
Comments
On a POpen object created with bufsize=0, stdout.readline() does a buffered reading with python3, whereas in 2.7 it did char-by-char reading. See attached example. As a result, a poll including the stdout object suffers a behaviour change when stdout is ready for writing and there is more than one line of data available. In both cases we get notified by poll() that data is available on the fd and we can stdout.readline() and get back to our polling loop. Then:
Running the attached example under strace reveals the underlying difference: write(4, "go\n", 3) = 3
poll([{fd=5, events=POLLIN|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=5, revents=POLLIN}])
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "x", 1) = 1
-read(5, "\n", 1) = 1
-fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x2), ...}) = 0
+read(5, "xxxxxxxxxxxx\nyyyyyyyyyyyyyyy\naaa"..., 8192) = 74
write(1, ">xxxxxxxxxxxx\n", 14) = 14 We can see a buffered read, which explains the behaviour difference. Changing to bufsize=1, strace does not show a difference here. This is especially troubling, as the first note in https://docs.python.org/3/library/io.html#class-hierarchy mentions that even in buffered mode there is an unoptimized readline() implementation. |
With upcoming 3.10 phasing out 2.7 compatibility I have to find a solution to this, so I'm back digging here. Even .read(1) on a subprocess pipe causes an underlying buffered read, so working around the problem by a loop of 1-byte reads has to do with os.read(), though its usage on file-like object is discouraged. It looks like one of those would be needed, depending on the expected semantics of
|
Relevant commits include this one from v3.1.4: commit 877766d
I can't use that commit without cherry-picking this one from v3.2.2, though: commit e96ec68
And my test script still shows the same behaviour, with poll.poll() or poll.select(). The fact that my stdout object has no read1() and needs the above patch looks like a good lead for further investigation? |
That's linked to universal_newlines, the bug only shows when that flag is set. Testcases provided in #25859 |
Hmm, sorry for not responding earlier. Buffering is necessary for implementing the universal_newlines behaviour (I don't know how we could do otherwise?). This has the unavoidable side effect that the Python buffered file object is not in sync with the underlying file descriptor, so that using So it seems like this is perhaps a documentation issue. What do you think? |
Why is that? I can see that it requires newline state tracking, and the allowance to make two read(fd, &c, 1) system calls for a single read(1) method call, in case a "\n" has to be ignored. testproc-unbuffered.py runs to completion in 3.11 if the following statement that changes the text wrapper's chunk size is added right after creating the Popen() instance: if sys.version_info[0] > 2:
process.stdout._CHUNK_SIZE = 1 The initial chunk size for a text wrapper is hard coded as 8192 bytes. For some reason the constructor has no parameter for it. |
Where can we go from here ?
I must say I'm puzzled by this statement as well, since readline did work together with universal newlines in python2. |
Let's summarize my understanding of the situation (don't hesitate to correct any misinterpretation):
@pitrou python2 does not have such a discrepancy, yet it seems it had no problem to implement the universal_newlines behaviour. Can you elaborate ? Do I just avoid any universal_newlines because I'm essentially just running on Linux ? |
The
This can be worked around by piling another kludge:
A longer-term solution would be nicer, obviously :) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: