Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-42044: Write all bytes to the console in unbuffered mode on Windows #26678

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

fabioz
Copy link
Contributor

@fabioz fabioz commented Jun 11, 2021

@Ringdingcoder
Copy link

I am only a moderately interested observer, and from the information in issue 11395 it seems to be fair to remove the capping, now that Windows 7 is not supported anymore. But: When a function returns the number of bytes written, as this one does, shouldn’t the caller be fixed to actually handle partial writes?

@fabioz
Copy link
Contributor Author

fabioz commented Jul 2, 2021

I am only a moderately interested observer, and from the information in issue 11395 it seems to be fair to remove the capping, now that Windows 7 is not supported anymore. But: When a function returns the number of bytes written, as this one does, shouldn’t the caller be fixed to actually handle partial writes?

I think that the caller can be fixed, but for me that's a separate (and bigger) issue (i.e.: writing of all bytes to the console in an unbuffered mode can be done in a single call in Windows and it's being artificially constrained from working due to historical issues supporting older versions of Windows).

As a note, I actually took a look at fixing the caller, but I don't understand some of the decisions in textio.c (so, I think I could inadvertently break something).

Also, it'd need to be fixed more places -- _PyIO_str_write (which in windows maps to _io__WindowsConsoleIO_write) is called from a few places, all with the pattern:

    do {
        errno = 0;
        res = PyObject_CallMethodOneArg(self->raw, _PyIO_str_write, memobj);
        errnum = errno;
    } while (res == NULL && _PyIO_trap_eintr());

So, the fix would have change all those places to do the rewrite, but the memobj seems to be opaque at that place, making it hard to know how to retry from a given index in it...

@gvanrossum
Copy link
Member

I don’t see the point of leaving a long comment that essentially apologizes for some deleted code. The commit message is supposed to cover that. Instead state that for long strings this will fail (and how — that’s still unclear to me) on Windows 7, but that W7 is not supported.

@ammaraskar
Copy link
Member

/cc @eryksun

Note: a fix for issue11395 started to write bytes capped at 32k and mandated
the caller to handle partial writes, but when the text io flush was called,
it only does only a single call in unbuffered mode. This made it so that the
bytes unwritten were actually lost (and not printed) in the process.
The reason for that was historical to support older versions of Windows. As
of Windows 10, this is no longer needed and thus given issue42044, the number
of bytes written to the console is no longer capped.
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, but I still don't understand the failure mode. Will it return a correct indicator of how much was written, or will it do something else (then what)?

PS. Once a PR is in review, it's better not to amend your commits -- just add new commits, this is easier for the reviewers. We'll squash when we merge anyway.

@fabioz
Copy link
Contributor Author

fabioz commented Jul 2, 2021

Much better, but I still don't understand the failure mode. Will it return a correct indicator of how much was written, or will it do something else (then what)?

It should fail (i.e.: return false and then GetLastError() should get the error later on in the function -- at least according to: https://docs.microsoft.com/en-us/windows/console/writeconsole)

So, if unable to print it could fail with something as:
IOError: [Errno 12] Not enough space
Also see: https://bugs.python.org/issue11395 (this is the original error fixed which introduced the breakage in unbuffered mode).

Note that with this fix this can happen again if printing a string big enough that it wouldn't fit in the current heap memory (vs silently dropping everything above 32KiB which was happening until now).

Given that unbuffered was requested, this seems ok to me (but if this limitation is too strong, another option could be doing multiple calls to WriteConsoleW inside of this function).

PS. Once a PR is in review, it's better not to amend your commits -- just add new commits, this is easier for the reviewers. We'll squash when we merge anyway.

Agreed (I thought that this was so small that it'd be ok, but I see how it can be confusing..)

@eryksun
Copy link
Contributor

eryksun commented Jul 2, 2021

Instead state that for long strings this will fail (and how — that’s still unclear to me) on Windows 7, but that W7 is not supported.

Windows NT uses local procedure call (LPC) ports for message-based communication between client processes and system components (e.g. subsystems, services, and even kernel components). LPC ports do not directly support arbitrarily large messages. Instead, large messages are exchanged via shared memory, which is typically managed as a heap.

Prior to Windows 8, console files are implemented by the console host process (e.g. conhost.exe in Windows 7, and csrss.exe before that). Handle values for console files are tagged by setting the lower two bits (e.g. handle values 3, 7, 11, etc). When ReadFile and WriteFile are passed a console pseudohandle, they redirect to the internal implementation of, respectively, ReadConsoleA and WriteConsoleA. The request is implemented by exchanging LPC messages with the console host. Data is sent via shared memory if it's too big to fit in a message, but the window of shared memory is only 64 KiB. To avoid a failed write, it was decided to limit writes to 32 KiB, which assumes that at least half of the shared memory is always available as a contiguous block. Console reads have the same size limit, but large reads aren't a practical concern in line-input mode.

This is no longer an issue in Windows 8+. The system console API was redesigned to use a console device instead of LPC. Thus ReadFile and WriteFile on a console file now take the normal I/O request path via the NtReadFile and NtWriteFile system calls, and ReadConsole and WriteConsole are implemented as IOCTLs via NtDeviceIoControlFile. A device I/O request is only limited by available address space and virtual memory. Even in a 32-bit process, one should be able to read and write multiple MiB in a single request, which presents no practical limit for console I/O.

@eryksun
Copy link
Contributor

eryksun commented Jul 2, 2021

For consistency, you can also remove the size limit from _Py_write_impl in "Python/fileutils.c".

Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callers still have to handle partial writes, so I don't consider this to fix the issue.

But it certainly raises the number of bytes you have to write until it becomes obvious.

Modules/_io/winconsoleio.c Outdated Show resolved Hide resolved
@fabioz
Copy link
Contributor Author

fabioz commented Jul 5, 2021

For consistency, you can also remove the size limit from _Py_write_impl in "Python/fileutils.c".

Done.

@fabioz
Copy link
Contributor Author

fabioz commented Jul 5, 2021

Callers still have to handle partial writes, so I don't consider this to fix the issue.

But it certainly raises the number of bytes you have to write until it becomes obvious.

One thing I considered was changing the semantics so that _io__WindowsConsoleIO_write_impl always wrote everything (even if it meant doing multiple system calls) -- if the call to WriteConsoleW failed if the contents were too large, by retrying with a smaller buffer and doing multiple calls.

-- the way things are structured, the actual number of bytes written never gets reported to the initial caller. It just expects that everything has been properly written (for instance, when write calls flush in unbuffered mode, flush doesn't report the number of bytes written, it just returns whether it was written or not and write will say that everything was written (even if it wasn't -- which is what's happening now).

So, retrying at the caller (flush in this case, but there are others too), is a bit strange implementation wise (also, if all callers will retry, doing an auto-retry at the write implementation would be much more straightforward as the types are already concrete, whereas the callers don't really have enough information to retry from some index).

Personally, I think that reporting a failure -- as will happen with this patch -- is a reasonable approach, but if you deem worthy to fix the issue by writing in chunks at _io__WindowsConsoleIO_write_impl, I'm OK in doing that change too (although if that's not reasonable, I'd like to get the change which raises the limit and throws an error if unable to do it and leave it to someone with more background on the code to do the change to retry at a caller as I can't see how that'd work properly under the current structure).

@fabioz fabioz requested a review from zooba July 8, 2021 11:54
Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has merge conflicts now.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@bersbersbers
Copy link

@fabioz are still working on this? I would be interested in seeing #86210 fixed, and from skimming the comments of this PR, it looks like it might be close to solving this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet