Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: support blocking functions on Windows #101881

Closed
RayyanAnsari opened this issue Feb 13, 2023 · 6 comments
Closed

os: support blocking functions on Windows #101881

RayyanAnsari opened this issue Feb 13, 2023 · 6 comments
Labels
3.12 bugs and security fixes extension-modules C modules in the Modules dir OS-windows topic-IO type-feature A feature request or enhancement

Comments

@RayyanAnsari
Copy link
Contributor

RayyanAnsari commented Feb 13, 2023

The os.get_blocking and os.set_blocking functions are only currently supported on Unix.

Linked PRs

@eryksun
Copy link
Contributor

eryksun commented Feb 13, 2023

When WinAPI ReadFile() is called to read from an empty pipe that's in non-blocking mode, the call fails with ERROR_NO_DATA (232). The C runtime's read() function maps this error code to the errno value EINVAL (invalid argument). We would have to update all code that calls read() to clarify this case by checking _doserrno. This is the Windows error code that was mapped to the C errno value. We can manually map ERROR_NO_DATA to EAGAIN (resource temporarily unavailable) in the read() context1 and possibly raise BlockingIOError, unless EAGAIN should be ignored.

When WinAPI WriteFile() is called to write to a pipe that's in non-blocking mode, if the size of the write exceeds the available space, then the call succeeds with 0 bytes written2. The C runtime's write() function usually handles a 'successful' write with nothing written by failing with ENOSPC (no space left on device), except it's not considered an error when writing a string that starts with Ctrl+Z to a character device3. For POSIX compatibility, the errno value that we want in this case is EAGAIN.

Footnotes

  1. The error mapping depends on context. When writing to a pipe, if the read end is closed, this also fails with ERROR_NO_DATA, which we need to map to EPIPE and raise BrokenPipeError. These cases are distinct at the lower level of the NT API. When NtReadFile() is called to read from an empty pipe in non-blocking mode, the call fails with STATUS_PIPE_EMPTY. When NtWriteFile() is called to Write to a pipe for which the read end is closed, the call fails with STATUS_PIPE_CLOSING. Unfortunately the Windows API maps both of these status codes to ERROR_NO_DATA.

  2. It's documented that as much as possible will be written if a byte-type pipe (as opposed to a message-type pipe) is in non-blocking mode. However, in my testing back to Windows NT 4.0 (1996), no data is written to a pipe in non-blocking mode if the write size exceeds the available space, even if the pipe is empty. It's all or nothing.

  3. This is bizarre because nothing nowadays implements special behavior for writing Ctrl+Z to a character device. Reading from console input via ReadFile() is special cased to return 0 bytes read if the line read from the console begins with Ctrl+Z.

@RayyanAnsari
Copy link
Contributor Author

@eryksun thanks for the detailed information about the Win32 API, I've implemented your suggestions in #101882. Though I'm not sure why test_nonblock_pipe_write_bigbuf is failing when test_nonblock_pipe_write_smallbuf is not.

@eryksun
Copy link
Contributor

eryksun commented Feb 14, 2023

Usage in the io module and test cases will have to be examined for assumptions that partial writes are supported up to the available space in the pipe. On Windows, it's all or nothing in non-blocking mode. If the pipe has 1024 bytes available, then trying to write 2048 bytes will write nothing at all and fail.

@eryksun
Copy link
Contributor

eryksun commented Feb 14, 2023

Here's an idea to get around the annoying behavior in non-blocking mode. Modify _Py_write_impl() to check for this case. If so, cap the write to the size of the pipe, from GetNamedPipeInfo(). Eventually the reader will empty the pipe, and a write of the full pipe size will succeed. This allows test_nonblock_pipe_write_bigbuf() to succeed.

For example, append the following code to the first MS_WINDOWS section in _Py_write_impl().

    // A write that exceeds the available size of a pipe never succeeds in
    // non-blocking mode. Limiting writes to the pipe size in this case allows
    // a buffered write to succeed eventually, as the pipe is read.
    DWORD mode, pipe_size;
    HANDLE hfile = _Py_get_osfhandle(fd);
    if (hfile == INVALID_HANDLE_VALUE) {
        return -1;
    }
    if (gil_held) {
        Py_BEGIN_ALLOW_THREADS
        if (GetFileType(hfile) == FILE_TYPE_PIPE &&
            GetNamedPipeHandleStateW(hfile, &mode,
              NULL, NULL, NULL, NULL, 0) &&
            mode & PIPE_NOWAIT)
        {
            // GetNamedPipeInfo() requires FILE_READ_ATTRIBUTES access.
            // CreatePipe() includes this access for the write handle.
            if (!GetNamedPipeInfo(hfile, NULL, NULL, &pipe_size, NULL)) {
                pipe_size = 4096;
            }
            if (count > pipe_size) {
                count = pipe_size;
            }
        }
        Py_END_ALLOW_THREADS
    }
    else {
        if (GetFileType(hfile) == FILE_TYPE_PIPE &&
            GetNamedPipeHandleStateW(hfile, &mode,
              NULL, NULL, NULL, NULL, 0) &&
            mode & PIPE_NOWAIT)
        {
            if (!GetNamedPipeInfo(hfile, NULL, NULL, &pipe_size, NULL)) {
                pipe_size = 4096;
            }
            if (count > pipe_size) {
                count = pipe_size;
            }
        }
    }

@eryksun
Copy link
Contributor

eryksun commented Feb 15, 2023

The documentation in "Doc/library/os.rst" needs to be updated to indicate the new Windows support in 3.12, i.e. .. availability:: and .. versionchanged::, plus a brief description that Windows support is limited to pipes.

For example:

.. function:: get_blocking(fd, /)

   Get the blocking mode of the file descriptor: ``False`` if the
   :data:`O_NONBLOCK` flag is set, ``True`` if the flag is cleared.

   See also :func:`set_blocking` and :meth:`socket.socket.setblocking`.

   .. availability:: Unix, Windows.

      The function is limited on Emscripten and WASI, see
      :ref:`wasm-availability` for more information.

      On Windows, this function is limited to pipe files.

   .. versionadded:: 3.5
   
   .. versionchanged:: 3.12
      Added support for Windows
.. function:: set_blocking(fd, blocking, /)

   Set the blocking mode of the specified file descriptor. Set the
   :data:`O_NONBLOCK` flag if blocking is ``False``, clear the flag otherwise.

   See also :func:`get_blocking` and :meth:`socket.socket.setblocking`.

   .. availability:: Unix, Windows.

      The function is limited on Emscripten and WASI, see
      :ref:`wasm-availability` for more information.

      On Windows, this function is limited to pipe files.

   .. versionadded:: 3.5

   .. versionchanged:: 3.12
      Added support for Windows

@eryksun eryksun added the 3.12 bugs and security fixes label Feb 15, 2023
RayyanAnsari added a commit to RayyanAnsari/cpython that referenced this issue Feb 15, 2023
Use the GetNamedPipeHandleState and SetNamedPipeHandleState Win32 API functions to add support for os.get_blocking and os.set_blocking.
@RayyanAnsari
Copy link
Contributor Author

done, anything else needed?

RayyanAnsari added a commit to RayyanAnsari/cpython that referenced this issue Feb 16, 2023
Use the GetNamedPipeHandleState and SetNamedPipeHandleState Win32 API functions to add support for os.get_blocking and os.set_blocking.
RayyanAnsari added a commit to RayyanAnsari/cpython that referenced this issue Feb 16, 2023
Use the GetNamedPipeHandleState and SetNamedPipeHandleState Win32 API functions to add support for os.get_blocking and os.set_blocking.
zooba pushed a commit that referenced this issue Feb 16, 2023
…es (GH-101882)

* fileutils: handle non-blocking pipe IO on Windows

Handle erroring operations on non-blocking pipes by reading the _doserrno code.
Limit writes on non-blocking pipes that are too large.

* Support blocking functions on Windows

Use the GetNamedPipeHandleState and SetNamedPipeHandleState Win32 API functions to add support for os.get_blocking and os.set_blocking.
@zooba zooba closed this as completed Feb 16, 2023
dscho added a commit to dscho/git that referenced this issue Dec 3, 2023
Since c6d3cce (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
python/cpython#101881 (comment),
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write failed because more data was
provided than the internal pipe buffer can handle. Which can be somewhat
surprising, considering that `write()` is allowed to write less than the
specified amount, e.g. by writing only as much as fits in that buffer.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what should be written, and re-try using
the buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because that Git for Windows added an ugly work-around specifically to
avoid a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit to dscho/git that referenced this issue Jan 30, 2024
Since c6d3cce (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
python/cpython#101881 (comment),
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write _fails_. Completely. Because
more data was provided than the internal pipe buffer can handle.
Somewhat surprising, considering that `write()` is allowed to write less
than the specified amount, e.g. by writing only as much as fits in that
buffer. But it doesn't, it writes no byte at all in that instance.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what needs to be written, and re-try
using the pipe's buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again. And the whole point of a non-blocking pipe is to be
non-blocking.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls. It's the best we can do,
though, so it has to be good enough.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project: Without this
patch, the failed write would result in an infinite loop. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because Git for Windows added an ugly work-around specifically to avoid
a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
gitster pushed a commit to git/git that referenced this issue Jan 30, 2024
Since c6d3cce (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
python/cpython#101881 (comment),
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write _fails_. Completely. Because
more data was provided than the internal pipe buffer can handle.
Somewhat surprising, considering that `write()` is allowed to write less
than the specified amount, e.g. by writing only as much as fits in that
buffer. But it doesn't, it writes no byte at all in that instance.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what needs to be written, and re-try
using the pipe's buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again. And the whole point of a non-blocking pipe is to be
non-blocking.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls. It's the best we can do,
though, so it has to be good enough.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project: Without this
patch, the failed write would result in an infinite loop. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because Git for Windows added an ugly work-around specifically to avoid
a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
maryis pushed a commit to maryis/git that referenced this issue Feb 9, 2024
Since c6d3cce (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
python/cpython#101881 (comment),
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write _fails_. Completely. Because
more data was provided than the internal pipe buffer can handle.
Somewhat surprising, considering that `write()` is allowed to write less
than the specified amount, e.g. by writing only as much as fits in that
buffer. But it doesn't, it writes no byte at all in that instance.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what needs to be written, and re-try
using the pipe's buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again. And the whole point of a non-blocking pipe is to be
non-blocking.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls. It's the best we can do,
though, so it has to be good enough.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project: Without this
patch, the failed write would result in an infinite loop. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because Git for Windows added an ugly work-around specifically to avoid
a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes extension-modules C modules in the Modules dir OS-windows topic-IO type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants