Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid socketserver.StreamRequestHandler.wfile doing partial writes #70908

Closed
vadmium opened this issue Apr 9, 2016 · 11 comments
Closed

Avoid socketserver.StreamRequestHandler.wfile doing partial writes #70908

vadmium opened this issue Apr 9, 2016 · 11 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@vadmium
Copy link
Member

vadmium commented Apr 9, 2016

BPO 26721
Nosy @vstinner, @vadmium
Files
  • buffered-wfile.patch
  • buffered-wfile.v2.patch
  • buffered-wfile.v3.patch
  • buffered-wfile.v4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-06-29.12:06:14.319>
    created_at = <Date 2016-04-09.12:19:55.693>
    labels = ['type-feature', 'library']
    title = 'Avoid socketserver.StreamRequestHandler.wfile doing partial writes'
    updated_at = <Date 2016-06-29.12:06:14.318>
    user = 'https://github.com/vadmium'

    bugs.python.org fields:

    activity = <Date 2016-06-29.12:06:14.318>
    actor = 'martin.panter'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-06-29.12:06:14.319>
    closer = 'martin.panter'
    components = ['Library (Lib)']
    creation = <Date 2016-04-09.12:19:55.693>
    creator = 'martin.panter'
    dependencies = []
    files = ['42417', '42843', '43365', '43366']
    hgrepos = []
    issue_num = 26721
    keywords = ['patch']
    message_count = 11.0
    messages = ['263084', '263120', '263400', '263401', '263402', '263437', '265507', '268384', '268386', '269474', '269481']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'python-dev', 'martin.panter']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue26721'
    versions = ['Python 3.6']

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 9, 2016

    This is a follow-on from bpo-24291. Currently, for stream servers (as opposed to datagram servers), the wfile attribute is a raw SocketIO object. This means that wfile.write() is a simple wrapper around socket.send(), and can do partial writes.

    There is a comment inherited from Python 2 that reads:

    # . . . we make
    # wfile unbuffered because (a) often after a write() we want to
    # read and we need to flush the line; (b) big writes to unbuffered
    # files are typically optimized by stdio even when big reads
    # aren't.

    Python 2 only has one kind of “file” object, and it seems partial writes are impossible: <https://docs.python.org/2/library/stdtypes.html#file.write\>. But in Python 3, unbuffered mode means that the lower-level RawIOBase API is involved rather than the higher-level BufferedIOBase API.

    I propose to change the “wfile” attribute to be a BufferedIOBase object, yet still be “unbuffered”. This could be implemented with a class that looks something like

    class _SocketWriter(BufferedIOBase):
        """Simple writable BufferedIOBase implementation for a socket
        
        Does not hold data in a buffer, avoiding any need to call flush()."""
        def write(self, b):
            self._sock.sendall(b)
            return len(b)

    @vadmium vadmium added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Apr 9, 2016
    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 10, 2016

    Here is a patch with my proposal.

    @vstinner
    Copy link
    Member

    Hum, since long time ago, Python has issues with partial write. It's hard to guess if a write will always write all data, store the data on partial write, or simply ignore remaining data on partial write.

    I recall a "write1" function which was well defined: limited to 1 syscall, don't try (or maybe only on the very specific case of EINTR). But I'm not sure that it still exists in the io module of Python 3.

    asyncio has also issues with the definition of "partial write" in its API.

    You propose to fix the issue in socketserver.

    socket.makefile(bufsize=0).write() uses send() and so use partial write. Are you sure that users are prepared for that? Maybe SocketIO must be modified to use sendall() when bufsize=0?

    @vstinner
    Copy link
    Member

    FYI I recently worked on a issue with partial write in eventlet on Python 3:

    By the way, I opened bpo-26292 "Raw I/O writelines() broken for non-blocking I/O" as a follow-up of the eventlet issue.

    @vstinner
    Copy link
    Member

    I recall a "write1" function which was well defined: limited to 1 syscall, don't try (or maybe only on the very specific case of EINTR). But I'm not sure that it still exists in the io module of Python 3.

    Oops, in fact it is read1:
    https://docs.python.org/dev/library/io.html#io.BufferedIOBase.read1
    "Read and return up to size bytes, with at most one call to the underlying raw stream’s read()"

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 15, 2016

    If a user calls makefile(bufsize=0), they may have written that based on Python 2’s behaviour of always doing full writes. But in Python 3 this is indirectly documented as doing partial writes: makefile() args interpreted as for open, and open() buffering disabled returns a RawIOBase subclass.

    People porting code from Python 2 may be unprepared for partial writes. Just another subtle Python 2 vs 3 incompatibility. People using only Python 3 might be unprepared if they are not familar with the intricacies of the Python API, but then why are they using bufsize=0? On the other hand, they might require partial writes if they are using select() or similar, so changing it would be a compatibility problem.

    You could use the same arguments for socketserver. The difference is that wfile being in raw mode is not documented. Almost all of the relevant builtin library code that I reviewed expects full blocking writes, and I did not find any that requires partial writes. So I think making the change in socketserver is less likely to introduce compatibility problems, and is much more beneficial.

    @vadmium
    Copy link
    Member Author

    vadmium commented May 14, 2016

    Merged with current code, and copied the original wsgiref test case from bpo-24291, since this patch also fixes that bug.

    @vadmium
    Copy link
    Member Author

    vadmium commented Jun 12, 2016

    Merged with current code, and migrated the interrupted-write test from test_wsgiref into test_socketserver.

    @vadmium
    Copy link
    Member Author

    vadmium commented Jun 12, 2016

    Forgot to remove the workaround added to 3.5 for wsgiref in bpo-24291

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 29, 2016

    New changeset 4ea79767ff75 by Martin Panter in branch 'default':
    Issue bpo-26721: Change StreamRequestHandler.wfile to BufferedIOBase
    https://hg.python.org/cpython/rev/4ea79767ff75

    @vadmium
    Copy link
    Member Author

    vadmium commented Jun 29, 2016

    Committed for 3.6.

    @vadmium vadmium closed this as completed Jun 29, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Jackie14 added a commit to Jackie14/dpu-update that referenced this issue Oct 25, 2024
    Before this patch, we are using a new thread to
    host the http server. It works in most cases, but hit
    the python3.5 partial writes bug (if the script is
    run by python3.5), which lead to firmware image
    downloading failure on BMC side.
    The python3.5 partial writes bug:
      Issue: python/cpython#70908
      Fix: python/cpython@34eeed4#diff-c00d56a0c132ee4bdf79f55a4b43643cf314df1fe122c07c539e120c7ec98b5e
    
    That python3.5 bug can be reproduced in a very rigorous env.
    In our script case, there are two kinds of operations:
       1: Script have a thread to host http server, BMC will http download
       Firmware image from that http server.
       2: At the same time, Script will also send redfish(http) request
       to BMC, to poll the downloading status.
    
       These two operations(traffic) need:
       a: happen at the same time.
       b: happen within the same process (different thread).
    
       For #a: That's why downloading image manually by wget or curl
               do not reproduce the downloading failure issue
       For #b: That's how we fix the problem. We host the http server
               in a new process, to avoid hitting that python3.5 bug.
    Jackie14 added a commit to Jackie14/dpu-update that referenced this issue Oct 25, 2024
    Before this patch, we are using a new thread to
    host the http server. It works in most cases, but hit
    the python3.5 partial writes bug (if the script is
    run by python3.5), which lead to firmware image
    downloading failure on BMC side.
    The python3.5 partial writes bug:
      Issue: python/cpython#70908
      Fix: python/cpython@34eeed4#diff-c00d56a0c132ee4bdf79f55a4b43643cf314df1fe122c07c539e120c7ec98b5e
    
    That python3.5 bug can be reproduced in a very rigorous env.
    In our script case, there are two kinds of operations:
       1: Script have a thread to host http server, BMC will http download
       Firmware image from that http server.
       2: At the same time, Script will also send redfish(http) request
       to BMC, to poll the downloading status.
    
       These two operations(traffic) need:
       a: happen at the same time.
       b: happen within the same process (different thread).
    
       For #a: That's why downloading image manually by wget or curl
               do not reproduce the downloading failure issue
       For #b: That's how we fix the problem. We host the http server
               in a new process, to avoid hitting that python3.5 bug.
    Jackie14 added a commit to Jackie14/dpu-update that referenced this issue Oct 25, 2024
    Before this patch, we are using a new thread to
    host the http server. It works in most cases, but hit
    the python3.5 partial writes bug (if the script is
    run by python3.5), which lead to firmware image
    downloading failure on BMC side.
    The python3.5 partial writes bug:
      Issue: python/cpython#70908
      Fix: python/cpython@34eeed4#diff-c00d56a0c132ee4bdf79f55a4b43643cf314df1fe122c07c539e120c7ec98b5e
    
    That python3.5 bug can be reproduced in a very rigorous env.
    In our script case, there are two kinds of operations:
       1: Script have a thread to host http server, BMC will http download
       Firmware image from that http server.
       2: At the same time, Script will also send redfish(http) request
       to BMC, to poll the downloading status.
    
       These two operations(traffic) need:
       a: happen at the same time.
       b: happen within the same process (different thread).
    
       For #a: That's why downloading image manually by wget or curl
               do not reproduce the downloading failure issue
       For #b: That's how we fix the problem. We host the http server
               in a new process, to avoid hitting that python3.5 bug.
    Jackie14 added a commit to Jackie14/dpu-update that referenced this issue Oct 25, 2024
    Before this patch, we are using a new thread to
    host the http server. It works in most cases, but hit
    the python3.5 partial writes bug (if the script is
    run by python3.5), which lead to firmware image
    downloading failure on BMC side.
    The python3.5 partial writes bug:
      Issue: python/cpython#70908
      Fix: python/cpython@34eeed4#diff-c00d56a0c132ee4bdf79f55a4b43643cf314df1fe122c07c539e120c7ec98b5e
    
    That python3.5 bug can be reproduced in a very rigorous env.
    In our script case, there are two kinds of operations:
       1: Script have a thread to host http server, BMC will http download
       Firmware image from that http server.
       2: At the same time, Script will also send redfish(http) request
       to BMC, to poll the downloading status.
    
       These two operations(traffic) need:
       a: happen at the same time.
       b: happen within the same process (different thread).
    
       For #a: That's why downloading image manually by wget or curl
               do not reproduce the downloading failure issue
       For #b: That's how we fix the problem. We host the http server
               in a new process, to avoid hitting that python3.5 bug.
    Jackie14 added a commit to Mellanox/dpu-update that referenced this issue Oct 25, 2024
    Before this patch, we are using a new thread to
    host the http server. It works in most cases, but hit
    the python3.5 partial writes bug (if the script is
    run by python3.5), which lead to firmware image
    downloading failure on BMC side.
    The python3.5 partial writes bug:
      Issue: python/cpython#70908
      Fix: python/cpython@34eeed4#diff-c00d56a0c132ee4bdf79f55a4b43643cf314df1fe122c07c539e120c7ec98b5e
    
    That python3.5 bug can be reproduced in a very rigorous env.
    In our script case, there are two kinds of operations:
       1: Script have a thread to host http server, BMC will http download
       Firmware image from that http server.
       2: At the same time, Script will also send redfish(http) request
       to BMC, to poll the downloading status.
    
       These two operations(traffic) need:
       a: happen at the same time.
       b: happen within the same process (different thread).
    
       For #a: That's why downloading image manually by wget or curl
               do not reproduce the downloading failure issue
       For #b: That's how we fix the problem. We host the http server
               in a new process, to avoid hitting that python3.5 bug.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants