Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch selectmodule.c to support WSAPoll on Windows #60711

Closed
tpn opened this issue Nov 18, 2012 · 32 comments
Closed

Patch selectmodule.c to support WSAPoll on Windows #60711

tpn opened this issue Nov 18, 2012 · 32 comments
Labels
type-feature A feature request or enhancement

Comments

@tpn
Copy link
Member

tpn commented Nov 18, 2012

BPO 16507
Nosy @gvanrossum, @jcea, @pitrou, @giampaolo, @tpn
Files
  • wsapoll.patch
  • miminal-wsapoll.patch
  • runtime_wsapoll.patch
  • runtime_wsapoll.patch
  • runtime_wsapoll.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-06-20.13:53:30.295>
    created_at = <Date 2012-11-18.22:32:54.167>
    labels = ['type-feature']
    title = 'Patch selectmodule.c to support WSAPoll on Windows'
    updated_at = <Date 2013-06-20.13:53:30.283>
    user = 'https://github.com/tpn'

    bugs.python.org fields:

    activity = <Date 2013-06-20.13:53:30.283>
    actor = 'sbt'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-06-20.13:53:30.295>
    closer = 'sbt'
    components = []
    creation = <Date 2012-11-18.22:32:54.167>
    creator = 'trent'
    dependencies = []
    files = ['28038', '28201', '28207', '28341', '28799']
    hgrepos = []
    issue_num = 16507
    keywords = ['gsoc']
    message_count = 32.0
    messages = ['175927', '175929', '175948', '176864', '176917', '177109', '177634', '180256', '180309', '180315', '180317', '180318', '180322', '180325', '180327', '180328', '180345', '180349', '180350', '180353', '180358', '180360', '180386', '180393', '180396', '180397', '180406', '180407', '180410', '180412', '180422', '180424']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'jcea', 'pitrou', 'giampaolo.rodola', 'trent', 'neologix', 'sbt']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue16507'
    versions = ['Python 3.4']

    @tpn
    Copy link
    Member Author

    tpn commented Nov 18, 2012

    Attached patch adds select.poll() support on Windows via WSAPoll.

    It's hacky; I was curious to see whether or not it could be done, and whether or not tulip's pollster would work with it.

    It compiles and works, but doesn't play very nicely with tulip. Also, just about every lick of code that tests poll() does so in a UNIX-specific way, so it's hard to test.

    As with select, WSAPoll() will barf if you feed it anything other than SOCKETs (i.e. it doesn't work against non-socket file descriptors).

    @pitrou
    Copy link
    Member

    pitrou commented Nov 18, 2012

    @tpn
    Copy link
    Member Author

    tpn commented Nov 19, 2012

    On Sun, Nov 18, 2012 at 03:19:19PM -0800, Antoine Pitrou wrote:

    Antoine Pitrou added the comment:

    Related post:
    http://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/

    Yeah, came across that yesterday.  Few other relevant links, for the
    records:
    
        http://social.msdn.microsoft.com/Forums/en/wsk/thread/18769abd-fca0-4d3c-9884-1a38ce27ae90 (has a code example of what doesn't work)
    
        http://www.codeproject.com/Articles/140533/The-Differences-Between-Network-Calls-in-Windows-a
    
        http://blogs.msdn.com/b/wndp/archive/2006/10/26/wsapoll.aspx
    
        http://curl.haxx.se/mail/lib-2012-10/0038.html
    

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Dec 3, 2012

    Attached is an alternative patch which only touches selectmodule.c. It still does not support WinXP.

    Note that in this version register() and modify() do not ignore the POLLPRI flag if it was *explicitly* passed. But I am not sure how best to deal with POLLPRI.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Dec 4, 2012

    Here is a version which loads WSAPoll at runtime. Still no tests or docs.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Dec 7, 2012

    It seems that the return code of WSAPoll() does not include the count of array items with revents == POLLNVAL. In the case where all of them are POLLNVAL, instead of returning 0 (which usually indicates a timeout) it returns -1 and WSAGetLastError() == WSAENOTSOCK.

    This does not match the MSDN documentation which claims that the return code is the number of descriptors for which revents is non-zero. But it arguably does agree with the FreeBSD and MacOSX man pages which say that it returns the number of descriptors that are "ready for I/O".

    BTW, the implementation of select_poll() assumes that the return code of poll() (if non-negative) is equal to the number of non-zero revents fields. But select_have_broken_poll() considers a MacOSX poll() implementation to be good even in cases where this assumption is not true:

    static int select_have_broken_poll(void)
    {
        int poll_test;
        int filedes[2];
        struct pollfd poll_struct = { 0, POLLIN|POLLPRI|POLLOUT, 0 };
        if (pipe(filedes) < 0) {
            return 1;
        }
        poll_struct.fd = filedes[0];
        close(filedes[0]);
        close(filedes[1]);
        poll_test = poll(&poll_struct, 1, 0);
        if (poll_test < 0) {
            return 1;
        } else if (poll_test == 0 && poll_struct.revents != POLLNVAL) {
            return 1;
        }
        return 0;
    }

    Note that select_have_broken_poll() == FALSE if poll_test == 0 and poll_struct.revents == POLLNVAL.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Dec 16, 2012

    Here is a new version with tests and docs.

    Note that the docs do not mention the bug mentioned in

    http://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/

    Maybe they should?

    Note that that bug makes it a bit difficult to use poll with tulip on Windows. (But one could restrict timeouts to one second and always check outstanding connect attempts using select() when poll() returns.)

    @sbt sbt mannequin added the type-feature A feature request or enhancement label Dec 16, 2012
    @gvanrossum
    Copy link
    Member

    This works well enough (tested in old version of Tulip), right? What's holding it up?

    @gvanrossum
    Copy link
    Member

    Oh, it needs a new patch -- the patch fails to apply in the 3.4
    (default) branch.

    @gvanrossum
    Copy link
    Member

    Here's a new version of the patch. (Will test on Windows next.)

    @gvanrossum
    Copy link
    Member

    That compiles (after hacking the line endings). One Tulip test fails, PollEventLooptests.testSockClientFail. But that's probably because the PollSelector class hasn't been adjusted for Windows yet (need to dig this out of the Pollster code that was deleted when switching to neologix's Selector).

    1 similar comment
    @gvanrossum
    Copy link
    Member

    That compiles (after hacking the line endings). One Tulip test fails, PollEventLooptests.testSockClientFail. But that's probably because the PollSelector class hasn't been adjusted for Windows yet (need to dig this out of the Pollster code that was deleted when switching to neologix's Selector).

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 20, 2013

    That compiles (after hacking the line endings). One Tulip test fails,
    PollEventLooptests.testSockClientFail. But that's probably because the
    PollSelector class hasn't been adjusted for Windows yet (need to dig this
    out of the Pollster code that was deleted when switching to neologix's
    Selector).

    Sorry I did not deal with this earlier. I can make the modifications to PollSelector tommorrow.

    Just to describe the horrible hack: every time poll() needs to be called we first check if there are any registered async connects. If so then I first use select([], [], connectors) to detect any failed connections, and then use poll() normally.

    This does mean that to detect failed connections we must never use too large a timeout with poll() when there are outstanding connects. Of course one must decide what is an acceptable maximum timeout -- too short and you might damage battery life, too long and you will not get prompt notification of failures.

    @gvanrossum
    Copy link
    Member

    Ow. How painful. I'll leave this for you to do. Note that this also
    requires separating EVENT_WRITE from EVENT_CONNECT -- I am looking
    into this now, but I am not sure how far I will get with this.

    @gvanrossum
    Copy link
    Member

    (FWIW, I've got the EVENT_CONNECT separation done.)

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 20, 2013

    Time for a stupid question from someone who doesn't know anything about Windows: if WSAPoll() is really terminally broken, is it really worth the hassle exposing it and warping the API?
    AFAICT, FD_SETSIZE is already bumped to 512 on Windows, and Windows select() is limited by the fd_set size, not the maximum descriptor: so what exactly does WSAPoll() bring over select() on Windows?
    (Especially if there are plans to support IOCP, wouldn't that make WSAPoll() obsolete?)

    @gvanrossum
    Copy link
    Member

    This is a very good question to which I have no good answer. If it weren't for this, we could probably do away with the distinction between add_writer and add_connector, and a lot of code could be simpler. (Or is that distinction also needed for IOCP?)

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 21, 2013

    On 21/01/2013 5:38pm, Guido van Rossum wrote:

    This is a very good question to which I have no good answer.
    If it weren't for this, we could probably do away with the distinction
    between add_writer and add_connector, and a lot of code could be
    simpler. (Or is that distinction also needed for IOCP?)

    The distinction is not needed by IOCP. I am also not too sure that
    running tulip on WSAPoll() is a good idea, even if the select module
    provides it.

    OFF-TOPIC: Although it is not the optimal way of running tulip with
    IOCP, I have managed to implement IocpSelector and IocpSocket classes
    well enough to pass tulip's unittests (except for the ssl one).

    I did have to make some changes to the tests: selectors have a
    wrap_socket() method which prepares a socket for use with the selector.
    On Unix it just returns the socket unchanged, whereas for IocpSelector
    it returns an IocpSocket wrapper.

    I also had to make the unittests behave gracefully if there is a
    "spurious wakeup", i.e. the socket is reported as readable, but trying
    to read fails with BlockingIOError. (Spurious wakeups are possible but
    very rare with select() etc.)

    It would be possible to make IocpSelector deal with pipe handles too.

    @gvanrossum
    Copy link
    Member

    Thanks -- I am now close to rejecting the WSAPoll() patch, and even
    closer to rejecting its use for Tulip on Windows. That would in turn
    mean that we should kill add/remove_connector() and also the
    EVENT_CONNECT flag in selector.py. Anyone not in favor please speak
    up!

    Regarding your IOCP changes, that sounds pretty exciting. Richard,
    could you check those into the Tulip as a branch? (Maybe a new branch
    named 'iocp'.)

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 21, 2013

    On 21/01/2013 7:00pm, Guido van Rossum wrote:

    Regarding your IOCP changes, that sounds pretty exciting. Richard,
    could you check those into the Tulip as a branch? (Maybe a new branch
    named 'iocp'.)

    OK. It may take me a while to rebase them.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 21, 2013

    I have created an iocp branch.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 21, 2013

    I have created an iocp branch.

    You could probably report the fixes for spurious notifications in the
    default branch.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 22, 2013

    It appears that Linux's "spurious readiness notifications" are a deliberate deviation from the POSIX standard. (They are mentioned in the BUGS section of the man page for select.)

    Should I just apply the following patch to the default branch?

    diff -r 3ef7f1fe286c tulip/events_test.py
    --- a/tulip/events_test.py      Mon Jan 21 18:55:29 2013 -0800
    +++ b/tulip/events_test.py      Tue Jan 22 12:09:21 2013 +0000
    @@ -200,7 +200,12 @@
             r, w = unix_events.socketpair()
             bytes_read = []
             def reader():
    -            data = r.recv(1024)
    +            try:
    +                data = r.recv(1024)
    +            except BlockingIOError:
    +                # Spurious readiness notifications are possible
    +                # at least on Linux -- see man select.
    +                return
                 if data:
                     bytes_read.append(data)
                 else:
    @@ -218,7 +223,12 @@
             r, w = unix_events.socketpair()
             bytes_read = []
             def reader():
    -            data = r.recv(1024)
    +            try:
    +                data = r.recv(1024)
    +            except BlockingIOError:
    +                # Spurious readiness notifications are possible
    +                # at least on Linux -- see man select.
    +                return
                 if data:
                     bytes_read.append(data)
                 else:

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 22, 2013

    It appears that Linux's "spurious readiness notifications" are a deliberate deviation from the POSIX standard. (They are mentioned in the BUGS section of the man page for select.)

    I don't think it's a deliberate deviation, but really bugs/limitations
    (I can remember at least one occurrence case where a UDP segment would
    be received, which triggered a notification, but the segment was
    subsequently discarded because of an invalid checksum). AFAICT kernel
    developers tried to fix those spurious notifications, but some of them
    were quite tricky (see e.g. http://lwn.net/Articles/318264/ for
    epoll() patches, and
    http://lists.schmorp.de/pipermail/libev/2009q1/000627.html for an
    example spurious epoll() notification scenario).

    That's something we have to live with (like pthread condition spurious
    wakeups), select()/poll()/epoll() are mere hints that the FD is
    readable/writable...

    Also, in real code you have to be prepared to catch EAGAIN regardless
    of spurious notifications: when a FD is reported as read ready, it
    just means that there are some data to read. Depending on the
    watermark, it could mean that only one byte is available.

    So if you want to receive e.g. a large amount of data and the FD is
    non-blocking, you can do something like:

    """
    buffer = []
    while True:
    try:
    data = s.recv(8096)
    except BlockingIOError:
    break

            if data is None:
                break
            buffer += data
    """

    Otherwise, you'd have to read() only one byte at a time, and go back
    to the select()/poll() syscall.

    (For write ready, you can obviously have "spurious" notifications if
    you try to write more than what is available in the output socket
    buffer).

    Should I just apply the following patch to the default branch?

    LGTM.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 22, 2013

    Also, in real code you have to be prepared to catch EAGAIN regardless
    of spurious notifications: when a FD is reported as read ready, it
    just means that there are some data to read. Depending on the
    watermark, it could mean that only one byte is available.

    If only one byte is available, recv(4096) should simply return a partial result.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 22, 2013

    According to Alan Cox

    It's a design decision and a huge performance win. It's one of the areas
    where POSIX read in its strictest form cripples your performance.
    

    See https://lkml.org/lkml/2011/6/18/103

    (For write ready, you can obviously have "spurious" notifications if
    you try to write more than what is available in the output socket
    buffer).

    Wouldn't you just get a partial write (assuming an AF_INET, SOCK_STREAM socket)?

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 22, 2013

    If only one byte is available, recv(4096) should simply return a partial result.

    Of course, but how do you know if there's data left to read without
    calling select() again? It's much better to call read() until you get
    EAGAIN than calling select() between each read()/write() call.

    Wouldn't you just get a partial write (assuming an AF_INET, SOCK_STREAM socket)?

    For SOCK_STREAM, yes, not for SOCK_DGRAM (or for a pipe when trying to
    write more than PIPE_BUF, although I guess any sensible implementation
    doesn't report the pipe write ready if there's less than PIPE_BUF
    space left).

    It's a design decision and a huge performance win. It's one of the areas
    where POSIX read in its strictest form cripples your performance.

    Yes, he's referring to the fact that there are cases where you could
    avoid some spurious notifications, but that would incur a performance
    hit: that's exactly the same rationale behind condition variables
    spurious wakups: since the user-code must be prepared to handle
    spurious notifications, let's take advantage of it.

    But there are been various fixes in the past years to avoid spurious
    notifications in epoll() for example, because while they allow certain
    optimizations in the kernel, spurious wakeups can cost to user-level
    applications...

    I'm 99% sure that Linux isn't the only OS allowing spurious wakeups,
    since it's essentially an unsolvable issue (temporary shortage of
    buffer, or the example given by Alan Cox of a pipe with two
    readers...).

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 22, 2013

    For SOCK_STREAM, yes, not for SOCK_DGRAM (or for a pipe when trying to
    write more than PIPE_BUF, although I guess any sensible implementation
    doesn't report the pipe write ready if there's less than PIPE_BUF
    space left).

    That should be of course "when trying to write LESS than PIPE_BUF",
    since it's required to be atomic.

    @gvanrossum
    Copy link
    Member

    Short reads/writes are orthogonal to EAGAIN. All the mainline code treats
    readiness as a hint only, so tests should too.

    --Guido van Rossum (sent from Android phone)

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jan 22, 2013

    For SOCK_STREAM, yes, not for SOCK_DGRAM

    I thought SOCK_DGRAM messages just got truncated at the receiving end.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jan 22, 2013

    I thought SOCK_DGRAM messages just got truncated at the receiving end.

    You were referring to partial writes: for a datagram-oriented
    protocol, if the datagram can't be sent atomically (in one
    send()/write() call), the kernel will return EAGAIN. On the receiving
    side, it will get truncated is the buffer is too small.

    Going back to the subject: so what do we say, let's just forget about
    supporting WSAPoll at all (both in CPython and tulip)?

    If we ever choose to export it, I think the least we should do would
    be to not export it as select.poll(): since it has - not so subtle -
    semantic differences with poll(), code using previously select() on
    Windows may silently break when poll() is suddenly available: e.g.
    asyncore with use_poll=True would probably deadlock in case of
    unreachable host, if WSAPoll doesn't report connect() failures.

    When I see the hoops Richard had to go through to make WSAPoll usable
    in tulip, my gut feeling is that exposing it wouldn't be making a
    favor to poor unsuspecting Windows programmers :-\

    @gvanrossum
    Copy link
    Member

    Agreed, it does not sound very useful to support WSAPoll(), neither in
    selector.py (which is intended to eventually be turned into
    stdlib/select.py) nor in PEP-3156. And then, what other use is there
    for it, really?

    @sbt sbt mannequin closed this as completed Jun 20, 2013
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants