New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libuv backend #790

Closed
jamadden opened this Issue Apr 10, 2016 · 24 comments

Comments

Projects
None yet
7 participants
@jamadden
Member

jamadden commented Apr 10, 2016

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 10, 2016

(copied from #580 (comment)):

I've been looking at libuv, and it's not quite the panacea one would hope.

First, what's relevant for this thread, the DNS support, sadly, is also no better: it just shoves requests off onto its own internal threadpool. Making it no different than gevent's default resolver. (Edit: though possibly having slightly less overhead?)

Other minor problems include no support for fork watchers or child watchers; those can both be handled at the gevent level though, at least on POSIX (but the situation with children and SIGCHLD won't get any better).

The biggest impedance mismatch, though, is in the handling of IO. The libuv equivalent to libev's IO watchers, "poll" watchers, pretty much matches exactly what libev does, including only working for sockets on Windows (and not using IOCP, AFAICS). If you want the much-vaunted Windows improvements you have to use functions and types specific to TCP, UDP or pipes, and these do not have (exposed) read or write events, relying instead on a series of non-blocking callbacks that are handled internally to the event loop. This doesn't fit well with gevent's "wait for a read event then try to read again" approach (in fact, gevent wouldn't be able to use the Python-level socket object at all). It may be possible to paper over this with some higher-level abstractions, but they don't currently exist.

So in the short term, I'm still interested in looking into a libuv port based on the poll watchers. I think that's pretty low hanging fruit, and I'm interested to compare performance. For another thing, this should make CFFI possible on Windows. But any other Windows-specific improvements that would come from redesigning the socket abstractions are further down the line.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 10, 2016

I've hit a wall with libuv: you can't use libuv in a forked child process, period, no workaround, end-of-story, the child process crashes. See 550b698#diff-89fa73d9850232e80e8a3ef604f73eaeR106 .

While it may be possible to make subprocess work with uv_spawn (don't know, haven't tried but I'd guess probably not given the complexity of subprocess), the lack of fork() will break the newly-operable multiprocessing (and of course anybody else that simply does a fork().) I'm afraid it's going to take some work to libuv to make it a replacement for gevent's libev use-cases; ref joyent/libuv#1136 and joyent/libuv#1405.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 10, 2016

I found another snag. libuv says:

It is not okay to have multiple active poll handles for the same socket, this can cause libuv to busyloop or otherwise malfunction.

In contrast, libev explicitly says:

you can register as many read and/or write event watchers per fd as you want

gevent is designed to register multiple watchers. Or at least, it doesn't prohibit it. In fact this comes up quite often in test cases as sockets are opened and closed: the same fd gets reused for each of them, and as long as the old socket objects are still around, so are the old watchers (and on PyPy they would be around for some time after the sockets went out of scope, until the GC kicked in).

This wasn't much of an issue with libuv until i started explicitly cleaning up the watchers with uv_close (as documented as required). As soon as that started, then it became a problem. libuv didn't just busyloop or misbehave, it aborted the whole process: Assertion failed: (loop->watchers[w->fd] == w), function uv__io_stop, file src/unix/core.c, line 889.

So we can't rely on a one-to-one mapping of gevent watchers to libuv watchers, like we do for libev. Instead, gevent will have to multiplex all the watchers for a single fd into a single libuv watcher. Not insurmountable, but a layer of complexity that libev is already handling for us.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 14, 2016

A couple of other minor differences:

  • libuv doesn't support priorities. Notably this changes the idle watcher's semantics (gevent itself doesn't rely on this AFAICS).
  • libuv only handles timer resolution down to 1ms. The excuse for this is that syscalls and other loop overhead is likely to take at least 100 microseconds so it's not worth supporting anything beneath 1 millisecond. This changes the semantics of several gevent APIs (libev supports whatever timer resolution it can get); here I've had to force attempted waits of less than than 1ms to 1ms (otherwise trying to wait less doesn't wait at all, which can be excessive).
@jamadden

This comment has been minimized.

Member

jamadden commented Apr 14, 2016

Here's another one: libuv places loops around all its blocking (polling) calls, catching and ignoring EINTR (verified for kqueue and epoll):

for(;;) {
  nfds = poll(...);
  if(nfds == -1) {
     if(errno != EINTR)
         abort(); /* libuv really likes to abort your process */
     if (timeout == -1) /* no specified timeout */
        continue; /* ignore EINTR and start over */
     if (timeout > 0 )
        timeout = calculate_new_timeout();
     if (timeout == 0 )
         return;
}

This means that unless there are timers (or idle watchers) running, Python signal handlers, for important things like Ctrl-C, never get a chance to run. If there are timers running, Python signal handlers only get to run after the timeouts expire.

This not only is the cause of the last few broken tests (other than the fork issue), it makes for a really lousy user experience: hitting Ctrl-C while, say, socket.sendall() is in progress without a timeout does nothing.

The three options I see are:

  • install an idle watcher. This has the significant downside of not ever allowing the loop to actually block in the OS. Instead, we're effectively constantly polling in a tight loop.
  • Install a timer with some arbitrarily "low", but acceptable, timeout, so we have a maximum upper bound on how long signals will be delayed.
  • More monkey-patches to the signal module to direct every signal through the event loop.

The idle watcher is probably the most seamless way to go, but also probably the most performance intensive.

Edit: Yeah, the idle watcher was way overkill. A timer was the only way to go. I went with one second, completely arbitrarily.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 15, 2016

Here's a fun one: On Linux (only! not on kqueue platforms) if you pass an invalid file-descriptor to an io watcher (uv_poll), libuv will sometimes cheerfully abort your process with no indication of what went wrong:

//linux-core.c
    if (uv__epoll_ctl(loop->backend_fd, op, w->fd, &e)) {
      if (errno != EEXIST)
        abort();

This shows up in 2.7/test_asyncore.py:FileWrapperTest.test_dispatcher which attempts to pass a regular file to asyncore. A normal select or poll will just return immediate read events for the regular file, but epoll blows up with EPERM which means "the target file does not support epoll".

On the other hand, in our own test__select.py we pass a closed file-descriptor that was previously attached to a regular file, expecting to get an error message (but not an abort, obviously). With libev, we do. With libuv, silence. No events at all are generated.

Now libuv tries to detect bad fds when you init the watcher, and simple refuse to take them. That would be nice, if it worked reliably, but it doesn't seem to (example: test_asyncore). And of course there's the problem of dealing with a mix of regular files and other streams consistently if you're not allowed to create poll watchers for them. Then we can add on our layer of multiplexing (since libuv won't support multiple watchers for the same file descriptor), which makes it completely possible to re-use a watcher for an existing FD that then goes bad (although to be fair, libuv does warn about not closing a FD that's being used by a watcher)...I can probably set up a weakref chain to mostly deal with that, but there are bound to be race conditions.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 15, 2016

With a sleepless-night-so-why-not-code sprint that mostly handled the FD issue, I think I've gotten the bulk of the major issues ironed out on OS X and linux. Nicely, the CFFI bindings benchmark (bench_sendall.py) as well as the Cython bindings for libev. There's some ref cycles that have to be GCd that'd be nice to figure out a way around, but I gave up on that for now. It'd also be nice to have a more comprehensive benchmark suite. (And the bindings themselves are still a bit rough-and-ready, they can use some cleanup, things like improved naming conventions, etc, but I was waiting for that until I'm done making major changes.)

The big thing remaining for POSIX is the fork support; my organization, for one, won't be able to use libuv until that can be made to work, so I'll try to work with the libuv guys on a solution to that. If we can't get a patch landed, I may either fork libuv itself (license looks compatible), or give up on the bindings altogether (pity, since the code is much easier to work with than libev's).

So then I tried to move on to windows, where I can't get libuv to even build, despite starting with a near clone of saghul/pyuv, which does build on windows. I tried debugging locally (a VM) but got nowhere. The appveyor errors, at least for some platforms, are really weird, like the C code isn't even being parsed correctly. I'm burned out on the Windows issue for now; is any maintainer ( @denik @Ivoz ) or other party that's expressed interest ( @jgehrcke ) motivated to take a look?

In the meantime I'm moving on to fork.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 15, 2016

Oh, obviously, also, any feedback on any of my braindumps above is of course appreciated.

@jamadden

This comment has been minimized.

Member

jamadden commented Apr 21, 2016

Discussion thread on the libuv group regarding fork (no replies yet): https://groups.google.com/forum/#!topic/libuv/thZzQDO9qnQ

@byaka

This comment has been minimized.

byaka commented May 2, 2016

I like libuv, but as i understand, for supporting libuv (without breaking libev) we need major changes in gevent's architecture?

In last year i used https://github.com/saghul/pyuv for one project, maybe it will be usefull...

@jamadden

This comment has been minimized.

Member

jamadden commented May 2, 2016

No major changes are needed to do exactly what gevent does now. For more optimal Windows support some changes would be needed, which may or may not be worth it.

There is already a complete implementation in the libuv branch of this repository. It's just waiting for some fixes in libuv itself to be merged.

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 27, 2016

The fork support for libuv is at libuv/libuv#846 Hopefully that will be merged for 1.10.

@byaka

This comment has been minimized.

byaka commented Oct 27, 2016

Wow, great news!
What needed for test this on linux?
As i understand after compiling fork's sources, i need special version of gevent that use libuv as backend instead of libev?

@jamadden

This comment has been minimized.

Member

jamadden commented Oct 27, 2016

You need the libuv branch of gevent plus that branch of libuv. Note that there are still substantial issues to be resolved for parity with libev (see this discussion) and it's not clear that there's really going to be that much of a performance (or other) difference.

@jamadden

This comment has been minimized.

Member

jamadden commented Mar 21, 2017

Fork support has landed and will be part of libuv 1.12.

I need to rebase the branch.

@arcivanov

This comment has been minimized.

Contributor

arcivanov commented Mar 21, 2017

Any data on the performance yet?

@jamadden

This comment has been minimized.

Member

jamadden commented Mar 21, 2017

We need better benchmarks. The simple benchmarks that currently exist were basically comparable, IIRC.

@2mf

This comment has been minimized.

2mf commented May 3, 2017

any updates?

@the1337guy

This comment has been minimized.

the1337guy commented Nov 2, 2017

Updates?

@aalhour

This comment has been minimized.

aalhour commented Nov 18, 2017

Hi @jamadden, are there any updates on this?

@jamadden

This comment has been minimized.

Member

jamadden commented Nov 18, 2017

It hasn't been a high priority for me lately, but it hasn't been forgotten. There have been several fixes in libuv for its fork support since it was first released so I've been watching how those shake out before proceeding. I hope to have the time to update and merge before the end of the year (but things get crazy this time of year).

@papercuptech

This comment has been minimized.

papercuptech commented Dec 2, 2017

@jamadden I see the libuv branch's recent changes were over a year ago. Will that still kinda work with the most recent libuv release?

Looking to get gevent just working with PyPy on Windows in any fashion

@jamadden

This comment has been minimized.

Member

jamadden commented Dec 2, 2017

@papercuptech Than the libuv branch would probably not be the way to go. I haven't been able to get the libuv extension to build with CPython on Windows at all, let alone PyPy. You might have better luck debugging the current libev code...

That said, I wouldn't recommend either branch for any production usage on pure Windows. They both have the same sorts of limitations (IIRC we're using libuv's select/poll support, so it goes through the same FD mapping that libev does). There have been reports that the Windows Linux subsystem works pretty well with gevent, though.

@jamadden jamadden added this to the 1.3a1 milestone Jan 3, 2018

@jamadden

This comment has been minimized.

Member

jamadden commented Jan 27, 2018

gevent 1.3a1 has been released with support for libuv on unix and windows, and PyPy on Windows.

Closing this umbrella issue. Please open new issues for specific problems.

@jamadden jamadden closed this Jan 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment