Make the gevent workers handle the quit signal by deferring to a new greenlet #1128

Merged
merged 1 commit into from Oct 26, 2015

Projects

None yet

4 participants

@jamadden
Contributor

As discussed in #1126. This fixes the problem interactively. However, I have no idea how to add a unit test; there doesn't seem to be anything similar in place already. Suggestions welcome!

@tilgovi
Collaborator
tilgovi commented Oct 15, 2015

Hmmm. I'm actually wondering if we shouldn't just remove the time.sleep(). I know I tried to explain why I thought we had it, but I may not actually agree with it :). What do you think, @jamadden, or anyone else?

@jamadden
Contributor

I'm +1 for simply removing time.sleep. It's not like it's been working, and we haven't noticed any particular issues simply because the worker exits without sleeping.

@xtao
xtao commented Oct 16, 2015

It works, thanks.

@benoitc
Owner
benoitc commented Oct 16, 2015

@tilgovi it's here to let some times to the VM to release locks.

Patch looks good. However I am wondering if there isn't a way to patch the signal handler when launching the worker. Indeed gevent.spawn is quite the same as a gevent.sleep or time.sleep when patched. I think it would worth to investigate this possibility imo.

@tilgovi
Collaborator
tilgovi commented Oct 16, 2015

Gevent < 1.1 does not monkey patch signal, but even with 1.1 that patches signal.signal and time.sleep the same error occurs.

@jamadden
Contributor

@tilgovi Signals are delivered to the main greenlet, which happens to be the one running the hub---so the net result is the same on 1.0. So a complete stack trace captured there actually looks like this (captured with pdb):

[2015-10-16 07:34:43 -0500] [20467] [INFO] Booting worker with pid: 20467
^C[2015-10-16 07:34:45 -0500] [20464] [INFO] Handling signal: int
> //gunicorn/gunicorn/workers/ggevent.py(175)handle_quit()
-> gevent.spawn(super(GeventWorker, self).handle_quit, sig, frame)
(Pdb) bt
  //lib/python2.7/site-packages/gevent/hub.py(371)run()
-> loop.run()
  //gunicorn/gunicorn/workers/ggevent.py(174)handle_quit()
-> pdb.set_trace()

Under 1.1, monkey patching signal only has any effect for SIGCHLD; everything else is left untouched and behaves just as it did in 1.0.

@benoitc gevent.spawn is not the same thing as gevent.sleep or time.sleep. Both of the latter are blocking functions, while spawn is not. spawn is explicitly documented as safe to use in situations like this:

Note that the callbacks supplied to the libev API are run in the gevent.hub.Hub greenlet and thus cannot use the synchronous gevent API. It is possible to use the asynchronous API there, like gevent.spawn() and gevent.event.Event.set().

gevent does have a function gevent.signal that has the same effect as this patch (runs the handler in a new greenlet), but that API must be invoked directly---it is not monkey-patched in for compatibility.

@jamadden
Contributor

@benoitc I'm curious, can you share some more details about what locks you're talking about? And why they wouldn't be automatically cleaned up by interpreter shut-down? Is this something gunicorn specific?

@jamadden
Contributor

Any more thoughts on this?

@tilgovi
Collaborator
tilgovi commented Oct 19, 2015

From what I can tell, gevent 1.1 does monkey patch signal.signal so I'm a bit confused why this problem still exists there.

@jamadden
Contributor

Yes, there is a monkey patch on signal.signal in 1.1, but it only handles the case where the first argument is SIGCHLD. Everything else is passed through to the standard library, including SIGQUIT and SIGINT. So this case is unaffected by the patch. gevent's code literally looks like this:

import signal as _signal
_signal_signal = _signal.signal # capture before patch
def signal(signalnum, handler):
    """
    Exactly the same as :func:`signal.signal` except where
    :const:`signal.SIGCHLD` is concerned.
    """
    if signalnum != _signal.SIGCHLD:
        return _signal_signal(signalnum, handler)
   ...
@tilgovi
Collaborator
tilgovi commented Oct 19, 2015

Oh, right. You said that. Sorry.

@tilgovi
Collaborator
tilgovi commented Oct 19, 2015

I think this PR looks good enough to me.

@tilgovi
Collaborator
tilgovi commented Oct 26, 2015

I'm not hearing any objection. Pushing the button.

@tilgovi tilgovi merged commit 821123d into benoitc:master Oct 26, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@simon-weber simon-weber added a commit to simon-weber/gchatautorespond that referenced this pull request Sep 2, 2016
@simon-weber simon-weber gunicorn 19.6.0 for gevent fix ecb7d48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment