Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to cleanly terminate gunicorn with eventlet backend #867

Closed
sjagoe opened this issue Sep 3, 2014 · 13 comments
Closed

Unable to cleanly terminate gunicorn with eventlet backend #867

sjagoe opened this issue Sep 3, 2014 · 13 comments

Comments

@sjagoe
Copy link

sjagoe commented Sep 3, 2014

See the gist at https://gist.github.com/sjagoe/0d9ee7c9d9c034080abe for a simple flask app running in gunicorn with eventlet. The TERM works for the gevent and sync backends.

Here is my output (using the gist above) from running gunicorn with eventlet (I have to ^C the process to kill it):

$ gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app &
[1] 17196
$ curl localhost:6666
some text
$ kill -TERM %1
$ fg
gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app
...
^C
$

Here is the equivalent output using the gevent worker class:

$ gunicorn --bind 127.0.0.1:6666 --worker-class gevent app:app &
[1] 17267
$ curl localhost:6666
some text
$ kill -TERM %1
$
[1]+  Done                    gunicorn --bind 127.0.0.1:6666 --worker-class gevent app:app
$
@sjagoe
Copy link
Author

sjagoe commented Sep 3, 2014

This issue is present when using Python 2.7 or Python 3.4 (with eventlet at commit cbd404d56e231, which is Py3 compatible).

@benoitc
Copy link
Owner

benoitc commented Sep 9, 2014

I couldn't reproduce it with the latest stable of eventlet. Are you testing against the master of eventlet?

@sjagoe
Copy link
Author

sjagoe commented Sep 9, 2014

At the time, yes it was master (as the release did not support Python 3).

Using eventlet 0.15.2 and gunicorn 19.1.1 (and flask 0.10.1, if it matters), I still see the above behaviour. a SIGTERM does not cause gunicorn with eventlet to shut down, however a SIGINT does. This is tested on both Python 2.7.6 and 3.4.0.

$ gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app &
[1] 20084
$ kill -TERM %1
$ 
... # it does not terminate
$
$ kill -INT %1
$ 
[1]+  Done                    gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app
$

@benoitc
Copy link
Owner

benoitc commented Sep 9, 2014

mmm it seems to wait a little before stopping:

$ gunicorn -k eventlet flaskapp:app &
[1] 56146
$ kill -TERM %1

test if the process is still up:

$ ps ax|grep gunicorn
56153 s002  S+     0:00.00 grep gunicorn
[1]+  Done                    gunicorn -k eventlet flaskapp:app

test a new time :

$ ps ax|grep gunicorn
56155 s002  S+     0:00.01 grep gunicorn

Which seems normal for me, but I will check. Dod you checked if the process didn't really quit?

@sjagoe
Copy link
Author

sjagoe commented Sep 9, 2014

Okay, so the actual behaviour I appear to be seeing is that wIth the eventlet backend, it is taking 30s to exit on TERM. With gevent, it is immediate. Is this the expected behaviour for the eventlet backend?

$ gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app &
[1] 9172
$ kill -TERM %1; time while [[ "$(jobs)"x != ""x ]]; do echo -en .; sleep 1; done
..............................[1]+  Done                    gunicorn --bind 127.0.0.1:6666 --worker-class eventlet app:app
real    0m30.126s
$ gunicorn --bind 127.0.0.1:6666 --worker-class gevent app:app &
[1] 31792
$ kill -TERM %1; time while [[ "$(jobs)"x != ""x ]]; do echo -en .; sleep 1; done
.[1]+  Done                    gunicorn --bind 127.0.0.1:6666 --worker-class gevent app:app
real    0m1.006s

@tilgovi
Copy link
Collaborator

tilgovi commented Sep 10, 2014

We never actually shut down the acceptors and we instead wait() on them, but they don't even terminate because they'll sleep on accept() indefinitely.

Here's the beginning of a solution:

diff --git a/examples/longpoll.py b/examples/longpoll.py
index 97d6647..3e15a6b 100644
--- a/examples/longpoll.py
+++ b/examples/longpoll.py
@@ -13,7 +13,7 @@ class TestIter(object):
         lines = ['line 1\n', 'line 2\n']
         for line in lines:
             yield line
-            time.sleep(20)
+            time.sleep(5)

 def app(environ, start_response):
     """Application which cooperatively pauses 20 seconds (needed to surpass normal timeouts) before responding"""
diff --git a/gunicorn/workers/geventlet.py b/gunicorn/workers/geventlet.py
index c842bb5..c626e38 100644
--- a/gunicorn/workers/geventlet.py
+++ b/gunicorn/workers/geventlet.py
@@ -66,10 +66,10 @@ class EventletWorker(AsyncWorker):
     def run(self):
         acceptors = []
         for sock in self.sockets:
-            sock = GreenSocket(sock)
-            sock.setblocking(1)
-            hfun = partial(self.handle, sock)
-            acceptor = eventlet.spawn(eventlet.serve, sock, hfun,
+            gsock = GreenSocket(sock)
+            gsock.setblocking(1)
+            hfun = partial(self.handle, gsock)
+            acceptor = eventlet.spawn(self.serve, gsock, hfun,
                     self.worker_connections)

             acceptors.append(acceptor)
@@ -82,8 +82,23 @@ class EventletWorker(AsyncWorker):
         self.notify()
         try:
             with eventlet.Timeout(self.cfg.graceful_timeout) as t:
+                [a.kill(eventlet.StopServe()) for a in acceptors]
                 [a.wait() for a in acceptors]
         except eventlet.Timeout as te:
             if te != t:
                 raise
             [a.kill() for a in acceptors]
+
+    def serve(self, sock, handle, concurrency):
+        pool = eventlet.greenpool.GreenPool(concurrency)
+        server_gt = eventlet.greenthread.getcurrent()
+
+        while True:
+            try:
+                conn, addr = sock.accept()
+                gt = pool.spawn(handle, conn, addr)
+                gt.link(eventlet.convenience._stop_checker, server_gt, conn)
+                conn, addr, gt = None, None, None
+            except eventlet.StopServe:
+                pool.waitall()
+                return

@sjagoe
Copy link
Author

sjagoe commented Sep 10, 2014

Thanks, @tilgovi, I just tested that patch with my dummy application and it appears to fix the issue.

I'll apply it to gunicorn running a real app as well.

@sjagoe
Copy link
Author

sjagoe commented Sep 16, 2014

The patch above works. Only I do see the StopServe exceptions printed, due to the eventlet debug printing (via traceback.print_exception()), which is on by default; however, that is tangential.

@benoitc
Copy link
Owner

benoitc commented Sep 22, 2014

@tilgovi any reason no not make it as a PR ? patch looks good for me.

@benoitc benoitc added this to the R19.2 milestone Sep 22, 2014
@tilgovi
Copy link
Collaborator

tilgovi commented Sep 22, 2014

I'll do it this week.
On Sep 22, 2014 6:16 AM, "Benoit Chesneau" notifications@github.com wrote:

@tilgovi https://github.com/tilgovi any reason no not make it as a PR ?
patch looks good for me.


Reply to this email directly or view it on GitHub
#867 (comment).

@gtaylor
Copy link
Contributor

gtaylor commented Oct 8, 2014

Looks like this is also affecting us.

@sjagoe
Copy link
Author

sjagoe commented Oct 8, 2014

@tilgovi obviously hasn't had time for the pull request. I'll try to put one together tomorrow.

@tilgovi
Copy link
Collaborator

tilgovi commented Nov 18, 2014

Closed in 2f6aa75

@tilgovi tilgovi closed this as completed Nov 18, 2014
kiivihal added a commit to delving/nave that referenced this issue Jul 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants