Close listening socket immediately after stop signal received #922

Closed
tahajahangir opened this Issue Oct 25, 2014 · 14 comments

Projects

None yet

5 participants

@tahajahangir

When stopping gunicorn, server should wait for current requests to complete, but this may take several seconds (even 30 seconds, with the default timeout). Clients connecting in this interval, will wait forever, and will receive connection-reset error after server successfully stopped.

When using two or more gunicorn servers behind a reverse proxy (like nginx), this behavior leads to unreasonable raise in total upstream time (normally, real clients won't wait for 30 seconds, and will close tab, or hit refresh).

This also reduces impact of #901 (locked connections). Also related to #908

@tilgovi
Collaborator
tilgovi commented Oct 27, 2014

Short upstream timeouts reduce the impact, but is not ideal.

As @benoitc has pointed out, we cannot simply close the socket on any graceful shutdown because the worker might be shutting down after a new master is spawned with USR2.

@tahajahangir

Normally, spawning (forking) new master should be before stopping current master. After fork, it's totally safe to close listening socket. The socket will remain open in the forked child. (sample code)

@tilgovi
Collaborator
tilgovi commented Oct 27, 2014

True enough. I'll try to dig up the original conversation, in case there's
something else we've forgotten.

Thanks, tahajahangir

Do you want to try writing a patch? I'm happy to review. Otherwise, I can
maybe look next week.
On Oct 26, 2014 11:37 PM, "Taha Jahangir" notifications@github.com wrote:

Normally, spawning (forking) new master should be before stopping current
master. After fork, it's totally safe to close listening socket. The socket
will remain open in the forked child. (sample code
http://pastebin.com/g0ewbLux)


Reply to this email directly or view it on GitHub
#922 (comment).

@tahajahangir

Unfortunately, I'm not familiar with gunicorn structure, additionally we moved to uwsgi in production, and I don't have enough free time to dig into gunicorn.

@benoitc
Owner
benoitc commented Nov 23, 2014

So the socket will indeed remains in the forked child but we should make sure to kill the socket in the parent only on final exit (ie when the the master is finally asked to exit) during USR2. I will work on such patch asap.

@benoitc benoitc added this to the R19.3 milestone Nov 23, 2014
@pikeas
pikeas commented May 18, 2015

Was this ever fixed?

@benoitc
Owner
benoitc commented May 18, 2015

I din't tested it yet. thanks for the head-up it's on the todo for today
now!

On Mon, May 18, 2015 at 9:55 AM Aris Pikeas notifications@github.com
wrote:

Was this ever fixed?


Reply to this email directly or view it on GitHub
#922 (comment).

@benoitc benoitc self-assigned this May 18, 2015
@mleventi

+1 on getting this in

@benoitc benoitc modified the milestone: R19.4 Nov 23, 2015
@tilgovi tilgovi added this to the 20.0.0 milestone Dec 31, 2015
@benoitc
Owner
benoitc commented Dec 31, 2015

Re-reading this issue, I think we are missing some details, was the server closed using SIGQUIT or SIGTERM? It's totally expected to not close immediately the socket on SIGTERM while SIGQUIT close it immediately. What did I miss?

@tilgovi
Collaborator
tilgovi commented Dec 31, 2015

We're not closing the socket immediately which I think means that a
downstream proxy will keep trying to pass connections to gunicorn. I think
the request here is that during shutdown we close the listening socket
quickly.

On Thu, Dec 31, 2015, 02:16 Benoit Chesneau notifications@github.com
wrote:

Re-reading this issue, I think we are missing some details, was the server
closed using SIGQUIT or SIGTERM? It's totally expected to not close
immediately the socket on SIGTERM while SIGQUIT close it immediately. What
did I miss?


Reply to this email directly or view it on GitHub
#922 (comment).

@tilgovi tilgovi added a commit that referenced this issue Mar 13, 2016
@tilgovi tilgovi [arbiter] close sockets on shutdown
Close all the listeners when the arbiter shuts down. By doing so,
workers can close the socket at the beginning of a graceful shut
down thereby informing the operating system that the socket can
be cleaned up. With this change, graceful exits with such workers
will refuse new connections while draining, allowing load balancers
to respond more quickly and avoiding leaving connections dangling
in the listen backlog, unaccepted.

Ref #922
04d6547
@tilgovi tilgovi added a commit that referenced this issue Mar 13, 2016
@tilgovi tilgovi [arbiter] close sockets on shutdown
Close all the listeners when the arbiter shuts down. By doing so,
workers can close the socket at the beginning of a graceful shut
down thereby informing the operating system that the socket can
be cleaned up. With this change, graceful exits with such workers
will refuse new connections while draining, allowing load balancers
to respond more quickly and avoiding leaving connections dangling
in the listen backlog, unaccepted.

Ref #922
39cecbc
@tilgovi
Collaborator
tilgovi commented Mar 14, 2016
  • Close in arbiter shutdown (#1221)
  • Close in workers at graceful termination
    • sync (not applicable)
    • gevent
    • eventlet (#1222)
    • thread (#1230)
    • asyncio
@tilgovi tilgovi assigned tilgovi and unassigned benoitc Mar 14, 2016
@tilgovi tilgovi added a commit that referenced this issue Mar 21, 2016
@tilgovi tilgovi [thread] close sockets at graceful shutdown
The run loop has to change slightly to support graceful shutdown.
There is no way to interrupt a call to `futures.wait` so instead
the pattern, used by the async workers, is to sleep for only one
second at the most. The poll is extended to a one second timeout
to match.

Since threads are preemptively scheduled, it's possible that the
listener is closed when the request is actually handled. For this
reason it is necessary to slightly refactor the TConn class to store
the listening socket name. The name is checked once at the start of
the worker run loop.

Ref #922
f2418a9
@tilgovi
Collaborator
tilgovi commented Mar 21, 2016

I believe that this can be marked complete as soon as the two PRs I have open for eventlet and the threaded worker are merged.

@benoitc
Owner
benoitc commented Mar 22, 2016

@tilgovi 👍

@tilgovi
Collaborator
tilgovi commented Mar 25, 2016

Closing as all my open PRs have been merged. Opened #1236 to track future work arising from this discussion and related PRs.

@tilgovi tilgovi closed this Mar 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment