You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on critical service running on gunicorn+flask with the following configurations
Python 3.9.0
latest gunicorn
latest flask
worker :gthread
threads : 1
workers : 4-20 depends on deployed server
lately I wanted to add graceful shutdown to our application, so I used on_exit hook to register the service in discovery service as 'DOWN'
All works fine but the listening port is closed before on_exit invoke so when I send SIGTERM what actually happens is :
1.listening port closed (service still registers as up in discovery service)
2.Arbiter close listening socket
3.waiting graceful time out
4.kill workers
5.on_exit runs
So what happens is I still get requests while 1-4 happening which can be lost because only in 5 I register my service as 'DOWN'.
As far as I know when parent process close same socket (fd) as child have, the socket should close only if reference count == 0 and I wonder why the reference count is 1 on the listening socket before closing the socket ?
I know it supposed to be 1(master) + num of workers but in gunicorn case it's always 1.
So is it intentional ? I couldn't figure out from sock.py and I think parent process should only be responsible for his own socket.
below the code in the arbiter
def halt(self, reason=None, exit_status=0):
""" halt arbiter """
self.stop()
self.log.info("Shutting down: %s", self.master_name)
if reason is not None:
self.log.info("Reason: %s", reason)
if self.pidfile is not None:
self.pidfile.unlink()
self.cfg.on_exit(self)
sys.exit(exit_status)
def stop(self, graceful=True):
"""\
Stop workers
:attr graceful: boolean, If True (the default) workers will be
killed gracefully (ie. trying to wait for the current connection)
"""
unlink = (
self.reexec_pid == self.master_pid == 0
and not self.systemd
and not self.cfg.reuse_port
)
sock.close_sockets(self.LISTENERS, unlink)
self.LISTENERS = []
sig = signal.SIGTERM
if not graceful:
sig = signal.SIGQUIT
limit = time.time() + self.cfg.graceful_timeout
# instruct the workers to exit
self.kill_workers(sig)
# wait until the graceful timeout
while self.WORKERS and time.time() < limit:
time.sleep(0.1)
self.kill_workers(signal.SIGKILL)
Thanks
The text was updated successfully, but these errors were encountered:
Hi
I'm working on critical service running on gunicorn+flask with the following configurations
Python 3.9.0
latest gunicorn
latest flask
worker :gthread
threads : 1
workers : 4-20 depends on deployed server
lately I wanted to add graceful shutdown to our application, so I used on_exit hook to register the service in discovery service as 'DOWN'
All works fine but the listening port is closed before on_exit invoke so when I send SIGTERM what actually happens is :
1.listening port closed (service still registers as up in discovery service)
2.Arbiter close listening socket
3.waiting graceful time out
4.kill workers
5.on_exit runs
So what happens is I still get requests while 1-4 happening which can be lost because only in 5 I register my service as 'DOWN'.
As far as I know when parent process close same socket (fd) as child have, the socket should close only if reference count == 0 and I wonder why the reference count is 1 on the listening socket before closing the socket ?
I know it supposed to be 1(master) + num of workers but in gunicorn case it's always 1.
So is it intentional ? I couldn't figure out from sock.py and I think parent process should only be responsible for his own socket.
below the code in the arbiter
Thanks
The text was updated successfully, but these errors were encountered: