unix socket gets deleted on worker restart #1298

Closed
diwu1989 opened this Issue Jun 24, 2016 · 30 comments

Comments

Projects
None yet
4 participants
@diwu1989

We have a django app deployed on heroku with gunicorn 19.6.0
It is binding to a unix socket that's used by Nginx.

These are the options that we're using:

  --max-requests 500 \
  --max-requests-jitter 25 \
  --preload \
  --threads 2 \
  --timeout 27 \
  --workers 3 \
  --bind unix:/tmp/nginx.socket

What we're seeing is that the socket file disappears after a while.
Downgraded to gunicorn 19.4.5 and things are working fine.

Is the nginx socket file being deleted accidentally?

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jun 25, 2016

Owner

@diwu1989 do you have any gunicorn logs during th time int happen? It would help to figure what's happening. I can't reproduce it for now.

Owner

benoitc commented Jun 25, 2016

@diwu1989 do you have any gunicorn logs during th time int happen? It would help to figure what's happening. I can't reproduce it for now.

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

yes i realize my initial context doesn't provide enough, Im going to try to create a repro steps to post

yes i realize my initial context doesn't provide enough, Im going to try to create a repro steps to post

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

here's one snippet of log I collected while rebooting a dyno

2016-06-26T05:24:48.691401+00:00 app[web.1]: [2016-06-26 05:24:48 +0000] [23] [ERROR] Exception in worker process
2016-06-26T05:24:48.691403+00:00 app[web.1]: Traceback (most recent call last):
2016-06-26T05:24:48.691404+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/arbiter.py", line 557, in spawn_worker
2016-06-26T05:24:48.691405+00:00 app[web.1]:     worker.init_process()
2016-06-26T05:24:48.691406+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 109, in init_process
2016-06-26T05:24:48.691406+00:00 app[web.1]:     super(ThreadWorker, self).init_process()
2016-06-26T05:24:48.691407+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/base.py", line 132, in init_process
2016-06-26T05:24:48.691408+00:00 app[web.1]:     self.run()
2016-06-26T05:24:48.691408+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 240, in run
2016-06-26T05:24:48.691409+00:00 app[web.1]:     s.close()
2016-06-26T05:24:48.691410+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/sock.py", line 123, in close
2016-06-26T05:24:48.691411+00:00 app[web.1]:     os.unlink(self.cfg_addr)
2016-06-26T05:24:48.691411+00:00 app[web.1]: OSError: [Errno 2] No such file or directory: '/tmp/nginx.socket'

here's one snippet of log I collected while rebooting a dyno

2016-06-26T05:24:48.691401+00:00 app[web.1]: [2016-06-26 05:24:48 +0000] [23] [ERROR] Exception in worker process
2016-06-26T05:24:48.691403+00:00 app[web.1]: Traceback (most recent call last):
2016-06-26T05:24:48.691404+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/arbiter.py", line 557, in spawn_worker
2016-06-26T05:24:48.691405+00:00 app[web.1]:     worker.init_process()
2016-06-26T05:24:48.691406+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 109, in init_process
2016-06-26T05:24:48.691406+00:00 app[web.1]:     super(ThreadWorker, self).init_process()
2016-06-26T05:24:48.691407+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/base.py", line 132, in init_process
2016-06-26T05:24:48.691408+00:00 app[web.1]:     self.run()
2016-06-26T05:24:48.691408+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 240, in run
2016-06-26T05:24:48.691409+00:00 app[web.1]:     s.close()
2016-06-26T05:24:48.691410+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/gunicorn/sock.py", line 123, in close
2016-06-26T05:24:48.691411+00:00 app[web.1]:     os.unlink(self.cfg_addr)
2016-06-26T05:24:48.691411+00:00 app[web.1]: OSError: [Errno 2] No such file or directory: '/tmp/nginx.socket'
@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

this seems to be what's causing the socket to disappear

2016-06-26T05:35:11.069314+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [24] [DEBUG] GET /
2016-06-26T05:35:11.070223+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [24] [DEBUG] Closing connection.
2016-06-26T05:35:11.443232+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=7cd8420c-53d5-4b27-9d17-e7793fb5b4d7 fwd="52.33.79.242" dyno=web.1 connect=0ms service=3ms status=200 bytes=162
2016-06-26T05:35:11.409075+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] GET /
2016-06-26T05:35:11.409329+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [INFO] Autorestarting worker after current request.
2016-06-26T05:35:11.410194+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] Closing connection.
2016-06-26T05:35:11.787756+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=9899ce58-1450-4427-8d2a-9201949eae72 fwd="52.33.79.242" dyno=web.1 connect=0ms service=3ms status=200 bytes=162
2016-06-26T05:35:11.774464+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] GET /
2016-06-26T05:35:11.775451+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] Closing connection.
2016-06-26T05:35:11.775798+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [INFO] Worker exiting (pid: 23)
2016-06-26T05:35:11.800646+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [35] [INFO] Booting worker with pid: 35
2016-06-26T05:35:12.140399+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=fecd3f83-46f4-46f9-aee1-eae85f9ae626 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.138120+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *442 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.93.192.30, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"
2016-06-26T05:35:12.522873+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=bf01f7fd-fcd1-4678-b882-c5ed0b21fa23 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.488939+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *444 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.123.140.213, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"
2016-06-26T05:35:12.843974+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=bc8cc4ba-bb4b-4326-9181-742b84e3a9e6 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.840446+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *446 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.179.144.75, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"

diwu1989 commented Jun 26, 2016

this seems to be what's causing the socket to disappear

2016-06-26T05:35:11.069314+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [24] [DEBUG] GET /
2016-06-26T05:35:11.070223+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [24] [DEBUG] Closing connection.
2016-06-26T05:35:11.443232+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=7cd8420c-53d5-4b27-9d17-e7793fb5b4d7 fwd="52.33.79.242" dyno=web.1 connect=0ms service=3ms status=200 bytes=162
2016-06-26T05:35:11.409075+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] GET /
2016-06-26T05:35:11.409329+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [INFO] Autorestarting worker after current request.
2016-06-26T05:35:11.410194+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] Closing connection.
2016-06-26T05:35:11.787756+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=9899ce58-1450-4427-8d2a-9201949eae72 fwd="52.33.79.242" dyno=web.1 connect=0ms service=3ms status=200 bytes=162
2016-06-26T05:35:11.774464+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] GET /
2016-06-26T05:35:11.775451+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [DEBUG] Closing connection.
2016-06-26T05:35:11.775798+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [23] [INFO] Worker exiting (pid: 23)
2016-06-26T05:35:11.800646+00:00 app[web.1]: [2016-06-26 05:35:11 +0000] [35] [INFO] Booting worker with pid: 35
2016-06-26T05:35:12.140399+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=fecd3f83-46f4-46f9-aee1-eae85f9ae626 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.138120+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *442 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.93.192.30, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"
2016-06-26T05:35:12.522873+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=bf01f7fd-fcd1-4678-b882-c5ed0b21fa23 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.488939+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *444 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.123.140.213, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"
2016-06-26T05:35:12.843974+00:00 heroku[router]: at=info method=GET path="/" host=gunicorn.herokuapp.com request_id=bc8cc4ba-bb4b-4326-9181-742b84e3a9e6 fwd="52.33.79.242" dyno=web.1 connect=0ms service=1ms status=502 bytes=311
2016-06-26T05:35:12.840446+00:00 app[web.1]: 2016/06/26 05:35:12 [crit] 29#0: *446 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream, client: 10.179.144.75, server: _, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/nginx.socket:/", host: "gunicorn.herokuapp.com"
@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

Since im running with multiple workers, it seems that whenever one of them is autorestarting, as part of the worker exiting, the unix socket would get wiped
Autorestarting worker after current request.
Worker exiting (pid: 23)
Booting worker with pid: 35
*442 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream

Since im running with multiple workers, it seems that whenever one of them is autorestarting, as part of the worker exiting, the unix socket would get wiped
Autorestarting worker after current request.
Worker exiting (pid: 23)
Booting worker with pid: 35
*442 connect() to unix:/tmp/nginx.socket failed (2: No such file or directory) while connecting to upstream

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

commit 118668c seems to be fine
after commit c62cf2f is bad
image

diwu1989 commented Jun 26, 2016

commit 118668c seems to be fine
after commit c62cf2f is bad
image

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

to reproduce, clone this repo https://github.com/diwu1989/gunicorn-socket-bug
deploy to heroku app, then run apache bench on it to trigger the behavior

my guess is that the bug is related to this change
image

diwu1989 commented Jun 26, 2016

to reproduce, clone this repo https://github.com/diwu1989/gunicorn-socket-bug
deploy to heroku app, then run apache bench on it to trigger the behavior

my guess is that the bug is related to this change
image

@diwu1989 diwu1989 changed the title from unix socket disappears after a while to unix socket gets deleted on worker restart Jun 26, 2016

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jun 26, 2016

Owner

@diwu1989 The code you reference in the old version was not present in a stable release. Anyway in the current version the only reason the socket can disappear is when gunicorn has quit or has been terminated (ie when the arbiter die). Do you have any production logs that shows a restart or termination of gunicorn?

I've no available account on heroku right now, but will try to create one to test it on monday.

Owner

benoitc commented Jun 26, 2016

@diwu1989 The code you reference in the old version was not present in a stable release. Anyway in the current version the only reason the socket can disappear is when gunicorn has quit or has been terminated (ie when the arbiter die). Do you have any production logs that shows a restart or termination of gunicorn?

I've no available account on heroku right now, but will try to create one to test it on monday.

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jun 26, 2016

Owner

Sorry I missed the first log. I think there is a bug in the thread worker. it shouldn't close the socket there. I will have a closer look ASAP on it.

Owner

benoitc commented Jun 26, 2016

Sorry I missed the first log. I think there is a bug in the thread worker. it shouldn't close the socket there. I will have a closer look ASAP on it.

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jun 26, 2016

i dont think this is a heroku specific problem, it looks like gthread worker will end up deleting the unix socket when they reach their lifespan limit (if the max request setting is set)

i dont think this is a heroku specific problem, it looks like gthread worker will end up deleting the unix socket when they reach their lifespan limit (if the max request setting is set)

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jun 27, 2016

Collaborator

@benoitc it should close the socket there

I may have made a mistake in reviewing c62cf2f. We made the change so that only one arbiter will close the socket, and removed the protections in the socket's close method. However, workers still close the socket.

The workers should close the socket, though, because otherwise the OS still thinks there is a listener during graceful shutdown when what we want is immediate rejection of new connections so that load balancers and reverse proxies can fail over.

Maybe we should just make unlink explicit and not have it implied by close. Then, the top arbiter can explicitly unlike those sockets that should be unlinked.

Collaborator

tilgovi commented Jun 27, 2016

@benoitc it should close the socket there

I may have made a mistake in reviewing c62cf2f. We made the change so that only one arbiter will close the socket, and removed the protections in the socket's close method. However, workers still close the socket.

The workers should close the socket, though, because otherwise the OS still thinks there is a listener during graceful shutdown when what we want is immediate rejection of new connections so that load balancers and reverse proxies can fail over.

Maybe we should just make unlink explicit and not have it implied by close. Then, the top arbiter can explicitly unlike those sockets that should be unlinked.

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Jul 16, 2016

when do you guys expect to have this fixed?

when do you guys expect to have this fixed?

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 16, 2016

Collaborator

I'm still waiting for a response to my last comment, @diwu1989. What do you think?

Collaborator

tilgovi commented Jul 16, 2016

I'm still waiting for a response to my last comment, @diwu1989. What do you think?

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 16, 2016

Owner

@tilgovi before that we only have a barrier checking if the parent was actually unlinking the socket. Maybe we should just keep that? Ie. we keep the parent pid when creating the socket. If the pid is different we don't unlink. Thoughts?

Owner

benoitc commented Jul 16, 2016

@tilgovi before that we only have a barrier checking if the parent was actually unlinking the socket. Maybe we should just keep that? Ie. we keep the parent pid when creating the socket. If the pid is different we don't unlink. Thoughts?

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 16, 2016

Owner

@diwu1989 we will try to make a release during the coming week.

Owner

benoitc commented Jul 16, 2016

@diwu1989 we will try to make a release during the coming week.

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 16, 2016

Collaborator

Yes, we need to keep the barrier for unlink. What I am saying is we also need to make unlink explicit so when the workers close the socket they don't unlink it.

Collaborator

tilgovi commented Jul 16, 2016

Yes, we need to keep the barrier for unlink. What I am saying is we also need to make unlink explicit so when the workers close the socket they don't unlink it.

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 16, 2016

Collaborator

So then within the conditional in the arbiter only we would explicitly call unlink.

Collaborator

tilgovi commented Jul 16, 2016

So then within the conditional in the arbiter only we would explicitly call unlink.

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 16, 2016

Owner

why making it explicit? The socket is only created on the arbiter, so there is no chance that a worker could unlink it. Something like:

def close(self):
   if (os.pid() == self.parent_pid): os.unlink(self.fd)
   ...

if we make it explicit, then there is no need to have the barrier I described. What did I miss?

Owner

benoitc commented Jul 16, 2016

why making it explicit? The socket is only created on the arbiter, so there is no chance that a worker could unlink it. Something like:

def close(self):
   if (os.pid() == self.parent_pid): os.unlink(self.fd)
   ...

if we make it explicit, then there is no need to have the barrier I described. What did I miss?

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 17, 2016

Collaborator

I think you may be forgetting again why the workers must explicitly close the socket. They do this for graceful shutdown. If the socket is closed by all processes the operating system will not hold new, half-open connections. It will reject them, letting downstream load balancers shift traffic instantly.

If the workers always exited right after they closed the socket, they could simply exit. Instead, the workers need to explicitly close the socket sometimes, but they should not unlink it.

A worker is not the only place the socket is closed before Gunicorn shuts down. The other place is where an old arbiter exits. The barrier is there to protect an exiting arbiter from closing the socket when another arbiter still exists and Gunicorn is not exiting.

The original report was about arbiter causing unlink. We fixed that, but in the meantime I fixed every worker type to explicit close as soon as graceful shutdown is triggered. Now we have a new case where unlink is called.

I'm suggesting that we make unlink explicit. You're sort of correct that the barrier is not necessary around the close operation. It is only necessary around the unlink.

If unlink were explicit, the workers could close without worrying about unlink, the arbiters could close without unlink, and the current conditional check could ensure that only one arbiter performs an explicit unlink, separate from close.

Collaborator

tilgovi commented Jul 17, 2016

I think you may be forgetting again why the workers must explicitly close the socket. They do this for graceful shutdown. If the socket is closed by all processes the operating system will not hold new, half-open connections. It will reject them, letting downstream load balancers shift traffic instantly.

If the workers always exited right after they closed the socket, they could simply exit. Instead, the workers need to explicitly close the socket sometimes, but they should not unlink it.

A worker is not the only place the socket is closed before Gunicorn shuts down. The other place is where an old arbiter exits. The barrier is there to protect an exiting arbiter from closing the socket when another arbiter still exists and Gunicorn is not exiting.

The original report was about arbiter causing unlink. We fixed that, but in the meantime I fixed every worker type to explicit close as soon as graceful shutdown is triggered. Now we have a new case where unlink is called.

I'm suggesting that we make unlink explicit. You're sort of correct that the barrier is not necessary around the close operation. It is only necessary around the unlink.

If unlink were explicit, the workers could close without worrying about unlink, the arbiters could close without unlink, and the current conditional check could ensure that only one arbiter performs an explicit unlink, separate from close.

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 17, 2016

Owner

I don't follow... The unlink in the given snippet onlyt unlink if it's closed by the parent (pid) that created the socket.

I'm not against having an explicit unlink, so let's do that, it's simpler.

Owner

benoitc commented Jul 17, 2016

I don't follow... The unlink in the given snippet onlyt unlink if it's closed by the parent (pid) that created the socket.

I'm not against having an explicit unlink, so let's do that, it's simpler.

benoitc added a commit that referenced this issue Jul 17, 2016

only the arbiter should unlink the unix socket
make unlink explicitly done by the arbiter.

 fix #1298
@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 17, 2016

Owner

please review #1309 for that :)

Owner

benoitc commented Jul 17, 2016

please review #1309 for that :)

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 17, 2016

Collaborator

Sometimes the PID that unlinks it is not the PID that created it, such as after a USR2. Just checking the current and previous PID is not sufficient to cover all cases. That's why we introduced the explicit tracking of which arbiter is responsible to unlink.

Collaborator

tilgovi commented Jul 17, 2016

Sometimes the PID that unlinks it is not the PID that created it, such as after a USR2. Just checking the current and previous PID is not sufficient to cover all cases. That's why we introduced the explicit tracking of which arbiter is responsible to unlink.

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Jul 17, 2016

Owner

@tilgovi well the parent pid would have been set each time the socket is initialised :) but anyway the patch above use explicit unlink. Let me know

Owner

benoitc commented Jul 17, 2016

@tilgovi well the parent pid would have been set each time the socket is initialised :) but anyway the patch above use explicit unlink. Let me know

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Jul 17, 2016

Collaborator

Thanks for writing the patch. I'm mobile but will review in some hours.

Collaborator

tilgovi commented Jul 17, 2016

Thanks for writing the patch. I'm mobile but will review in some hours.

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 19, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 20, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 22, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

tilgovi added a commit that referenced this issue Jul 23, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

NexediGitlab pushed a commit to SlapOS/slapos that referenced this issue Oct 17, 2016

slaprunner: downgrade gunicorn to v19.4.5 because of bug in later ver…
…sions

In gunicorn v19.6 and v19.5, a bug was introduced which deletes socket when nginx worker exits.
In our case, it crashes the slaprunner and prevents it from restarting by slapos.
Let's stick to v19.4.5 until this issue is closed : benoitc/gunicorn#1298

tilgovi added a commit that referenced this issue Dec 20, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298
@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Dec 22, 2016

hello, would be very happy to see this fixed and a new release this year, thanks

hello, would be very happy to see this fixed and a new release this year, thanks

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Dec 23, 2016

Collaborator

@diwu1989 is it possible for you to test the branch at #1310?

Collaborator

tilgovi commented Dec 23, 2016

@diwu1989 is it possible for you to test the branch at #1310?

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Dec 27, 2016

just tested that branch and it does fix the problem with restarts

just tested that branch and it does fix the problem with restarts

@tilgovi

This comment has been minimized.

Show comment
Hide comment
@tilgovi

tilgovi Dec 27, 2016

Collaborator

@diwu1989 thanks so much. I'm going to try to get that merged and released as soon as possible.

Collaborator

tilgovi commented Dec 27, 2016

@diwu1989 thanks so much. I'm going to try to get that merged and released as soon as possible.

@diwu1989

This comment has been minimized.

Show comment
Hide comment
@diwu1989

diwu1989 Dec 27, 2016

thanks and have a good holiday

thanks and have a good holiday

tilgovi added a commit that referenced this issue Dec 27, 2016

Refactor socket activation and fd inheritance
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298

@tilgovi tilgovi closed this in #1310 Dec 27, 2016

tilgovi added a commit that referenced this issue Dec 27, 2016

Refactor socket activation and fd inheritance (#1310)
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298
@thara

This comment has been minimized.

Show comment
Hide comment
@thara

thara Feb 14, 2017

Hi, I'm faced this problem in production servers.
What version will you include the fix?

thara commented Feb 14, 2017

Hi, I'm faced this problem in production servers.
What version will you include the fix?

@benoitc benoitc added this to Other in Bugs Feb 23, 2017

@benoitc benoitc moved this from Other to Acknowledged in Bugs Feb 23, 2017

@benoitc benoitc removed this from Acknowledged in Bugs Feb 23, 2017

fofanov pushed a commit to fofanov/gunicorn that referenced this issue Mar 16, 2018

Refactor socket activation and fd inheritance (#1310)
Track the use of systemd socket activation and gunicorn socket inheritance
in the arbiter. Unify the logic of creating gunicorn sockets from each of
these sources to always use the socket name to determine the type rather
than checking the configured addresses. The configured addresses are only
used when there is no inheritance from systemd or a parent arbiter.

Fix #1298
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment