remove file locking #1270

Merged
merged 1 commit into from May 20, 2016

Projects

None yet

4 participants

@benoitc
Owner
benoitc commented May 18, 2016 edited

This changes improve the binary upgrade behaviour using USR2:

  • only one binary upgrade can happen at a time: the old arbiter needs to be
    killed to promote the new arbiter.
  • if a new arbiter is already spawned, until one is killed USR2 has no action
  • if a new arbiter has been spawned, the unix socket won't be unlinked
  • until the old arbiter have been killed the newly created pidfile has the name
    .2 and the name Master.2 .

Note: there is no dialog between both arbiters to handle this features.
Instead they will supervise each others until one is killed. So isolation is
still guaranted.

fix #1267

@benoitc benoitc referenced this pull request May 18, 2016
Closed

USR2 improvements & fixes #1267

@tilgovi tilgovi commented on an outdated diff May 20, 2016
gunicorn/arbiter.py
@@ -302,6 +299,23 @@ def handle_winch(self):
else:
self.log.debug("SIGWINCH ignored. Not daemonized")
+ def if_promoted(self):
+ if self.master_pid == 0:
+ return
+
+ if self.master_pid != os.getppid():
+ self.log.info("Master have been promoted.")
@tilgovi
tilgovi May 20, 2016 Collaborator

Grammar: "Master has been promoted."

@tilgovi tilgovi commented on an outdated diff May 20, 2016
tests/test_sock.py
@@ -9,19 +9,10 @@
@mock.patch('os.getpid')
@mock.patch('os.unlink')
@mock.patch('socket.fromfd')
-def test_unix_socket_close_delete_if_exlock(fromfd, unlink, getpid):
+def test_unix_socket_close_keep(fromfd, unlink, getpid):
@tilgovi
tilgovi May 20, 2016 Collaborator

Seems like this should read test_unix_socket_close_unlink since this test case checks that unlink is called.

@tilgovi tilgovi and 1 other commented on an outdated diff May 20, 2016
gunicorn/sock.py
@@ -20,6 +20,7 @@
class BaseSocket(object):
def __init__(self, address, conf, log, fd=None):
+ self.parent = os.getpid()
@tilgovi
tilgovi May 20, 2016 Collaborator

self.parent should just be removed. It's not doing anything anymore. It is always set to os.getpid() so it will always be equal to os.getpid() at close. This is because after USR2 exec these objects are still recreated in the new process.

If the intention is to prevent the destruction of systemd activated sockets then we need a separate flag here that we pass to the constructor only when called in the LISTEN_FDS loop, or we need to check for the environment variable here (a parameter is easier to unit test).

The arbiter is now tracking whether it's the "master" arbiter and intelligently calls close (or not), so the only information we need here is whether Gunicorn (any process) created the socket or whether the init system did.

It's not actually important which PID created the socket (the arbiter tracks whether to call close())

@benoitc
benoitc May 20, 2016 Owner

right. removing it. About systemd, we should handle it in another commit.. The genertal idea here would be keeping in env the systemd fds that were given to us. So the create_sockets method would return a the listeners and external fds. I will submit a patch.

@tilgovi tilgovi and 1 other commented on an outdated diff May 20, 2016
gunicorn/arbiter.py
@@ -302,6 +299,23 @@ def handle_winch(self):
else:
self.log.debug("SIGWINCH ignored. Not daemonized")
+ def if_promoted(self):
@tilgovi
tilgovi May 20, 2016 Collaborator

I would give this a new name. Maybe manage_arbiters for symmetry with manage_workers. Right now it looks to me like a function that returns a boolean value but doesn't have side effects.

@benoitc
benoitc May 20, 2016 Owner

I will rename it maybe_promote_master which is more explicit imo :)

@tilgovi tilgovi commented on the diff May 20, 2016
gunicorn/arbiter.py
@@ -126,28 +125,23 @@ def start(self):
"""
self.log.info("Starting gunicorn %s", __version__)
+ if 'GUNICORN_PID' in os.environ:
+ self.master_pid = int(os.environ.get('GUNICORN_PID'))
+ self.proc_name = self.proc_name + ".2"
+ self.master_name = "Master.2"
@tilgovi
tilgovi May 20, 2016 Collaborator

It's not really a problem what we do here, but I wonder if maybe we just shouldn't rename the process. It's possible to tell which is which by the process tree and I'm just a little bit afraid the complexity isn't worth it. I can imagine bugs now or later where the process name ends up as "app.2.2.2.2.2". Maybe I'm just nervous because there was the recent tornado worker issue where the process name was incorrectly set.

I can see how it might make some shell script easier if you can grep for .2$, but if you're trying to automate this I think using the pid file is probably the better choice, anyway.

@benoitc
benoitc May 20, 2016 Owner

right now it can't happen since the string is fixed. When don't do a + but set the master name.

@tilgovi
Collaborator
tilgovi commented May 20, 2016

Much better, I think.

@benoitc
Owner
benoitc commented May 20, 2016

@tilgovi thanks for the review! fixes are coming :)

@benoitc remove file locking
This changes improve the binary upgrade behaviour using USR2:

- only one binary upgrade can happen at a time: the old arbiter needs to be
  killed to promote the new arbiter.
- if a new arbiter is already spawned, until one is killed USR2 has no action
- if a new arbiter has been spawned, the unix socket won't be unlinked
- until the old arbiter have been killed the newly created pidfile has the name
  <pidfile>.2 and the name Master.2 .

Note: there is no dialog between both arbiters to handle this features.
Instead they will supervise each others until one is killed. So isolation is
still guaranted.

fix #1267
418f140
@benoitc benoitc merged commit c62cf2f into master May 20, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@simlun
Contributor
simlun commented Jun 16, 2016

Oops, it seems like the documentation still mentions the .oldbin PID file. https://github.com/benoitc/gunicorn/blob/master/docs/source/signals.rst should perhaps be updated as well? :)

@berkerpeksag berkerpeksag deleted the improve-arbiter-promotion branch Jun 16, 2016
@berkerpeksag
Collaborator

@simlun good catch! :) Would you like to send a PR?

@benoitc
Owner
benoitc commented Nov 6, 2016 edited

so this is exactly why commenting on closed ticket is really a bad behaviour... No notifications are sent, nothing is tracked... ;)

The documentation is still describing the old name of the the pid. Fixing it now.

@benoitc
Owner
benoitc commented Nov 6, 2016

never mind, it has been done but the current doc don't reflect it... So we just need to make a new release :) Will be done tomorrow. Sorry for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment