-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix unix socket locking #1263
fix unix socket locking #1263
Conversation
Maybe we should create unique lock file instead of fixing it to |
I think this could be a breaking change for anyone that runs multiple instances of gunicorn on the same machine at the same time, for example, in testing, or to run two different applications, etc. They would be forced to create a specific lock file location for each instance (which could be annoying if the configs are auto-generated; speaking from experience here). So a unique lock file name seems required to me. We could automatically make the lock file name unique. Since we're only locking for the sake of the unix domain socket(s), we can do that by including a portion of their path in the lockfile name. At first I thought of just using it as-is, e.g., So I would suggest, then, taking a hash of the unix domain socket path and including that (or some portion of it) in the lock: And if we're doing that, we can keep the lock file confined to the unix domain socket class itself, as before. If we also keep using the shared/exclusive lock technique, then that eliminates the need for changes to the arbiter, the |
@jamadden There is only one lock file for all the socket, and it's better that way imo, since it can be later use for other stuff possibly. I agree anyway that we should make the default lock file unique. The change above should fix it. |
@tilgovi @berkerpeksag any feedback is welcome :) |
I guess I'm just more a fan of finer-grained, single-purpose locking, especially when it's taken and managed close to the resource it's protecting. I find that easier to understand and debug when something (inevitably) goes wrong. |
@jamadden I am also fan about simple and focused solution and I think it's the case there :) What we want to is preventing any unix sockets file on which the app is listening to be unlinked on |
@benoitc ok, thanks. I see your point. I'm not sure I'm convinced about the complexity tradeoffs, but that's a personal thing :) |
This change add proper file locking to gunicorn. By default "gunicorn.lock" is created in the temporary directory when a unix socket is bound. In case someone want to fix the lock file path or use multiple gunicorn instance the "--lock-file" setting can be used to set the path of this file. fix #1259
|
||
def lock(self): | ||
_lock(self._lockfile.fileno()) | ||
self._locked = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be set to the return value of the _lock
call, or we may incorrectly report as locked when we are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm there is no return from the value, either it fails or not since it's a shared lock. did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that there's no return value from lock
. I was thinking of _unlock
. If the lock fails, this will bubble an error, and that seems fine.
We should do something about the naming. Calling a function |
Oh. I guess I'm a little late to comment. |
I also think maybe we should always do the lock, regardless of UNIX socket or not. It's unnecessary branching that adds complexity to the function bodies and some function call signatures, when it would be harmless to always do it. |
I don't love that sock.close now has an extra argument, but I'm not sure what to do about that, if anything should be. |
@tilgovi why should we create a lock file if we don't use it though? It's using a fd for nothing. I'm not against it though it would indeed simplify the code. Just that other servers like apache or nginx don't use locks by default. About the parameters added to the socket, why instead not adding a About method names well, what about |
I wonder if we could solve this simply without locks now that I think about it more. There would still be race conditions if USR2 is followed very quickly by QUIT, but that's true with the lock, also. The way it would work is this:
Something like this should work just as well as the lock, with no need for a lock file. It relies on the guarantee that there are never more than two arbiters, which seems reasonable maybe. On USR2, arbiter compare s the main arbiter pid to its parent pid and if they match, refuse to fork. Otherwise, the arbiter declares itself as main arbiter (update environment) and forks. It is not possible to be inherited by a process with as PID that matches the parent arbiter, because to inherit the current process it would have had to exist before the parent arbiter spawned, but that would imply two processes with the same PID. Therefore, it's not even necessary to keep track of the parent, only to check it at signal time. |
I pushed some improvements in #1266 @tilgovi what is a child arbiter though? The new arbiter spawned is not attached to the old arbiter which means that the current arbiter will not receive any signals associated to it. In fact we should make sure they don't share anything since the purpose of USR2 is to allow the upgrade of the application. It's late there so I will think more about it in the morning. The issue I can see right now also is that the arbiter can die before the control is given to the new arbiter. |
I also want to test how things are working with systemd socket activation. If systemd manages the socket lifecycle, we should perhaps never unlink the socket. I'll think this over more as I can but if you need to make a hotfix release go for it. |
Hrm I think you just found an issue. Indeed we should never unlink the socket if it passed to us via systemd. Or to be simpler when we don't control it ourself. Actually modulo the issue above, we may not need to use a lock at all. Using the following code allows us to check if a process exists: import errno
import os
def process_exists(Pid):
try:
os.kill(Pid, 0)
except OSError as e:
if e.errno == errno.ESRCH:
return False
elif e.errno == errno.EPERM:
return True
raise
return True
if __name__ == '__main__':
import sys
if len(sys.argv) < 2:
raise RuntimeError()
pid = int(sys.argv[1])
pid_exists = process_exists(pid)
print("process exists: %s" % pid_exists) So if we indeed keep somewhere the PID of the newly arbiter then we are good. Maybe we could open a temporary channel between the old arbiter and the new one? Says for example I could write on a pipe shared between two process via the environment then close it once it's done. Thoughts? |
A pipe might help, but I'm not sure. For checking the PID, it's not sufficient to check is a PID exists because it may have been reused. We need to check if the current parent PID is the same PID that spawned the arbiter. |
@tilgovi so I reviewed your initial proposal more carefully, and you're right, it's the right way to do it. I summarised everything in #1267 with small changes to your solution. Let me know :) @jamadden your opinion is welcome also :) |
fix unix socket locking
This change add proper file locking to gunicorn. By default "gunicorn.lock" is created in the temporary directory when a unix socket is bound. In case someone want to fix the lock file path or use multiple gunicorn instance the "--lock-file" setting can be used to set the path of this file.
fix #1259