New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runner: don't allow multiple instances of daemon to run #418
Conversation
main.c
Outdated
lock.l_type = F_WRLCK; | ||
if (fcntl(fd, F_SETLK, &lock) == -1) { | ||
tcmu_err("tcmu-runner is already running...\n"); | ||
close(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better add one goto for closing the fd to make the code more readable.
Or we should it close(fd) everywhere just after this line when trying to goto error tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch @lxbsz
I will fix this part.
main.c
Outdated
tcmu_err("tcmu-runner is already running...\n"); | ||
close(fd); | ||
goto destroy_log; | ||
} | ||
|
||
ret = load_our_module(); | ||
if (ret < 0) { | ||
tcmu_err("couldn't load module\n"); | ||
goto destroy_log; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like here.
main.c
Outdated
|
||
lock.l_type = F_WRLCK; | ||
if (fcntl(fd, F_SETLK, &lock) == -1) { | ||
tcmu_err("tcmu-runner is already running...\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that only when errno == EACCES or EAGAIN then we will be sure that the tcmu-runner main process is already runing ? Or it should be other op errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will defend on EAGAIN
@lxbsz updated with the suggestions. Thanks! |
It looks good to me and test it and the logs are: /usr/bin/tcmu-runner --tcmu-log-dir=/var/log/gluster-block/ --debug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me.
main.c
Outdated
@@ -56,6 +56,8 @@ | |||
#include "libtcmu_config.h" | |||
#include "libtcmu_log.h" | |||
|
|||
# define TCMU_LOCK_FILE "/var/run/tcmu.lock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this more normally in the lock dir like
/var/run/lock/tcmu.lock
or
/var/run/lock/tcmu/lock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes will pick the first one.
main.c
Outdated
@@ -994,6 +996,8 @@ int main(int argc, char **argv) | |||
GIOChannel *libtcmu_gio; | |||
guint reg_id; | |||
bool new_path = false; | |||
struct flock lock = {0, }; | |||
int fd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about rename to lock_fd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this.
main.c
Outdated
@@ -1146,6 +1173,8 @@ int main(int argc, char **argv) | |||
tcmulib_close(tcmulib_context); | |||
err_free_handlers: | |||
darray_free(handlers); | |||
close_fd: | |||
close(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why for the normal shutdown we do a F_UNLCK and a close, but for the error path we only do a close on the fd?
From the man page it sounds like the same thing happens and the lock is released due to the process exiting. Is that correct? Is it just customary to only do a close in error paths liek this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From man:
-
As well as being removed by an explicit F_UNLCK, record locks are automatically released when the process terminates.
-
If a process closes any file descriptor referring to a file, then all of the process's locks on that file are released
I just wrote more like sequence, like open, grab lock, release lock, close in the normal shutdown. And forget to add release lock at error path.
But, just closing the fd without F_UNLCK will also release the lock anyway.
@mikechristie I'm not sure what would you prefer:
- remove F_UNLCK completely and just close(fd) in the both normal & error paths (or)
- add F_UNLCK in error path, which I forget before ?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, I do not know.
Go for the F_UNLOCK just to make it clear I guess. We seem to do that for some other things so it will at least be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - Make-sense :-)
until now there is no check to defend on multiple runs of daemon within the same node. This patch takes a non blocking lock on 'tcmu.lock' file, if it succeeds only then daemon is allowed to run, if there is already a lock on lock-file (taken by some other instance of tcmu-runner) we will exit. Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Updated with the suggested changes. Thanks! |
until now there is no check to defend on multiple runs of daemon within
the same node. This patch takes a non blocking lock on 'tcmu.lock' file,
if it succeeds only then daemon is allowed to run, if there is already a
lock on lock-file (taken by some other instance of tcmu-runner) we will exit.
Signed-off-by: Prasanna Kumar Kalever prasanna.kalever@redhat.com