New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd-nbd: support signal handle for SIGHUP, SIGINT and SIGTERM. #14079

Merged
merged 2 commits into from Mar 29, 2017

Conversation

Projects
None yet
4 participants
@liupan1111
Contributor

liupan1111 commented Mar 22, 2017

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 22, 2017

@trociny, please help take a look.

@liupan1111 liupan1111 requested a review from trociny Mar 22, 2017

@yangdongsheng

This comment has been minimized.

Member

yangdongsheng commented Mar 22, 2017

should we also handle the sighup?

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 22, 2017

@yangdongsheng , last time I talked sigint and sigterm with trociny, and I decided to support them. I think @trociny could give the answer.

@trociny

I aggree with @yangdongsheng, it looks useful to register a SIGHUP handler too. I think doing similar to rbd-mirror is good (i.e. register a default sighup_handler, which will just reopen logs).

{
assert(signum == SIGINT || signum == SIGTERM);
derr << "*** Got signal " << sig_str(signum) << " ***" << dendl;
do_unmap();

This comment has been minimized.

@trociny

trociny Mar 22, 2017

Contributor

It works, though unnecessary opens/closes nbd device. This might be also an additional source of races, when the device is already closed by this rbd-nbd process and opened by another.

I think just doing ioctl(nbd, NBD_DISCONNECT) is enough. For this the signal handler needs to know nbd though. You could just make nbd global static (initialized to -1 for safety) or use C++11 lambda function, something like below:

std::function<void()> handle_signal_callback;
static void handle_signal(int signum) {
   ...
  handle_signal_callback();
}

...
int do_map(int argc, const char *argv[]) {
  ....
  handle_signal_callback = [nbd]() { ioctl(nbd, NBD_DISCONNECT); };
  register_async_signal_handler_oneshot(SIGINT, handle_signal);
  ...

This comment has been minimized.

@liupan1111

liupan1111 Mar 22, 2017

Contributor

@trociny Sorry that I don't catch your idea very much .... could you talk more about the potential data race? Thanks!

This comment has been minimized.

@trociny

trociny Mar 22, 2017

Contributor

After you move register/unregister closer to ioctl(nbd, NBD_DO_IT) a race seams like not possible (previously a scenario was possible, when the main thread closes nbd, a new process reuses this device, and then a signal handler fires sending the disconnect to a wrong device).

So the main point now is just to make it lightweight: no need to open/close a device if you can just use a descriptor. I wont insist though if you like your version more.

This comment has been minimized.

@liupan1111

liupan1111 Mar 24, 2017

Contributor

I agree, I prefer to define a static nbd variable, and this is very clean.

@@ -543,6 +553,10 @@ static int do_map(int argc, const char *argv[])
common_init_finish(g_ceph_context);
global_init_chdir(g_ceph_context);
init_async_signal_handler();
register_async_signal_handler_oneshot(SIGINT, handle_signal);
register_async_signal_handler_oneshot(SIGTERM, handle_signal);

This comment has been minimized.

@trociny

trociny Mar 22, 2017

Contributor

I think you register too early (and unregister too late), at this moment e.g. devpath still might be unknown (and nbd not opened), so if a signal happens before we start listening on nbd, disconnect request is lost.

I suggest registering the handlers just before ioctl(nbd, NBD_DO_IT) and unregister just after.

This comment has been minimized.

@liupan1111

liupan1111 Mar 22, 2017

Contributor

reasonable, thanks.

@trociny

This comment has been minimized.

Contributor

trociny commented Mar 22, 2017

@liupan1111 I have just created a tracker ticket for this feature [1] so it could be backported. Could you please add the link to the commit log message?

[1] http://tracker.ceph.com/issues/19349

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 22, 2017

@liupan1111 I have just created a tracker ticket for this feature [1] so it could be backported. Could you please add the link to the commit log message?

[1] http://tracker.ceph.com/issues/19349

Sure.

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 22, 2017

I aggree with @yangdongsheng, it looks useful to register a SIGHUP handler too. I think doing similar to rbd-mirror is good (i.e. register a default sighup_handler, which will just reopen logs).

@yangdongsheng @trociny , why reopen logs when catch sighub, but not do that for sigint or sigterm?

@trociny

This comment has been minimized.

Contributor

trociny commented Mar 22, 2017

By convention SIGHUP is used to inform a process to renew its state. Many unix daemons reload config and/or reopen log files on this. This is used e.g by logrotate to tell a daemon to use a new log file.

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 24, 2017

@trociny Thanks, I got it!

@liupan1111 liupan1111 changed the title from rbd_nbd: support signal handle for SIGINT and SIGTERM. to rbd_nbd: support signal handle for SIGHUP, SIGINT and SIGTERM. Mar 24, 2017

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 24, 2017

@trociny updated.

unregister_async_signal_handler(SIGTERM, handle_signal);
shutdown_async_signal_handler();
server.stop();

This comment has been minimized.

@trociny

trociny Mar 24, 2017

Contributor

@liupan1111 looks like the indentation is wrong

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 25, 2017

@trociny done, i will modify My vimrc to avoid this error happen again.

@trociny trociny self-assigned this Mar 25, 2017

{
assert(signum == SIGINT || signum == SIGTERM);
derr << "*** Got signal " << sig_str(signum) << " ***" << dendl;
ioctl(nbd, NBD_DISCONNECT);

This comment has been minimized.

@trociny

trociny Mar 25, 2017

Contributor

Testing the only issue noticed is that if there are many pending I/O requests, the NBD_DISCONNECT ioctl may last long time (the same is when doing unmap via CLI). Thus it it would be nice to add some dignostic: "sending NBD_DISCONNECT" debug message before ioctl, check for return value and output error if it failed (a debug message if it succeeded).

This comment has been minimized.

@liupan1111

liupan1111 Mar 25, 2017

Contributor

@trociny, I also find such kind of issue when I use unmap. Yeah,message is useful, not only in this signal handler, also for unmap function.

Just one question, what is the situation ioctl(nbd, NBD_DISCONNECT) may fail?

This comment has been minimized.

@trociny

trociny Mar 25, 2017

Contributor

I am not sure there is a valid situation when ioctl(nbd, NBD_DISCONNECT) may fail in our case. Still there may be bugs. And if it fails we are better to know about this.

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 26, 2017

@trociny done, thanks.

@trociny

I don't see a point in 2 commit messages. Please consider squashing.

ioctl(nbd, NBD_DISCONNECT);
dout(20) << __func__ << ": " << "sending NBD_DISCONNECT" << dendl;
if (ioctl(nbd, NBD_DISCONNECT) < 0)
cerr << "rbd-nbd: the device is not used" << std::endl;

This comment has been minimized.

@trociny

trociny Mar 26, 2017

Contributor

It is not necessary "not used" in this context. I suggest:

"rbd-nbd:: disconnect failed" << cpp_error(errno)

And output to derr, not cerr.

Also, I suggest always using curly brackets ({}), even for one line blocks.

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

For curly brackets, I also like it for one line blocks, but I find it is not used in rbd-nbd.cc

if (ioctl(nbd, NBD_DISCONNECT) < 0)
cerr << "rbd-nbd: the device is not used" << std::endl;
else
dout(20) << __func__ << ": " << "succesfully disconnect NBD with rbd-nbd" << dendl;

This comment has been minimized.

@trociny

trociny Mar 26, 2017

Contributor

Please break the line to fit 80 char width. Or rather make the message shorter (e.g. "disconnected" is enough for a debug message).

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

yup, agree

cerr << "rbd-nbd: the device is not used" << std::endl;
}

This comment has been minimized.

@trociny

trociny Mar 26, 2017

Contributor

The same about curly brackets.

cerr << "rbd-nbd: the device is not used" << std::endl;
}
else
dout(20) << __func__ << ": " << "succesfully disconnect NBD with rbd-nbd" << dendl;

This comment has been minimized.

@trociny

trociny Mar 26, 2017

Contributor

the same about too long message

@dillaman

This comment has been minimized.

Contributor

dillaman commented Mar 27, 2017

@liupan1111 Nit: update your commit prefix to "rbd-nbd" instead of "rbd_nbd"

@dillaman dillaman changed the title from rbd_nbd: support signal handle for SIGHUP, SIGINT and SIGTERM. to rbd-nbd: support signal handle for SIGHUP, SIGINT and SIGTERM. Mar 27, 2017

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 27, 2017

@dillaman done, thanks.

ioctl(nbd, NBD_DISCONNECT);
dout(20) << __func__ << ": " << "sending NBD_DISCONNECT" << dendl;
if (ioctl(nbd, NBD_DISCONNECT) < 0) {
derr << "rbd-nbd: disconnect failed" << cpp_error(errno) << std::endl;

This comment has been minimized.

@trociny

trociny Mar 27, 2017

Contributor

Need ": " after "disconnect failed"

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

right.

@@ -90,6 +92,20 @@ static bool exclusive = false;
#endif
#define htonll(a) ntohll(a)
static int do_unmap();

This comment has been minimized.

@yangdongsheng

yangdongsheng Mar 27, 2017

Member

seems not needed any more.

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

right, thanks.

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

yup

derr << "rbd-nbd: disconnect failed" << cpp_error(errno) << std::endl;
} else {
dout(20) << __func__ << ": " << "disconnected" << dendl;
}

This comment has been minimized.

@yangdongsheng

yangdongsheng Mar 27, 2017

Member

same code block with the disconnect action in do_unmap(). better to implement a do_disconnect() and call it in different place.

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

not same completely, you could see the error message.

This comment has been minimized.

@yangdongsheng

yangdongsheng Mar 27, 2017

Member

So the question is, why they are different? Does ioctl() < 0 means device is not used by all means? maybe we should check the ret-val, ENOENT means not used, but others means disconnect failed., right?

This comment has been minimized.

@trociny

trociny Mar 27, 2017

Contributor

@yangdongsheng They are different, because in do_unmap case, this is the user who specifies the device name, and the most probable error in this case is that the user specified a wrong device or it had already been closed. In the signal handler if ioctl fails it is definitely not due to the user input, and is the most probably a bug manifestation either in the rbd-nbd or the kernel.

This comment has been minimized.

@liupan1111

liupan1111 Mar 27, 2017

Contributor

@yangdongsheng agree with trociny

This comment has been minimized.

@liupan1111

liupan1111 Mar 28, 2017

Contributor

@yangdongsheng Thanks for your analysis.
(1) For your first question, I think there are two different kinds of perspective: from rbd-nbd, or from nbd driver. I agree with Trociny because in this way, the user doesn't need to care the detailed implementation in nbd driver. @trociny, please correct me.

(2) From your example, I believe the second disconnect: "disconnect in sig_handler" in Thread1 will report error:
rbd-nbd: disconnect failed: " << cpp_error(errno) , and we also dump error code, so I don't think it is a big issue.

This comment has been minimized.

@yangdongsheng

yangdongsheng Mar 28, 2017

Member

@liupan1111 @trociny Let me try to make this discussion easier, Please consider this scenario, a device is mapped, but we don't have permission to unmap it.

Then a user call a do_unmap(), but we error out with the message of "the device is not used", it is not proper, do you agree?

This comment has been minimized.

@trociny

trociny Mar 28, 2017

Contributor

@yangdongsheng in your scenario do_unmap will fail earlier: "failed to open device". Still either way would be ok to me.

This comment has been minimized.

@liupan1111

liupan1111 Mar 28, 2017

Contributor

@trociny @yangdongsheng Mmm... I prefer to keep them seperately at this moment for flexible. Anyway, thanks dongsheng for the comments!

This comment has been minimized.

@yangdongsheng

yangdongsheng Mar 28, 2017

Member

@trociny @liupan1111 That's okey. I am not very insist on it. 😄

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 27, 2017

@trociny done.

@@ -92,13 +92,16 @@ static int nbd = -1;
#endif
#define htonll(a) ntohll(a)
static int do_unmap();

This comment has been minimized.

@trociny

trociny Mar 28, 2017

Contributor

You added this in the previous commit and then removed here.

I still think that just squashing two commits into one would be better. Anyway, if you want two commits, please clean up this.

This comment has been minimized.

@liupan1111

liupan1111 Mar 28, 2017

Contributor

I will remove this declaration, thanks.
I still prefer two commits, since they are different fixes, and will be clearer.

rbd-nbd: polish the output info before and after ioctl NBD_DISCONNECT.
Signed-off-by: Pan Liu <liupan1111@gmail.com>
@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Mar 28, 2017

@trociny done, thanks.

@trociny trociny merged commit 6e77724 into ceph:master Mar 29, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@liupan1111 liupan1111 deleted the liupan1111:wip-support-signal-handler branch Mar 29, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment