Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: use synchronous signal handling on unix #7050

Merged

Conversation

@stevendanna
Copy link
Contributor

stevendanna commented Oct 16, 2019

Summary

Issue #7044 describes multiple problems with the current signal
handling approach. These problems were introduced in 8e13827 and
render our current signal handling unsafe. Specifically, taking a
mutex and allocating memory from inside an signal handler can result
in a deadlock or state corruption.

When using asynchronous signal handling via signal handlers, the code
in the signal handler are restricted to limited number of safe
operations (i.e. flipping an atomic, calling async-signal-safe
functions).

This change hopes to solve the existing issues and avoid future
problems caused by signal handlers by moving us to synchronous signal
handling.

Move to Synchronous signal handling

The strategy used here is described in The Linux
Programming Interface
[0]:

  • All threads block all of the asynchronous signals that the process
    might receive. The simplest way to do this is to block the signals
    in the main thread before any other threads are created. Each
    subsequently created thread will inherit a copy of the main
    thread's signal mask.

  • Create a single dedicated thread that accepts incoming signals
    using sigwaitinfo(), sigtimedwait(), or sigwait()...

On Linux, signals::init() will now:

  • Block all signals that we plan on handling
  • Starts a thread that processes signals synchronously.

Because we want to block the signals early in the process, we move
signals::init() much earlier into main.

Move to a more explicit interface

This change also moves us from using a VecDequeue under a Mutex for
tracking signals to signal-specific AtomicBools, replacing the generic
check_for_signal() function with two more-specific
pending_sighup() and pending_sigchld() functions.

This has two advantages:

  1. No mutexes or memory allocations in our signal handling code. This
    matters less since we have also moved to synchronous handling of
    signals, bug gives us some flexibility to change our approach more
    safely in the future.

  2. Clearer alignment between this interface and the application
    behavior.

The old check_for_signal() API looks like it intended to provide the
groundwork for a variety of signal-based behavior; however, in
reality, we are handling a small number of signals. The generic nature
of that interface and the fact that the code is shared between two
processes was obscuring the fact that we actually have a small amount
of signal-related behavior across the two services:

Linux Behavior

| signal    | hab-launch behavior | hab-sup behavior     |
|-----------+---------------------+----------------------|
| SIGINT    | Graceful shutdown   | Handled but ignored  |
| SIGTERM   | Graceful shutdown   | Handled but ignored  |
| SIGHUP    | Send to hab-sup     | Shutdown for restart |
| SIGCHLD   | Reap zombies        | Handled but ignored  |
| SIGQUIT   | Handled but ignored | Handled but ignored  |
| SIGALRM   | Handled but ignored | Handled but ignored  |
| SIGUSR1   | Handled but ignored | Handled but ignored  |
| SIGUSR2   | Handled but ignored | Handled but ignored  |
| All other | Default disposition | Default disposition  |

Windows Behavior

Windows uses a library which appears to handle the CTRL_C_EVENT.

The set of changes should not substantively change the behavior on either platform,
although we are now installing the handler earlier in the startup of the relevant applications.

This more explicit interface has the disadvantage of not being
well-suited to a larger variety of signal-based behavior. However, it
seems to me that we can always implement something smarter when that
actually becomes an issue.

Resources

[0] Kerrisk, Michael. The Linux Programming Interface, 685. San
Franscisco: No Starch Press, 2010.
[1] http://man7.org/linux/man-pages/man2/sigprocmask.2.html
[2] http://man7.org/linux/man-pages/man3/sigwait.3.html
[3] http://man7.org/linux/man-pages/man3/pthread_sigmask.3.html
[4] http://man7.org/linux/man-pages/man3/sigsetops.3.html
[5] http://man7.org/linux/man-pages/man7/signal.7.html

Signed-off-by: Steven Danna steve@chef.io

@chef-expeditor

This comment has been minimized.

Copy link

chef-expeditor bot commented Oct 16, 2019

Hello stevendanna! Thanks for the pull request!

Here is what will happen next:

  1. Your PR will be reviewed by the maintainers.
  2. If everything looks good, one of them will approve it, and your PR will be merged.

Thank you for contributing!

@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 16, 2019

Note that I kept the check_for_signal function; however, I'm kinda skeptical of that function as well because the process that calls it only dequeues one signal at a time and only calls it every tick which means we can get pretty overwhelmed by signals.

components/sup/src/main.rs Outdated Show resolved Hide resolved
@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 16, 2019

Test failures is:

[2019-10-16T17:15:03Z ERROR habitat_butterfly::server::inbound] UDP Receive error: Either the application has not called WSAStartup, or WSAStartup failed. (os error 10093)
@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 16, 2019

TODO:

  • Ensure everyone who forks/exec's is using std::os. Our signal mask is shared inherited across those calls. The stdlib is careful to reset it before launching a subprocess, but anyone using this process and calling libc directly might be surprised.
Copy link
Contributor

markan left a comment

This generally looks good, but I'm still refreshing my memory on signal handling behavior in threads.

inner: libc::sigset_t
}

impl Sigset {

This comment has been minimized.

Copy link
@markan

markan Oct 16, 2019

Contributor

Is it worth looking at the nix crate (https://github.com/nix-rust/nix/blob/01f4d57eede3ed2f1051778113b1d6a7dff8e9d6/src/sys/signal.rs)
Looks like we already pull it in indirectly, but not directly.

This comment has been minimized.

Copy link
@stevendanna

stevendanna Oct 16, 2019

Author Contributor

Happy to change it over if y'all prefer. It seems the main difference in implementation (aside from them creating more types given their larger API surface area) is they use mem::MaybeUninit::uninit() whereas I use mem::zeroed() and libc::c_int. Based on reading this:

https://doc.rust-lang.org/beta/std/mem/union.MaybeUninit.html

It seems like that is a worthwhile change even if we don't move to the library.

This comment has been minimized.

Copy link
@stevendanna

stevendanna Oct 16, 2019

Author Contributor

Although the documentation here makes me think that I've used mem::zeroed() as intended:

https://doc.rust-lang.org/std/mem/fn.zeroed.html

This comment has been minimized.

Copy link
@markan

markan Oct 16, 2019

Contributor

I looked at the mem::zeroed() and mem::MaybeUninit::uninit() docs, and think that your mem::zeroed usage is correct.

This comment has been minimized.

Copy link
@christophermaier

christophermaier Oct 21, 2019

Contributor

The usage of mem::zeroed here seems OK to me too, but based on https://blog.rust-lang.org/2019/09/26/Rust-1.38.0.html#linting-some-incorrect-uses-of-mem:{uninitialized,-zeroed}, it might be a good idea to move to MaybeUninit anyway.

The nix crate does look interesting, but I don't think we necessarily have to do anything with it right now.

@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 16, 2019

@markan > but I'm still refreshing my memory on signal handling behavior in threads

If you have TLPI, checkout chapter 33.2.

@markan

This comment has been minimized.

Copy link
Contributor

markan commented Oct 16, 2019

stevendanna > Note that I kept the check_for_signal function; however, I'm kinda skeptical of that function as well because the process that calls it only dequeues one signal at a time and only calls it every tick which means we can get pretty overwhelmed by signals.

I was wondering about that as well. Is there any case where we need to resolve multiple instances of a signal, or the order of their arrival? If not could we simply have a static array of flags, one per signal type, and test/process them in some priority order. While I think it's safe now, getting rid of the alloc implicit in push_back and replacing it with updating a static mask would simplify things and make it easier to think about the code.

@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 16, 2019

Is there any case where we need to resolve multiple instances of a signal, or the order of their arrival?

I don't think so. Also, based on my understanding of how signals work, anything that is depending on this would be broken by design since the kernel will "coalesce" signals on you.

@stevendanna stevendanna force-pushed the stevendanna:ssd/sync-signal-handling branch 2 times, most recently from da61ded to 630293b Oct 17, 2019
@stevendanna stevendanna marked this pull request as ready for review Oct 17, 2019
Some(SignalEvent::Passthrough(signal)) => {
self.forward_signal(signal);

if signals::pending_sighup() {

This comment has been minimized.

Copy link
@stevendanna

stevendanna Oct 17, 2019

Author Contributor

Tecnically this is a small behavior change for the launcher. Whereas before we would only handle 1 signal on each tick, now we potentially handle two. If we are concerned we could move this to an if else.

This comment has been minimized.

Copy link
@christophermaier

christophermaier Oct 21, 2019

Contributor

I think this should be fine.

@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 17, 2019

@markan Thanks for the review! I've moved away from the VecDequeue and just added a couple more AtomicBool's. I thought about doing some bitmask type thing, but for now, I wanted the signal behavior we actually handle to be much more explicit in the code. I don't think a bitmask+mutex and a new API would be hard to wire up in the future if we needed it. Ultimately, given the interactions between process, threads, signals, and the complicated 3rd party libraries we use which launch their own threads, I'd rather keep our use of signal-based behavior to a minimum.

On your question of libraries. I defer to y'all who work on this every day. Since a huge function of launcher and sup is subprocess management, I don't think we ever escape from having to own the details, but there is still a trade-off. In this PR, I've deferred to stdlib to do all necessary cleanup before calling execvp but have taken the signal masking operations in-house. To me the big question around whether to use nix is mostly about whether we want to deal with managing the FFI bits. In the very long run, there is also a good looking library signal-hook (that we already have as a transitive dependency) that might be worth looking at as well but I think using it would require more substantive changes to the main loops so I've deferred looking into that for now.

stevendanna added 3 commits Oct 16, 2019
Issue #7044 describes multiple problems with the current signal
handling approach. These problems were introduced in 8e13827 and
render our current error handling unsafe. Specifically, taking a
mutex and allocating memory from inside an signal handler can result
in a deadlock or state corruption.

When using asynchronous signal handling via signal handlers, the code
in the signal handler are restricted to limited number of safe
operations (i.e. flipping an atomic, calling async-signal-safe
functions).

This change hopes to solve the existing issues and avoid future
problems caused by signal handlers by moving us to synchronous signal
handling.

The strategy used here is described in _The Linux
Programming Interface_[0]:

> - All threads block all of the asynchronous signals that the process
>   might receive. The simplest way to do this is to block the signals
>   in the main thread before any other threads are created. Each
>   subsequently created thread will inherit a copy of the main
>   thread's signal mask.
>
> - Create a single dedicated thread that accepts incoming signals
>   using _sigwaitinfo()_, _sigtimedwait()_, or _sigwait()_...

On Linux, signals::init() will now:

- Block all signals that we plan on handling
- Starts a thread that processes signals synchronously.

Because we want to block the signals early in the process, we move
signals::init() much earlier into main.

Resources

[0] Kerrisk, Michael. _The Linux Programming Interface_, 685. San
Franscisco: No Starch Press, 2010.
[1] http://man7.org/linux/man-pages/man2/sigprocmask.2.html
[2] http://man7.org/linux/man-pages/man3/sigwait.3.html
[3] http://man7.org/linux/man-pages/man3/pthread_sigmask.3.html
[4] http://man7.org/linux/man-pages/man3/sigsetops.3.html
[5] http://man7.org/linux/man-pages/man7/signal.7.html

Signed-off-by: Steven Danna <steve@chef.io>
This change moves us from using a VecDequeue under a Mutex for
tracking signals to signal-specific AtomicBools, replacing the generic
`check_for_signal()` function with two more-specific
`pending_sighup()` and `pending_sigchld()` functions.

This has two advantages:

1) No mutexes or memory allocations in our signal handling code. This
   matters less since we have also moved to synchronous handling of
   signals, bug gives us some flexibility to change our approach more
   safely in the future.

2) Clearer alignment between this interface and the application
   behavior.

The old check_for_signal() API looks like it intended to provide the
groundwork for a variety of signal-based behavior; however, in
reality, we are handling a small number of signals. The generic nature
of that interface and the fact that the code is shared between two
processes was obscuring the fact that we actually have a small amount
of signal-related behavior across the two services:

*Linux Behavior*

| signal    | hab-launch behavior | hab-sup behavior     |
|-----------+---------------------+----------------------|
| SIGINT    | Graceful shutdown   | Handled but ignored  |
| SIGTERM   | Graceful shutdown   | Handled but ignored  |
| SIGHUP    | Send to hab-sup     | Shutdown for restart |
| SIGCHLD   | Reap zombies        | Handled but ignored  |
| SIGQUIT   | Handled but ignored | Handled but ignored  |
| SIGALRM   | Handled but ignored | Handled but ignored  |
| SIGUSR1   | Handled but ignored | Handled but ignored  |
| SIGUSR2   | Handled but ignored | Handled but ignored  |
| All other | Default disposition | Default disposition  |

*Windows Behavior*

Windows uses a library which appears to handle the CTRL_C_EVENT. The
set of changes should not change the behavior here, although we are
now installing the handler earlier in the startup of the relevant
applications.

This more explicit interface has the disadvantage of not being
well-suited to a larger variety of signal-based behavior. However, it
seems to me that we can always implement something smarter when that
actually becomes an issue.

Signed-off-by: Steven Danna <steve@chef.io>
We were calling libc::execvp directly here. This is problematic
because ideally we would do various cleanups before calling exec,
including resetting signal masks and signal dispositions. While we
could have done this directly, the standard library already handles
the nitty-gritty details and we already have an internal function that
calls out to it.

Signed-off-by: Steven Danna <steve@chef.io>
@stevendanna stevendanna force-pushed the stevendanna:ssd/sync-signal-handling branch from c715d5c to e17c2de Oct 17, 2019
@jaym

This comment has been minimized.

Copy link
Contributor

jaym commented Oct 17, 2019

Something to discuss: I don't believe we need anything more than relaxed for our atomic memory ordering because the 2 threads are not sharing any other writes/reads. The assumption I'm making here is that the actions taken as a result of the atomic bool are not reordered to happen before the CAS.

@markan
markan approved these changes Oct 17, 2019
Copy link
Contributor

markan left a comment

I very much like these changes, especially the simplification of replacing the VecDeque with a pair of AtomicBool flags

@markan

This comment has been minimized.

Copy link
Contributor

markan commented Oct 18, 2019

This is directed to the more experienced habitat developers:

So when I look at the history of this code, it looks like circa hab 0.50 we handled signals in a similar fashion to this PR. Then in habitat-sh/core#75 we changed that to a queue, because of concerns that multiple signals might be lost or handled out of order.

But (as @stevendanna mentioned) multiple signals to a process can be freely reordered and coalesced, in particular any sent before the receiver is scheduled. It looks like under the covers it's a set of bit flags per process, indicating whether a signal has been sent to it, and when scheduled the signals are delivered in a canonical order https://github.com/torvalds/linux/blob/master/kernel/signal.c#L217. So we can't actually guarantee the behavior that #75 above is intended to provide, especially under load or other duress.

So any behavior that depends on in-order or multiple signal delivery is broken and needs to be modified. But from habitat-sh/core#11, I'm not clear on what depends on that behavior. How do we find out?

Copy link
Contributor

christophermaier left a comment

This looks like a great improvement @stevendanna; thanks very much for digging into this, and for providing such a well-documented solution.

@markan Doing a bit of code archaeology, it seems that before we had the VecDeque, we just had two atomics; a boolean that said "a signal was caught" and an integer that said which signal that was.

https://github.com/habitat-sh/core/blob/c7dff4480dcb8eca049b2ae7b12f2ad34df17656/src/os/signals/unix.rs#L38-L41

Multiple different signals arriving in succession could thus get condensed into whichever was the last one handled, resulting in the potential for missed signals, which is what habitat-sh/core#75 was trying to address (rather than ensuring that we directly handled every instance of every kind of signal). It looks like this current PR provides a more explicit way of dealing with multiple different signals, by dealing concretely with each one we care about, rather than trying to handle them all generically.

inner: libc::sigset_t
}

impl Sigset {

This comment has been minimized.

Copy link
@christophermaier

christophermaier Oct 21, 2019

Contributor

The usage of mem::zeroed here seems OK to me too, but based on https://blog.rust-lang.org/2019/09/26/Rust-1.38.0.html#linting-some-incorrect-uses-of-mem:{uninitialized,-zeroed}, it might be a good idea to move to MaybeUninit anyway.

The nix crate does look interesting, but I don't think we necessarily have to do anything with it right now.

Some(SignalEvent::Passthrough(signal)) => {
self.forward_signal(signal);

if signals::pending_sighup() {

This comment has been minimized.

Copy link
@christophermaier

christophermaier Oct 21, 2019

Contributor

I think this should be fine.

unsafe {
libc::execvp(c_cmd.as_ptr(), argv.as_mut_ptr());
}
become_command(cmd_path, &[])?;

This comment has been minimized.

Copy link
@christophermaier

christophermaier Oct 21, 2019

Contributor

Nice find!

@mwrock
mwrock approved these changes Oct 21, 2019
Copy link
Contributor

mwrock left a comment

This looks good to me and I don't see any problems from a Windows standpoint.

@baumanj

This comment has been minimized.

Copy link
Contributor

baumanj commented Oct 22, 2019

@stevendanna: excellent work, as always. I was indeed mistaken in using locking and allocating functions inside the signal handler 🤦‍♂️.

To that end, I'd suggest at least considering using the signal-hook crate, which looks like it was written with a lot of those subtleties in mind.

@christophermaier

This comment has been minimized.

Copy link
Contributor

christophermaier commented Oct 22, 2019

@stevendanna Was there anything else you wanted to do with this PR? If not, I'm happy to get it merged; I just didn't want to pull the rug out from under you if you had any other tweaks you wanted to do. 😄

I'm going to file an issue to look into signal-hook as a further evolution of this work... it's been on my radar for a little while, since signal handling has always been a bit more obfuscated and complicated in our codebase than I'd like. I agree that it would likely require more extensive refactoring to our code, though.

I think what you've done here in this PR is a very definite improvement, though!

@stevendanna

This comment has been minimized.

Copy link
Contributor Author

stevendanna commented Oct 22, 2019

@christophermaier No other urgent changes on my end. I think some of these suggestions can be done as more focused follow-ups.

@christophermaier christophermaier merged commit 46163ef into habitat-sh:master Oct 22, 2019
5 checks passed
5 checks passed
DCO This commit has a DCO Signed-off-by
Details
buildkite/habitat-sh-habitat-master-verify Build #3762 passed (53 minutes, 41 seconds)
Details
buildkite/habitat-sh-habitat-master-website Build #846 passed (1 minute, 37 seconds)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
expeditor/config-validation Validated your Expeditor config file
Details
@christophermaier

This comment has been minimized.

Copy link
Contributor

christophermaier commented Oct 22, 2019

Here's that signal-hook issue, for posterity: #7076

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.