Signal Handling #16

Closed
wycats opened this Issue Sep 22, 2014 · 15 comments

Projects

None yet

5 participants

@wycats
Contributor
wycats commented Sep 22, 2014

Signals, how do they work?

The goal of this proposal is to make it possible to write programs using mio that can successfully handle Unix signals.

Unfortunately, signal handling is somewhat tricky, made even more complex by the way Unix delivers signals to multi-threaded programs.

The basics:

  • By default, each signal has one of these effects:
    • Ignored
    • Terminate
    • Terminate and dump core
  • It is possible for a program to register a handler for most signals
    (SIGKILL and SIGSTOP cannot be caught).
  • Individual threads can choose to mask signals
  • There can only be one handler function for all threads in the entire
    program that are handling the signal.
  • For most signals, if a signal arrives and more than one thread has the
    signal unmasked, the operating system will deliver it to an arbitrary
    thread.
  • For signals that represent hardware failure (SIGSEGV, SIGFPE), the
    signal is delivered to the thread that generated the failure.
  • The sigwaitinfo function allows a thread to synchronously wait for a set
    of signals. It can atomically unmask a set of signals and wait, which
    avoids race conditions.
  • In Linux, signalfd creates a file descriptor that can be used to
    synchronously (and atomically) wait on signals, in a way that can work with
    the multiplexing APIs (epoll_wait).
  • The "self-pipe trick" can be used with sigwaitinfo to emulate signalfd,
    at the cost of an additional thread that is responsible for waiting on
    signals.

At a high level, the goal for mio is to allow a consumer of a reactor to register interest in signals, to be delivered to a mio handler.

This means that programs that want to use this facility will ignore signals on all threads, and we will use sigwaitinfo or signalfd to allow the reactor to register interest in signals. This also means that only one reactor can be interested in signals. Otherwise, an arbitrary interested reactor would receive the signals.

If a program uses this facility without successfully ignoring signals, signals may be delivered to random user threads instead of the reactor.

Initialization

To make it easy for users to successfully ignore signals across their entire program, a new function, mio::initialize, is added. Programs should run this function at the beginning of their main function, to ensure that any subsequent threads ignore signals, ensuring that the mio reactor gets notified of signals.

Handler API

mio::Handler gets a new callback:

pub trait Handler<T: Token, M: Send> {
    // ...
    fn signal(&mut self, reactor: &mut Reactor<T, M>, info: Siginfo);
}

The information provided to the handler will be the subset of siginfo_t that is reliably supported on Linux and Darwin. It will be organized by signal type; in particular fields related to SIGCHLD will be grouped together.

Control

In order to ensure that signal notifications are sent to the reactor loop, we need to:

  • At minimum, mask all signals for the reactor thread, so that signals are
    only delivered when waiting.
  • Ideally, mask all signals for all threads, so that signals do not get
    delivered to other threads.

It is in theory possible to use this facility without control over signal masking, but that will mean that signals can be missed if they get dispatched to another thread. For programs that want to handle signals, this is very unlikely to be desirable.

EINTR

When a thread recieves a signal during a blocking system call, the system call may return with an EINTR error.

Typically, this means that system calls must guard themselves against this possibility and attempt to retry the call in the case of EINTR. There is even a (not fully reliable) sigaction flag SA_RESTART that can be used to help with this problem.

For mio internals, this problem should not exist:

  • We require that the reactor thread mask all signals, and deliver signals
    using the signal handler method, not sigaction.
  • We use atomic functionality (signalfd or sigwaitinfo) to unmask the
    signals only at the same time as we know we are waiting for a signal.
  • As a result, all other blocking system calls will never be interrupted by a
    signal.

For programs that want to sanely handle signals with mio, this problem also should not exist:

  • Programs that want to handle signals through mio callbacks will need to mask
    signals process-wide.
  • As above, we use atomic functionality to unmask signals only at the same
    time that we are waiting for a signal.
  • If signals are masked process-wide, this means that signals will always be
    queued and delivered to the expected system call, and never interrupt other
    system calls.

It is true that this strategy requires some control over the process, or at > least the thread running the reactor. However, if a user of mio wants to be
notified of signals, they will have to ensure that mio, and only mio, can
receive signals. Otherwise, signals may be silently dropped.

Also, if people have a use-case for using mio in a thread that cannot have
signals masked, we can revisit handling internal-to-mio EINTRs by
retrying. At the moment, it seems very strange to need to allow the same
thread that runs the mio reactor to register a sigaction and receive the
signal through a facility other than mio, but I may be missing something
obvious!

Implementation

Linux (epoll + signalfd)

On Linux, we simply use signalfd to accept all signals, after having masked all signals on the thread.

We then register signalfd with epoll. When epoll_wait returns, if the FD was readable, we consume signals via read, get the associated signalfd_siginfo stuctures and call the handler.

Other (Self-Pipe + Emulated signalfd)

In the interest of simplicity, on platforms without signalfd, we will emulate it using a separate thread and self-pipe.

  • Create a thread that is dedicated to waiting for signals.
  • Use sigwaitinfo in a loop to listen for signals
  • When signals arrive, write their siginfo_t structures into the pipe
  • Use the receiving end of the pipe with the system's multiplexing facility
    (e.g. kqueue)

It seems possible to implement the functionality without a separate thread
using kqueue. If that proves important, we can investigate more
platform-specific options.

@wycats wycats added this to the 0.3.0 - Annouce milestone Sep 22, 2014
This was referenced Sep 22, 2014
@carllerche
Owner

Couple of questions:

  • What does mio::initialize do exactly?
  • What is the strategy for multiple mio reactors on different threads?
@wycats
Contributor
wycats commented Sep 28, 2014

@carllerche

What does mio::initialize do exactly?

It masks out all the signals on the current thread. The intent is to run it early on in main, so that subsequent threads inherit the masking and the signalfd (or polyfill) is guaranteed to get the signals.

What is the strategy for multiple mio reactors on different threads?

Each reactor could opt into receiving signals. In practice, this means you should do one of the following:

  • Dedicate a single reactor to signals, and all signals will go to it.
  • Dedicate multiple reactors to signals to load balance signals handling across them. This would mean that no one reactor would see all signals, and the kernel would deliver signals to an arbitrary reactor that opted in to signal handling.
@vbuslov
Contributor
vbuslov commented Jan 8, 2015

I've been thinking about this issue and can't decide how to implement signal handling for OS X.

kqueue supports signals. However it only provides signal code and number of times the signal was received between polls. This means its not possible to have full Siginfo for fn signal(&mut self, reactor: &mut Reactor<T, M>, info: Siginfo); method.

Emulated signalfd with separate thread should work but I'm not sure how to support multiple reactors with this implementation. We need a way for each EventLoop to register and deregister itself with one well-known signal handling thread. (Create channel in mio::initialize to send signalfd per each EventLoop to signal-handler thread?)

To summarise possible implementations:

  1. Simple(zero overhead) implementation of signal handling without Siginfo
  2. Trying to provide signal handling in fully cross platform way seems to be a "leaky abstraction" even among Unixes and MIO declares future Windows support, so make Siginfo somehow platform-specific.
  3. Try to implement "heavy" emulated signalfd hack with separate thread.
@wycats
Contributor
wycats commented Jan 12, 2015

@vbuslov I've been pretty slow on this, but here is my point of view:

  • I've been working on implementing support for signals in nix. Originally, my idea was to support the same signal API across all *Nixes.
  • I also assumed that I would use signalfd on modern Linuxes.
  • I quickly realized that the signal API in OSX is actually quite out of date. Specifically sigtimedwait and sigwaitinfo both don't exist on OSX.
  • That means that the only wait to get a siginfo_t in OSX is to use sigaction. The good news about sigaction is that it can be called without a new thread. The bad news is that you're limited to async-safe functions in a sigaction handler.

I think what I would do is to signalfd on modern Linux, and do the following on all other Unixes:

  • Create a pipe (self-pipe trick)
  • Create a new portable Rust definition of siginfo, that contains the fields you care about. We'll use the same struct when signalfd is used.
  • Register a sigaction handler (an sa_sigaction, which gets the siginfo) whenever you would register a signalfd handler.
  • Whenever the sigaction handler is called, write a copy of the Rust siginfo into the pipe (write is async-safe)
  • Now you have a pipe you can multiplex on in mio.

Does that sound plausible? Did I miss anything? Anything else I can add?

@geofft
Contributor
geofft commented May 8, 2015

The bad news is that you're limited to async-safe functions in a sigaction handler.

Sadly the Async-Signal-Safe rules only help you with libc, not libstd, jemalloc, or your own code. You're also compelled not to deadlock or otherwise reach unsafety on your own even if you're not calling libc unsafely. So, for instance, you can't do allocations except on the stack, and you can't access any data structures where code needs to complete to dynamically uphold safety invariants.

I believe you can get away with the self-pipe trick + sending Siginfo via libc write, so long as you keep a single fd in some global structure that's only modified before calling sigaction, and so long as the signal handler allocates the Siginfo itself on the stack. Anything more complicated is risky. (I think, for instance, that looping through a Vec<Fd>, is unsafe, since the signal could be delivered in the middle of Vec::push, which does not use any compiler memory barriers, so len could be updated before ptr is valid. But I don't really understand compiler reordering.)

This does seem like it makes it hard to support multiple reactors. :( Perhaps there's no use case for multiple reactors independently interested in signals (since signals are process-wide anyway), and they can all select on a single fd, or something.

BTW, do you have partial work on this? I'm interested in seeing signal and signalfd support and would be happy to try to finish up a branch.

@geofft
Contributor
geofft commented May 9, 2015

Ugh, another problem here: signal dispositions get inherited to child processes, and at the very least, std::process doesn't reset signal dispositions on fork. So the proposed mio::initialize API will cause inexplicable breakage in child processes: e.g., Ctrl-C and kill (without -9) won't work, a shell script that pipes an infinite-output command to head and waits for the command to die with SIGPIPE will hang, etc.

(Even if mio were able to conspire with std::process, this wouldn't help for processes spawned from other APIs, e.g. native code that runs helper processes, like grantpt(3) or pam_unix or utempter. See https://lwn.net/Articles/415684/ for a discussion of why everything is terrible. There's a mention at the bottom that GHC's runtime gave up on signalfd for this reason, which might be illuminating to follow up on.)

So I think a mio::initialize API isn't going to work. Even a hypothetical variant of mio::initialize that only masked out signals on other threads isn't really what you want: if you have a multithreaded program, you presumably want those other threads to be able to spawn children (either with std::process or via foreign libraries that spawn children) safely.

I'm not seeing a solution to this other than using the self-pipe trick on all platforms (where the signal could get delivered on any thread, but that's OK), with the side effect of requiring all code to be EINTR-safe, or disallowing usage of mio in nontrivially multithreaded programs. (Background threads from mio itself, for a garbage collector or something, etc. are fine, as long as they don't ever spawn children.) I'm hoping there's a better solution here that I'm just missing. :-(

@wycats
Contributor
wycats commented May 9, 2015

Hey @geofft. Are you on IRC? Would be glad to chat about this 😄

@carllerche
Owner

@geofft Yeah, ping us on the #mio channel (Mozilla IRC server: irc.mozilla.org).

@geofft
Contributor
geofft commented May 14, 2015

@wycats and I talked a bit on Sunday about the inheritance issue. To summarize how we understand things:

  1. In order to reliably deliver a signal to a single thread, every other thread in the process has to block (aka mask) that signal. Quoting Linux's signal(7) man page: "A process-directed signal may be delivered to any one of the threads that does not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal." But signal masks get inherited: "A child created via fork(2) inherits a copy of its parent's signal mask; the signal mask is preserved across execve(2)." There is no API to mask a signal but have it get cleared in the child, nor an API to request that signals get delivered to a signal thread, other than masking it on all other threads.
  2. If we don't want signals to be handled all by a single thread, then we have to use a self-pipe handler on all threads. So, we have an unfortunate choice between risking problems with EINTR on all user threads, and risking problems with ignored/blocked signals getting inherited from all user threads.
  3. @wycats is worried, from experience, about C libraries that don't handle EINTR properly. It looks like native Rust code is probably going to be fine (rust-lang/rust#11214), but lots of C libraries aren't, and this is a pretty big use case. He's also worried about cases where there is no correct way to restart a syscall after EINTR, even if you wanted to, because of sharp corners in the POSIX API (close(2) has this problem, for instance).
  4. I'm worried about libraries that fork and exec children without clearing signal masks. In particular, neither Rust's std::process nor -- surprisingly -- libc's system(3) clear the signal mask before running a child. (This behavior is implied by POSIX's sample implementation, and at least glibc and musl follow POSIX.) However, I'm willing to believe that there are way fewer of these -- just about all libraries call system calls frequently, few libraries spawn processes and they do so in very few places -- and that this is much more tractable a problem than getting every third-party library to deal with EINTR correctly.

So, we need to go around to all the C libraries that spawn helper processes, and make sure that they properly unmask signals (and potentially reset all signal handlers first) between fork and exec. We also need to get the same change into Rust's std::process, so that at the very least, you can safely spawn processes from Rust code.

(Note that pthread_atfork(3) doesn't help here: POSIX doesn't require they get run from system(3) or posix_spawn(3), and neither glibc nor musl do. It might be useful for std::process to have some API for running post-fork handlers in the long term, but it won't help us right now, and a nice interface for writing post-fork-safe code is going to be complicated.)

I'm still unhappy about the mio::initialize() API that masks all signals unconditionally, because of the risk that this will interact poorly with child processes for no benefit (e.g., you might just care about SIGWINCH, but you'll break SIGINT, SIGPIPE, etc. in children from system(3)). I suppose you could pass it a set of signals you might ever care about, but I'm not sure how great an option that is.

@geofft
Contributor
geofft commented May 14, 2015

Somewhat unrelatedly, I think I'm concluding that it's strictly worse to use signalfd on Linux than a dedicated thread with reentrant, async-safe handlers. The basic trouble is that all the normal (non-real-time) signals are permitted to coalesce, if multiple of them get delivered while they're being blocked. So if you write a program that blocks SIGCHLD, opens a signalfd for SIGCHLD, spawns two children that exit immediately, and then reads from the signalfd, you will nondeterministically get either one or two signalfd_siginfos from the signalfd, depending on whether both of the processes exited before you received the first one, and the info for the second child will be lost. I've written up a test case (using a hacked-up nix) that demonstrates this. :(

See also GHC ticket #2451, comment 19 onwards:

Really? My (vague) understanding is that a single SIGCHLD might be generated even though multiple children have exited -- i.e. SIGCHLD will not be queued the way real-time signals are. I see that w/ a test program which blocks SIGCHILD w/ sigprocmask -- 2 children exit, only one signal handler gets invoked.

If we want to get one siginfo per SIGCHLD (or other normal signal), I think we'll have to register a real signal handler with SA_NODEFER, and carefully make sure that it's re-entrant. (Which will involve some hilarious framing issues if we get interrupted while writing to our own self-pipe thread, but what can you do.) Since we need the separate thread on OS X anyway, I'm not terribly sad about requiring it on Linux.

Alternatively, we could decide that the semantics for mio signal delivery involve siginfos as a best-effort thing, but acknowledge that they may get coalesced, and you may only get one siginfo if multiple signals of the same type were delivered between calls to the reactor.

@rrichardson
Collaborator

I think there are three major subtasks in this issue:

  1. Make syscalls within mio/nix-rust signal safe.
  2. Make mio event_handler deal reasonably with signals such as SIGINT
  3. Make a signal delivery system.

I am of the opinion that items 1 and 2 are worthwhile tasks. 3 is not. I might be wrong, I am willing to defer to the results of a survey of developers for the market viability of signal delivery as a feature. My instinct is that, frankly, if someone is interested in organizing their management of signals they've made some wrong architectural decisions and I am not interested in helping someone do the wrong thing. That's as much as I'll say about item 3.

Re item 1, retrying on eintr is a pretty good thing to do in general (except for close()1) While I don't think much effort spent on signals is warranted as a lib designer, I do think that we should operate correctly if someone does have a legitimate use case for SIGUSR or SIGCHLD.

Re item 2, in my career as a POSIX dev, pretty much every mature software dev organization on *nix had an application framework context that they initialized. It set a static global variable named something intelligent like "volatile int __running_g = 1;" And the application context just set a handler for SIGINT and set that __running_g to 0, then the event loop would shut down, worker threads would spin down, etc. It doesn't matter which thread gets picked to handle the signal, the handler gets run, the global atomic variable gets tripped and things begin shutting down gracefully. I'd probably recommend a configuration setting in Mio EventLoop which is handle_sigint or something to let people opt out.

[1] close() isn't really as broken as the article referenced above might suggest, the most correct behavior is to not retry on close. The worst case is very remote possibility is you do leak a file descriptor. I don't know what the exact odds are, but let's say a 1 in 1,000,000 chance, let's also put the odds of a signal occuring during a close() in your program at 1,000,000 for round numbers. Also a paranoid sysadmin put the per-user fd limit at 4096 for some silly reason. Well then you'd need to handle 4e15 signals in your process to run out of fds. If you came anywhere close to that, I'd say that you're doing something really dumb with signals in the first place :)

@geofft
Contributor
geofft commented May 18, 2015

The things that have me interested in signal handling are clean handling of SIGINT, SIGWINCH (terminal window was resized), SIGTSTP (Ctrl-Z was pressed, but you can respond to it with custom behavior), etc. There are a couple of things that are only really doable with signals, unfortunately, and there have been a handful of questions on #rust recently about how to deal with signals. Some people have custom cleanup they want to do on SIGINT, some people want to interrupt their shell prompt but not terminate the program, etc.

I'm actually less concerned about SIGUSR1/2 personally, since there other, more expressive ways to do that sort of thing (named pipes, D-Bus, etc.), although yes, it'd be nice to handle if it's easy. My personal motivation is things that the OS interface requires that you handle via signals -- that is, cases where you can't make a better architectural decision even if you wanted to.

I think retry-on-EINTR is generally correct to do, but it's a pretty massive project to get all FFI'd C libraries to do so. Unless we're worried about some side effect of spawning a signal-handling thread, we might as well avoid harm. (That is, there are lots of cases where EINTR causes problems, but apart from the signal mask getting inherited, I don't think there are any cases, at least on halfway-modern platforms, where a separate signal-handling thread causes problems.)

@rrichardson
Collaborator

Maybe we're saying the same thing here, but I don't think it's possible to generically designate a signal handling thread. Unknown future libraries in the application will create a thread and ruin everything :)

I am all for providing some cleanly wrapped library functions that let a user do it themselves, provided they know what they're doing. Unless I'm missing something, all they'd need at that point to interact with EventLoop is a Notify Sender channel. I don't even think it would require a special callback in Handler, notify should be sufficient, obviously Message would need to understand how to carry said information, but that is entirely up to the API user, IMO.

@geofft
Contributor
geofft commented May 18, 2015

By the way, it was pointed out to me that you don't get reliable signal delivery over re-entrant sigaction handlers either: two signals can be delivered before the signal-handling thread is scheduled, and the handler will only be activated once. (See this C demonstration of the issue, or more details in my blog post on the subject.) So reliable siginfo for SIGCHLD is impossible using any approach, and I think it'd be a mistake to expose the pid from SIGCHLD's siginfo, since that would encourage people to treat it as reliable when it's not. POSIX really doesn't want SIGCHLD to mean anything other than "at least one of your children exited", so we should follow that. (@wycats says he's okay dropping siginfo on cross-platform-compatibility grounds anyway, and I can't see anything other than SIGCHLD where siginfo is likely to be useful -- other than SIGSEGV and friends, which we can't handle in an evented fashion anway.)

So this simplifies the API that mio needs to expose, and gets us closer to offering a clean abstraction like we do for sockets or timers, instead of something that's closely related to OS quirks. It also lets us get away with using signalfd on Linux if we'd like (since it turns out that signal handlers aren't actually more capable than signalfd). This avoids the separate thread, but it's not clear how useful that is.

I do wonder if there's value in mio offering subprocess management in the mainloop, and offering an API that doesn't specifically reference SIGCHLD, just like it offers its own APIs for TCP sockets, etc. It seems like kqueue's EVFILT_PROC lets us notify on processes directly, and Linux might get a CLONE_FD, so it'd be nice to use those interfaces on platforms where we can bypass SIGCHLD.

@carllerche
Owner

After talking more with @geofft, I believe that signal handling would be better off as a standalone library (built on top of nix). The reasons:

  • It is not really compatible with running multiple event loops in a given process since signals are global.
  • It will be pretty specific to *nix platforms.
  • Users of a signal lib can register handlers that communicate with running event loops via the notification system.

I'm hoping that @geofft will write this library 😄

@carllerche carllerche closed this May 28, 2015
@bors bors added a commit to rust-lang/rust that referenced this issue May 28, 2015
@bors bors Auto merge of #25784 - geofft:subprocess-signal-masks, r=alexcrichton
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
52f12c9
@bors bors added a commit to rust-lang/rust that referenced this issue May 28, 2015
@bors bors Auto merge of #25784 - geofft:subprocess-signal-masks, r=alexcrichton
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
2b457ad
@bors bors added a commit to rust-lang/rust that referenced this issue May 29, 2015
@bors bors Auto merge of #25784 - geofft:subprocess-signal-masks, r=alexcrichton
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
623072d
@bors bors added a commit to rust-lang/rust that referenced this issue Jun 21, 2015
@bors bors Auto merge of #25784 - geofft:subprocess-signal-masks, r=alexcrichton
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
00b862b
@bors bors added a commit to rust-lang/rust that referenced this issue Jun 22, 2015
@bors bors Auto merge of #25784 - geofft:subprocess-signal-masks, r=alexcrichton
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See carllerche/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
4e2a898
@geofft geofft referenced this issue in nix-rust/nix Sep 4, 2015
Closed

Add signalfd support #186

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment