Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset signal behavior before starting children with std::process #25784

Merged
merged 5 commits into from
Jun 22, 2015

Conversation

geofft
Copy link
Contributor

@geofft geofft commented May 25, 2015

UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets SIGPIPE to be ignored, on the grounds that Rust code using libstd will get the EPIPE errno and handle it correctly. But shell pipelines are built around the assumption that SIGPIPE will have its default behavior of killing the process, so that things like head work:

geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]

Here, head is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of SIGPIPE from its Rust grandparent process. So it gets a bunch of EPIPEs that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with find / | head, yes | head, etc.

This PR resets Rust's SIGPIPE handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like signalfd to dequeue signals, is one of two reasonable ways for a library to process signals. See tokio-rs/mio#16 for more discussion about this approach to signal handling and why it needs a change to std::process. The other approach is for the library to set a signal-handling function (signal() / sigaction()): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in libstd. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by stack_overflow.rs. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable stack_overflow.rs on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to stack_overflow.rs

@geofft geofft force-pushed the subprocess-signal-masks branch 2 times, most recently from 3617b67 to 15d62f6 Compare May 25, 2015 23:23
@pnkfelix pnkfelix added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 26, 2015
@@ -143,138 +143,6 @@ mod imp {
pub unsafe fn drop_handler(handler: &mut Handler) {
munmap(handler._data, SIGSTKSZ);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear, thanks for the eagle eyes here! Glad to see this duplication reduced.

@alexcrichton
Copy link
Member

This is some fascinating investigation @geofft, very nice work!

@geofft
Copy link
Contributor Author

geofft commented May 26, 2015

So just to check I poked around at some other libraries to see what they do. In libuv at least, nothing with signals is handled in terms of forking, but they also don't by default disable SIGPIPE

libuv appears to be processing signals via signal handlers. I can look in more detail if you think it's interesting, but if they're doing handlers instead of a thread and asking library users to ignore SIGPIPE themselves, strictly speaking none of this is their responsibility (and I'm also a little unsure if they've thought through their signals-and-subprocesses story enough to be a worthwhile reference). Rust libstd ignores SIGPIPE, so it incurs that responsibility, and while there's no signal-handling API in libstd, mio (for instance) wants to avoid handler functions to avoid EINTR problems. But without cooperation from libstd, unintentionally masking all signals in children is a bad plan.

I couldn't make heads or tails of what Ruby does

Seems to get this right:

howe-and-ser-moving:~ geofft$ ruby -e 'open("/proc/self/status").each_line { |line| puts line if line.include? "SigIgn" }'
SigIgn: 0000000000001000
howe-and-ser-moving:~ geofft$ ruby -e 'puts `grep SigIgn /proc/self/status`'
SigIgn: 0000000000000000

but it didn't look like Python did much with signal masks or forking despite ignoring SIGPIPE like we are.

Python 3 (at least 3.4) appears to have fixed this relative to Python 2.7:

howe-and-ser-moving:~ geofft$ python2 -c 'print [line for line in open("/proc/self/status") if "SigIgn" in line][0]'
SigIgn: 0000000001001000

howe-and-ser-moving:~ geofft$ python2 -c 'import subprocess; print subprocess.check_output(["grep", "SigIgn", "/proc/self/status"])'
SigIgn: 0000000001001000

howe-and-ser-moving:~ geofft$ python3 -c 'print([line for line in open("/proc/self/status") if "SigIgn" in line][0])'
SigIgn: 0000000001001000

howe-and-ser-moving:~ geofft$ python3 -c 'import subprocess, sys; sys.stdout.buffer.write(subprocess.check_output(["grep", "SigIgn", "/proc/self/status"]))'
SigIgn: 0000000000000000

Shouldn't that be up to the application instead of the library here instead? For example this makes it so it's impossible to spawn a process with a signal mask in place through the Command API. This info also makes me uneasy about ignoring SIGPIPE by default, but I think that setting it to SIG_DFL here is fine.

Yeah, I do actually mostly agree: there is definitely a use case for intentionally starting a child with ignored signals, e.g., writing nohup in Rust. (I'm less convinced there's a use case for starting a child with a mask, but posix_spawn supports it, so whatever.) What would be nice is a method on the Command object to change signal handling, just like the methods to change standard I/O inheritance -- and a static method that allows libraries to reset signal behaviors (or perhaps run a general-purpose unsafe fn to clean things up) post-fork. But I don't think we need that immediately. I was planning on non-urgently sending in a PR or writing a third-party crate to add some methods to do these sorts of things, but this current PR fixes the immediate problem with SIGPIPE, and also enables libraries like mio to turn on signal masks safely.

If you're worried about the regression possibility with signal masks (although the docs don't promise anything either way, and I can't imagine any users will ever care, let alone have cared), we could set things up to reset SIGPIPE, and also offer a static method that an external library like mio can call to undo its signal mask. That would also solve my immediate problem, although it's more complexity and unsafe code surface, and I don't think it gains us anything in practice. If you're just worried about the ability to set signal masks one way or another, I can get you a separate PR for that, though probably not immediately.

@alexcrichton
Copy link
Member

Yeah I think the change here to reset SIGPIPE back to SIG_DFL is definitely the right way to go (e.g. we differ from libuv here). My only pause now is whether to reset the sigmask on a spawn. For example the test here does not require the procmask to be reset, and nor does anything else in the standard library.

Perhaps a test could be added at least which requires procmask to be reset on spawn? I think it's fine to do, but it'd be nice to have a regression test to ensure it's not accidentally removed as well!

@geofft
Copy link
Contributor Author

geofft commented May 28, 2015

OK, I've added a #[test] to process.rs (so I can use all the internal bindings) that masks SIGINT, spawns cat, writes something, and makes sure that cat dies instead of replying. If I remove the five lines in that file to clear the signal mask, it fails make check-stage1-std, so it's appropriately testing for this.

Is this ready for merge? Are you happy with the changes to the bindings?

@alexcrichton
Copy link
Member

Looks good to me, thanks @geofft! Can you also tag the test with #[cfg(unix)]? Other than that through r=me.

@geofft
Copy link
Contributor Author

geofft commented May 28, 2015

Doesn't its presence in libstd/sys/unix/ make that implicit, since that path is only brought in by libstd/lib.rs's #[cfg(unix)] #[path = "sys/unix/mod.rs"] mod sys;? Like, code that references libc::fork is not really going to compile on Windows.

@alexcrichton
Copy link
Member

Welp, I got this.

@bors: r+ 8e2e0ee

bors added a commit that referenced this pull request May 28, 2015
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See tokio-rs/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
@bors
Copy link
Contributor

bors commented May 28, 2015

⌛ Testing commit 8e2e0ee with merge 52f12c9...

@bors
Copy link
Contributor

bors commented May 28, 2015

💔 Test failed - auto-linux-64-x-android-t

@geofft
Copy link
Contributor Author

geofft commented May 28, 2015

Oops, fixed. Can I test-build Android without having a local Android dev environment? make check-stage1-T-arm-linux-androideabi-H-x86_64-unknown-linux-gnu-std doesn't seem to like me.

@alexcrichton
Copy link
Member

@bors: r+ 824a928

You'll need to pass --target=arm-linux-androideabi to ./configure to enable a target like that.

@bors
Copy link
Contributor

bors commented May 28, 2015

⌛ Testing commit 824a928 with merge 2b457ad...

bors added a commit that referenced this pull request May 28, 2015
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See tokio-rs/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
@bors
Copy link
Contributor

bors commented May 28, 2015

💔 Test failed - auto-linux-64-x-android-t

@geofft
Copy link
Contributor Author

geofft commented May 28, 2015

Bah, I think I should go and set up build toolchains for all platforms. A #![cfg_attr(target_os = "linux", allow(dead_code))](or just reintroducing #![allow(dead_code)]...) will probably fix the immediate problem, but maybe you want a nicer fix, and maybe something's broken on Bitrig or whatever.

@alexcrichton
Copy link
Member

Ah feel free to throw up just a blanket #![allow(dead_code)] on c.rs, we're not that worried about dead code in that module anyway.

@alexcrichton
Copy link
Member

@bors: r+ 9a676f1

@bstrie
Copy link
Contributor

bstrie commented May 28, 2015

This probably means it's safe to enable stack_overflow.rs on more targets, but I'm not including such a change in this PR.

Can you file a bug for this so we remember to look at it later?

@bors
Copy link
Contributor

bors commented May 29, 2015

⌛ Testing commit 9a676f1 with merge 623072d...

bors added a commit that referenced this pull request May 29, 2015
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See tokio-rs/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
@bors
Copy link
Contributor

bors commented May 29, 2015

💔 Test failed - auto-linux-64-x-android-t

@alexcrichton
Copy link
Member

@bors: r+ 2a93dca

Thanks!

@bors
Copy link
Contributor

bors commented Jun 21, 2015

🔒 Merge conflict

@bors
Copy link
Contributor

bors commented Jun 21, 2015

☔ The latest upstream changes (presumably #25641) made this pull request unmergeable. Please resolve the merge conflicts.

@sfackler
Copy link
Member

@bors r=alexcrichton

@bors
Copy link
Contributor

bors commented Jun 21, 2015

📌 Commit 3a3d864 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Jun 21, 2015

⌛ Testing commit 3a3d864 with merge 00b862b...

@bors
Copy link
Contributor

bors commented Jun 21, 2015

💔 Test failed - auto-linux-64-x-android-t

@geofft
Copy link
Contributor Author

geofft commented Jun 21, 2015

Hrrm. I don't think this test failure is related to my code, despite being on Android again.

rustc: x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd
../src/libstd/lib.rs:152:21: 152:35 error: unused or unknown feature, #[deny(unused_features)] on by default
../src/libstd/lib.rs:152             feature(num_bits_bytes))]
                                             ^~~~~~~~~~~~~~
error: aborting due to previous error

Seems related to #26192, but the Android tests passed there, so I'm confused.

c::pthread_sigmask(c::SIG_SETMASK, &set, ptr::null_mut()) != 0 ||
libc::funcs::posix01::signal::signal(
libc::SIGPIPE, mem::transmute(c::SIG_DFL)
) == mem::transmute(c::SIG_ERR) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use as foo instead of mem::transmute? The transmute may be a bit of a heavy hammer for this operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transmute checks size, and SIG_ERR is ~0. Really the problem is that liblibc and c.rs have different sighandler_t types. c.rs uses *mut c_void since that's what we need for struct sigaction (well, really a function pointer); liblibc calls it a size_t, which is probably the same but I didn't want to assume that. Ideally we should synchronize these.

(Actually what I really want is an extended nullable-pointer optimization where I can say that 0, 1, and ~0 are invalid and force an enum {SIG_DFL = 0, SIG_IGN = 1, SIG_ERR = ~0, handler(extern fn(c_int)) to be represented as a single pointer, but I assume that's super hard.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that definitely sounds like a mess, this already landed so it's fine for now, but I may do a drive-by fix at some point to just cast these both to a usize and compare that (not critical at all though)

@alexcrichton
Copy link
Member

Ah unfortunately that constant is unstable currently, and it just looks like you deleted the usage of it, so you may be able to delete the associated cfg_attr entirely.

It looks like a lot of this dated to previous incarnations of the io
module, etc., and went unused in the reworking leading up to 1.0. Remove
everything we're not actively using (except for signal handling, which
will be reworked in the next commit).
Both c.rs and stack_overflow.rs had bindings of libc's signal-handling
routines. It looks like the split dated from rust-lang#16388, when (what is now)
c.rs was in libnative but not libgreen. Nobody is currently using the
c.rs bindings, but they're a bit more accurate in some places.

Move everything to c.rs (since I'll need signal handling in process.rs,
and we should avoid duplication), clean up the bindings, and manually
double-check everything against the relevant system headers (fixing a
few things in the process).
Make sure that child processes don't get affected by libstd's desire to
ignore SIGPIPE, nor a third-party library's signal mask (which is needed
to use either a signal-handling thread correctly or to use signalfd /
kqueue correctly).
signal(), sigemptyset(), and sigaddset() are only available as inline
functions until Android API 21. liblibc already handles signal()
appropriately, so drop it from c.rs; translate sigemptyset() and
sigaddset() (which is only used in a test) by hand from the C inlines.

We probably want to revert this commit when we bump Android API level.
@sfackler
Copy link
Member

@bors r=alexcrichton

@bors
Copy link
Contributor

bors commented Jun 22, 2015

📌 Commit a8dbb92 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Jun 22, 2015

⌛ Testing commit a8dbb92 with merge 4e2a898...

bors added a commit that referenced this pull request Jun 22, 2015
UNIX specifies that signal dispositions and masks get inherited to child processes, but in general, programs are not very robust to being started with non-default signal dispositions or to signals being blocked. For example, libstd sets `SIGPIPE` to be ignored, on the grounds that Rust code using libstd will get the `EPIPE` errno and handle it correctly. But shell pipelines are built around the assumption that `SIGPIPE` will have its default behavior of killing the process, so that things like `head` work:

```
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
geofft@titan:/tmp$ cat bash.rs
fn main() {
        std::process::Command::new("bash").status();
}
geofft@titan:/tmp$ ./bash
geofft@titan:/tmp$ for i in `seq 1 20`; do echo "$i"; done | head -1
1
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
bash: echo: write error: Broken pipe
[...]
```

Here, `head` is supposed to terminate the input process quietly, but the bash subshell has inherited the ignored disposition of `SIGPIPE` from its Rust grandparent process. So it gets a bunch of `EPIPE`s that it doesn't know what to do with, and treats it as a generic, transient error. You can see similar behavior with `find / | head`, `yes | head`, etc.

This PR resets Rust's `SIGPIPE` handler, as well as any signal mask that may have been set, before spawning a child. Setting a signal mask, and then using a dedicated thread or something like `signalfd` to dequeue signals, is one of two reasonable ways for a library to process signals. See tokio-rs/mio#16 for more discussion about this approach to signal handling and why it needs a change to `std::process`. The other approach is for the library to set a signal-handling function (`signal()` / `sigaction()`): in that case, dispositions are reset to the default behavior on exec (since the function pointer isn't valid across exec), so we don't have to care about that here.

As part of this PR, I noticed that we had two somewhat-overlapping sets of bindings to signal functionality in `libstd`. One dated to old-IO and probably the old runtime, and was mostly unused. The other is currently used by `stack_overflow.rs`. I consolidated the two bindings into one set, and double-checked them by hand against all supported platforms' headers. This probably means it's safe to enable `stack_overflow.rs` on more targets, but I'm not including such a change in this PR.

r? @alexcrichton
cc @Zoxc for changes to `stack_overflow.rs`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants