Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::os::unix::process::CommandExt::before_exec should be unsafe #39575

Closed
fweimer opened this issue Feb 6, 2017 · 73 comments

Comments

@fweimer
Copy link

commented Feb 6, 2017

The before_exec method should be marked unsafe because after a fork from a multi-threaded process, POSIX allows only allows async-signal-safe functions to be called. Otherwise, the result is undefined behavior.

The only way to enforce this in Rust is to mark the function as unsafe.

So to be clear, this should not compile:

use std::process::Command;
use std::os::unix::process::CommandExt;

fn main() {
    let status = Command::new("sh")
        .arg("-c")
             .arg("echo world")
        .before_exec(|| { println!("hello"); Ok(()) })
        .status().unwrap();
    println!("status: {}", status);
}
@alexcrichton

This comment has been minimized.

Copy link
Member

commented Feb 6, 2017

Could you clarify the unsafety that can happen here? We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such.

@fweimer

This comment has been minimized.

Copy link
Author

commented Feb 6, 2017

It's probably best to treat anything that is undefined according to POSIX as a memory safety violation. Once POSIX says that you cannot assume anything about the process's behavior, you need to assume that memory safety is violated as well.

I'll try to give two examples why this facility is unsafe in practice. I have to argue based on the glibc implementation, but we can hopefully assume that it is a valid implementation, matching the POSIX requirements in play here.

First, assume that you have a PRNG for use for cryptography written in Rust. This PRNG uses locks or thread-local variables to protect its internal state. It does not have fork protection because there is no way to implement that in Rust. This PRNG will return the same sequence of bytes in the before_exec callback and the parent thread, violating the expectations associated with a PRNG. Maybe this does not count as a memory safety violation, but I think it is still a problem for PRNGs.

For the second problem, we look at the implementation of recursive mutexes in libpthread. They embed a PID/TID, a small numeric value which gets reused over time. Let's look at this sequence of events:

  1. A thread in the parent process acquires a recursive mutex.
  2. Another thread invokes an external command with a before_exec callback.
  3. The before_exec callback invokes another external command with another before_exec callback.
  4. The initial parent process exits (so the thread in step 1 ceases to exist and its TID can be reused).
  5. The second before_exec callback spawns a new thread. The TID of that thread happens to match the TID of the thread in step 1.
  6. This new thread attempts to acquire the recursive mutex which has been acquired in step 1.

The attempt in step 6 will succeed because libpthread incorrectly assumes that the same thread tried to acquire the mutex. But the process was copied while the thread in step 1 was holding the mutex, so the data protected by it may not be consistent, and the thread in steps 5 and 6 may access intermediate results which are not supposed to be visible. This is a thread safety violation, and hence a memory safety violation.

I think the second example at least is quite compelling.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Feb 6, 2017

That'd do it!

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

commented May 20, 2017

This is not marked as unsafe today and to do so would be backwards incompatible; however, a quick search on GitHub didn't seem to turn up any code that uses it.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Jun 5, 2017

The libs team discussed this a few weeks ago at the last triage and the conclusion was that we'd prefer to see concrete evidence (e.g. a proof-of-concept) showing the memory unsafety here. At that point we'll consider remediations but for now we're fine just adding some documentation that "care should be taken"

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2017

In general, anything involving files may not behave as expected.

For example, a concrete memory safety issue would be a memory mapped allocator and assumes that the memory mapped file can't be accessed by multiple processes (e.g., by creating a file with the O_TMPFILE on Linux or enforced through some other means (e.g., by contract)).

Other shenanigans include:

  • A database that locks the underlying file and thus assumes that there is at most one writer can end up corrupting the database.
  • Loggers can end up writing corrupted logs.
  • Values can be dropped twice per constructor call (although the values will be dropped by different processes).
@bstrie

This comment has been minimized.

Copy link
Contributor

commented Jun 8, 2017

Not to denigrate theoretical concerns (which is to say, just because one hasn't yet demonstrated that this violates memory safety doesn't mean that we're going to reject this out-of-hand, and it doesn't mean that it won't eventually become a severe bug somehow), but I second Alex's desire to see a working proof-of-concept in code.

@steveklabnik

This comment has been minimized.

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2017

I honestly don't have time to write a convincing POC but this breaks all wayland implementations (assuming memory corruption counts as a safety bug) because wayland clients and servers communicate by memory mapped files. Any attempt (accidental or otherwise) to update a wayland buffer from within a before_exec callback might corrupt the shared buffer. Now, this can be mitigated by taking a global lock and then checking the PID before writing to memmapped buffers but that's not something one wants to do every frame.


@steveklabnik

Those aren't quite the same:

  1. They're more useful than before_exec (a reason to make an exception).
  2. The unsafety only shows up when doing FFI.

Unlike those cases, this affects a fundamental rust concept: move/copy semantics. Without before_exec (the ability to fork without unsafe code), it's impossible to (safely) copy a value of a type that doesn't implement Copy. With before_exec, this guarantee is relaxed by allowing copies (in safe code) as long as those copies reside in different processes. This prevents processes from relying on any form of "one-constructor-call -> at-most-one-value" guarantee.


As before_exec can be used to emulate fork, these threads also apply:

@bstrie

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2017

@alexcrichton I like to see all soundness issues with an associated priority tag, does your comment above imply that this is considered P-low? Alternatively, if you don't consider this a soundness issue at all, would you like to remove the I-unsound label?

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Jun 20, 2017

Ah yes the libs team decided this was basically P-medium, so I will tag as such.

@briansmith

This comment has been minimized.

Copy link

commented Jul 7, 2017

Even if there were no way to trigger memory unsafety today, at any time a change to libpthreads or libc or the operating system in the future could cause memory unsafety where there was previously none. Accordingly, I don't think blocking the change on a PoC makes sense, and further I think requiring somebody to make a PoC before the change is made would be a waste of those people's time. Any time we trigger undefined behavior, memory unsafety should just be assumed.

@Kixunil

This comment has been minimized.

Copy link

commented Jul 7, 2017

BTW have anyone considered that on macOS if an application is using libdispatch, it must not call any libdispatch code after fork? I don't have PoC now but I think it'd not be difficult to create.

@bstrie

This comment has been minimized.

Copy link
Contributor

commented Oct 19, 2017

I concur with the arguments here that this is a different beast than remove_var and set_var. Thus far we have chosen to always interpret specified undefined behavior as memory unsafety, demonstrable or otherwise. It would be a unique exception to our policy of "all memory unsafety may be traced back to an unsafe block" to add "(...unless using std::os::unix::process::CommandExt::before_exec)". The longer we wait, the more painful this will be. I propose that we begin emitting a warning ASAP with a plan to mark this unsafe at a specified future date.

@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

With some OS help one can always cause memory unsafety:

#![feature(getpid)]

fn hacky_ptr_write<T>(ptr: &T, value: u32) {
    std::process::Command::new("gdb").arg("-batch").arg("-quiet")
        .arg("-pid").arg(format!("{}", std::process::id())).arg("-ex")
        .arg(format!("set {{unsigned long}}{:p}={}",ptr,value))
        .output() .unwrap();
}
fn main() {
    let q = &mut Box::new(55);
    hacky_ptr_write(&q, 0);
    *q = Box::new(44); // Segmentation fault
}

But this does not mean Command should be unsafe.

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

@vi

The problem here is that the type-based assumption that a non-Copy type can't be copied is broken (even though the copies end up in different processes). Any code relying on this assumption is potentially incorrect and any unsafe code relying on it is potentially memory unsafe. However, given the "different process" constraint, this bug necessarily involves the operating system.

Yes, you can usually shoot yourself in the foot using the OS. However, you generally have to explicitly request this behavior. In this case, you can take two apparently safe and independent APIs (e.g., wayland and before_exec) and combine them to corrupt memory.

It's actually harder than you might think as long as you aren't root. Most modern Linux distros, at least, don't allow non-root processes to debug (or access/modify the memory of) non-child processes.

@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

I expect there to be some unsafe{} inside that Wayland code. Can one reproduce the unsafety without using non-std unsafe{} code?

Does the question boil down to "Should every unsafe{} block also expect forks or should the fork itself just be be unsafe"?

non-Copy type can't be copied

During fork() I expect it to stay on the same virtual address, so that it is as if not copied. !Copy from Rust perspective does not mean non-copyable from OS kernel perspective.

@Stebalien

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

I expect there to be some unsafe{} inside that Wayland code.

Yes. It memory maps a shared buffer.

Can one reproduce the unsafety without using non-std unsafe{} code?

Not unless you were to move the shared mmap functionality from wayland into std. It's not really about where the code lives but what are valid assumptions and what are not (although any assumptions made by std are assumed to be valid).

Does the question boil down to "Should every unsafe{} block also expect forks or should the fork itself just be be unsafe"?

Pretty much, yes. If we do say "this API is fine", we should also provide a fork function (the same way we made mem::forget safe).

During fork() I expect it to stay on the same virtual address, so that it is as if not copied. !Copy from Rust perspective does not mean non-copyable from OS kernel perspective.

From an outside perspective, the value is copied and:

  1. Will be dropped twice.
  2. Will share any os-level objects (e.g. file descriptors).
@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

From an outside perspective, the value is copied and:

Will be dropped twice.
Will share any os-level objects (e.g. file descriptors).

Just what is typically needed. Consider a C program:

#include <stdlib.h>
#include <unistd.h>

int main() {
    int fd = open("myfile.txt", O_RDONLY);
    void* mem = malloc(300);
    if (fork()) {
        close(fd); // first "drop" of fd
        free(mem); // first "drop" of mem
    } else {
        close(fd); // second "drop" of fd
        free(mem); // second "drop" of mem
    }
}

There are two frees, but it's not that "double free" error. Likewise file descriptors should be closed twice after a fork.

Unsafety of fork may be like unsafety of FromRawFd - not directly messing with memory, just confusing other parts of system with surprises, not causing unsafety from within Rust, but causing operating system to behave unsafely for Rust program. And it's workaroundable by std::fs::File::open("/proc/self/fd/0") or something like that.

Relationship between processes after fork() is IPC. Shared memory buffers is also a sort of IPC. Triggering program crash by using IPC is unsafety of the program...

What exactly happens in Wayland if evil peer (or even evil Wayland server) deliberately corrupts our buffer? Shound't Rust program not trust memory-mapped buffers that are writable from outside anyway? I expect storing pointers inside a memory-mapped file is unsafe anyways. And if no pointers getting stored there then it would be wrong, but safe behaviour.

@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2017

Not unless you were to move the shared mmap functionality from wayland into std.

What function? map? But it's unsafe anyways. Also it gives you [u8], and those bytes can jump around unpredictably because of it's IPC and somebody may be interfering. Storing safety-important things in that [u8] is your decision and that's where the wrong assumption ("when I get mapping, it should be safe to work with") kicks in.

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Dec 16, 2017

There's also this comment from the implementation of Command:

    // Currently we try hard to ensure that the call to `.exec()` doesn't
    // actually allocate any memory. While many platforms try to ensure that
    // memory allocation works after a fork in a multithreaded process, it's
    // been observed to be buggy and somewhat unreliable, so we do our best to
    // just not do it at all!

If this is true it seems pretty clear that this function should be considered unsafe. If it's not true then there's a whole lot of unnecessary and complex unsafe code in the standard library.

@rkruppe

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

There is definitely a policy against marking functions that are known to sound but otherwise dangerous as unsafe. It's also true that functions exposed as safe should be proven (or, in the absence of a formalism for all of Rust in which such a proof can be carried out, convincingly argued) to be sound. Neither quite matches what @Diggsey is talking about: no proof that it's sound and no non-UB footguns motivating unsafe. I don't think there is an existing policy about that.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

you can use [patch] to depend on a git repository for a try build.

I've never used [patch] before, will see what I can do.


there is some notion against marking functions as unsafe just because of it feels unsafe or because of it is bad for security or to deter users from using it.

True. However:

there are features which are independently safe, but cause unsafety when used together

Namely, my pipe example (helpfully folded away by GitHub) is safe in a universe without a safe fork, and fork is safe in a universe that does not allow my pipe example. Making fork (and hence before_exec) unsafe would not just be a "lint" because it is "bad for security", it would actually come with a proper proof obligation for the caller: You must not use this to "use up" or even touch the same exclusive resource in both parent and child process. (Basically, shared references are okay, mutable references are not. My pipe example acquires a mutable reference from a shared reference through Mutex::lock, that is the unsound part.)

I do not think there can be a general policy about what to do in such cases, we must decide on a case-by-case basis.


Making before_exec an unsafe fn doesn't directly impact nix as a matter of semver due to fork being safe in the crate; it does however indicate that nix might be doing something fishy.

If we decide before_exec to be unsafe, that can only go together with deciding that we make some amount of guarantees for using ownership to track out-of-process resources -- promising that things like my pipe example are sound. Hence the justification for fork to be safe (which contains an unsafe block, so it must manually justify its safety) would no longer hold.

So, yes, this would immediately imply that a safe fork is a soundness hole in a library.

@Centril

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

@RalfJung

If we decide before_exec to be unsafe, that can only go together with deciding that we make some amount of guarantees for using ownership to track out-of-process resources -- promising that things like my pipe example are sound. Hence the justification for fork to be safe (which contains an unsafe block, so it must manually justify its safety) would no longer hold.

So, yes, this would immediately imply that a safe fork is a soundness hole in a library.

Oh sure, the crate would have to change eventually; but it does not imply that the code would break immediately in the sense that cargo check no longer works, which is what I was trying to communicate (poorly!). :)

@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2018

pipe example is safe in a universe without a safe fork

The example looks like it trusts sockets to return some sane data. I'm not sure if OS sockets and other objects (apart from memory manager) are in scope of Rust's safety rules.

from_raw_fd being unsafe may set precedent that file descriptors are in scope of safety rules although. There are workarounds (like /proc/self/fd/... or /proc/self/mem), so maybe there is some concensus that tricky file paths are not in scope of safety rules.

Another tricky point is shared memory. Is all shared memory inherently unpredictable (i.e. can't store any references there or rely on any invariants about content of shared memory) or safe facades are OK and tool that break those safe facades are unsafe?


Would hypothetical new OS function that adjusts monotonic clock, allows it to jump back in time (which is currently documended not to happen) safe or unsafe according to Rust?

There should be some explicit policy of which operation system features are in scope of safety rules and which are not. For each feature in scope, there should be description of what invariants shall unsafe-using code preserve (e.g. "don't close or duplicate file descriptors owned by Rust code").

bors added a commit that referenced this issue Nov 21, 2018

Auto merge of #56129 - RalfJung:before-exec, r=<try>
[DO NOT MERGE] make before_exec an unsafe fn

This is an experiment for #39575.
@RalfJung

This comment has been minimized.

Copy link
Member

commented Nov 24, 2018

So crater found 32 root regressions. Looks like if we decide we want to take a stanza for tracking ownership of external resources, we'd have to go the gradual-deprecation route.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Nov 25, 2018

Thanks for gathering the data @RalfJung! This is a nominated issue for the next T-libs meeting where we can try to discuss and discover a new name here. Do others have suggestions for what to call this method alternatively? Some ideas I might have are:

  • before_exec2
  • pre_exec
  • after_fork
@Centril

This comment has been minimized.

Copy link
Member

commented Nov 25, 2018

Nominating for discussion on next T-lang meeting which is on the 29th.
If @RalfJung and @alexcrichton can participate that'd be good.

Ralf couldn't participate on the 29th.

@Centril

This comment has been minimized.

Copy link
Member

commented Nov 25, 2018

My preference is to deprecate the method temporarily with:

#[deprecated =
    "This method will become unsafe because $reason.\
     Check whether what you are doing is safe and then wrap the call in `unsafe { ... }`."]

And then we can let say 3-6 months pass while we file fixing PRs; after that we make before_exec into unsafe fn.

I don't mind having an extra method; but I would like before_exec to eventually become unsafe fn.
As for naming, all of the mentioned candidates seem fine; another potential name could be before_exec_unchecked.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Nov 28, 2018

This was discussed briefly with the libs team during today's triage, and there was no objection to moving forward with a deprecation. The leading contender for a name was before_exec2 which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).

Moving this issue forward largely just needs someone to champion the PR and implementation to push it through!

@mitsuhiko

This comment has been minimized.

Copy link
Contributor

commented Nov 28, 2018

Not my call to make but I really do not like the idea of just incrementing the count on methods. This made a lot of code in Python just very ugly to begin with.

If you want to find some better names here are similar APIs in other languages: function with the same behavior in python is called prexec_fn for subprocesses, in boost it's called on_exec_setup. In glib the closest equivalent is GSpawnChildSetupFunc (setup_func).

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Nov 28, 2018

If editions could be used to solve this in the future, that would be a huge benefit. (I realise this is difficult given that libstd has to only be compiled once, but perhaps there is a way to attach some metadata to the function signature so that the compiler can do the mapping).

eg. an attribute like:

#[rename("before_exec", edition=2018)]
fn before_exec2(...)
@Centril

This comment has been minimized.

Copy link
Member

commented Nov 28, 2018

@Diggsey ostensibly you could do that with "edition visibility", e.g. pub(>= 2018) and pub(<2018) as an unstable mechanism. I'd like to use some mechanism like that to say e.g. pub(< 2021) fn uninitialized<T>() -> T.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Nov 29, 2018

The leading contender for a name was before_exec2 which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).

This makes sense for someone coming from today's Rust, but in a future where before_exec is not used at all any more, many people will be confused by what the 2 means. Basically, before_exec2 means we will always have to explain the name as "successor of a now-deprecated function".

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2018

We can have "soft unsafe" operations which only produce a (deny by default) lint that states that one should be wrapping it in unsafe, but also mention that the only reason it's a lint is because of backwards compatibility.

Implementing such a scheme in the unsafe checker is quite simple. I am volunteering to write a PR if such a change is desired (or if we just want to see how such a change would look).

@vi

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2018

"soft unsafe"

Like what happens when taking a reference of a field of a repr(packed) struct?

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Nov 29, 2018

I think there's definitely enough valid pushback to not call it before_exec2! I like preexec or maybe on_exec_setup myself perhaps? (I don't feel too strongly one way or another).

@oli-obk I think I'd personally prefer to see a different function, because there's also the matter of creating a function pointer to this method which today is safe and creates a safe function pointer, but aftewards would need to ideally be safe and create an unsafe function pointer but would in practice have to unsafely create a safe function pointer

@Centril Centril removed the I-nominated label Nov 29, 2018

@joshtriplett

This comment has been minimized.

Copy link
Member

commented Nov 29, 2018

The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

@joshtriplett

This comment has been minimized.

Copy link
Member

commented Nov 29, 2018

Speaking only for myself: I like pre_exec.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2018

I do feel like there is a meta question lurking here of unsafe composition and how we should manage it. For now, it seems good to err on the side of caution where possible, though, and avoid thorny questions -- but -- as has been amptly demonstrated here -- it's sort of hard to tell what the limits ought to be on what unsafe code can and cannot do when it comes to e.g. external resources. I wonder if at some point we're going to want to try and allow unsafely implemented libraries to declare more precisely the kinds of things they rely on other libraries not to do.

bors added a commit that referenced this issue Feb 15, 2019

Auto merge of #58059 - RalfJung:before_exec, r=alexcrichton
deprecate before_exec in favor of unsafe pre_exec

Fixes #39575

As per the [lang team decision](#39575 (comment)):

> The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

Cc @alexcrichton @rust-lang/libs how would you like to proceed?

Centril added a commit to Centril/rust that referenced this issue Feb 20, 2019

Rollup merge of rust-lang#58059 - RalfJung:before_exec, r=alexcrichton
deprecate before_exec in favor of unsafe pre_exec

Fixes rust-lang#39575

As per the [lang team decision](rust-lang#39575 (comment)):

> The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

Cc @alexcrichton @rust-lang/libs how would you like to proceed?

Centril added a commit to Centril/rust that referenced this issue Feb 22, 2019

Rollup merge of rust-lang#58059 - RalfJung:before_exec, r=alexcrichton
deprecate before_exec in favor of unsafe pre_exec

Fixes rust-lang#39575

As per the [lang team decision](rust-lang#39575 (comment)):

> The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

Cc @alexcrichton @rust-lang/libs how would you like to proceed?

Centril added a commit to Centril/rust that referenced this issue Feb 22, 2019

Rollup merge of rust-lang#58059 - RalfJung:before_exec, r=alexcrichton
deprecate before_exec in favor of unsafe pre_exec

Fixes rust-lang#39575

As per the [lang team decision](rust-lang#39575 (comment)):

> The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

Cc @alexcrichton @rust-lang/libs how would you like to proceed?

@bors bors closed this in #58059 Feb 23, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.