std::os::unix::process::CommandExt::before_exec should be unsafe #39575

fweimer · 2017-02-06T07:34:56Z

The before_exec method should be marked unsafe because after a fork from a multi-threaded process, POSIX allows only allows async-signal-safe functions to be called. Otherwise, the result is undefined behavior.

The only way to enforce this in Rust is to mark the function as unsafe.

So to be clear, this should not compile:

use std::process::Command;
use std::os::unix::process::CommandExt;

fn main() {
    let status = Command::new("sh")
        .arg("-c")
             .arg("echo world")
        .before_exec(|| { println!("hello"); Ok(()) })
        .status().unwrap();
    println!("status: {}", status);
}

alexcrichton · 2017-02-06T19:14:27Z

Could you clarify the unsafety that can happen here? We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such.

fweimer · 2017-02-06T19:38:25Z

It's probably best to treat anything that is undefined according to POSIX as a memory safety violation. Once POSIX says that you cannot assume anything about the process's behavior, you need to assume that memory safety is violated as well.

I'll try to give two examples why this facility is unsafe in practice. I have to argue based on the glibc implementation, but we can hopefully assume that it is a valid implementation, matching the POSIX requirements in play here.

First, assume that you have a PRNG for use for cryptography written in Rust. This PRNG uses locks or thread-local variables to protect its internal state. It does not have fork protection because there is no way to implement that in Rust. This PRNG will return the same sequence of bytes in the before_exec callback and the parent thread, violating the expectations associated with a PRNG. Maybe this does not count as a memory safety violation, but I think it is still a problem for PRNGs.

For the second problem, we look at the implementation of recursive mutexes in libpthread. They embed a PID/TID, a small numeric value which gets reused over time. Let's look at this sequence of events:

A thread in the parent process acquires a recursive mutex.
Another thread invokes an external command with a before_exec callback.
The before_exec callback invokes another external command with another before_exec callback.
The initial parent process exits (so the thread in step 1 ceases to exist and its TID can be reused).
The second before_exec callback spawns a new thread. The TID of that thread happens to match the TID of the thread in step 1.
This new thread attempts to acquire the recursive mutex which has been acquired in step 1.

The attempt in step 6 will succeed because libpthread incorrectly assumes that the same thread tried to acquire the mutex. But the process was copied while the thread in step 1 was holding the mutex, so the data protected by it may not be consistent, and the thread in steps 5 and 6 may access intermediate results which are not supposed to be visible. This is a thread safety violation, and hence a memory safety violation.

I think the second example at least is quite compelling.

alexcrichton · 2017-02-06T20:37:53Z

That'd do it!

Mark-Simulacrum · 2017-05-20T15:57:34Z

This is not marked as unsafe today and to do so would be backwards incompatible; however, a quick search on GitHub didn't seem to turn up any code that uses it.

alexcrichton · 2017-06-05T21:50:36Z

The libs team discussed this a few weeks ago at the last triage and the conclusion was that we'd prefer to see concrete evidence (e.g. a proof-of-concept) showing the memory unsafety here. At that point we'll consider remediations but for now we're fine just adding some documentation that "care should be taken"

Stebalien · 2017-06-05T22:40:02Z

In general, anything involving files may not behave as expected.

For example, a concrete memory safety issue would be a memory mapped allocator and assumes that the memory mapped file can't be accessed by multiple processes (e.g., by creating a file with the O_TMPFILE on Linux or enforced through some other means (e.g., by contract)).

Other shenanigans include:

A database that locks the underlying file and thus assumes that there is at most one writer can end up corrupting the database.
Loggers can end up writing corrupted logs.
Values can be dropped twice per constructor call (although the values will be dropped by different processes).

bstrie · 2017-06-08T20:36:44Z

Not to denigrate theoretical concerns (which is to say, just because one hasn't yet demonstrated that this violates memory safety doesn't mean that we're going to reject this out-of-hand, and it doesn't mean that it won't eventually become a severe bug somehow), but I second Alex's desire to see a working proof-of-concept in code.

steveklabnik · 2017-06-08T20:47:29Z

Some prior art

Feels similar to me.

Stebalien · 2017-06-11T01:05:22Z

I honestly don't have time to write a convincing POC but this breaks all wayland implementations (assuming memory corruption counts as a safety bug) because wayland clients and servers communicate by memory mapped files. Any attempt (accidental or otherwise) to update a wayland buffer from within a before_exec callback might corrupt the shared buffer. Now, this can be mitigated by taking a global lock and then checking the PID before writing to memmapped buffers but that's not something one wants to do every frame.

@steveklabnik

Those aren't quite the same:

They're more useful than before_exec (a reason to make an exception).
The unsafety only shows up when doing FFI.

Unlike those cases, this affects a fundamental rust concept: move/copy semantics. Without before_exec (the ability to fork without unsafe code), it's impossible to (safely) copy a value of a type that doesn't implement Copy. With before_exec, this guarantee is relaxed by allowing copies (in safe code) as long as those copies reside in different processes. This prevents processes from relying on any form of "one-constructor-call -> at-most-one-value" guarantee.

As before_exec can be used to emulate fork, these threads also apply:

bstrie · 2017-06-20T19:54:27Z

@alexcrichton I like to see all soundness issues with an associated priority tag, does your comment above imply that this is considered P-low? Alternatively, if you don't consider this a soundness issue at all, would you like to remove the I-unsound label?

alexcrichton · 2017-06-20T22:12:48Z

Ah yes the libs team decided this was basically P-medium, so I will tag as such.

briansmith · 2017-07-07T03:06:06Z

Even if there were no way to trigger memory unsafety today, at any time a change to libpthreads or libc or the operating system in the future could cause memory unsafety where there was previously none. Accordingly, I don't think blocking the change on a PoC makes sense, and further I think requiring somebody to make a PoC before the change is made would be a waste of those people's time. Any time we trigger undefined behavior, memory unsafety should just be assumed.

Kixunil · 2017-07-07T13:56:33Z

BTW have anyone considered that on macOS if an application is using libdispatch, it must not call any libdispatch code after fork? I don't have PoC now but I think it'd not be difficult to create.

bstrie · 2017-10-19T12:57:57Z

I concur with the arguments here that this is a different beast than remove_var and set_var. Thus far we have chosen to always interpret specified undefined behavior as memory unsafety, demonstrable or otherwise. It would be a unique exception to our policy of "all memory unsafety may be traced back to an unsafe block" to add "(...unless using std::os::unix::process::CommandExt::before_exec)". The longer we wait, the more painful this will be. I propose that we begin emitting a warning ASAP with a plan to mark this unsafe at a specified future date.

vi · 2017-11-05T00:47:40Z

With some OS help one can always cause memory unsafety:

#![feature(getpid)]

fn hacky_ptr_write<T>(ptr: &T, value: u32) {
    std::process::Command::new("gdb").arg("-batch").arg("-quiet")
        .arg("-pid").arg(format!("{}", std::process::id())).arg("-ex")
        .arg(format!("set {{unsigned long}}{:p}={}",ptr,value))
        .output() .unwrap();
}
fn main() {
    let q = &mut Box::new(55);
    hacky_ptr_write(&q, 0);
    *q = Box::new(44); // Segmentation fault
}

But this does not mean Command should be unsafe.

Stebalien · 2017-11-05T03:02:40Z

@vi

The problem here is that the type-based assumption that a non-Copy type can't be copied is broken (even though the copies end up in different processes). Any code relying on this assumption is potentially incorrect and any unsafe code relying on it is potentially memory unsafe. However, given the "different process" constraint, this bug necessarily involves the operating system.

Yes, you can usually shoot yourself in the foot using the OS^†. However, you generally have to explicitly request this behavior. In this case, you can take two apparently safe and independent APIs (e.g., wayland and before_exec) and combine them to corrupt memory.

^†It's actually harder than you might think as long as you aren't root. Most modern Linux distros, at least, don't allow non-root processes to debug (or access/modify the memory of) non-child processes.

vi · 2017-11-05T03:58:40Z

I expect there to be some unsafe{} inside that Wayland code. Can one reproduce the unsafety without using non-std unsafe{} code?

Does the question boil down to "Should every unsafe{} block also expect forks or should the fork itself just be be unsafe"?

non-Copy type can't be copied

During fork() I expect it to stay on the same virtual address, so that it is as if not copied. !Copy from Rust perspective does not mean non-copyable from OS kernel perspective.

Stebalien · 2017-11-05T07:24:42Z

I expect there to be some unsafe{} inside that Wayland code.

Yes. It memory maps a shared buffer.

Can one reproduce the unsafety without using non-std unsafe{} code?

Not unless you were to move the shared mmap functionality from wayland into std. It's not really about where the code lives but what are valid assumptions and what are not (although any assumptions made by std are assumed to be valid).

Does the question boil down to "Should every unsafe{} block also expect forks or should the fork itself just be be unsafe"?

Pretty much, yes. If we do say "this API is fine", we should also provide a fork function (the same way we made mem::forget safe).

During fork() I expect it to stay on the same virtual address, so that it is as if not copied. !Copy from Rust perspective does not mean non-copyable from OS kernel perspective.

From an outside perspective, the value is copied and:

Will be dropped twice.
Will share any os-level objects (e.g. file descriptors).

vi · 2017-11-05T11:58:44Z

From an outside perspective, the value is copied and:

Will be dropped twice.
Will share any os-level objects (e.g. file descriptors).

Just what is typically needed. Consider a C program:

#include <stdlib.h>
#include <unistd.h>

int main() {
    int fd = open("myfile.txt", O_RDONLY);
    void* mem = malloc(300);
    if (fork()) {
        close(fd); // first "drop" of fd
        free(mem); // first "drop" of mem
    } else {
        close(fd); // second "drop" of fd
        free(mem); // second "drop" of mem
    }
}

There are two frees, but it's not that "double free" error. Likewise file descriptors should be closed twice after a fork.

Unsafety of fork may be like unsafety of FromRawFd - not directly messing with memory, just confusing other parts of system with surprises, not causing unsafety from within Rust, but causing operating system to behave unsafely for Rust program. And it's workaroundable by std::fs::File::open("/proc/self/fd/0") or something like that.

Relationship between processes after fork() is IPC. Shared memory buffers is also a sort of IPC. Triggering program crash by using IPC is unsafety of the program...

What exactly happens in Wayland if evil peer (or even evil Wayland server) deliberately corrupts our buffer? Shound't Rust program not trust memory-mapped buffers that are writable from outside anyway? I expect storing pointers inside a memory-mapped file is unsafe anyways. And if no pointers getting stored there then it would be wrong, but safe behaviour.

vi · 2017-11-05T12:05:57Z

Not unless you were to move the shared mmap functionality from wayland into std.

What function? map? But it's unsafe anyways. Also it gives you [u8], and those bytes can jump around unpredictably because of it's IPC and somebody may be interfering. Storing safety-important things in that [u8] is your decision and that's where the wrong assumption ("when I get mapping, it should be safe to work with") kicks in.

Diggsey · 2017-12-16T16:24:08Z

There's also this comment from the implementation of Command:

    // Currently we try hard to ensure that the call to `.exec()` doesn't
    // actually allocate any memory. While many platforms try to ensure that
    // memory allocation works after a fork in a multithreaded process, it's
    // been observed to be buggy and somewhat unreliable, so we do our best to
    // just not do it at all!

If this is true it seems pretty clear that this function should be considered unsafe. If it's not true then there's a whole lot of unnecessary and complex unsafe code in the standard library.

RalfJung · 2018-11-24T15:23:59Z

So crater found 32 root regressions. Looks like if we decide we want to take a stanza for tracking ownership of external resources, we'd have to go the gradual-deprecation route.

alexcrichton · 2018-11-25T09:53:30Z

Thanks for gathering the data @RalfJung! This is a nominated issue for the next T-libs meeting where we can try to discuss and discover a new name here. Do others have suggestions for what to call this method alternatively? Some ideas I might have are:

before_exec2
pre_exec
after_fork

Centril · 2018-11-25T10:32:23Z

Nominating for discussion on next T-lang meeting which is on the 29th.
If @RalfJung and @alexcrichton can participate that'd be good.

Ralf couldn't participate on the 29th.

Centril · 2018-11-25T10:49:24Z

My preference is to deprecate the method temporarily with:

#[deprecated =
    "This method will become unsafe because $reason.\
     Check whether what you are doing is safe and then wrap the call in `unsafe { ... }`."]

And then we can let say 3-6 months pass while we file fixing PRs; after that we make before_exec into unsafe fn.

I don't mind having an extra method; but I would like before_exec to eventually become unsafe fn.
As for naming, all of the mentioned candidates seem fine; another potential name could be before_exec_unchecked.

alexcrichton · 2018-11-28T22:51:04Z

This was discussed briefly with the libs team during today's triage, and there was no objection to moving forward with a deprecation. The leading contender for a name was before_exec2 which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).

Moving this issue forward largely just needs someone to champion the PR and implementation to push it through!

mitsuhiko · 2018-11-28T23:04:47Z

Not my call to make but I really do not like the idea of just incrementing the count on methods. This made a lot of code in Python just very ugly to begin with.

If you want to find some better names here are similar APIs in other languages: function with the same behavior in python is called prexec_fn for subprocesses, in boost it's called on_exec_setup. In glib the closest equivalent is GSpawnChildSetupFunc (setup_func).

Diggsey · 2018-11-28T23:47:22Z

If editions could be used to solve this in the future, that would be a huge benefit. (I realise this is difficult given that libstd has to only be compiled once, but perhaps there is a way to attach some metadata to the function signature so that the compiler can do the mapping).

eg. an attribute like:

#[rename("before_exec", edition=2018)]
fn before_exec2(...)

Centril · 2018-11-28T23:55:12Z

@Diggsey ostensibly you could do that with "edition visibility", e.g. pub(>= 2018) and pub(<2018) as an unstable mechanism. I'd like to use some mechanism like that to say e.g. pub(< 2021) fn uninitialized<T>() -> T.

RalfJung · 2018-11-29T08:17:33Z

The leading contender for a name was before_exec2 which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).

This makes sense for someone coming from today's Rust, but in a future where before_exec is not used at all any more, many people will be confused by what the 2 means. Basically, before_exec2 means we will always have to explain the name as "successor of a now-deprecated function".

oli-obk · 2018-11-29T09:38:00Z

We can have "soft unsafe" operations which only produce a (deny by default) lint that states that one should be wrapping it in unsafe, but also mention that the only reason it's a lint is because of backwards compatibility.

Implementing such a scheme in the unsafe checker is quite simple. I am volunteering to write a PR if such a change is desired (or if we just want to see how such a change would look).

vi · 2018-11-29T10:59:34Z

"soft unsafe"

Like what happens when taking a reference of a field of a repr(packed) struct?

alexcrichton · 2018-11-29T15:37:27Z

I think there's definitely enough valid pushback to not call it before_exec2! I like preexec or maybe on_exec_setup myself perhaps? (I don't feel too strongly one way or another).

@oli-obk I think I'd personally prefer to see a different function, because there's also the matter of creating a function pointer to this method which today is safe and creates a safe function pointer, but aftewards would need to ideally be safe and create an unsafe function pointer but would in practice have to unsafely create a safe function pointer

joshtriplett · 2018-11-29T21:07:10Z

The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team.

joshtriplett · 2018-11-29T21:13:21Z

Speaking only for myself: I like pre_exec.

nikomatsakis · 2018-11-29T21:18:05Z

I do feel like there is a meta question lurking here of unsafe composition and how we should manage it. For now, it seems good to err on the side of caution where possible, though, and avoid thorny questions -- but -- as has been amptly demonstrated here -- it's sort of hard to tell what the limits ought to be on what unsafe code can and cannot do when it comes to e.g. external resources. I wonder if at some point we're going to want to try and allow unsafely implemented libraries to declare more precisely the kinds of things they rely on other libraries not to do.

@alexcrichton

deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?

@alexcrichton

deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?

@alexcrichton

deprecate before_exec in favor of unsafe pre_exec Fixes rust-lang#39575 As per the [lang team decision](rust-lang#39575 (comment)): > The language team agreed that before_exec should be unsafe, and leaves the details of a transition plan to the libs team. Cc @alexcrichton @rust-lang/libs how would you like to proceed?

mahkoh · 2020-05-24T20:16:31Z

Here is a simple example showing memory unsafety in the presence of fork:

#![feature(thread_id_value)]

use std::sync::atomic::{AtomicU64, AtomicPtr};
use std::sync::atomic::Ordering::SeqCst;
use std::{thread, ptr};

pub fn f() {
    static I: AtomicU64 = AtomicU64::new(0);
    static D: AtomicPtr<i32> = AtomicPtr::new(ptr::null_mut());

    let data = Box::leak(Box::new(0));
    let tid = thread::current().id().as_u64().get();
    match I.load(SeqCst) {
        0 => {
            I.store(tid, SeqCst);
            /*
             * Assumption for safety: No call to `f` can have the same `tid` while we're
             * in the critical section that spans this comment.
             */
            D.store(data, SeqCst);
        },
        n if n == tid => {
			/* D has been set to a valid pointer because of the assumption above */
            let _v = unsafe { *D.load(SeqCst) };
        },
        _ => { },
    }
}

Mark-Simulacrum added I-nominated T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 20, 2017

brson added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label May 23, 2017

alexcrichton removed the I-nominated label Jun 5, 2017

alexcrichton added the P-medium Medium priority label Jun 20, 2017

jonas-schievink mentioned this issue Jul 19, 2017

Rethinking the interface for fork nix-rust/nix#586

Open

Stebalien mentioned this issue Jul 24, 2017

std::sync::Once can block forever in forked process #43448

Closed

Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 27, 2017

bstrie mentioned this issue Dec 18, 2017

std::os::unix::process::CommandExt::exec should be unsafe #46775

Closed

Centril removed the I-nominated label Nov 29, 2018

RalfJung mentioned this issue Feb 1, 2019

deprecate before_exec in favor of unsafe pre_exec #58059

Merged

bors closed this as completed in #58059 Feb 23, 2019

RalfJung mentioned this issue Feb 28, 2019

#[ffi_returns_twice] rust-lang/rfcs#2633

Closed

jorickert mentioned this issue Apr 23, 2019

mark fork as unsafe nix-rust/nix#1047

Closed

RalfJung mentioned this issue Aug 28, 2019

Review safety of all functions bytecodealliance/wasi-rs#8

Closed

leo60228 mentioned this issue Oct 29, 2021

Consider deprecating and/or modifying behavior of std::env::set_var #90308

Closed

std::os::unix::process::CommandExt::before_exec should be unsafe #39575

std::os::unix::process::CommandExt::before_exec should be unsafe #39575

Comments

fweimer commented Feb 6, 2017

alexcrichton commented Feb 6, 2017

fweimer commented Feb 6, 2017

alexcrichton commented Feb 6, 2017

Mark-Simulacrum commented May 20, 2017

alexcrichton commented Jun 5, 2017

Stebalien commented Jun 5, 2017

bstrie commented Jun 8, 2017

steveklabnik commented Jun 8, 2017

Stebalien commented Jun 11, 2017

bstrie commented Jun 20, 2017

alexcrichton commented Jun 20, 2017

briansmith commented Jul 7, 2017

Kixunil commented Jul 7, 2017

bstrie commented Oct 19, 2017

vi commented Nov 5, 2017 • edited Loading

Stebalien commented Nov 5, 2017

vi commented Nov 5, 2017 • edited Loading

Stebalien commented Nov 5, 2017

vi commented Nov 5, 2017 • edited Loading

vi commented Nov 5, 2017 • edited Loading

Diggsey commented Dec 16, 2017

RalfJung commented Nov 24, 2018

alexcrichton commented Nov 25, 2018

Centril commented Nov 25, 2018

Centril commented Nov 25, 2018

alexcrichton commented Nov 28, 2018

mitsuhiko commented Nov 28, 2018

Diggsey commented Nov 28, 2018

Centril commented Nov 28, 2018

RalfJung commented Nov 29, 2018

oli-obk commented Nov 29, 2018

vi commented Nov 29, 2018 • edited Loading

alexcrichton commented Nov 29, 2018

joshtriplett commented Nov 29, 2018

joshtriplett commented Nov 29, 2018

nikomatsakis commented Nov 29, 2018

mahkoh commented May 24, 2020 • edited Loading

vi commented Nov 5, 2017 •

edited

Loading

vi commented Nov 5, 2017 •

edited

Loading

vi commented Nov 5, 2017 •

edited

Loading

vi commented Nov 5, 2017 •

edited

Loading

vi commented Nov 29, 2018 •

edited

Loading

mahkoh commented May 24, 2020 •

edited

Loading