Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upstd::os::unix::process::CommandExt::before_exec should be unsafe #39575
Comments
This comment has been minimized.
This comment has been minimized.
|
Could you clarify the unsafety that can happen here? We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such. |
This comment has been minimized.
This comment has been minimized.
fweimer
commented
Feb 6, 2017
|
It's probably best to treat anything that is undefined according to POSIX as a memory safety violation. Once POSIX says that you cannot assume anything about the process's behavior, you need to assume that memory safety is violated as well. I'll try to give two examples why this facility is unsafe in practice. I have to argue based on the glibc implementation, but we can hopefully assume that it is a valid implementation, matching the POSIX requirements in play here. First, assume that you have a PRNG for use for cryptography written in Rust. This PRNG uses locks or thread-local variables to protect its internal state. It does not have fork protection because there is no way to implement that in Rust. This PRNG will return the same sequence of bytes in the For the second problem, we look at the implementation of recursive mutexes in
The attempt in step 6 will succeed because I think the second example at least is quite compelling. |
This comment has been minimized.
This comment has been minimized.
|
That'd do it! |
This comment has been minimized.
This comment has been minimized.
|
This is not marked as unsafe today and to do so would be backwards incompatible; however, a quick search on GitHub didn't seem to turn up any code that uses it. |
Mark-Simulacrum
added
I-nominated
T-libs
labels
May 20, 2017
brson
added
the
I-unsound 💥
label
May 23, 2017
This comment has been minimized.
This comment has been minimized.
|
The libs team discussed this a few weeks ago at the last triage and the conclusion was that we'd prefer to see concrete evidence (e.g. a proof-of-concept) showing the memory unsafety here. At that point we'll consider remediations but for now we're fine just adding some documentation that "care should be taken" |
alexcrichton
removed
the
I-nominated
label
Jun 5, 2017
This comment has been minimized.
This comment has been minimized.
|
In general, anything involving files may not behave as expected. For example, a concrete memory safety issue would be a memory mapped allocator and assumes that the memory mapped file can't be accessed by multiple processes (e.g., by creating a file with the Other shenanigans include:
|
This comment has been minimized.
This comment has been minimized.
|
Not to denigrate theoretical concerns (which is to say, just because one hasn't yet demonstrated that this violates memory safety doesn't mean that we're going to reject this out-of-hand, and it doesn't mean that it won't eventually become a severe bug somehow), but I second Alex's desire to see a working proof-of-concept in code. |
This comment has been minimized.
This comment has been minimized.
|
Some prior art
Feels similar to me. |
This comment has been minimized.
This comment has been minimized.
|
I honestly don't have time to write a convincing POC but this breaks all wayland implementations (assuming memory corruption counts as a safety bug) because wayland clients and servers communicate by memory mapped files. Any attempt (accidental or otherwise) to update a wayland buffer from within a Those aren't quite the same:
Unlike those cases, this affects a fundamental rust concept: move/copy semantics. Without As |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton I like to see all soundness issues with an associated priority tag, does your comment above imply that this is considered P-low? Alternatively, if you don't consider this a soundness issue at all, would you like to remove the I-unsound label? |
This comment has been minimized.
This comment has been minimized.
|
Ah yes the libs team decided this was basically P-medium, so I will tag as such. |
alexcrichton
added
the
P-medium
label
Jun 20, 2017
This comment has been minimized.
This comment has been minimized.
briansmith
commented
Jul 7, 2017
|
Even if there were no way to trigger memory unsafety today, at any time a change to libpthreads or libc or the operating system in the future could cause memory unsafety where there was previously none. Accordingly, I don't think blocking the change on a PoC makes sense, and further I think requiring somebody to make a PoC before the change is made would be a waste of those people's time. Any time we trigger undefined behavior, memory unsafety should just be assumed. |
This comment has been minimized.
This comment has been minimized.
Kixunil
commented
Jul 7, 2017
|
BTW have anyone considered that on macOS if an application is using libdispatch, it must not call any libdispatch code after fork? I don't have PoC now but I think it'd not be difficult to create. |
Stebalien
referenced this issue
Jul 24, 2017
Open
std::sync::Once can block forever in forked process #43448
Mark-Simulacrum
added
the
C-bug
label
Jul 27, 2017
This comment has been minimized.
This comment has been minimized.
|
I concur with the arguments here that this is a different beast than |
This comment has been minimized.
This comment has been minimized.
|
With some OS help one can always cause memory unsafety: #![feature(getpid)]
fn hacky_ptr_write<T>(ptr: &T, value: u32) {
std::process::Command::new("gdb").arg("-batch").arg("-quiet")
.arg("-pid").arg(format!("{}", std::process::id())).arg("-ex")
.arg(format!("set {{unsigned long}}{:p}={}",ptr,value))
.output() .unwrap();
}
fn main() {
let q = &mut Box::new(55);
hacky_ptr_write(&q, 0);
*q = Box::new(44); // Segmentation fault
}But this does not mean |
This comment has been minimized.
This comment has been minimized.
|
The problem here is that the type-based assumption that a non-Copy type can't be copied is broken (even though the copies end up in different processes). Any code relying on this assumption is potentially incorrect and any unsafe code relying on it is potentially memory unsafe. However, given the "different process" constraint, this bug necessarily involves the operating system. Yes, you can usually shoot yourself in the foot using the OS†. However, you generally have to explicitly request this behavior. In this case, you can take two apparently safe and independent APIs (e.g., wayland and †It's actually harder than you might think as long as you aren't root. Most modern Linux distros, at least, don't allow non-root processes to debug (or access/modify the memory of) non-child processes. |
This comment has been minimized.
This comment has been minimized.
|
I expect there to be some Does the question boil down to "Should every
During |
This comment has been minimized.
This comment has been minimized.
Yes. It memory maps a shared buffer.
Not unless you were to move the shared mmap functionality from wayland into std. It's not really about where the code lives but what are valid assumptions and what are not (although any assumptions made by std are assumed to be valid).
Pretty much, yes. If we do say "this API is fine", we should also provide a fork function (the same way we made
From an outside perspective, the value is copied and:
|
This comment has been minimized.
This comment has been minimized.
Just what is typically needed. Consider a C program: #include <stdlib.h>
#include <unistd.h>
int main() {
int fd = open("myfile.txt", O_RDONLY);
void* mem = malloc(300);
if (fork()) {
close(fd); // first "drop" of fd
free(mem); // first "drop" of mem
} else {
close(fd); // second "drop" of fd
free(mem); // second "drop" of mem
}
}There are two Unsafety of Relationship between processes after What exactly happens in Wayland if evil peer (or even evil Wayland server) deliberately corrupts our buffer? Shound't Rust program not trust memory-mapped buffers that are writable from outside anyway? I expect storing pointers inside a memory-mapped file is |
This comment has been minimized.
This comment has been minimized.
What function? |
This comment has been minimized.
This comment has been minimized.
|
There's also this comment from the implementation of // Currently we try hard to ensure that the call to `.exec()` doesn't
// actually allocate any memory. While many platforms try to ensure that
// memory allocation works after a fork in a multithreaded process, it's
// been observed to be buggy and somewhat unreliable, so we do our best to
// just not do it at all!If this is true it seems pretty clear that this function should be considered unsafe. If it's not true then there's a whole lot of unnecessary and complex unsafe code in the standard library. |
bstrie
referenced this issue
Dec 18, 2017
Closed
std::os::unix::process::CommandExt::exec should be unsafe #46775
This comment has been minimized.
This comment has been minimized.
I think this approach should be re-evaluated going forwards. Whilst "being able to directly cause memory unsafety" should of course imply that a function should be unsafe, I don't think the inability to do that is reason that a function should necessarily be stabilised as a safe function. Firstly, there are features which are independently safe, but cause unsafety when used together. With the above approach, this will end up being decided just by whichever feature gets there first, whereas it would be better to make such choices explicitly, ie. we want X to be safe at the cost of Y being unsafe. Secondly, it may be hard (as is the case here) to properly evaluate whether something is indeed safe or not, and stabilising it as an unsafe function would be more conservative. Obviously we want to avoid doing this too, but in certain cases where there is reasonable doubt (there was a lot of evidence at the time that there might be memory unsafety here), and particularly on platform-specific and infrequently used functionality, either delaying stabilisation or stabilising it as unsafe would be preferable. |
This comment has been minimized.
This comment has been minimized.
I think we should measure the impact on the ecosystem, but given the results from sourcegraph it strikes me as hyperbole to state that this would "break the entire ecosystem in one go" (emphasis mine). Twice I have read RFC 1122 now and I cannot find any paragraph nor sentence which states that breakage due to soundness fixes has to be minimal. The RFC mainly talks about mitigation strategies depending on impact, severity, exploitability, etc. If you think I'm wrong about this please show me why. A migration strategy consisting of changing key crates before-hand plus a compatibility warning, over say 2 release cycles, that tells users to wrap
Well, historical versions yes, we can at most go and introduce new patch versions
As I noted before this has one case of
This has 2 mentions of
OK; so we take a more gradual approach with a lint / warning stating that it should be wrapped in |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
What would cause a breakage in Nix? I can't find any uses of |
This comment has been minimized.
This comment has been minimized.
I'm not sure how "async-signal-safe" and Rust's "safe" are connected. |
This comment has been minimized.
This comment has been minimized.
As far as I remember, there is some notion against marking functions as unsafe just because of it feels unsafe or because of it is bad for security or to deter users from using it. Each unsafe function's documentation should ideally contain the proof of unsafety and concrete instructions of how to use it without causing UB. Examples of such "unsafe-esque" safe functions are |
This comment has been minimized.
This comment has been minimized.
This is the wrong way round: functions which call unsafe functions but are themselves safe should have a "proof" (not necessarily formal, but at least some reasoning) of why they are safe. Not just "we couldn't find any unsafety". |
This comment has been minimized.
This comment has been minimized.
Making |
This comment has been minimized.
This comment has been minimized.
|
There is definitely a policy against marking functions that are known to sound but otherwise dangerous as |
This comment has been minimized.
This comment has been minimized.
I've never used
True. However:
Namely, my pipe example (helpfully folded away by GitHub) is safe in a universe without a safe I do not think there can be a general policy about what to do in such cases, we must decide on a case-by-case basis.
If we decide So, yes, this would immediately imply that a safe |
This comment has been minimized.
This comment has been minimized.
Oh sure, the crate would have to change eventually; but it does not imply that the code would break immediately in the sense that |
This comment has been minimized.
This comment has been minimized.
The example looks like it trusts sockets to return some sane data. I'm not sure if OS sockets and other objects (apart from memory manager) are in scope of Rust's safety rules.
Another tricky point is shared memory. Is all shared memory inherently unpredictable (i.e. can't store any references there or rely on any invariants about content of shared memory) or safe facades are OK and tool that break those safe facades are unsafe? Would hypothetical new OS function that adjusts monotonic clock, allows it to jump back in time (which is currently documended not to happen) safe or unsafe according to Rust? There should be some explicit policy of which operation system features are in scope of safety rules and which are not. For each feature in scope, there should be description of what invariants shall unsafe-using code preserve (e.g. "don't close or duplicate file descriptors owned by Rust code"). |
RalfJung
referenced this issue
Nov 21, 2018
Closed
[DO NOT MERGE] make before_exec an unsafe fn #56129
added a commit
that referenced
this issue
Nov 21, 2018
This comment has been minimized.
This comment has been minimized.
|
So crater found 32 root regressions. Looks like if we decide we want to take a stanza for tracking ownership of external resources, we'd have to go the gradual-deprecation route. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for gathering the data @RalfJung! This is a nominated issue for the next T-libs meeting where we can try to discuss and discover a new name here. Do others have suggestions for what to call this method alternatively? Some ideas I might have are:
|
This comment has been minimized.
This comment has been minimized.
Ralf couldn't participate on the 29th. |
This comment has been minimized.
This comment has been minimized.
|
My preference is to deprecate the method temporarily with: #[deprecated =
"This method will become unsafe because $reason.\
Check whether what you are doing is safe and then wrap the call in `unsafe { ... }`."]And then we can let say 3-6 months pass while we file fixing PRs; after that we make I don't mind having an extra method; but I would like |
This comment has been minimized.
This comment has been minimized.
|
This was discussed briefly with the libs team during today's triage, and there was no objection to moving forward with a deprecation. The leading contender for a name was Moving this issue forward largely just needs someone to champion the PR and implementation to push it through! |
This comment has been minimized.
This comment has been minimized.
|
Not my call to make but I really do not like the idea of just incrementing the count on methods. This made a lot of code in Python just very ugly to begin with. If you want to find some better names here are similar APIs in other languages: function with the same behavior in python is called |
This comment has been minimized.
This comment has been minimized.
|
If editions could be used to solve this in the future, that would be a huge benefit. (I realise this is difficult given that libstd has to only be compiled once, but perhaps there is a way to attach some metadata to the function signature so that the compiler can do the mapping). eg. an attribute like: #[rename("before_exec", edition=2018)]
fn before_exec2(...) |
This comment has been minimized.
This comment has been minimized.
|
@Diggsey ostensibly you could do that with "edition visibility", e.g. |
This comment has been minimized.
This comment has been minimized.
This makes sense for someone coming from today's Rust, but in a future where |
This comment has been minimized.
This comment has been minimized.
|
We can have "soft unsafe" operations which only produce a (deny by default) lint that states that one should be wrapping it in Implementing such a scheme in the unsafe checker is quite simple. I am volunteering to write a PR if such a change is desired (or if we just want to see how such a change would look). |
This comment has been minimized.
This comment has been minimized.
Like what happens when taking a reference of a field of a repr(packed) struct? |
This comment has been minimized.
This comment has been minimized.
|
I think there's definitely enough valid pushback to not call it @oli-obk I think I'd personally prefer to see a different function, because there's also the matter of creating a function pointer to this method which today is safe and creates a safe function pointer, but aftewards would need to ideally be safe and create an unsafe function pointer but would in practice have to unsafely create a safe function pointer |
Centril
removed
the
I-nominated
label
Nov 29, 2018
This comment has been minimized.
This comment has been minimized.
|
The language team agreed that |
This comment has been minimized.
This comment has been minimized.
|
Speaking only for myself: I like |
This comment has been minimized.
This comment has been minimized.
|
I do feel like there is a meta question lurking here of unsafe composition and how we should manage it. For now, it seems good to err on the side of caution where possible, though, and avoid thorny questions -- but -- as has been amptly demonstrated here -- it's sort of hard to tell what the limits ought to be on what unsafe code can and cannot do when it comes to e.g. external resources. I wonder if at some point we're going to want to try and allow unsafely implemented libraries to declare more precisely the kinds of things they rely on other libraries not to do. |
fweimer commentedFeb 6, 2017
The before_exec method should be marked unsafe because after a fork from a multi-threaded process, POSIX allows only allows async-signal-safe functions to be called. Otherwise, the result is undefined behavior.
The only way to enforce this in Rust is to mark the function as unsafe.
So to be clear, this should not compile: