-
Notifications
You must be signed in to change notification settings - Fork 19
various FFI cleanups and soundness fixes #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
various FFI cleanups and soundness fixes #307
Conversation
apoelstra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On ffad4a0 successfully ran local tests
ffad4a0 to
6678851
Compare
apoelstra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 6678851 successfully ran local tests
...and remove Elements and _elements from the names (where they occured).
These don't refer to anything
I can't find any citation that this is safe to do. We don't actually use this structure in our Rust code; we populate it by calling a c_set function which is defined on the C side, and then pass it to C functions. So no code needs to be changed. However, I want to remove the c_set_* functions because they hide life information (and in fact, are unsound, as we currently use them) and serve to obfuscate code. They seem to have come from BlockstreamResearch#48 BlockstreamResearch#49 BlockstreamResearch#50 BlockstreamResearch#51 in which Sanket seems to have copied the corresponding Haskell code into Rust, where it is unnecessary and dangerous.
…erences This adds some lifetimes to our raw C structures, which in the next commits will be used to enforce that we are not giving the C code pointers to local variables or other unsound things. (In fact, our existing code does do this.) As a starting point, replace a few "easy" FFI fields with stronger types. We do not change the pointers to value types, which are listed in Rust as "NULL, or a pointer to 9 bytes, or to 33" because it was not clear that there was any better representation for this than *const u8. Through the C FFI we have structs which have listed invariants of the form "either this is NULL or it points to an array of 33 bytes". We further have a guarantee (because we are maintainers of both the C and Rust code, and we are responsible citizens) that these const pointers will never be mutated through by the C code. The rest of this comment is to justify this change: According to the nomicon, we may use references in FFI types as long as we promise they won't be NULL and that we won't violate the aliasing rules (which for const pointers, simply means we won't cast them to non-const pointers): https://doc.rust-lang.org/nomicon/ffi.html#interoperability-with-foreign-code The repr(C) docs further say that if our pointers might be NULL, we can use Option<&T>: https://doc.rust-lang.org/nomicon/other-reprs.html#reprc Finally, where the C code takes a pointer to a u8 with an implicit claim that this is actually a pointer to 33 contiguous u8s, we may use pointers or references to Rust arrays, which is permissible because: * all thin pointers have the same size and representation, justified by the Rust reference https://doc.rust-lang.org/reference/expressions/operator-expr.html#r-expr.as.pointer.sized * arrays are guaranteed to be contiguous in memory in both C and Rust
(The following commits will delete the rest of these.) The role of these functions is to marshall Rust data into C structures. This is implemented in env.c, on the C side of the FFI boundary, and use raw pointers and out-pointers. None of this is necessary because Rust has a well-defined FFI interface with C. Furthermore, it serves to obfuscate the lifetimes and provenance of the various pointers at play. We can remove this function, replace its call sites with direct construction of the Rust version of the C structure, and replace raw pointers (where possible) with references that have named lifetimes. Then our Rust code is formally safe and the compiler will give us much more help.
The existing code creates a temporary `PeginData` structure, then stores a pointer to its `genesis_hash` field in a struct that we pass to the C code. This is unsound. In general, we lack transaction environment tests where we construct different environments then run Simplicity code that exercises them.
This one was easy.
The nomicon doesn't like this. From the bottom of https://doc.rust-lang.org/nomicon/ffi.html "Notice that it is a really bad idea to use an empty enum as FFI type. The compiler relies on empty enums being uninhabited, so handling values of type &Empty is a huge footgun and can lead to buggy program behavior (by triggering undefined behavior)." They give an example that uses a phantom to also remove Send, Sync, and Unpin. I did not do this. I think all these traits are fine, because the type has no self-references and is never mutated by the C code except when it is freed, and then only through a `*mut` pointer.
These structures are private data types used ephemerally by src/jet/elements/c_env.rs. They should not be public and certainly should not be exposed in simplicity-sys.
6678851 to
405487e
Compare
|
Rebased. |
apoelstra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 405487e successfully ran local tests
| #[derive(Debug)] | ||
| pub struct RawOutputData { | ||
| pub asset: Vec<c_uchar>, | ||
| pub asset: Option<[c_uchar; 33]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: In some places you've replaced c_uchar with u8 but not others. Using u8 consistently seems better (but I can change that in another commit).
|
ACK 405487e |
Currently our transaction environment construction is quite complicated, and (as a consequence) unsound in a couple of places. This was implemented in PRs #49 through #52 in the early days of this crate with very little review, using a strategy copied from Haskell but not really appropriate for Rust.
In particular:
repr(C)structsThere was at least one instance where we were creating a temporary variable and extracting a pointer from it, which would then outlive the object it pointed into. (This occurred for the annex, though in dead code, and for the pegin genesis hash, in live code that has never been used.)
As an aside, we used empty enums in a couple places for opaque types. We should not do this, according to the nomicon.