Skip to content

Conversation

@uncomputable
Copy link
Collaborator

@uncomputable uncomputable commented Apr 13, 2022

Exposes part of Simplicity's C API in Rust. At the moment, one can decode Simplicity DAGs from bytes, and one can access test programs with hardcoded bytes and their correct Merkle roots.

Todo

  • Discussion: use minimal set of necessary C files instead of entire library?
  • Discussion: use C test files?
  • test CMRs and IMRs against C [postponed to later PR due to outdated jets]
  • Make sure that FFI is safe and that pointers are freed

@uncomputable
Copy link
Collaborator Author

Please check if this is the way we should expose C bindings, @apoelstra, @sanket1729. I mimicked secp256k1-sys with important differences:

  1. There is no renaming / tagging of functions.
  2. The crate is called simplicity_sys, since hyphens are not allowed in crate identifiers.

.into_iter()
.map(|x| simplicity_path.join(x))
.collect();
let test_files: Vec<_> = vec![
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cdee8cb:

I don't think we need this array (or these files, for that matter).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- you have unit tests in -sys that use them. In this case I think you could gate this array with #[cfg(test)] so that it's not built when simplicity-sys is used as a dependency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could reduce the built files to a minimum. For instance, in order to construct C-DAGs from binary, we definitely need decodeMallocDag() from dag.c.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this PR is to expose (1) a way to decode DAGs from bytes in C, and (2) test programs with hardcoded bytes and Merkle roots. This is why we need test_files. My original plan was to test Rust against C, but this has to be done in a future PR due to outdated jets.

}

impl From<&[u8]> for Bitstream {
fn from(bytes: &[u8]) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 8e45d76:

This entire function needs to be marked unsafe, or better, you need to tie the lifetime of bytes to the Bitstream type (perhaps with one of the types in std::marker). As written it takes an array bytes and then returns an object pointing into bytes' data which might outlive bytes.

Copy link
Collaborator Author

@uncomputable uncomputable May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated Bitstream to Bitstream<'a> where 'a is the lifetime of &bytes.

jet: *const c_void,
cmr: [u32; 8],
// Assume largest union member
union_aux_types: [size_t; 2],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 8e45d76:

This makes me a little nervous. Rust has a union keyword. Could you use that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DagNode now uses unions, namely AuxTypes [1] and IndicesChildrenWitness [2].

@apoelstra
Copy link
Collaborator

Regarding "should we try to minimize the set of C", I vote no. The whole C library is super small; better to use the whole thing to make review easier. Link-time optimization will strip unused symbols from the final binaries.

Done reviewing 8e45d76. Overall this looks great!

/// Returns the DAG and its length on success.
pub fn decode(bytes: &[u8]) -> Result<(&mut DagNode, usize), Error> {
let mut bitstream = Bitstream::from(bytes);
let mut dag: *mut DagNode = std::ptr::null_mut();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to explicitly free the DAG in the destructor of DagNode? In the C code, the DAG is freed after use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you need to free dag in the Drop impl of DagNode. Good catch.

unsafe {
let mode = "r";
// `file` does not need to be freed as
// `bytes` will free itself at the end of its lifetime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 213c697:

This comment doesn't make sense to me. At some point you do need to fclose file, don't you?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitstream<'a> lives at most as long as &'a bytes. According to man pages, fmemopen does not open a file descriptor, nor does a read-only stream have any buffers that need to be flushed. fclose simply closes file descriptors and flushes. If I understand correctly, this means that the FILE pointer in the stream can safely be dropped without freeing it.

That being said, not freeing a FILE pointer is probably bad practice. I will add a destructor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure that you need to actually call fclose at some point in this source file.

Otherwise, the lifetime stuff looks like it's been fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on further reading I don't know. I can't find a citation either way. It is surprising that there is no descriptor and that fileno will just fail.

@apoelstra
Copy link
Collaborator

Briefly reviewed 213c697. (Only changes from the last review were in the final commit; I think we'll need a bit more iteration on this one.)

@uncomputable
Copy link
Collaborator Author

I updated how DAGs are represented, and added free and fclose for each allocation and opened stream.

Because Drop now tries to free a DAG when it goes out of scope, we cannot do pointer addition to index the DAG and return a struct of the same type. The fact that DAGs are not their root node is confusing in the first place, so I decided to hide that from the user completely: A DAG has exactly one CMR, namely its root CMR. In future PRs, we can add the notion of DAG nodes, whose lifetime will depend on that of their parent DAG.

@uncomputable
Copy link
Collaborator Author

@apoelstra This should be ready to be merged. Sorry for the many force pushes.

len: usize,
}

impl<'a> fmt::Debug for Dag {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a0faaf2:

nit: this 'a shouldn't be there. I'm not sure why the compiler doesn't warn about this.

@apoelstra
Copy link
Collaborator

Could you also rebase onto master? This currently doesn't have #23 which is making it hard for me to test locally.

apoelstra
apoelstra previously approved these changes May 23, 2022
Copy link
Collaborator

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 190a669

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high level review ACK. Left a couple questions

}

/// Simplicity program with bytes, CMR and IMR for testing.
pub struct TestProgram {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we testing the decoding works?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that the values from C are correct (within C's context, using its functions). Testing Rust against C is currently not feasible, due to outdated jets, because these tests will fail.

fn decode_cmr() {
let bytes = SCHNORR0.bytes;
let dag = Dag::decode(bytes).expect("decoding");
assert_eq!(SCHNORR0.cmr(), dag.cmr());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also test that calculation for cmr from SCHNORR0 from rust-simplicity is same as C simplicity. What we are testing here is that is the value exported from C is correctly calculated from C.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree; this was my original intention of this PR. Unfortunately, outdated jets etc. make it so the Rust CMR is almost always different than the C CMR (for the same program). Rust's SCHNORR0 and C's SCHNORR0 are also very different programs, the later using jets and being much smaller. In a later PR, I will replace src/test_progs in Rust with tests that use the FFI once there is a prospect that these tests will succeed.

- sub-crate contains C code
- copied from ElementsProject/simplicity/commit/35627fc49ad96fcb844cca72ffbc69b6e934cb4c
Copy link
Collaborator

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 66135ce

@uncomputable uncomputable mentioned this pull request Jul 9, 2022
3 tasks
Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 66135ce. The build script seems fragile. We should have some vendoring-like script that updates the underlying library when anything changes.

@apoelstra
Copy link
Collaborator

Let's merge this for now. We can improve things later to write a revendoring script that will do codegen and maybe symbol renaming, but for now this is something we've wanted for a while.

@apoelstra apoelstra merged commit 3c12828 into BlockstreamResearch:master Aug 25, 2022
@uncomputable uncomputable deleted the simplicity-sys branch August 29, 2022 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants