-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merkle Proving #16
Merkle Proving #16
Conversation
crates/merkle-inclusion/src/lib.rs
Outdated
E: PrimeField, | ||
CS: ConstraintSystem<E>, | ||
{ | ||
sha3(cs.namespace(|| "hash combine"), &[hash1, hash2].concat()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing context for this pull request. Can the sha3
here be replaced with a generic hash function circuit? For example, someone may want to use Poseidon to construct a Merkle tree and prove inclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I'm currently working on a commit that's dedicated to it. Should be pushed quite soon 👍
@huitseeker , I added a test over the number of constraints in 437f1c8 as you suggested. I think I got it right 🤔 Edit to get a more generalized formula for the number of constraints of the circuit. The number of constraints is as follow:
So for a 256 digest length over a depth 3 leaf:
|
|
||
// Example use of the macro with OutOfCircuitHasher specified | ||
create_gadget_digest_impl!(Sha3, sha3, 256, Sha3_256); | ||
create_gadget_digest_impl!(Keccak, keccak256, 256, Keccak256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be great to check we're not hard-coding the length to be 256, and an easy way to do this is to test with the neighboring SHA512 crate (which output is 512 bits)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also help with endianness: SHA512 uses big-endian, whereas the keccak crate is little-endian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See in the recent PR how I deal with this using Bitvec: #20
This might help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is pretty helpful!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 04ae409.
/// // Process each leaf and intermediary node's hash and key... | ||
/// } | ||
/// ``` | ||
pub fn construct_merkle_tree<D: Digest>() -> SimpleMerkleTree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not needed for this iteration, but we'll probably want to open an issue to use proptest to generate this sort of test instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huitseeker I just added it to have a clean tree to use for testing. Would you rather have me removing this implementation and implement a more straight forward construction of the leafs / proofs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's ok. I'm just mentioning specifically that proptest_recurse
can generate random tree-structured data:
https://docs.rs/proptest-recurse/latest/proptest_recurse/
This is probably something we'll want, but in the future.
/// Each leaf node contains a key-value pair, where the key is used to determine the position in the tree and the value | ||
/// is the data stored. | ||
pub struct Leaf { | ||
key: Key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the expected endianness of this key? Shall we document it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now I handle the key to be Little Endian all the time. I think it would actually make more sense if it could be of the same endian of the hashing algorithm it is being associated to. Will update the code to reflect it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this in 775a5fa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior needed correction, updated in a74aa26
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it make sense to have the endianness of the key be that of the hashing gadget?
I'm not against a simplifying assumption such as "key, siblings and gadget output should all have the same endianness" (which I'm not sure is required strictly speaking by the Aptos spec), but I'd expect that to be spectacularly well documented.
That is, if I pass a key, proof pair to the algorithm, and that proof is being iterated on in the direction of the endianness of a hashing gadget that I happen to have specified separatedly, I will be surprised to see a dependency come from that gadget's implementation detail unless it's well documented: the gadget's implementer and the provider of the proof may not be the same people.
To be explicit, there's good documentation about this on the conditional_reverse
function, which is private. I'd expect this documentation to be on verify_proof
and on the Leaf
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting close, I think we need to improve how we document the assumptions we'd have on the key-siblings joint iteration. Thanks!
version = "0.1.0" | ||
edition = "2021" | ||
authors = ["Lurk Lab Engineering <engineering@lurk-lab.com>"] | ||
license = "MIT OR Apache-2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It would be great to take this occasion to set up a workspace-wide license field here:
https://github.com/lurk-lab/bellpepper-gadgets/blob/main/Cargo.toml#L12-L14
/// Each leaf node contains a key-value pair, where the key is used to determine the position in the tree and the value | ||
/// is the data stored. | ||
pub struct Leaf { | ||
key: Key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it make sense to have the endianness of the key be that of the hashing gadget?
I'm not against a simplifying assumption such as "key, siblings and gadget output should all have the same endianness" (which I'm not sure is required strictly speaking by the Aptos spec), but I'd expect that to be spectacularly well documented.
That is, if I pass a key, proof pair to the algorithm, and that proof is being iterated on in the direction of the endianness of a hashing gadget that I happen to have specified separatedly, I will be surprised to see a dependency come from that gadget's implementation detail unless it's well documented: the gadget's implementer and the provider of the proof may not be the same people.
To be explicit, there's good documentation about this on the conditional_reverse
function, which is private. I'd expect this documentation to be on verify_proof
and on the Leaf
struct.
crates/merkle-inclusion/src/lib.rs
Outdated
let mut actual_root_hash = proof.leaf().hash().to_vec(); | ||
|
||
let key_iterator = | ||
conditional_reverse::<_, _, GD>(proof.leaf().key().iter().take(proof.siblings().len())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I have for example :
proof.siblings.len() == 5
andproof.leaf().key().len() == 8
, andGD::is_little_endian() == true
The sequence of indexes of proof.leaf.key()
that will be used to thread proof().siblings()
is [4, 3, 2, 1, 0]
. Is that what the user should expect, or should they expect [7, 6, 5, 4, 3]
instead? Is the MSB of the key not at index 7 in LE order?
I think there's two ways to go about this:
- explicitly state everywhere that key and gadget should have the same endianness, and in that case, reverse first, and
.take(proof.siblings.len())
after. This is closer to the Aptos spec (note the mention of MSB of the key there), but may be less simple for other Merkle algorithms. - not explicitly make any relation between the gadget's endianness and the key's iteration order, and say that in a proof, key and siblings should be iterable in the same order, whatever it is: to the point that the algorithm should operate on the iterator formed by
proof.leaf.key().iter().zip(proof.siblings.iter())
(after checkingproof.keaf.key().len() > proof.siblings().len()
)
(I don't particularly care which of these two solutions we use, but I am mindful that we put the user in a situation to know precisely which approach we're enforcing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've went along with solution 2. , I think it should suffice in our context.
crates/merkle-inclusion/src/lib.rs
Outdated
GD: GadgetDigest<E>, | ||
>( | ||
iter: I, | ||
) -> Box<dyn DoubleEndedIterator<Item = &'a Boolean> + 'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could inline this and avoid the performance cost of Box<dyn MyType>
for these few lines.
crates/merkle-inclusion/src/lib.rs
Outdated
for (i, (sibling_hash, bit)) in proof.siblings().iter().zip(key_iterator).enumerate() { | ||
if let Some(b) = bit.get_value() { | ||
if b { | ||
actual_root_hash = GD::digest( | ||
cs.namespace(|| format!("sibling {}", i)), | ||
&[sibling_hash.to_owned(), actual_root_hash].concat(), | ||
)? | ||
} else { | ||
actual_root_hash = GD::digest( | ||
cs.namespace(|| format!("sibling {}", i)), | ||
&[actual_root_hash, sibling_hash.to_owned()].concat(), | ||
)? | ||
} | ||
} else { | ||
return Err(SynthesisError::Unsatisfiable); | ||
} | ||
} | ||
|
||
hash_equality(cs, expected_root, actual_root_hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aka, I believe:
for (i, (sibling_hash, bit)) in proof.siblings().iter().zip(key_iterator).enumerate() {
let b = bit.get_value().ok_or(SynthesisError::Unsatisfiable)?;
// Determine the order of hashing based on the bit value
let hash_order = if b {
vec![sibling_hash.to_owned(), actual_root_hash]
} else {
vec![actual_root_hash, sibling_hash.to_owned()]
};
// Compute the new hash
actual_root_hash = GD::digest(cs.namespace(|| format!("sibling {}", i)), &hash_order.concat())?;
}
hash_equality(cs, expected_root, actual_root_hash)
LICENSE-APACHE
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need the License files replicated at the root of the repo if we copy them in each crate, which we have done so far.
I'd love to address those copies, but I'd want to do it in another PR, to keep this one focused on Merkle trees => I think this should not add any license files other than the ones in its own crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll remove the license files at the root.
We now expect proof.siblings and proof.leaf.key to be ordered from first sibling to last (bottom to top of the tree).
a8b7666
to
460c8b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
Description
This Pr introduces a gadget to prove a leaf inclusion in a Merkle Tree.
Current progress
Used Jellyfish Merkle Tree paper to implement a Proof of Inclusion algorithm.
It has a generic approach to which hashing algorithm is used, and currently natively handles the Keccak and Sha3 implementation in
bellpepper-keccak
.Links to any relevant issues or pull requests
Related to linear#29