Merkle Proving #16

tchataigner · 2024-01-15T12:42:48Z

Description

This Pr introduces a gadget to prove a leaf inclusion in a Merkle Tree.

Current progress

Used Jellyfish Merkle Tree paper to implement a Proof of Inclusion algorithm.

It has a generic approach to which hashing algorithm is used, and currently natively handles the Keccak and Sha3 implementation in bellpepper-keccak.

Links to any relevant issues or pull requests

Related to linear#29

crates/merkle-inclusion/LICENSE-APACHE

crates/merkle-inclusion/LICENSE-MIT

avras · 2024-01-16T06:34:20Z

crates/merkle-inclusion/src/lib.rs

+    E: PrimeField,
+    CS: ConstraintSystem<E>,
+{
+    sha3(cs.namespace(|| "hash combine"), &[hash1, hash2].concat())


I may be missing context for this pull request. Can the sha3 here be replaced with a generic hash function circuit? For example, someone may want to use Poseidon to construct a Merkle tree and prove inclusion.

For sure, I'm currently working on a commit that's dedicated to it. Should be pushed quite soon 👍

tchataigner · 2024-01-17T22:30:10Z

@huitseeker , I added a test over the number of constraints in 437f1c8 as you suggested. I think I got it right 🤔

Edit to get a more generalized formula for the number of constraints of the circuit.

The number of constraints is as follow:

$ constraints_inputs + constraints_computation
$ expected_root_digest_length + leaf_digest_length + tree_depth + digest_length * nbr_siblings + enforce_equal_over_digest + hasher_constraints * nbr_siblings
$ digest_length + digest_length + nbr_siblings + digest_length * nbr_siblings + digest_length + hasher_constraints * nbr_siblings

$ 3 * digest_length + ( digest_length + hasher_constraints + 1 ) * nbr_siblings

So for a 256 digest length over a depth 3 leaf:

$ 3 * 256 + ( 256 + 151424 + 1 ) * 3
$ 455811

crates/merkle-inclusion/Cargo.toml

crates/merkle-inclusion/src/traits.rs

crates/merkle-inclusion/tests/gadget.rs

huitseeker · 2024-01-18T14:53:14Z

crates/merkle-inclusion/src/traits.rs

+
+// Example use of the macro with OutOfCircuitHasher specified
+create_gadget_digest_impl!(Sha3, sha3, 256, Sha3_256);
+create_gadget_digest_impl!(Keccak, keccak256, 256, Keccak256);


it would be great to check we're not hard-coding the length to be 256, and an easy way to do this is to test with the neighboring SHA512 crate (which output is 512 bits)

This would also help with endianness: SHA512 uses big-endian, whereas the keccak crate is little-endian.

See in the recent PR how I deal with this using Bitvec: #20
This might help.

Thanks, this is pretty helpful!

Done in 04ae409.

crates/merkle-inclusion/Cargo.toml

huitseeker · 2024-01-18T14:54:57Z

crates/merkle-inclusion/tests/gadget.rs

+///     // Process each leaf and intermediary node's hash and key...
+/// }
+/// ```
+pub fn construct_merkle_tree<D: Digest>() -> SimpleMerkleTree {


It's not needed for this iteration, but we'll probably want to open an issue to use proptest to generate this sort of test instance.

@huitseeker I just added it to have a clean tree to use for testing. Would you rather have me removing this implementation and implement a more straight forward construction of the leafs / proofs?

No, it's ok. I'm just mentioning specifically that proptest_recurse can generate random tree-structured data:
https://docs.rs/proptest-recurse/latest/proptest_recurse/
This is probably something we'll want, but in the future.

crates/merkle-inclusion/src/lib.rs

huitseeker · 2024-01-18T16:18:57Z

crates/merkle-inclusion/src/lib.rs

+/// Each leaf node contains a key-value pair, where the key is used to determine the position in the tree and the value
+/// is the data stored.
+pub struct Leaf {
+    key: Key,


What's the expected endianness of this key? Shall we document it?

As of now I handle the key to be Little Endian all the time. I think it would actually make more sense if it could be of the same endian of the hashing algorithm it is being associated to. Will update the code to reflect it.

Did this in 775a5fa

Behavior needed correction, updated in a74aa26

Why does it make sense to have the endianness of the key be that of the hashing gadget?

I'm not against a simplifying assumption such as "key, siblings and gadget output should all have the same endianness" (which I'm not sure is required strictly speaking by the Aptos spec), but I'd expect that to be spectacularly well documented.

That is, if I pass a key, proof pair to the algorithm, and that proof is being iterated on in the direction of the endianness of a hashing gadget that I happen to have specified separatedly, I will be surprised to see a dependency come from that gadget's implementation detail unless it's well documented: the gadget's implementer and the provider of the proof may not be the same people.

To be explicit, there's good documentation about this on the conditional_reverse function, which is private. I'd expect this documentation to be on verify_proof and on the Leaf struct.

crates/merkle-inclusion/src/lib.rs

huitseeker

This is getting close, I think we need to improve how we document the assumptions we'd have on the key-siblings joint iteration. Thanks!

huitseeker · 2024-01-20T22:22:57Z

crates/merkle-inclusion/Cargo.toml

+version = "0.1.0"
+edition = "2021"
+authors = ["Lurk Lab Engineering <engineering@lurk-lab.com>"]
+license = "MIT OR Apache-2.0"


Nit: It would be great to take this occasion to set up a workspace-wide license field here:
https://github.com/lurk-lab/bellpepper-gadgets/blob/main/Cargo.toml#L12-L14

huitseeker · 2024-01-20T23:55:17Z

crates/merkle-inclusion/src/lib.rs

+/// Each leaf node contains a key-value pair, where the key is used to determine the position in the tree and the value
+/// is the data stored.
+pub struct Leaf {
+    key: Key,


Why does it make sense to have the endianness of the key be that of the hashing gadget?

I'm not against a simplifying assumption such as "key, siblings and gadget output should all have the same endianness" (which I'm not sure is required strictly speaking by the Aptos spec), but I'd expect that to be spectacularly well documented.

That is, if I pass a key, proof pair to the algorithm, and that proof is being iterated on in the direction of the endianness of a hashing gadget that I happen to have specified separatedly, I will be surprised to see a dependency come from that gadget's implementation detail unless it's well documented: the gadget's implementer and the provider of the proof may not be the same people.

To be explicit, there's good documentation about this on the conditional_reverse function, which is private. I'd expect this documentation to be on verify_proof and on the Leaf struct.

huitseeker · 2024-01-21T00:08:18Z

crates/merkle-inclusion/src/lib.rs

+    let mut actual_root_hash = proof.leaf().hash().to_vec();
+
+    let key_iterator =
+        conditional_reverse::<_, _, GD>(proof.leaf().key().iter().take(proof.siblings().len()));


if I have for example :

proof.siblings.len() == 5 and

proof.leaf().key().len() == 8, and

GD::is_little_endian() == true

The sequence of indexes of proof.leaf.key() that will be used to thread proof().siblings() is [4, 3, 2, 1, 0]. Is that what the user should expect, or should they expect [7, 6, 5, 4, 3] instead? Is the MSB of the key not at index 7 in LE order?

I think there's two ways to go about this:

explicitly state everywhere that key and gadget should have the same endianness, and in that case, reverse first, and .take(proof.siblings.len()) after. This is closer to the Aptos spec (note the mention of MSB of the key there), but may be less simple for other Merkle algorithms.

not explicitly make any relation between the gadget's endianness and the key's iteration order, and say that in a proof, key and siblings should be iterable in the same order, whatever it is: to the point that the algorithm should operate on the iterator formed by proof.leaf.key().iter().zip(proof.siblings.iter()) (after checking proof.keaf.key().len() > proof.siblings().len())

(I don't particularly care which of these two solutions we use, but I am mindful that we put the user in a situation to know precisely which approach we're enforcing.)

I've went along with solution 2. , I think it should suffice in our context.

huitseeker · 2024-01-21T00:09:01Z

crates/merkle-inclusion/src/lib.rs

+    GD: GadgetDigest<E>,
+>(
+    iter: I,
+) -> Box<dyn DoubleEndedIterator<Item = &'a Boolean> + 'a> {


It would be great if we could inline this and avoid the performance cost of Box<dyn MyType> for these few lines.

huitseeker · 2024-01-21T00:12:37Z

crates/merkle-inclusion/src/lib.rs

+    for (i, (sibling_hash, bit)) in proof.siblings().iter().zip(key_iterator).enumerate() {
+        if let Some(b) = bit.get_value() {
+            if b {
+                actual_root_hash = GD::digest(
+                    cs.namespace(|| format!("sibling {}", i)),
+                    &[sibling_hash.to_owned(), actual_root_hash].concat(),
+                )?
+            } else {
+                actual_root_hash = GD::digest(
+                    cs.namespace(|| format!("sibling {}", i)),
+                    &[actual_root_hash, sibling_hash.to_owned()].concat(),
+                )?
+            }
+        } else {
+            return Err(SynthesisError::Unsatisfiable);
+        }
+    }
+
+    hash_equality(cs, expected_root, actual_root_hash)


Aka, I believe:

for (i, (sibling_hash, bit)) in proof.siblings().iter().zip(key_iterator).enumerate() { let b = bit.get_value().ok_or(SynthesisError::Unsatisfiable)?; // Determine the order of hashing based on the bit value let hash_order = if b { vec![sibling_hash.to_owned(), actual_root_hash] } else { vec![actual_root_hash, sibling_hash.to_owned()] }; // Compute the new hash actual_root_hash = GD::digest(cs.namespace(|| format!("sibling {}", i)), &hash_order.concat())?; } hash_equality(cs, expected_root, actual_root_hash)

huitseeker · 2024-01-22T15:53:34Z

LICENSE-APACHE

I think we don't need the License files replicated at the root of the repo if we copy them in each crate, which we have done so far.

I'd love to address those copies, but I'd want to do it in another PR, to keep this one focused on Merkle trees => I think this should not add any license files other than the ones in its own crate.

Alright, I'll remove the license files at the root.

… key

…d big endian

We now expect proof.siblings and proof.leaf.key to be ordered from first sibling to last (bottom to top of the tree).

huitseeker

LGTM!!

tchataigner requested review from huitseeker and mpenciak January 15, 2024 13:31

avras reviewed Jan 16, 2024

View reviewed changes

crates/merkle-inclusion/LICENSE-APACHE Outdated Show resolved Hide resolved

avras reviewed Jan 16, 2024

View reviewed changes

crates/merkle-inclusion/LICENSE-MIT Outdated Show resolved Hide resolved

avras reviewed Jan 16, 2024

View reviewed changes

tchataigner marked this pull request as ready for review January 17, 2024 15:09

huitseeker reviewed Jan 18, 2024

View reviewed changes

tchataigner requested a review from huitseeker January 19, 2024 17:17

huitseeker reviewed Jan 21, 2024

View reviewed changes

tchataigner requested a review from huitseeker January 22, 2024 15:50

huitseeker reviewed Jan 22, 2024

View reviewed changes

tchataigner added 17 commits January 22, 2024 16:57

feat(aptos-proof-inclusion): ignore jetbrain config

327a2c5

feat(aptos-proof-inclusion): base without test

4c711cb

feat(aptos-proof-inclusion): include in workspace & sha3

fd2e421

feat(aptos-proof-inclusion): rename merkle-aptos -> merkle-inclusion

c79237f

feat(aptos-proof-inclusion): merkle-inclusion in workspace & test dep

99242a0

test(aptos-proof-inclusion): base test w/ aptos merkle & sha3

2fb3608

chore(aptos-proof-inclusion): update licenses

1870971

feat(aptos-proof-inclusion): wip trait

4dcb3e0

feat(aptos-proof-inclusion): trait for hasher

6924b5d

chore(aptos-proof-inclusion): rust doc

5f58941

chore(aptos-proof-inclusion): remove todo

3ecdb14

chore(aptos-proof-inclusion): rename test method

edd4dd2

chore(aptos-proof-inclusion): dry tests

c27ffd6

chore(aptos-proof-inclusion): generic hasher

8950d15

chore(aptos-proof-inclusion): simplify utils

ceee8e1

feat(aptos-proof-inclusion): OutOfCircuitHasher for testing

8efe124

test(aptos-proof-inclusion): use OutOfCircuitHasher for testing

48a7714

tchataigner added 15 commits January 22, 2024 16:57

test(aptos-proof-inclusion): macro to generate tests

ecb0bd1

chore(aptos-proof-inclusion): fixing xclippy

aecbbd3

test(aptos-proof-inclusion): added test for number of constraints

0462441

test(aptos-proof-inclusion): removed unnecessary constraints for leaf…

cfabfed

… key

fix(aptos-proof-inclusion): iterating over the whole hash

2434ed9

fix(aptos-proof-inclusion): updating test

2633910

fix(aptos-proof-inclusion): wip integrating review

c43121c

fix(aptos-proof-inclusion): proper order for each algorithm

9d51d80

feat(aptos-proof-inclusion): declarative endianness on GadgetDigest

186f91c

fix(aptos-proof-inclusion): assertion over nbr siblings & len key

439b091

fix(aptos-proof-inclusion): proper ordering and parsing for little an…

3709a56

…d big endian

ci(aptos-proof-inclusion): fix xclippy

d8e851a

feat(aptos-proof-inclusion): workspace licensing

697e460

feat(aptos-proof-inclusion): changed key & siblings iteration order

9548340

We now expect proof.siblings and proof.leaf.key to be ordered from first sibling to last (bottom to top of the tree).

fix(aptos-proof-inclusion): fix xclippy

460c8b6

tchataigner force-pushed the feature/merkle-proving branch from a8b7666 to 460c8b6 Compare January 22, 2024 15:57

chore(aptos-proof-inclusion): remove licenses at root

a011a61

tchataigner requested a review from huitseeker January 22, 2024 15:58

tchataigner added 2 commits January 22, 2024 17:01

fix(aptos-proof-inclusion): ref expected_root in verify_proof

bf32fe5

fix(aptos-proof-inclusion): update test fn

11021f8

huitseeker approved these changes Jan 22, 2024

View reviewed changes

huitseeker merged commit 5e93eca into lurk-lab:main Jan 22, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merkle Proving #16

Merkle Proving #16

tchataigner commented Jan 15, 2024 •

edited

avras Jan 16, 2024

tchataigner Jan 16, 2024

tchataigner commented Jan 17, 2024 •

edited

huitseeker Jan 18, 2024

huitseeker Jan 18, 2024

huitseeker Jan 18, 2024

tchataigner Jan 18, 2024

tchataigner Jan 19, 2024

huitseeker Jan 18, 2024

tchataigner Jan 18, 2024

huitseeker Jan 18, 2024

huitseeker Jan 18, 2024

tchataigner Jan 18, 2024

tchataigner Jan 19, 2024

tchataigner Jan 19, 2024

huitseeker Jan 20, 2024

huitseeker left a comment

huitseeker Jan 20, 2024

huitseeker Jan 20, 2024

huitseeker Jan 21, 2024

tchataigner Jan 22, 2024

huitseeker Jan 21, 2024

huitseeker Jan 21, 2024

huitseeker Jan 22, 2024

tchataigner Jan 22, 2024 •

edited

huitseeker left a comment

Merkle Proving #16

Merkle Proving #16

Conversation

tchataigner commented Jan 15, 2024 • edited

Description

Current progress

Links to any relevant issues or pull requests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tchataigner commented Jan 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huitseeker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tchataigner Jan 22, 2024 • edited

Choose a reason for hiding this comment

huitseeker left a comment

Choose a reason for hiding this comment

tchataigner commented Jan 15, 2024 •

edited

tchataigner commented Jan 17, 2024 •

edited

tchataigner Jan 22, 2024 •

edited