Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add pathfinder_getProof #726

Merged
merged 54 commits into from
Dec 9, 2022

Conversation

pscott
Copy link
Contributor

@pscott pscott commented Nov 11, 2022

Implementation of #714

  • Adds a new endpoint (pathfinder_getStorageProofs)
  • Adds a new method get_proof / get_proofs for MerkleTree.
  • Adds verify_proof function in the test module, that can be used as a starting point for people wanting to implement their own verifier.

Closes #714

@Mirko-von-Leipzig
Copy link
Contributor

Mirko-von-Leipzig commented Nov 16, 2022

I've had a (non-comprehnsive) look at the code so far. I mainly focussed on the test code because

proofs::binary_root
proofs::double_binary

failed to pass. I'll look at rest of the code tomorrow. Also test code is the most important code :D

The tests failed due to bugs in the verify_proof code, so I suggest for now we focus on that for now.

If we consider what some external verify_proof should do. It has the following information:

  1. the tree root (known and trusted via some external source)
  2. the target key (known and trusted)
  3. a vector of nodes supposedly forming an acyclical graph, proving either that the target does not exist, or that the target exists and also its value.

To verify the graphness, we'll have to verify the hashes as we go along.

The current Node variants are a bit clumsy for this, because they also provide direct Rc links to other nodes and so on. Or put another way -- if we want to verify the proof from an external source, it cannot do this since we cannot form the Rc links etc.

So I would first suggest creating a new enum type (couldn't think of better names on the fly, feel free to experiment):

pub enum ProofNode {
  Edge(EdgeProofNode),
  Binary(BinaryProofNode),
}

pub struct BinaryProofNode {
  left: StarkHash,
  right: StarkHash,
}

pub struct EdgeProofNode {
  pub path: BitVec<Msb0, u8>,
  child: StarkHash
}

This will more accurately describe what you'll see. Note that a Leaf variant is not required. This is because the final non-leaf node will always contain the leaf's hash as a child -- and a leaf's value is its hash. So no leaf required since we have all the information there already (so long as we know we are hitting a leaf -- which we can if we track the height etc). You'll also need to add a hash() function to ProofNode, but you can just copy paste that from the original versions.

I would then suggest an algorithm as follows for verify_proof:

Track the path as we move through the vector of nodes -- this will allow us to check that we are okay and are following the target key's route. This also implicly will guard against a massive list of nodes in the proof.. Alternatively, nibble off the target's bits as we process through the vector (provided they match of course). I would suggest the nibble approach as it means less indexing troubles (always 0 based).

1. init expected_hash <- root hash
2. loop over nodes: current <- nodes[i]
   1. verify the current node's hash matches expected_hash (if not then we have a bad proof)
   2. move towards the target - if current is:
      1. binary node then choose the child that moves towards the target, else if
      2. edge node then check the path against the target bits
         1. If it matches then proceed with the child, else
         2. if it does not match then we now have a proof that the target does not exist
   3. nibble off target bits according to which child you got in (2). If all bits are gone then you have reached the target and the child hash is the value you wanted and the proof is complete.
   4. set expected_hash <- to the child hash

@pscott
Copy link
Contributor Author

pscott commented Nov 17, 2022

Hey @Mirko-von-Leipzig , thanks for your comment.
I've implemented your suggestion in my last commit (again, the name of the commit will be changed, and I'm using force pushes because I don't want to ruin the history later on).

Your algorithm is so much more cleaner and elegant than what I had come up with.
I still need to:

  • Address the todos left in comments
  • Add some additional tests (maybe even some that would be randomly generated).
  • Add the full RPC endpoint
  • Document code

I do already have a couple of questions though:

  1. The ProofNode enum and its associated BinaryProofNode / EdgeProofNode are declared in merkle_tree.rs, but only used in the proofs test module. This triggers a compilation warning for unused code. Should I be using [allow(dead_code] on every impl / variable, or should I ![allow(dead_code)] on the whole file, or are there other clever ways of doing this that I'm not aware of? :)
  2. In section 2.2.2 you mention if it does not match then we now have a proof that the target does not exist. Are you saying this is a proper non-inclusion proof (aka proof of non-membership), or are you just saying that this means the proof is incorrect?
  3. I'm using a fair bit of unwraps around the code. I'm not sure which ones we should keep and which ones we shouldn't?

@Mirko-von-Leipzig
Copy link
Contributor

Hey @Mirko-von-Leipzig , thanks for your comment. I've implemented your suggestion in my last commit (again, the name of the commit will be changed, and I'm using force pushes because I don't want to ruin the history later on).

I would suggest keeping the commits and just doing normal pushes. git comes with some powerful options to fix history after the fact. In other words: just keep adding commits now - do try make them sensible / understandable, so that way we as reviewers can properly track your changes. Right now we would have to re-read the code entirely and would have to try remember what the previous iteration did. And then once you're done we can work on getting the commits and the messages more sensible.

I do already have a couple of questions though:

  1. The ProofNode enum and its associated BinaryProofNode / EdgeProofNode are declared in merkle_tree.rs, but only used in the proofs test module. This triggers a compilation warning for unused code. Should I be using [allow(dead_code] on every impl / variable, or should I ![allow(dead_code)] on the whole file, or are there other clever ways of doing this that I'm not aware of? :)

You should move those definitions into the proofs test module. That way they only get compiled along with the proofs tests and you won't get those warnings.

  1. In section 2.2.2 you mention if it does not match then we now have a proof that the target does not exist. Are you saying this is a proper non-inclusion proof (aka proof of non-membership), or are you just saying that this means the proof is incorrect?

I think that its a proper non-inclusion proof. Since at that point the caller can determine that we:

  1. Correctly moved towards the target insofar as is possible, and
  2. hashing all the nodes along the path does result in the root hash, which means
  3. the target definitely does not exist in this tree
  1. I'm using a fair bit of unwraps around the code. I'm not sure which ones we should keep and which ones we shouldn't?

I'll have a look, but I would say just keep that as the last thing todo once you're sure everything is 100% correct and you're happy with the API. In general:

  1. Only allow unwrap if you can prove (logically) that it can never fail, or
  2. its some invariant that you have specified that the caller must uphold (but often this is a poor API, and should be a last resort).
  3. Everything else should map to an error.

@pscott
Copy link
Contributor Author

pscott commented Nov 18, 2022

You should move those definitions into the proofs test module. That way they only get compiled along with the proofs tests and you won't get those warnings.

So the ProofNode enum is used in the get_proof code, but the fields in the BinaryProofNode and EdgeProofNode are not. The same goes for their respective impl.

I've moved the impl in the proofs (test) mod, and added the #[allow(dead_code)] on the structs in 84000ac .

Regarding the traverse usage: I think it's a terrific idea! I've implemented this in 8b8f7db and 94d7922. There are a couple of things I'd like to point out though:

  1. I've added a Leaf struct that is checked at the end of verify_proof. We could avoid it by simply using take(len - 1) in get_proof. Don't know what's best: keeping this Leaf or removing it from the vec in get_proof.
  2. Regarding the inclusion proof: you are correct. I was missing the fact that we check that the hashes are equivalent BEFORE we check the path. Thanks for clarifying!

@Mirko-von-Leipzig
Copy link
Contributor

You should move those definitions into the proofs test module. That way they only get compiled along with the proofs tests and you won't get those warnings.

So the ProofNode enum is used in the get_proof code, but the fields in the BinaryProofNode and EdgeProofNode are not. The same goes for their respective impl.
I've moved the impl in the proofs (test) mod, and added the #[allow(dead_code)] on the structs in 84000ac .

Yes, thats true -- for now. At some point you'll have to use them in the RPC code to translate it from our pathfinder internal code (what you're busy with now) into json. At that point you'll be mapping from the these internal structs and their fields into json. But for now just slapping #[allow(dead_code)] to make the warnings shut up is fine 👍 or moving them into the test module is fine.

Regarding the traverse usage: I think it's a terrific idea! I've implemented this in 8b8f7db and 94d7922. There are a couple of things I'd like to point out though:

  1. I've added a Leaf struct that is checked at the end of verify_proof. We could avoid it by simply using take(len - 1) in get_proof. Don't know what's best: keeping this Leaf or removing it from the vec in get_proof.

I think it depends on what we want to return in the RPC.. its technically redudant information but if we include it, then we have to add a leaf type to the RPC enum. And callers still have to verify that the full path is correct - so I don't think there is any value in having it.

I would remove it - however you can't simply [len-1] it. Because if its non-inclusion proof then the final element will not be a leaf, but an edge node.

A nice functional way to remove it nicely (i.e. without raw indices):

if matches!(nodes.last(), Some(Node::Leaf(_))) {
  nodes.pop();
}
  1. Regarding the inclusion proof: you are correct. I was missing the fact that we check that the hashes are equivalent BEFORE we check the path. Thanks for clarifying!

💪 Probably means that its worth adding a comment explaining this for the next poor soul that has to try understand the intent of it all :)

And thanks for keeping the commits - as you've noticed it means you can even point me at which ones do what. Makes discussions simpler etc.

Comment on lines 33 to 58
match context
.pending_data
.ok_or_else(|| anyhow!("Pending data not supported in this configuration"))?
.state_update()
.await
{
Some(update) => {
let pending_value = update
.state_diff
.storage_diffs
.get(&input.contract_address)
.and_then(|storage| {
storage.iter().find_map(|update| {
(update.key == input.key).then_some(update.value)
})
});

// match pending_value {
// Some(value) => return Ok(value),
// None => StarknetBlocksBlockId::Latest,
// }
StarknetBlocksBlockId::Latest
}
None => StarknetBlocksBlockId::Latest,
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find a way to fix that :)

Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looking very good! Thanks for all your effort on this!

crates/pathfinder/src/state/merkle_tree.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/state/merkle_tree.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/state/merkle_tree.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/state/merkle_tree.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/state/merkle_tree.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/rpc/v02/method/get_proof.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/rpc/v02/method/get_proof.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/rpc/v02/method/get_proof.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/rpc/v02/method/get_proof.rs Outdated Show resolved Hide resolved
crates/pathfinder/src/rpc/v02/method/get_proof.rs Outdated Show resolved Hide resolved
pscott and others added 2 commits November 23, 2022 23:20
Co-authored-by: Mirko von Leipzig <48352201+Mirko-von-Leipzig@users.noreply.github.com>
Co-authored-by: Mirko von Leipzig <48352201+Mirko-von-Leipzig@users.noreply.github.com>
@CHr15F0x
Copy link
Member

CHr15F0x commented Dec 7, 2022

I'm yet to go through the tests, but overall everything looks good

Co-authored-by: CHr15F0x <42979253+CHr15F0x@users.noreply.github.com>
pscott and others added 4 commits December 8, 2022 12:31
Co-authored-by: CHr15F0x <42979253+CHr15F0x@users.noreply.github.com>
Co-authored-by: CHr15F0x <42979253+CHr15F0x@users.noreply.github.com>
@CHr15F0x
Copy link
Member

CHr15F0x commented Dec 8, 2022

I've just gone through the tests, I really like it. And I do appreciate the ascii art 👍 which always helps with understanding the concept of this tree when going through examples, which these tests in fact are.

@pscott
Copy link
Contributor Author

pscott commented Dec 8, 2022

@CHr15F0x Thanks for your review!

Addressed your comments (and replied with commit hashes for easy tracking :)).

Also created two tracking issues so we don't forget about it! #779 #778

Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - we'll need to do some follow-ups on the issues you've created to complete this entirely, but should be fairly quick.

Thanks again; really appreciate all the effort you've put in here.

Copy link
Member

@CHr15F0x CHr15F0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Thanks for the effort!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pathfinder_getProof RPC endpoint
4 participants