Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apex merkle commitment #562

Merged
merged 1 commit into from Apr 16, 2019
Merged

Apex merkle commitment #562

merged 1 commit into from Apr 16, 2019

Conversation

porcuquine
Copy link
Collaborator

@porcuquine porcuquine commented Mar 18, 2019

This PR implements a component we can use to avoid rehashing the top (apex) part of a merkle tree repeatedly when performing many inclusion proofs. I will write the larger plan (which @nicola and I have been discussing in detail since Friday) up separately.

I'm submitting this PR now because what it contains is independently meaningful and can be reviewed independently — even though not used for our proofs yet. We can iterate on the interface, etc. if necessary when we get to the full optimization.

The idea here is that we can create a vector commitment to a power-of-2 number of Fr values and hash them together to form a single root more cheaply than by creating a complete binary merkle tree. (If/when we have more generators, we can do this in fewer pedersen hashes. For now, we still have to use merkle-damgard compression.)

If we implemented an equivalent vector commitment using a binary merkle tree with height L and size (number of commited values = 2^L - 1) S, we would need S - 1 hashes outside the circuit to construct, and L-1 hashes inside the circuit to prove inclusion.

Instead, we perform S-1 hashes outside the circuit, the same S-1 hashes inside the circuit, with an additional (2^L + L) - 1 constraints per inclusion proof. Since each pedersen hash uses more than 1,000 constraints, this is a savings for L up to 10.

In order to make this work, we make use of num::AllocatedNum::conditionally_reverse which we repurpose to select between one of two alternative allocated numbers at the cost of only two constraints.

I ended up implementing this two different ways in order to verify. FlatApexCommitment will be more efficient at synthesis time than BinaryApexCommitment. It could be refined further if need be.

If anyone knows of a way to verify positional inclusion in fewer constraints, we could also use that. For now, this stands as the best candidate.

See more design detail in #563.

@nicola
Copy link
Contributor

nicola commented Mar 18, 2019

Can you attach a very brief spec/intuition doc of what is happening?

@porcuquine porcuquine mentioned this pull request Mar 18, 2019
@porcuquine
Copy link
Collaborator Author

@nicola I wrote something very brief in #563. See if that helps you get the idea.

@porcuquine
Copy link
Collaborator Author

@nicola @dignifiedquire

It seems that we will be able to use the fantastic optimization of challenge bucketing, with each apex leaf representing a bucket. This has some wrinkles to sort out, but it also means we will not need this code now.

However, since this is a standalone implementation and has some value to our proving toolkit, if only informational, I'd like to merge so it's not lost.

@porcuquine
Copy link
Collaborator Author

Unfortunately, it seems unlikely that we can make challenge bucketing work for most inclusion proofs, since parents (which represent most inclusion proofs) are random by their nature. We could still eliminate the need for this code for data and replica inclusion proofs, but some variation would be needed for the parents.

Also note that a distinct apex is required for each merkle tree. Since committing to the apex is somewhat expensive, we should avoid repeating this unnecessarily in each partition. Although we cannot get rid of the need to commit to each apex once, we can reduce the cost of duplication by segregating partitions by layer. For example, the last (largest, with tapering) layer might have its own partition.

This would require distinct circuits for each such set of layers to be grouped, but would avoid the overhead of paying setup costs for every layer in every partition.

dignifiedquire
dignifiedquire previously approved these changes Apr 16, 2019
Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small change requested other than that, meeeerge it

* This sets an uppper bound on the size of the apex. When the cost of including another row in the apex
* exceeds the cost (in constraints) of one hash, there's no potential savings. (Reference: 1 32-byte pedersen hash
* requires ~1152 constraints).
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use module level comments //!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants