Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of Merkle root in Solidity #1401

Closed
wants to merge 1 commit into from

Conversation

dwardu
Copy link
Contributor

@dwardu dwardu commented Oct 9, 2018

The current MerkleProof library, introduced in #260, allows a Merkle proof for a leaf hash to be verified against a pre-computed Merkle root that typically would be stored in the contract. Applications of the current MerkleProof library are described in #704 and #715.

The existent library does not contain a function to calculate a Merkle root in Solidity, as it is presumed that the Merkle root will be computed off-chain and only stored on-chain. But there is a valid use case for computing the Merkle-root on-chain. If we take e.g. OpenZeppelin Solidity’s own Escrow contract, (and we first add a function bulkDeposit(address[] payees, uint256[] values) so that a number of n deposits can be made in bulk so as to make the gas per deposit cheaper), by using MerkleTrees.computeRoot() it is possible to cut down gas consumption of the deposit to 20% of what would be consumed by implementing bulkDeposit using mapping entries (I will be publishing a gas-usage analysis later.) By being able to compute the Merkle root on-chain, it is possible to validate the leaf data blocks against a set of rules before they are combined into a Merkle root (and lost forever.) As an example, we might want to allow a bulk ETH deposit only if the individual deposit amounts add up to msg.value.

This PR is not final, as I am not certain on which level of abstraction to settle. What I have done is to publish the first commit with an arbitrary implementation, so that it may serve as a basis for discussion. The goal is that as a result of the discussion on this PR, the code is slimmed down and tidied up. In addition to the code pushed until now, I have also refactored the Merkle Tree JS library, as well as written several JavaScript tests for the new functionality, but I am not pushing them for now, and will add them to this PR when we have decided on the form of the library and I have refactored them to match.

Here is an overview of the changes committed up to now:

  • MerkleTrees library contains a generic Merkle-tree implementation, that may be configured with aribitrary leaf/node-pair hash functions. It contains two primary functions: computeRoot and verifyProof. Computing a root is split into 3 stages: 1. Allocating a Tree structure, 2. setting the leaves, and 3. computing the root. This multi-step process could have been avoided by providing a single computeRoot(bytes[] unhashedLeafDataBlocks) returns (bytes32) function, but then the user would have to allocate a array of unhashed leaf data blocks — by splitting into 3 steps, this (potentially costly) allocation is avoided.
  • LegacyMerkleTrees library contains a Merkle tree implementation that uses plain keccak256 for both leaves and nodes. The difference between this and the existing MerkleProof.verify() implementation is that the existing implementation accepts an already-hashed leaf node. But a quick search in published contracts in Etherscan shows that the leaf-hash function is almost always plain keccak256, so in the LegacyMerkleTrees library we use that.
  • MerkleProof library behaviour is unchanged, except that it now accepts the leaf data directly instead of the leaf hash (and mock/tests have been slightly modified to reflect this change). Otherwise it keeps the same interface and tests, but internally it forwards calls to LegacyMerkleTrees library.
  • SecondPreimageResistantMerkleTrees contains a Merkle-tree implementation that uses different hash functions for leaves and node-pairs, as recommended here.

For a concrete example of how to compute a Merkle root, kindly read the natspec docs and refer to TestSecondPreimageResistantMerkleTrees.sol#testComputeRootOfTreeOfSize.

I believe the core implementation of the algorithm is pretty solid. What I would like to discuss most are things like:

  • Should the library provide a generic implementation that does not specify the hashing functions (as in MerkleTrees in the current commit), and stop there?
  • Should it provide a generic implementation, and then provide specific implementations (as in LegacyMerkleTrees and SecondPreimageResistantMerkleTrees in the current commit)?
  • Or should it just provide a single implementation with pre-set hash functions (e.g. SecondPreimageResistantMerkleTrees)? (Code would be much simpler like this)
  • Should the hash functions simply be passed as parameters to createTree and verifyProof, or should the current object-oriented design be kept?
  • Should the library be split into different Solidity files or not?
  • What level of backward compatibility with the existent library should be provided?

Don’t worry about the long filenames, they’re temporary (I hope.)

I eagerly await your feedback!

@nventuro nventuro self-assigned this Oct 10, 2018
@nventuro nventuro added feature New contracts, functions, or helpers. contracts Smart contract code. labels Oct 10, 2018
@nventuro
Copy link
Contributor

Hey there @dwardu, thanks for the thorough explanation! It is always a pleasure to review these well thought-out sort of PRs.

Unfortunately, we're sort of swamped right now due to the imminent 2.0 release, Devcon, and related activities, making it very difficult for us to go through the required design and review process. Would you be OK with us putting this on hold until November, when things will have quieted down significantly?

@dwardu
Copy link
Contributor Author

dwardu commented Oct 10, 2018

Hey @nventuro, of course, first things first, and thanks for letting me know!

@nventuro nventuro added the on hold Put on hold for some reason that must be specified in a comment. label Oct 10, 2018
…n function

* MerkleTrees library contains a generic Merkle-tree implementation, that may be configured with aribitrary leaf/node-pair hash functions
* LegacyMerkleTrees library contains a Merkle tree implementation that uses plain keccak256 for both leaves and nodes.
* MerkleProof library behaviour is unchanged, except that it now accepts the leaf data directly instead of the leaf hash. Otherwise it keeps the same interface and tests, but internally it forwards calls to LegacyMerkleTrees library
* SecondPreimageResistantMerkleTrees contains a Merkle-tree implementation that uses different hash functions for leaves and node-pairs
@nventuro nventuro removed the on hold Put on hold for some reason that must be specified in a comment. label Feb 28, 2019
@nventuro
Copy link
Contributor

We may want to revisit this sort of thing soon since there are many applications that require said computation - tagging for the next release cycle for us to evaluate inclusion.

@nventuro nventuro added this to the v2.3 milestone Feb 28, 2019
@dwardu
Copy link
Contributor Author

dwardu commented Mar 14, 2019

Hey @nventuro, that’s great! I will be able to focus on it in ~10 days time.

@frangio frangio added the needs milestone Interesting features or improvements that are not yet assigned to a milestone. label Apr 8, 2019
@frangio frangio removed this from the v2.3 milestone Apr 8, 2019
@stale
Copy link

stale bot commented Apr 23, 2019

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added the stale label Apr 23, 2019
@stale
Copy link

stale bot commented May 8, 2019

Hi folks!
This Pull Request is being closed as there was no response to the previous prompt. However, please leave a comment whenever you're ready to resume, so it can be reopened.
Thanks again!

@stale stale bot closed this May 8, 2019
@dwardu
Copy link
Contributor Author

dwardu commented Jun 21, 2019

Hi @nventuro, I have brought the changes up to date.

My primary query regarding this PR is whether the library should offer a generic MerkleTree framework that can be extended easily to accommodate different hashing functions (as it is now in the pushed code), at the expense of having more complex code; or else settle on a specific MerkleTree implementation, with specific hashing functions (as in SecondPreimageResistantMerkleTrees), thus simplifying things.

What might assist making this decision is if you could kindly share with me some links to the applications that require this computation that you have mentioned above in your comment. Thanks!

P.S. I can see that there are no .sol tests in the project, apart from the one I introduced now, all tests are JS. Is there a specific reason for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contracts Smart contract code. feature New contracts, functions, or helpers. needs milestone Interesting features or improvements that are not yet assigned to a milestone. stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants