New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to a constant-space Merkle root/branch algorithm. #6508

Merged
merged 2 commits into from Nov 28, 2015

Conversation

Projects
None yet
8 participants
@sipa
Member

sipa commented Aug 3, 2015

This switches the Merkle tree logic for blocks to one that runs in logspace (effectively constant space, with a limit of 2^32 leaves). The old code is moved to tests, and a new test is added that for various combinations of block sizes, transaction positions to compute a branch for, and mutations:

  • Verifies that the old code and new code agree for the Merkle root.
  • Verifies that the old code and new code agree for the Merkle branch.
  • Verifies that the computed Merkle branch is valid.
  • Verifies that mutations don't change the Merkle root.
  • Verifies that mutations are correctly detected.

Advantages:

  • Removes the need for vMerkleTree (a validation-related data structure) from CBlock (a primitive data structure).
  • Is slightly faster due to better memory locality and avoiding all heap overhead.

Disadvantages:

  • Requires recomputing the Merkle tree any time a Merkle branch is needed. The wallet depends on this any time a transaction is added. This is not technically necessary, however.
  • Longer and harder to read code.

The code remains in primitives/block for now. This is not optimal, as one of the purposes is disentangling the Merkle calculation logic from the primitive data structures. TBD.

@dgenr8

This comment has been minimized.

Show comment
Hide comment
@dgenr8

dgenr8 Aug 3, 2015

Contributor

Great improvement over current space requirement which is O(n), yes?

Contributor

dgenr8 commented Aug 3, 2015

Great improvement over current space requirement which is O(n), yes?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 3, 2015

Member

@dgenr8 Yes, but the current code requires around 96 bytes per transaction, which is insignificant compared to the ~1 kB each transaction already requires in memory on average. Still, this PR improves it to ~1 kB total (but more importantly: entirely continuous and on the stack).

Member

sipa commented Aug 3, 2015

@dgenr8 Yes, but the current code requires around 96 bytes per transaction, which is insignificant compared to the ~1 kB each transaction already requires in memory on average. Still, this PR improves it to ~1 kB total (but more importantly: entirely continuous and on the stack).

@laanwj laanwj added the Validation label Aug 5, 2015

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Aug 5, 2015

Member

Concept ACK.
Needs to be reviewed very carefully as to not change the behavior in subtle edge cases.

Member

laanwj commented Aug 5, 2015

Concept ACK.
Needs to be reviewed very carefully as to not change the behavior in subtle edge cases.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 5, 2015

Member

Of course.

Note that the unit test creates merkle trees of all sizes up to 16, and for
each all known malleabilities and all branches.

On the other side, the performance and memory usage benefit are not
significant, but getting rid of vMerkleTree in CBlock may be worth it.

Member

sipa commented Aug 5, 2015

Of course.

Note that the unit test creates merkle trees of all sizes up to 16, and for
each all known malleabilities and all branches.

On the other side, the performance and memory usage benefit are not
significant, but getting rid of vMerkleTree in CBlock may be worth it.

@cozz

This comment has been minimized.

Show comment
Hide comment
@cozz

cozz Aug 8, 2015

Contributor

Does this mean that all transactions in a block have to be hashed together again, every time a transaction gets added to the wallet? So if there are 10 transactions in a block for the wallet, we hash it again 10 times?

Contributor

cozz commented Aug 8, 2015

Does this mean that all transactions in a block have to be hashed together again, every time a transaction gets added to the wallet? So if there are 10 transactions in a block for the wallet, we hash it again 10 times?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 8, 2015

Member
Member

sipa commented Aug 8, 2015

@cozz

This comment has been minimized.

Show comment
Hide comment
@cozz

cozz Aug 8, 2015

Contributor

Yes, I wouldnt store them either. But as long as we do, I do not like, that this pull decreases wallet performance. So I would like to see that fixed. Either by some temp-cache for the wallet, or just by removing those merkle-branches.

Contributor

cozz commented Aug 8, 2015

Yes, I wouldnt store them either. But as long as we do, I do not like, that this pull decreases wallet performance. So I would like to see that fixed. Either by some temp-cache for the wallet, or just by removing those merkle-branches.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 11, 2015

Member
Member

sipa commented Aug 11, 2015

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 11, 2015

Member

Rebased on top of #6550, to address the issue that @cozz raised.

Member

sipa commented Aug 11, 2015

Rebased on top of #6550, to address the issue that @cozz raised.

@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens Aug 12, 2015

Contributor

utACK

Diff comparison for those who want to check without including the rebased commits.

Contributor

dcousens commented Aug 12, 2015

utACK

Diff comparison for those who want to check without including the rebased commits.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Oct 6, 2015

Member

Needs rebase (after #6550, I think)

Member

laanwj commented Oct 6, 2015

Needs rebase (after #6550, I think)

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Nov 2, 2015

Member

Rebased.

Member

sipa commented Nov 2, 2015

Rebased.

@gmaxwell gmaxwell added this to the 0.12.0 milestone Nov 11, 2015

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Nov 17, 2015

Member

Moved the Merkle computation code to a separate file (independent of CBlock and CTransaction), and used that in a thin wrapper caller in primitives/block.h.

Member

sipa commented Nov 17, 2015

Moved the Merkle computation code to a separate file (independent of CBlock and CTransaction), and used that in a thin wrapper caller in primitives/block.h.

@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens Nov 18, 2015

Contributor

re-ACK

Also, awesome on moving merkle.cpp out 👍

Contributor

dcousens commented Nov 18, 2015

re-ACK

Also, awesome on moving merkle.cpp out 👍

@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon Nov 23, 2015

Member

If we make it consensus/merkle.o, that would be less code to move again later...

Member

jtimon commented Nov 23, 2015

If we make it consensus/merkle.o, that would be less code to move again later...

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Nov 26, 2015

Member

@jtimon Hmm, I was seeing this more as generic data structure code, like limitedmap.h is for example, but it's probably not useful to anything that doesn't also need the consensus logic anyway.

Member

sipa commented Nov 26, 2015

@jtimon Hmm, I was seeing this more as generic data structure code, like limitedmap.h is for example, but it's probably not useful to anything that doesn't also need the consensus logic anyway.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Nov 26, 2015

Member

Agree that Merkle trees are a general data structure - however, the specific way of doing a Merkle-tree is very bitcoin-consensus-specific, and there is the big warning that this is not recommended to use this as-is outside of bitcoin.

Member

laanwj commented Nov 26, 2015

Agree that Merkle trees are a general data structure - however, the specific way of doing a Merkle-tree is very bitcoin-consensus-specific, and there is the big warning that this is not recommended to use this as-is outside of bitcoin.

@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon Nov 26, 2015

Member

General data structure or not, this is used in checkBlock and will therefore be part of VerifyBlock and the complete libconsensus too.
Once it is complete, if we want to separate it as a subtree with its own repo, we will have to put all the files in the same dir (ie src/consensus) first.

Member

jtimon commented Nov 26, 2015

General data structure or not, this is used in checkBlock and will therefore be part of VerifyBlock and the complete libconsensus too.
Once it is complete, if we want to separate it as a subtree with its own repo, we will have to put all the files in the same dir (ie src/consensus) first.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Nov 26, 2015

Member

@jtimon I agree

Member

laanwj commented Nov 26, 2015

@jtimon I agree

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Nov 26, 2015

Member

Done.

I had to move more things around, because otherwise primitives and consensus would end up with a circular dependency on each other. Now all logic (both generic Merkle logic, and the block-specific code) are in consensus, which is better as none of it really belonged in primitives anyway.

Member

sipa commented Nov 26, 2015

Done.

I had to move more things around, because otherwise primitives and consensus would end up with a circular dependency on each other. Now all logic (both generic Merkle logic, and the block-specific code) are in consensus, which is better as none of it really belonged in primitives anyway.

@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon Nov 26, 2015

Member

Thank you.
I wouldn't worry about consensus/merkle depending on primitives/transaction while primitives/block depends on consensus/merkle. primitives is not a building package and they all three belong in the consensus building package (see WIP #7091) and eventually they should all be in the consensus dir anyway.
A circular dependency between consensus/merkle and primitives/block is another thing though.
I can't remember how the includes were before, but they look good to me now.

Member

jtimon commented Nov 26, 2015

Thank you.
I wouldn't worry about consensus/merkle depending on primitives/transaction while primitives/block depends on consensus/merkle. primitives is not a building package and they all three belong in the consensus building package (see WIP #7091) and eventually they should all be in the consensus dir anyway.
A circular dependency between consensus/merkle and primitives/block is another thing though.
I can't remember how the includes were before, but they look good to me now.

@jtimon

View changes

Show outdated Hide outdated src/test/main_tests.cpp
@@ -73,4 +74,127 @@ BOOST_AUTO_TEST_CASE(test_combiner_all)
BOOST_CHECK(Test());
}
// Older version of the merkle root computation code, for comparison.

This comment has been minimized.

@jtimon

jtimon Nov 26, 2015

Member

minor nit, is main_tests.cpp really the right place for this?

@jtimon

jtimon Nov 26, 2015

Member

minor nit, is main_tests.cpp really the right place for this?

This comment has been minimized.

@sipa

sipa Nov 27, 2015

Member

Have a better suggestion? :)

@sipa

sipa Nov 27, 2015

Member

Have a better suggestion? :)

This comment has been minimized.

@jtimon

jtimon Nov 27, 2015

Member

I don't know, checkblock_tests.cpp? maybe its own merkle_test.cpp ?
The only use in main.cpp is in CheckBlock(), which should eventually move to the consensus dir...

@jtimon

jtimon Nov 27, 2015

Member

I don't know, checkblock_tests.cpp? maybe its own merkle_test.cpp ?
The only use in main.cpp is in CheckBlock(), which should eventually move to the consensus dir...

This comment has been minimized.

@sipa

sipa Nov 27, 2015

Member

Moved!

@sipa

sipa Nov 27, 2015

Member

Moved!

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Nov 27, 2015

Member

utACK

Member

laanwj commented Nov 27, 2015

utACK

sipa added some commits Nov 17, 2015

Switch blocks to a constant-space Merkle root/branch algorithm.
This switches the Merkle tree logic for blocks to one that runs in constant (small) space.
The old code is moved to tests, and a new test is added that for various combinations of
block sizes, transaction positions to compute a branch for, and mutations:
 * Verifies that the old code and new code agree for the Merkle root.
 * Verifies that the old code and new code agree for the Merkle branch.
 * Verifies that the computed Merkle branch is valid.
 * Verifies that mutations don't change the Merkle root.
 * Verifies that mutations are correctly detected.
@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon Nov 27, 2015

Member

Oh, thought I had done it already...concept ACK

Member

jtimon commented Nov 27, 2015

Oh, thought I had done it already...concept ACK

@sipa sipa merged commit eece63f into bitcoin:master Nov 28, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

sipa added a commit that referenced this pull request Nov 28, 2015

Merge pull request #6508
eece63f Switch blocks to a constant-space Merkle root/branch algorithm. (Pieter Wuille)
ee60e56 Add merkle.{h,cpp}, generic merkle root/branch algorithm (Pieter Wuille)
@jgarzik

This comment has been minimized.

Show comment
Hide comment
@jgarzik

jgarzik Nov 28, 2015

Contributor

utACK

Contributor

jgarzik commented Nov 28, 2015

utACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment