Do not store Merkle branches in the wallet. #6550

Merged
merged 2 commits into from Sep 23, 2015

Conversation

Projects
None yet
7 participants
@sipa
Member

sipa commented Aug 11, 2015

Assume that when a wallet transaction has a valid block hash and transaction position
in it, the transaction is actually there. We're already trusting wallet data in a
much more fundamental way anyway.

To prevent backward compatibility issues, a new record is used for storing the
block locator in the wallet. Old wallets will see a wallet file synchronized up
to the genesis block, and rescan automatically.

Fixes #6536 using a suggestion by @jonasschnelli to retain backward compatibility.

@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens Aug 12, 2015

Contributor

utACK

Contributor

dcousens commented Aug 12, 2015

utACK

@@ -15,7 +15,7 @@ uint256 CBlockHeader::GetHash() const
return SerializeHash(*this);
}
-uint256 CBlock::BuildMerkleTree(bool* fMutated) const
+uint256 CBlock::ComputeMerkleRoot(bool* fMutated) const

This comment has been minimized.

@dcousens

dcousens Aug 12, 2015

Contributor

Unrelated to this PR, but is there any reason we prefer the optional arguments compared to a tuple return value?

@dcousens

dcousens Aug 12, 2015

Contributor

Unrelated to this PR, but is there any reason we prefer the optional arguments compared to a tuple return value?

This comment has been minimized.

@laanwj

laanwj Aug 12, 2015

Member

I'm not sure this is so much "preferred", it's how it happens to be written between two more-or-less equivalent ways to formulate it.

@laanwj

laanwj Aug 12, 2015

Member

I'm not sure this is so much "preferred", it's how it happens to be written between two more-or-less equivalent ways to formulate it.

This comment has been minimized.

@sipa

sipa Aug 12, 2015

Member

You're relying on an optional optimization allowed by the C++ standard (copy elision) to avoid constructing a tuple as a temporary and copying it to the return location. In C++11 it's better because you can explicitly indicate move semantics.

@sipa

sipa Aug 12, 2015

Member

You're relying on an optional optimization allowed by the C++ standard (copy elision) to avoid constructing a tuple as a temporary and copying it to the return location. In C++11 it's better because you can explicitly indicate move semantics.

@dcousens

View changes

src/wallet/wallet.cpp
@@ -2841,11 +2836,9 @@ int CMerkleTx::GetDepthInMainChainINTERNAL(const CBlockIndex* &pindexRet) const
return 0;
// Make sure the merkle branch connects to this block
- if (!fMerkleVerified)
+ if (nIndex == -1)

This comment has been minimized.

@dcousens

dcousens Aug 12, 2015

Contributor

Why is this added? Its already done above in:

 if (hashBlock.IsNull() || nIndex == -1)
         return 0;

Or is nIndex somehow modified elsewhere?

@dcousens

dcousens Aug 12, 2015

Contributor

Why is this added? Its already done above in:

 if (hashBlock.IsNull() || nIndex == -1)
         return 0;

Or is nIndex somehow modified elsewhere?

This comment has been minimized.

@jonasschnelli

jonasschnelli Aug 12, 2015

Member

Yes. I think this check can be removed completely.

@jonasschnelli

jonasschnelli Aug 12, 2015

Member

Yes. I think this check can be removed completely.

This comment has been minimized.

@sipa

sipa Aug 12, 2015

Member
@sipa

sipa via email Aug 12, 2015

Member

This comment has been minimized.

@sipa

sipa Sep 7, 2015

Member

Fixed.

@sipa

sipa Sep 7, 2015

Member

Fixed.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Aug 12, 2015

Member

Concept ACK.

Member

laanwj commented Aug 12, 2015

Concept ACK.

@laanwj laanwj added the Wallet label Aug 12, 2015

@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon Aug 12, 2015

Member

Concept ACK

Member

jtimon commented Aug 12, 2015

Concept ACK

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 12, 2015

Member

This needs testing :)

Member

sipa commented Aug 12, 2015

This needs testing :)

@jonasschnelli

This comment has been minimized.

Show comment
Hide comment
@jonasschnelli

jonasschnelli Aug 12, 2015

Member

Concept ACK. Started testing...

Member

jonasschnelli commented Aug 12, 2015

Concept ACK. Started testing...

@jonasschnelli

This comment has been minimized.

Show comment
Hide comment
@jonasschnelli

jonasschnelli Aug 12, 2015

Member

Tested ACK (lldb stepped; update of "old" wallet.dat works). Played around with some huge regtest wallets/chains. Moved wallet.dat back and forth from master bitcoin-core to master+this PR bitcoin-core.

I think this is a step forward. IMO it's totally sufficient to check if a wtx has a valid block in the active chain and the wtx transaction index matches the position in the block.

Next step could be to simplify the height calculation.
GetDepthInMainChainINTERNAL() is very ineffective and holds cs_main because it accesses mapBlockIndex. Calculating balance, UI listing of transactions will call GetDepthInMainChainINTERNAL() for each wallet transaction (! and it even gets called when the credit of a wtx is cached).

I think a height cache would perform better (together with a invalidating option down to a certain height in case of a reorg).

Member

jonasschnelli commented Aug 12, 2015

Tested ACK (lldb stepped; update of "old" wallet.dat works). Played around with some huge regtest wallets/chains. Moved wallet.dat back and forth from master bitcoin-core to master+this PR bitcoin-core.

I think this is a step forward. IMO it's totally sufficient to check if a wtx has a valid block in the active chain and the wtx transaction index matches the position in the block.

Next step could be to simplify the height calculation.
GetDepthInMainChainINTERNAL() is very ineffective and holds cs_main because it accesses mapBlockIndex. Calculating balance, UI listing of transactions will call GetDepthInMainChainINTERNAL() for each wallet transaction (! and it even gets called when the credit of a wtx is cached).

I think a height cache would perform better (together with a invalidating option down to a certain height in case of a reorg).

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 12, 2015

Member

Thanks for testing!

@jonasschnelli No need for a height cache, just a cached chainActive.Tip() inside the wallet suffices (you can do ancestor-of checks without cs_main). I can do that, but it's not for this PR.

Member

sipa commented Aug 12, 2015

Thanks for testing!

@jonasschnelli No need for a height cache, just a cached chainActive.Tip() inside the wallet suffices (you can do ancestor-of checks without cs_main). I can do that, but it's not for this PR.

@theuni

This comment has been minimized.

Show comment
Hide comment
@theuni

theuni Aug 12, 2015

Member

This will cause the merkle tree to be calculated at least 3x per accepted block, as far as I can see:
ProcessNewBlock -> CheckBlock
ProcessNewBlock -> AcceptBlock -> CheckBlock
ProcessNewBlock -> ActivateBestChain -> ... ConnectBlock -> CheckBlock

Is the calculation time significant enough to try to skip some of that?

Member

theuni commented Aug 12, 2015

This will cause the merkle tree to be calculated at least 3x per accepted block, as far as I can see:
ProcessNewBlock -> CheckBlock
ProcessNewBlock -> AcceptBlock -> CheckBlock
ProcessNewBlock -> ActivateBestChain -> ... ConnectBlock -> CheckBlock

Is the calculation time significant enough to try to skip some of that?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 12, 2015

Member

@theuni Ugh, that's bad. No, it won't make up for that. We should avoid doing those duplicated checks.

Member

sipa commented Aug 12, 2015

@theuni Ugh, that's bad. No, it won't make up for that. We should avoid doing those duplicated checks.

@theuni

This comment has been minimized.

Show comment
Hide comment
@theuni

theuni Aug 12, 2015

Member

@sipa If I'm reading correctly, it looks like the CheckBlock() in AcceptBlock() is safe to use !fCheckMerkleRoot since it's only ever called from ProcessNewBlock() where it's already been checked.

Same for the ConnectBlock() in ConnectTip(), since they're already checked before writing to disk. That seems a bit more risky, but surely a bad/corrupt read would be detected long before that point anyway?

Member

theuni commented Aug 12, 2015

@sipa If I'm reading correctly, it looks like the CheckBlock() in AcceptBlock() is safe to use !fCheckMerkleRoot since it's only ever called from ProcessNewBlock() where it's already been checked.

Same for the ConnectBlock() in ConnectTip(), since they're already checked before writing to disk. That seems a bit more risky, but surely a bad/corrupt read would be detected long before that point anyway?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 13, 2015

Member
Member

sipa commented Aug 13, 2015

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Aug 15, 2015

Member

@theuni It looks a bit more complicated, as currently we do a CheckBlock on blocks read from disk for validation, which would be harder to do if we want to avoid duplicate checks. I've made a much simpler (and slightly uglier) change for now here, which is to cache the result of CheckBlock in CBlock. That also avoids a few other checks that were being done 3 times before.

Member

sipa commented Aug 15, 2015

@theuni It looks a bit more complicated, as currently we do a CheckBlock on blocks read from disk for validation, which would be harder to do if we want to avoid duplicate checks. I've made a much simpler (and slightly uglier) change for now here, which is to cache the result of CheckBlock in CBlock. That also avoids a few other checks that were being done 3 times before.

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell Sep 6, 2015

Member

Where does this stand right now? Should I spend cycles testing it as is?

Member

gmaxwell commented Sep 6, 2015

Where does this stand right now? Should I spend cycles testing it as is?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Sep 7, 2015

Member

@gmaxwell Test away.

Member

sipa commented Sep 7, 2015

@gmaxwell Test away.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Sep 14, 2015

Member

Ping

Member

sipa commented Sep 14, 2015

Ping

@@ -2573,7 +2576,7 @@ bool CheckBlock(const CBlock& block, CValidationState& state, bool fCheckPOW, bo
// Check the merkle root.
if (fCheckMerkleRoot) {
bool mutated;
- uint256 hashMerkleRoot2 = block.BuildMerkleTree(&mutated);
+ uint256 hashMerkleRoot2 = block.ComputeMerkleRoot(&mutated);

This comment has been minimized.

@dcousens

dcousens Sep 14, 2015

Contributor

const?

@dcousens

dcousens Sep 14, 2015

Contributor

const?

This comment has been minimized.

@sipa

sipa Sep 22, 2015

Member

const what?

@sipa

sipa Sep 22, 2015

Member

const what?

@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens Sep 14, 2015

Contributor

re-utACK

Contributor

dcousens commented Sep 14, 2015

re-utACK

sipa added some commits Aug 11, 2015

Do not store Merkle branches in the wallet.
Assume that when a wallet transaction has a valid block hash and transaction position
in it, the transaction is actually there. We're already trusting wallet data in a
much more fundamental way anyway.

To prevent backward compatibility issues, a new record is used for storing the
block locator in the wallet. Old wallets will see a wallet file synchronized up
to the genesis block, and rescan automatically.
@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Sep 22, 2015

Member

Rebased.

Member

sipa commented Sep 22, 2015

Rebased.

@@ -78,7 +78,7 @@ class CBlock : public CBlockHeader
std::vector<CTransaction> vtx;
// memory only
- mutable std::vector<uint256> vMerkleTree;
+ mutable bool fChecked;

This comment has been minimized.

@laanwj

laanwj Sep 23, 2015

Member

Ideally we'd have this fChecked status (which isn't used in CBlock itself) on an administrative object that wraps a CBlock, instead of on CBlock itself.

@laanwj

laanwj Sep 23, 2015

Member

Ideally we'd have this fChecked status (which isn't used in CBlock itself) on an administrative object that wraps a CBlock, instead of on CBlock itself.

This comment has been minimized.

@sipa

sipa Sep 23, 2015

Member

Fully agree, I actually started implementing that as a follow-up, but didn't want to interfere with other refactorings.

@sipa

sipa Sep 23, 2015

Member

Fully agree, I actually started implementing that as a follow-up, but didn't want to interfere with other refactorings.

This comment has been minimized.

@laanwj

laanwj Sep 23, 2015

Member

OK, yes, I'm fine with keeping it like this for this pull

@laanwj

laanwj Sep 23, 2015

Member

OK, yes, I'm fine with keeping it like this for this pull

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Sep 23, 2015

Member

Tested forward/backward compatibility of wallet with this code change.
ACK.

Member

laanwj commented Sep 23, 2015

Tested forward/backward compatibility of wallet with this code change.
ACK.

@laanwj laanwj merged commit 3b33ec8 into bitcoin:master Sep 23, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Sep 23, 2015

Merge pull request #6550
3b33ec8 Avoid duplicate CheckBlock checks (Pieter Wuille)
391dff1 Do not store Merkle branches in the wallet. (Pieter Wuille)

@rnicoll rnicoll referenced this pull request in dogecoin/dogecoin Feb 21, 2016

Closed

Merge AuxPoW support from Namecore #1332

@dagurval dagurval referenced this pull request in bitcoinxt/bitcoinxt Jun 13, 2018

Merged

Switch to a constant-space Merkle root/branch algorithm. #452

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment