New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce checkpoints' effect on consensus. #5927

Merged
merged 1 commit into from Jun 10, 2015

Conversation

Projects
None yet
7 participants
@sipa
Member

sipa commented Mar 19, 2015

Instead of only checking height to decide whether to disable script checks, actually check whether a block is an ancestor of a checkpoint, up to which headers have been validated. This means that we don't have to prevent accepting a side branch anymore - it will be safe, just less fast to do.

We still need to prevent being fed a multitude of low-difficulty headers filling up our memory. The mechanism for that is unchanged for now: once a checkpoint is reached with headers, no headers chain branching off before that point are allowed anymore.

@jonasschnelli

This comment has been minimized.

Show comment
Hide comment
@jonasschnelli

jonasschnelli Mar 19, 2015

Member

I'm getting EXC_BAD_ACCESS during setgenerate -> ConnectBlock().

Maybe CBlockIndex* CBlockIndex::GetAncestor(int height) was called with a CBlockIndex with a pskip NULL Poiner?

* thread #2: tid = 0x586dd, 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89
   86               (heightSkip > height && !(heightSkipPrev < heightSkip - 2 &&
   87                                         heightSkipPrev >= height))) {
   88               // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
-> 89               pindexWalk = pindexWalk->pskip;
   90               heightWalk = heightSkip;
   91           } else {
   92               pindexWalk = pindexWalk->pprev;
(lldb) bt
* thread #2: tid = 0x586dd, 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
  * frame #0: 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89
    frame #1: 0x0000000100163650 bitcoind`ConnectBlock(block=0x0000000101f04690, state=0x000000010597d5f0, pindex=0x000000010597cf38, view=0x000000010597cfc0, fJustCheck=true) + 592 at main.cpp:1761
    frame #2: 0x00000001001730a7 bitcoind`TestBlockValidity(state=0x000000010
Member

jonasschnelli commented Mar 19, 2015

I'm getting EXC_BAD_ACCESS during setgenerate -> ConnectBlock().

Maybe CBlockIndex* CBlockIndex::GetAncestor(int height) was called with a CBlockIndex with a pskip NULL Poiner?

* thread #2: tid = 0x586dd, 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89
   86               (heightSkip > height && !(heightSkipPrev < heightSkip - 2 &&
   87                                         heightSkipPrev >= height))) {
   88               // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
-> 89               pindexWalk = pindexWalk->pskip;
   90               heightWalk = heightSkip;
   91           } else {
   92               pindexWalk = pindexWalk->pprev;
(lldb) bt
* thread #2: tid = 0x586dd, 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
  * frame #0: 0x00000001000518fe bitcoind`CBlockIndex::GetAncestor(this=0x000000010597cf38, height=0) + 190 at chain.cpp:89
    frame #1: 0x0000000100163650 bitcoind`ConnectBlock(block=0x0000000101f04690, state=0x000000010597d5f0, pindex=0x000000010597cf38, view=0x000000010597cfc0, fJustCheck=true) + 592 at main.cpp:1761
    frame #2: 0x00000001001730a7 bitcoind`TestBlockValidity(state=0x000000010
@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Mar 19, 2015

Member
Member

sipa commented Mar 19, 2015

@morcos

This comment has been minimized.

Show comment
Hide comment
@morcos

morcos Mar 19, 2015

Member

Hmm.. So this requires #5875 to be safe I think. Was that the intention?

Member

morcos commented Mar 19, 2015

Hmm.. So this requires #5875 to be safe I think. Was that the intention?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Mar 21, 2015

Member

@morcos Well there are still measures in place to prevent us from accepting a long header-branch still, and we won't accept blocks for which we don't accept the header. Though the PR as-is indeed loosens some things, that are countered by #5875.

Member

sipa commented Mar 21, 2015

@morcos Well there are still measures in place to prevent us from accepting a long header-branch still, and we won't accept blocks for which we don't accept the header. Though the PR as-is indeed loosens some things, that are countered by #5875.

@morcos

This comment has been minimized.

Show comment
Hide comment
@morcos

morcos Mar 21, 2015

Member

Ok, I think my concern is that before this change, when a node starts up, once it gets through enough IBD, it's impossible to DOS it by just feeding it low difficulty full blocks, because those blocks would either have to be a fork before a checkpoint, or they wouldn't connect back to a valid chain. After this change, I think it's possible to push through a long low-difficulty fork before the node gets through the checkpoints. If the length is greater than the height of the last checkpoint, then from then on nothing prevents you from stuffing low difficulty blocks at this node forever until you fill up its disk or tie up its CPU. Once you've snuck in that low difficulty long fork at the beginning it's usable any time after.

Member

morcos commented Mar 21, 2015

Ok, I think my concern is that before this change, when a node starts up, once it gets through enough IBD, it's impossible to DOS it by just feeding it low difficulty full blocks, because those blocks would either have to be a fork before a checkpoint, or they wouldn't connect back to a valid chain. After this change, I think it's possible to push through a long low-difficulty fork before the node gets through the checkpoints. If the length is greater than the height of the last checkpoint, then from then on nothing prevents you from stuffing low difficulty blocks at this node forever until you fill up its disk or tie up its CPU. Once you've snuck in that low difficulty long fork at the beginning it's usable any time after.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Mar 23, 2015

Member

@morcos You're right, but just receiving headers up to the checkpoints suffices to prevent branches, which should be within seconds in normal circumstances (for now). Headers synchronization is however not very robust (it's easy to have a peer that just don't respond to headers request to delay the synchronization significantly), so I agree that it's better to combine this with #5875.

Member

sipa commented Mar 23, 2015

@morcos You're right, but just receiving headers up to the checkpoints suffices to prevent branches, which should be within seconds in normal circumstances (for now). Headers synchronization is however not very robust (it's easy to have a peer that just don't respond to headers request to delay the synchronization significantly), so I agree that it's better to combine this with #5875.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Mar 23, 2015

Member

@jonasschnelli Bug fixed.

Member

sipa commented Mar 23, 2015

@jonasschnelli Bug fixed.

@laanwj laanwj added the Consensus label Mar 26, 2015

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 10, 2015

Member

Rebased, and will rebase on #5975 if that gets merged first.

Member

sipa commented May 10, 2015

Rebased, and will rebase on #5975 if that gets merged first.

Reduce checkpoints' effect on consensus.
Instead of only checking height to decide whether to disable script checks,
actually check whether a block is an ancestor of a checkpoint, up to which
headers have been validated. This means that we don't have to prevent
accepting a side branch anymore - it will be safe, just less fast to
do.

We still need to prevent being fed a multitude of low-difficulty headers
filling up our memory. The mechanism for that is unchanged for now: once
a checkpoint is reached with headers, no headers chain branching off before
that point are allowed anymore.
@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 13, 2015

Member

@gavinandresen Bug fixed, I hope.

Member

sipa commented May 13, 2015

@gavinandresen Bug fixed, I hope.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj May 21, 2015

Member

Tested ACK (code reviewed, synced up to block 320700 with #5875 and #5927)

Member

laanwj commented May 21, 2015

Tested ACK (code reviewed, synced up to block 320700 with #5875 and #5927)

@jtimon

This comment has been minimized.

Show comment
Hide comment
@jtimon

jtimon May 21, 2015

Member

ut ACK

Member

jtimon commented May 21, 2015

ut ACK

@sdaftuar

This comment has been minimized.

Show comment
Hide comment
@sdaftuar

sdaftuar May 22, 2015

Member

I'm concerned about the impact a malicious peer could have on a node starting up from scratch; in particular I think a malicious peer could partition a new node running this code from honest peers.

In 0.10 and master, when a node starts up, it picks one peer to download the headers chain from. After the headers chain's best block time is within 24 hours of now, the node will start to download headers and then blocks from the other peers.

In master without this pull, if that sync peer is malicious or buggy and serves the node a bad chain, then once the node hits a checkpoint height, it'll recognize it as bad and disconnect that peer. However, with this pull, the node could process a blockchain from the bad peer past the checkpoint heights. Upon connecting to other (honest) peers, those peers will send the node a getheaders message, and it will respond with the bad chain. Those honest peers would then disconnect the node for serving headers that violate a checkpoint, preventing the node from ever joining the honest network.

FYI #6172 would limit the potential for this type of behavior by not responding to a getheaders until we're past the checkpoint height, but it is insufficient protection for this PR, because that sync node just needs to serve more blocks than the checkpoint height to create a problem (ie once the bad chain gets past the criteria for exiting IsInitialBlockDownload, the node would start inv'ing bad chain blocks to peers as the tip is updated, triggering disconnection from any honest peers who may be connected, and also preventing new connections to honest peers who would send a getheaders message upon connection).

I think the principle here is that if we are going to relax requirements so that we'd sometimes allow accepting blocks that might violate a currently-existing-checkpoint, then we first need to make peers be willing to stay connected to nodes that might serve them such blocks. That needs more thought, as we'd likely need to replace the current disconnect behavior with some other kind of DoS protection, and I'm not sure what that should look like.

One idea I had was to make IsInitialBlockDownload have some knowledge of the amount of work on the most-work-chain (rather then just use the checkpointed block height), and set a node's behavior to not deliver blocks (or headers) until exiting InitialBlockDownload -- that way, any bad chains would have to have a lot of work in order to trick a node to start trying to serve it to others and face possible disconnection. I think this is doable, but it also feels like a workaround rather than a solution to the underlying issue, so perhaps there are better ideas out there.

Member

sdaftuar commented May 22, 2015

I'm concerned about the impact a malicious peer could have on a node starting up from scratch; in particular I think a malicious peer could partition a new node running this code from honest peers.

In 0.10 and master, when a node starts up, it picks one peer to download the headers chain from. After the headers chain's best block time is within 24 hours of now, the node will start to download headers and then blocks from the other peers.

In master without this pull, if that sync peer is malicious or buggy and serves the node a bad chain, then once the node hits a checkpoint height, it'll recognize it as bad and disconnect that peer. However, with this pull, the node could process a blockchain from the bad peer past the checkpoint heights. Upon connecting to other (honest) peers, those peers will send the node a getheaders message, and it will respond with the bad chain. Those honest peers would then disconnect the node for serving headers that violate a checkpoint, preventing the node from ever joining the honest network.

FYI #6172 would limit the potential for this type of behavior by not responding to a getheaders until we're past the checkpoint height, but it is insufficient protection for this PR, because that sync node just needs to serve more blocks than the checkpoint height to create a problem (ie once the bad chain gets past the criteria for exiting IsInitialBlockDownload, the node would start inv'ing bad chain blocks to peers as the tip is updated, triggering disconnection from any honest peers who may be connected, and also preventing new connections to honest peers who would send a getheaders message upon connection).

I think the principle here is that if we are going to relax requirements so that we'd sometimes allow accepting blocks that might violate a currently-existing-checkpoint, then we first need to make peers be willing to stay connected to nodes that might serve them such blocks. That needs more thought, as we'd likely need to replace the current disconnect behavior with some other kind of DoS protection, and I'm not sure what that should look like.

One idea I had was to make IsInitialBlockDownload have some knowledge of the amount of work on the most-work-chain (rather then just use the checkpointed block height), and set a node's behavior to not deliver blocks (or headers) until exiting InitialBlockDownload -- that way, any bad chains would have to have a lot of work in order to trick a node to start trying to serve it to others and face possible disconnection. I think this is doable, but it also feels like a workaround rather than a solution to the underlying issue, so perhaps there are better ideas out there.

@luke-jr

This comment has been minimized.

Show comment
Hide comment
@luke-jr

luke-jr Jun 2, 2015

Member

utACK

Member

luke-jr commented Jun 2, 2015

utACK

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Jun 10, 2015

Member

Going ahead to merge this (into master only). I agree @sdaftuar 's issue should be addressed before the 0.12 release.

Member

laanwj commented Jun 10, 2015

Going ahead to merge this (into master only). I agree @sdaftuar 's issue should be addressed before the 0.12 release.

@laanwj laanwj merged commit dce8360 into bitcoin:master Jun 10, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Jun 10, 2015

Merge pull request #5927
dce8360 Reduce checkpoints' effect on consensus. (Pieter Wuille)
@sdaftuar

This comment has been minimized.

Show comment
Hide comment
@sdaftuar

sdaftuar Oct 29, 2015

Member

@sipa @laanwj I just remembered this pull, and I think the concern I raised earlier is still a potential issue for 0.12. Any thoughts on the best way to mitigate?

One idea I had would be to not leave IBD (and therefore not announce new blocks or respond to a getheaders message) until your main chain builds off the last checkpointed block. When 0.12 is widely enough deployed, then in a future release we could eliminate that check (since 0.12 peers won't disconnect each other for being on the non-checkpointed chain).

An alternate idea would be to just hardcode in a value of required work on chainActive before we leave IBD, and we set that to something reasonably recent so that it would be expensive to attack. Again, when 0.12 is widely enough deployed, we can eliminate this hardcoded value and the associated check. But I don't love the idea of introducing a new hardcoded value...

Member

sdaftuar commented Oct 29, 2015

@sipa @laanwj I just remembered this pull, and I think the concern I raised earlier is still a potential issue for 0.12. Any thoughts on the best way to mitigate?

One idea I had would be to not leave IBD (and therefore not announce new blocks or respond to a getheaders message) until your main chain builds off the last checkpointed block. When 0.12 is widely enough deployed, then in a future release we could eliminate that check (since 0.12 peers won't disconnect each other for being on the non-checkpointed chain).

An alternate idea would be to just hardcode in a value of required work on chainActive before we leave IBD, and we set that to something reasonably recent so that it would be expensive to attack. Again, when 0.12 is widely enough deployed, we can eliminate this hardcoded value and the associated check. But I don't love the idea of introducing a new hardcoded value...

morcos added a commit to morcos/bitcoin that referenced this pull request Mar 15, 2016

Make GetAncestor more robust
This check for pskip != NULL was introduced in #5927 for 0.12.
It is in general safer and allows GetAncestor to be used in more places, specifically in the mining tests for the backport of BIP 68 to 0.11.

morcos added a commit to morcos/bitcoin that referenced this pull request Mar 16, 2016

Make GetAncestor more robust
This check for pskip != NULL was introduced in #5927 for 0.12.
It is in general safer and allows GetAncestor to be used in more places, specifically in the mining tests for the backport of BIP 68 to 0.11.

rebroad pushed a commit to rebroad/bitcoin that referenced this pull request Dec 7, 2016

Make GetAncestor more robust
This check for pskip != NULL was introduced in #5927 for 0.12.
It is in general safer and allows GetAncestor to be used in more places, specifically in the mining tests for the backport of BIP 68 to 0.11.

@str4d str4d referenced this pull request Feb 14, 2017

Merged

Bitcoin 0.12 misc PRs 1 #2099

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment