New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain: Convert to direct single-step reorgs. #1500

Merged
merged 1 commit into from Nov 12, 2018

Conversation

Projects
None yet
4 participants
@davecgh
Copy link
Member

davecgh commented Oct 16, 2018

This requires PR #1471.

This modifies the chain reorganization logic to directly perform the reorg one block at a time with rollback in the case of failure, as opposed to the existing memory-based two-step approach, so that it is more optimized for the typical case, better handles large reorgs, gives the ability to implement better caching strategies, and helps provide a path to decouple the chain processing and connection code from the download logic. It also removes the cached stxos from the view since the aforementioned changes make them no longer necessary.

A side effect of these changes is that it is no longer possible to know if a reorg will succeed before actually performing it, so the NTReorganization notification is now sent after a successful reorg. The notification really should have been sent after the reorg before anyway.

Prior to these changes, chain reorganization used a two-step approach such that the first step involved checking all of the blocks along the reorg path in memory and then actually performing the reorg in a second step if those checks succeeded. While that approach does have some benefits in terms of avoiding any intermediate mutation to the current best chain for failed reorgs, and thus not requiring a rollback in that case, it also has some disadvantages such as not scaling well with large reorgs, being more difficult to make use of different caching strategies, and hindering the ability to decouple the connection code from the download logic.

In a certain sense, the approach this replaces assumed that a reorg would fail and took measures to detect that condition prior to performing the reorg, while the new approach assumes the reorg will succeed and rolls back the changes in the very rare case it doesn't. This is an acceptable and safe assumption because the proof-of-work requirements make it exceedingly expensive to create blocks that are valid enough to trigger a reorg yet ultimately end up failing to connect, thus miners are heavily disincentivized from creating such invalid blocks and attackers are also unable to easily create such blocks either. Even in the case of attack, the only result would be nodes performing slightly more database updates than the existing approach.

The following results show the difference between performing the large reorg full block tests before and after these changes:

before: 4.3GB memory usage, 2m18.629s to complete
after:  2.8GB memory usage, 2m04.056s to complete

As can be seen, the new approach takes much less memory and is also a bit faster as well.

This is work towards #1145.

@davecgh davecgh added this to the 1.4.0 milestone Oct 16, 2018

@davecgh davecgh referenced this pull request Oct 16, 2018

Open

Multi-peer Checklist #1145

15 of 25 tasks complete

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch from 3cd19f3 to 4371c09 Oct 16, 2018

@dajohi

dajohi approved these changes Oct 16, 2018

Copy link
Member

dajohi left a comment

testnet miner tOK

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch from 4371c09 to 49c997e Oct 16, 2018

Show resolved Hide resolved blockchain/chain.go Outdated

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch from 49c997e to aeb7351 Oct 16, 2018

@dnldd

dnldd approved these changes Oct 16, 2018

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch 4 times, most recently from de7528d to 0609124 Oct 16, 2018

@alexlyp

alexlyp approved these changes Nov 6, 2018

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch from 0609124 to 568e9a5 Nov 9, 2018

blockchain: Convert to direct single-step reorgs.
This modifies the chain reorganization logic to directly perform the
reorg one block at a time with rollback in the case of failure, as
opposed to the existing memory-based two-step approach, so that it is
more optimized for the typical case, better handles large reorgs, gives
the ability to implement better caching strategies, and helps provide a
path to decouple the chain processing and connection code from the
download logic.  It also removes the cached stxos from the view since
the aforementioned changes make them no longer necessary.

A side effect of these changes is that it is no longer possible to know
if a reorg will succeed before actually performing it, so the
NTReorganization notification is now sent after a successful reorg.  The
notification really should have been sent after the reorg before anyway.

Prior to these changes, chain reorganization used a two-step approach
such that the first step involved checking all of the blocks along the
reorg path in memory and then actually performing the reorg in a second
step if those checks succeeded.  While that approach does have some
benefits in terms of avoiding any intermediate mutation to the current
best chain for failed reorgs, and thus not requiring a rollback in that
case, it also has some disadvantages such as not scaling well with large
reorgs, being more difficult to make use of different caching
strategies, and hindering the ability to decouple the connection code
from the download logic.

In a certain sense, the approach this replaces assumed that a reorg
would fail and took measures to detect that condition prior to
performing the reorg, while the new approach assumes the reorg will
succeed and rolls back the changes in the very rare case it doesn't.
This is an acceptable and safe assumption because the proof-of-work
requirements make it exceedingly expensive to create blocks that are
valid enough to trigger a reorg yet ultimately end up failing to
connect, thus miners are heavily disincentivized from creating such
invalid blocks and attackers are also unable to easily create such
blocks either.  Even in the case of attack, the only result would be
nodes performing slightly more database updates than the existing
approach.

The following results show the difference between performing the large
reorg full block tests before and after these changes:

before: 4.3GB memory usage, 2m18.629s to complete
after:  2.8GB memory usage, 2m04.056s to complete

As can be seen, the new approach takes much less memory and is also a
bit faster as well.

@davecgh davecgh force-pushed the davecgh:blockchain_single_step_reorg branch from 568e9a5 to 9c4569a Nov 12, 2018

@davecgh davecgh merged commit 9c4569a into decred:master Nov 12, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@davecgh davecgh deleted the davecgh:blockchain_single_step_reorg branch Nov 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment