Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Add blockchain:analyze/1 and blockchain:repair/1 #324

Merged
merged 2 commits into from
Dec 20, 2019

Conversation

Vagabond
Copy link
Contributor

There are many things that go wrong when syncing is interrupted by some
kind of hard power down (OTAs or unplugging). This patch intends to try
to add some repair facilities for the common scenarios observed during
sync corruption.

The common sync failures are:

  • Missing blocks
  • Ledger inconsistency

To detect these failures several checks have been added:

  • Check the delayed ledger is the right number of blocks behind
  • Check the blockchain and the ledger agree on the blockchain height
  • Check the delayed ledger advanced to the leading ledger fingerprints
    the same as the leading ledger

And several repair tools have been added:

  • Advance the lagging ledger
  • Re-sync a missing block (only gap one block wide is doable right now)
  • Apply all the differences from an advanced lagging ledger to the
    leading ledger

To analyze the problems, use blockchain:analyze. Analyze will report a
single problem at a time.

To repair the current analyzed problem (if it is one we can repair), use
blockchain:repair. It will only repair a single problem at a time. It is
recommended you use analyze after each repair to check if there are more
problems, and to repair them if needed.

@Vagabond
Copy link
Contributor Author

This PR also fixes a silly bug when we did not report we were doing an 'assume valid' sync and the sync height looked like it was going backwards.

@codecov
Copy link

codecov bot commented Dec 17, 2019

Codecov Report

Merging #324 into master will decrease coverage by 2.58%.
The diff coverage is 20.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #324      +/-   ##
==========================================
- Coverage   70.13%   67.54%   -2.59%     
==========================================
  Files          66       66              
  Lines        6304     6279      -25     
==========================================
- Hits         4421     4241     -180     
- Misses       1883     2038     +155
Impacted Files Coverage Δ
src/blockchain.erl 67.46% <10.07%> (-14%) ⬇️
src/ledger/v1/blockchain_ledger_v1.erl 77.03% <39.02%> (-1.2%) ⬇️
src/transactions/blockchain_txn.erl 67.4% <75%> (-0.33%) ⬇️
src/blockchain_utils.erl 71.64% <0%> (-11.95%) ⬇️
...transactions/v1/blockchain_txn_poc_receipts_v1.erl 31.57% <0%> (-4.61%) ⬇️
src/transactions/v1/blockchain_txn_rewards_v1.erl 89.03% <0%> (-4.25%) ⬇️
src/handlers/blockchain_sync_handler.erl 74.19% <0%> (-3.23%) ⬇️
src/blockchain_worker.erl 46.77% <0%> (-1.21%) ⬇️
src/blockchain_poc_path_v2.erl 0% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7422aee...054d846. Read the comment docs.

src/blockchain.erl Show resolved Hide resolved
src/blockchain.erl Outdated Show resolved Hide resolved
src/blockchain.erl Outdated Show resolved Hide resolved
src/blockchain.erl Show resolved Hide resolved
src/transactions/blockchain_txn.erl Show resolved Hide resolved
Vagabond and others added 2 commits December 20, 2019 14:50
There are many things that go wrong when syncing is interrupted by some
kind of hard power down (OTAs or unplugging). This patch intends to try
to add some repair facilities for the common scenarios observed during
sync corruption.

The common sync failures are:

* Missing blocks
* Ledger inconsistency

To detect these failures several checks have been added:

* Check the delayed ledger is the right number of blocks behind
* Check the blockchain and the ledger agree on the blockchain height
* Check the delayed ledger advanced to the leading ledger fingerprints
  the same as the leading ledger

And several repair tools have been added:

* Advance the lagging ledger
* Re-sync a missing block (only gap one block wide is doable right now)
* Apply all the differences from an advanced lagging ledger to the
  leading ledger

To analyze the problems, use blockchain:analyze. Analyze will report a
single problem at a time.

To repair the current analyzed problem (if it is one we can repair), use
blockchain:repair. It will only repair a single problem at a time. It is
recommended you use analyze after each repair to check if there are more
problems, and to repair them if needed.
@Vagabond Vagabond merged commit d9a1416 into master Dec 20, 2019
@Vagabond Vagabond deleted the adt/blockchain-repair branch December 20, 2019 23:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants