Fix some chainstate-init-order bugs. #10758

Open
wants to merge 5 commits into
from

Conversation

Projects

Blockers in High-priority for review

7 participants
Contributor

TheBlueMatt commented Jul 7, 2017 edited

This does a number of things to clean up chainstate init order,
fixing some issues as it goes:

  • Order chainstate init more logically - first all of the
    blocktree-related loading, then coinsdb, then
    pcoinsTip/chainActive. Only create objects as needed.

  • More clearly document exactly what is and isn't called in
    -reindex and -reindex-chainstate both with comments noting
    calls as no-ops and by adding if guards.

  • Move the writing of fTxIndex to LoadBlockIndex - this fixes a
    bug introduced in d6af06d where
    InitBlockIndex was writing to fTxIndex which had not yet been
    checked (because LoadChainTip hadn't yet initialized the
    chainActive, which would otherwise have resulted in
    InitBlockIndex being a NOP), allowing you to modify -txindex
    without reindex, potentially corrupting your chainstate!

  • Rename InitBlockIndex to LoadGenesisBlock, which is now a more
    natural name for it. Also check mapBlockIndex instead of
    chainActive, fixing a bug where we'd write the genesis block out
    on every start.

  • Move LoadGenesisBlock further down in init. This is a more logical
    location for it, as it is after all of the blockindex-related
    loading and checking, but before any of the UTXO-related loading
    and checking.

  • Give LoadChainTip a return value - allowing it to indicate that
    the UTXO DB ran ahead of the block DB. This just provides a nicer
    error message instead of the previous mysterious
    assert(!setBlockIndexCandidates.empty()) error.

  • Calls ActivateBestChain in case we just loaded the genesis
    block in LoadChainTip, avoiding relying on the ActivateBestChain
    in ThreadImport before continuing init process.

  • Move all of the VerifyDB()-related stuff into a -reindex +
    -reindex-chainstate if guard. It couldn't do anything useful
    as chainActive.Tip() would be null at this point anyway.

Contributor

TheBlueMatt commented Jul 7, 2017

Note that the two generic "Error initializing block database" strings really should be changed, but as we are already in string freeze for 0.15, they will have to stay as-is for now :(.

fanquake added the Refactoring label Jul 7, 2017

Contributor

TheBlueMatt commented Jul 7, 2017

More explicitly fixed the bug that I accidentally fixed by being more explicit - RewindBlockIndex MUST be called even if we start with -reindex-chainstate.

Contributor

TheBlueMatt commented Jul 10, 2017

This needs an 0.15 tag as it fixes some bugs.

Contributor

TheBlueMatt commented Jul 16, 2017

Rebased and added a fix for the issue @sipa pointed out at #10770 (comment)

src/validation.cpp
+ }
+
+ if (needs_init) {
+ // Note that LoadBlockIndexDB may have set fReindex if we shut down
@sipa

sipa Jul 16, 2017

Owner

I have difficulty parsing this sentence. Can you split it up?

@TheBlueMatt

TheBlueMatt Jul 16, 2017

Contributor

OK, rewrote the comment.

Owner

sipa commented Jul 16, 2017

@TheBlueMatt When starting with -reindex, and then restart before completing without -reindex, it seems to start over from scratch.

Owner

sipa commented Jul 17, 2017

ACK

@TheBlueMatt I retract my comment, everything seems to be working like before.

Owner

sipa commented Jul 17, 2017

I'd like to see this in 0.15 as it at least turns an assert into an error message, in case your chainstate is out of sync with the block chain.

Contributor

TheBlueMatt commented Jul 17, 2017

This (or some subset of it) absolutely has to land for 15, it fixes several regressions currently on master, but its a bugfix, so doesn't need to land today (but isnt small, so should go sooner rather than later).

sipa added this to the 0.15.0 milestone Jul 17, 2017

Member

theuni commented Jul 19, 2017

I haven't finished reviewing, but this also fixes a crash when using an invalid -wallet argument. For example: ./bitcoind -wallet='with spaces', which crashes for me on master.

Contributor

morcos commented Jul 20, 2017

utACK c93db42

Strongly advocate for merging this in the near term.
Although I have not gone through testing all sequences, I think this logic is far clearer and more straight forward than the existing logic and there are known bugs in the existing startup sequence.
With all the other changes to databases recently (rewinding, replaying, reindexing), I think it makes sense to have as much testing as possible with this new code rather than waste our time potentially finding more bugs with the old logic.

Contributor

promag commented Jul 21, 2017

In current master running bitcoind -regtest '-wallet=foo%.dat' gives:

./src/bitcoind -regtest '-wallet=foo%.dat'
Error: Invalid characters in -wallet filename
Segmentation fault (core dumped)

With this branch gives

Error: Invalid characters in -wallet filename
bitcoind: scheduler.cpp:200: void SingleThreadedSchedulerClient::EmptyQueue(): Assertion `!m_pscheduler->AreThreadsServicingQueue()' failed.
Aborted (core dumped)

Reporting this here because of c93db42.

Contributor

TheBlueMatt commented Jul 23, 2017

Hmm, I went ahead and added the fix for @promag's race here, feel free to tell me it should be in a different PR. (@theuni note that it was a race, which may explain why you didnt see it).

Contributor

TheBlueMatt commented Jul 23, 2017

@sipa hmm, re: restarting-mid-reindex: your original comment that it was broken was more correct than your later retraction. Fixed and cleaned that stuff up more with more comments :).

Contributor

promag commented Jul 23, 2017

@TheBlueMatt if it's not related with your refactor then IMO a new PR would be best.

Contributor

TheBlueMatt commented Jul 24, 2017

OK, removed the top two commits so this is just what it was that got ACKs. The two commits from the top are in #10919, which should likely also be taken for 15.

Owner

sipa commented Jul 25, 2017

reutACK c93db42

laanwj added the Bug label Jul 25, 2017

src/validation.cpp
+ if (!ReceivedBlockTransactions(block, state, pindex, blockPos, chainparams.GetConsensus()))
+ return error("LoadBlockIndex(): genesis block not accepted");
+ } catch (const std::runtime_error& e) {
+ return error("LoadBlockIndex(): failed to initialize block database: %s", e.what());
@laanwj

laanwj Jul 27, 2017

Owner

This error is no longer correct, this is "failed to load genesis block" now I guess?

src/validation.cpp
+ CDiskBlockPos blockPos;
+ CValidationState state;
+ if (!FindBlockPos(state, blockPos, nBlockSize+8, 0, block.GetBlockTime()))
+ return error("LoadBlockIndex(): FindBlockPos failed");
@laanwj

laanwj Jul 27, 2017

Owner

Would be better to use error("%s: ...", __func__) here as in other places, but in any case, the function name is wrong now.

Owner

laanwj commented Jul 27, 2017

utACK apart from error message nits

TheBlueMatt added some commits Jul 6, 2017

@TheBlueMatt TheBlueMatt Fix some LoadChainTip-related init-order bugs.
* Move the writing of fTxIndex to LoadBlockIndex - this fixes a
  bug introduced in d6af06d where
  InitBlockIndex was writing to fTxIndex which had not yet been
  checked (because LoadChainTip hadn't yet initialized the
  chainActive, which would otherwise have resulted in
  InitBlockIndex being a NOP), allowing you to modify -txindex
  without reindex, potentially corrupting your chainstate!

* Rename InitBlockIndex to LoadGenesisBlock, which is now a more
  natural name for it. Also check mapBlockIndex instead of
  chainActive, fixing a bug where we'd write the genesis block out
  on every start.
eda888e
@TheBlueMatt TheBlueMatt More user-friendly error message if UTXO DB runs ahead of block DB
This gives LoadChainTip a return value - allowing it to indicate that
the UTXO DB ran ahead of the block DB. This just provides a nicer
error message instead of the previous mysterious
assert(!setBlockIndexCandidates.empty()) error.

This also calls ActivateBestChain in case we just loaded the genesis
block in LoadChainTip, avoiding relying on the ActivateBestChain
in ThreadImport before continuing init process.
b0f3249
@TheBlueMatt TheBlueMatt Call RewindBlockIndex even if we're about to run -reindex-chainstate
RewindBlockIndex works over both chainActive - disconnecting blocks
from the tip that need witness verification - and mapBlockIndex -
requiring redownload of blocks missing witness data.

It should never have been the case that the second half is skipped
if we're about to run -reindex-chainstate.
ff3a219
@TheBlueMatt TheBlueMatt Order chainstate init more logically.
* Order chainstate init more logically - first all of the
  blocktree-related loading, then coinsdb, then
  pcoinsTip/chainActive. Only create objects as needed.

* More clearly document exactly what is and isn't called in
  -reindex and -reindex-chainstate both with comments noting
  calls as no-ops and by adding if guards.

* Move LoadGenesisBlock further down in init. This is a more logical
  location for it, as it is after all of the blockindex-related
  loading and checking, but before any of the UTXO-related loading
  and checking.

* Move all of the VerifyDB()-related stuff into a -reindex +
  -reindex-chainstate if guard. It couldn't do anything useful
  as chainActive.Tip() would be null at this point anyway.
1385697
@TheBlueMatt TheBlueMatt Fix segfault when shutting down before fully loading
This was introduced by 3192975.
It can be triggered easily when canceling DB upgrade from
pre-per-utxo.
c0025d0
Contributor

TheBlueMatt commented Jul 27, 2017

Fixed error printings in LoadGenesisBlock.

Contributor

morcos commented Jul 28, 2017

re-utACK c0025d0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment