Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init, index: disallow indexes when running reindex-chainstate #24789

Merged
merged 2 commits into from Apr 26, 2022

Conversation

mzumsande
Copy link
Contributor

When started together with -reindex-chainstate, currently coinstatsindex gets corrupted and the blockfilterindex flatfiles duplicated. See the OP of #24630 for more a more detailed explanation on why this happens.

This is an alternative to #24630 which does not wipe and rebuild the indexes but returns an InitError when they are activated, thus requiring the user to deactivate them temporarily until the -reindex-chainstate run is finished.

This also disallows -reindex-chainstate in combination with -txindex, which is not leading to corruption, but currently still rebuilds the index unnecessarily and unexpectedly.

As a long-term goal, it would be desirable to have the indexes tolerate reindex-chainstate by ignoring their BlockConnected notifications (there is discussion in #24630 about this) or possibly move reindex-chainstate option into a bitcoin-chainstate executable, which could also solve the problem. But these would be larger projects - until then, it might be better to disallow the interaction than having corrupted indexes.

The first commit adjusts the -reindex doc to mention that this option does rebuild all active indexes.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 6, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #24830 (init: Allow -proxy="" setting values by ryanofsky)
  • #22072 (Add reindex=auto flag to automatically reindex corrupt data by AaronDewes)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj
Copy link
Member

laanwj commented Apr 7, 2022

Seems ok (with the noted provision that it is a temporary state of affairs). Theoretially -reindex-chainstate is supposed to be simply a faster -reindex that doesn't rebuild the block index, and work exactly the same otherwise. But if it doesn't work that way, it's better to disable the combination for now than have corruption and errors.

@luke-jr
Copy link
Member

luke-jr commented Apr 12, 2022

Ok with this temporarily, but I think we should really fix the issue instead long-term.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 93fa8f6. This looks great, improves documentation and test coverage, and avoids a corruption bug.

I agree with Luke it would be better in longer term to improve the indexes. Ideally all the indexing options should straightforwardly work well together without duplication, waste, or corruption. But as long as they don't work well together it is good to prevent using them together, and this PR does it in a helpful way, suggesting good workarounds.

It doesn't seem like it should be too hard to follow up this PR with a more permanent fix that just makes indexes ignore blockconnected notifications that aren't relevant or just delays starting them until reindex-chainstate finishes.

src/init.cpp Outdated
@@ -1031,6 +1031,19 @@ bool AppInitParameterInteraction(const ArgsManager& args, bool use_syscall_sandb
return InitError(_("No proxy server specified. Use -proxy=<ip> or -proxy=<ip:port>."));
}

if (args.IsArgSet("-reindex-chainstate")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "init: disallow reindex-chainstate with optional indexes" (93fa8f6)

It would be good to replace args.IsArgSet("-reindex-chainstate") with args.GetBoolArg("-reindex-chainstate", false) because otherwise this will wrongly trigger for -noreindex-chainstate.

(In general, it's good to avoid the IsArgSet function. It's frequently misused and leads to minor bugs like this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as suggested, good catch! Maybe there could be a warning about this in the doc of IsArgSet (system.h).

src/init.cpp Outdated
return InitError(_("reindex-chainstate with active blockfilterindex is not possible. Please deactivate the index temporarily or perform a full -reindex, which will also rebuild the index."));
}
if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) {
return InitError(_("reindex-chainstate with active txindex is not possible. Please deactivate the index temporarily or perform a full -reindex, which will also rebuild the index."));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "init: disallow reindex-chainstate with optional indexes" (93fa8f6)

Curious, but do you know how or whether txindex gets corrupted? #24630 only mentions the other two indexes. I guess even if the txindex doesn't get corrupted, it is probably still good to prevent -reindex-chainstate and -txindex from being used together if the current implementation is wastefully slow and inefficient, until we properly make the options more compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it will get corrupted, because it doesn't have any kind of running variable like the other two indexes. However, it would get "rebuilt" by processing BlockConnected() signals for the entire chain (with the result being identical to the previously saved index data), which is probably not be what the user expects from running a -reindex-chainstate command. That's why I disabled it as well - I don't think it is strictly necessary and can drop this, if you / others would prefer that.

src/init.cpp Outdated
if (args.IsArgSet("-reindex-chainstate")) {
// indexes that must be deactivated to prevent index corruption, see #24630
if (args.GetBoolArg("-coinstatsindex", DEFAULT_COINSTATSINDEX)) {
return InitError(_("reindex-chainstate with active coinstatsindex is not possible. Please deactivate the index temporarily or perform a full -reindex, which will also rebuild the index."));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "init: disallow reindex-chainstate with optional indexes" (93fa8f6)

Minor: Here and below I think you could make recommendations more explicit. I would say "-reindex-chainstate option is not compatible with -coinstatsindex. Please temporarily disable -coinstatsindex while using -reindex-chainstate, or replace -reindex-chainstate with -reindex to fully rebuild all indexes." But current wording is also OK if you prefer it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I took your suggestion (just didn't use a "-" after "disable" because here the index is meant, not the command option).

It currently leads to corruption (coinstatsindex) or
data duplication (blockfilterindex), so disable it.
@mzumsande
Copy link
Contributor Author

Addressed feedback by @ryanofsky

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK dac44fc. Just fixed IsArgSet call and edited error messages since last review

@fanquake fanquake merged commit 269dcad into bitcoin:master Apr 26, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 26, 2022
@@ -1031,6 +1031,19 @@ bool AppInitParameterInteraction(const ArgsManager& args, bool use_syscall_sandb
return InitError(_("No proxy server specified. Use -proxy=<ip> or -proxy=<ip:port>."));
}

if (args.GetBoolArg("-reindex-chainstate", false)) {
// indexes that must be deactivated to prevent index corruption, see #24630
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how helpful it is to write #24630. Either this should be a full URL or even better a description for devs to explain which indexes must be deactivated and which ones do not.

If all need to be deactivated it would be better to write "All index" instead of "indexes that".

return InitError(_("-reindex-chainstate option is not compatible with -blockfilterindex. Please temporarily disable blockfilterindex while using -reindex-chainstate, or replace -reindex-chainstate with -reindex to fully rebuild all indexes."));
}
if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) {
return InitError(_("-reindex-chainstate option is not compatible with -txindex. Please temporarily disable txindex while using -reindex-chainstate, or replace -reindex-chainstate with -reindex to fully rebuild all indexes."));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe this can be wrapped in a lambda that takes the index name?

maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Apr 30, 2022
…nstatsindex and feature_prune

a3cd7db test: stop node before calling assert_start_raises_init_error (Martin Zumsande)

Pull request description:

  In bitcoin#24789, I forgot to stop the node before using `assert_start_raises_init_error` in `feature_coinstatsindex`. This resulted in a bitcoind process that is not being terminated after the test finishes.
  `feature_prune` has the same problem and also creates a zombie bitcoind process.

  Also adds an assert to `assert_start_raises_init_error` to make sure the node isn't already running to prevent this sort of mistake in the future.

Top commit has no ACKs.

Tree-SHA512: 902f683ebe7b19ca32ab83ca40d9698e9d91509b1d003f21a7221f79b647e05b6ef5c0c888fbb772cbca5e641d5c9437d522b6671f446c3ab321d79f7c6d0284
@mzumsande mzumsande deleted the 202204_reindex_corrupt branch May 19, 2022 13:27
ryanofsky added a commit that referenced this pull request May 17, 2023
…w interaction with reindex-chainstate

97844d9 index: Enable reindex-chainstate with active indexes (Martin Zumsande)
60bec3c index: Use first block from locator instead of looking for fork point (Martin Zumsande)

Pull request description:

  This makes two improvements to the index init phase:

  **1) Prevent index corruption in case a reorg happens when the index was switched off**:
  This is done by reading in the top block stored in the locator instead of looking for a fork point already in `BaseIndex::Init()`.
  Before, we'd just go back to the fork point by calling `FindForkInGlobalIndex()`, which would have corrupted the coinstatsindex because its saved muhash needs to be reverted step by step by un-applying all blocks in between, which wasn't done before. This is now being done a bit later in  `ThreadSync()`, which has existing logic to call the custom `Rewind()` method when going back along the chain to the forking point (thanks ryanofsky for pointing this out to me!).

  **2) Allow using the `-reindex-chainstate` option without needing to disabling indexes**:
  With `BaseIndex::Init()` not calling `FindForkInGlobalIndex()` anymore, we can allow `reindex-chainstate` with active indexes. `reindex-chainstate` deletes the chain and rebuilds it later in `ThreadImport`, so there is no chain available during `BaseIndex::Init()`, which would lead to problems (see #24789).
  But now we'll only need the chain a bit later in `BaseIndex::ThreadSync`, which will wait for the reindex-chainstate in `ThreadImport` to finish and will continue syncing after that.

ACKs for top commit:
  ryanofsky:
    Code review ACK 97844d9. Just simple rebase since last review

Tree-SHA512: e24973fc22e0b87a49026f4820aecb0a4e415f4d381bade9969dd31cf97afecfea0449dce7fcc797343b792199cc8287276d1f5ffa4433dcb54fb24a808db6fb
@bitcoin bitcoin locked and limited conversation to collaborators May 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants