Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow users to optionally configure node startup even if index reconciliation fails #12930

Merged
merged 5 commits into from
Mar 6, 2025

Conversation

aarshkshah1992
Copy link
Contributor

For #12897.

There chain state has some missing messages and this PR enables them to start the node even if index reconciliation fails.

@aarshkshah1992 aarshkshah1992 requested a review from rvagg March 4, 2025 05:35
@rvagg
Copy link
Member

rvagg commented Mar 4, 2025

This is pretty blunt. @emmanuelm41 can you consider this change and comment on whether it makes your life any better? This gives you a flag that lets you start the node up, log an error, but the reconciliation is aborted entirely. It doesn't address the messages being missing (that's still a mystery for a full archival node, we really need to get to the bottom of that but it's separate to the chainindexer tripping over it). It means there may be a small gap in what's been indexed, although it can always manually run a backfill over epochs you're concerned about with lotus index validate-backfill.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Mar 4, 2025

@emmanuelm41

  • Based on the error message at Node crashed, and stop syncing with EOF #12897, that node's chain state is missing some messages at epoch 4702087.

  • This PR enables you to start your node with the ChainIndexerConfig .AllowIndexReconciliationFailure flag enabled (false by default, you need to set it to true) which means that your node will boot up even if it fails to reconcile the Index with the chain state. However, note that this means that your Index will not have entries for epoch 4702087 and also for epochs before it if you have NOT backfilled them already.

  • Note the validate-backfill CLI will also fail for epoch 4702087.

  • Ideally, what you want to do is validate-backfill from the latest epoch upto epoch 4702087 and then validate-backfill from epoch 4702086 upto genesis in the background while the node is running so you don't need the reconciliation at all to backfill the Index.

Let me know if you have any questions.

@aarshkshah1992 aarshkshah1992 merged commit 48cd351 into master Mar 6, 2025
89 checks passed
@aarshkshah1992 aarshkshah1992 deleted the fix/backfill-missing-block-dont-panic branch March 6, 2025 07:43
rjan90 pushed a commit that referenced this pull request Mar 6, 2025
…econciliation fails (#12930)

* optional index start on reconciliation failure

* warning

* fix ChangeLog

* fix make gen

* fix compilation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

2 participants