Skip to content

Add rollback script, fix deadline non-determinism#169

Merged
raymondjacobson merged 8 commits intomainfrom
fix/finalize-block-no-revalidation
Mar 30, 2026
Merged

Add rollback script, fix deadline non-determinism#169
raymondjacobson merged 8 commits intomainfrom
fix/finalize-block-no-revalidation

Conversation

@raymondjacobson
Copy link
Copy Markdown
Contributor

@raymondjacobson raymondjacobson commented Mar 30, 2026

# dry run
docker run --rm -v /root/data:/data openaudio/go-openaudio:stable rollback -dry-run
# for real
docker run --rm -v /root/data:/data openaudio/go-openaudio:stable rollback

During block sync, FinalizeBlock re-validates attestations by calling
isValidAttestation, which queries the current validator set to compute
attestor rendezvous. Nodes that state-synced have a validator set from
a later height, causing the rendezvous to diverge and attestation
validation to fail. This produces a different ExecTxResult code than
the original validators computed, breaking the LastResultsHash check
in the next block header and halting the syncing node.

Scoped to registration/deregistration attestations and misbehavior
deregistrations — the tx types where this divergence has been observed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@raymondjacobson raymondjacobson force-pushed the fix/finalize-block-no-revalidation branch from a7cdd22 to e8820b4 Compare March 30, 2026 18:12
raymondjacobson and others added 7 commits March 30, 2026 12:39
Adds cmd/rollback which rolls back CometBFT state by one block and
cleans up the corresponding PG rows so the block can be replayed
with corrected code. Used to recover nodes that got stuck at block
21941479 due to the RSize 10→15 upgrade causing attestation
re-validation failures in FinalizeBlock.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Build rollback binary alongside openaudio and include it in the image
with a dedicated entrypoint script. Run via:
  docker run --entrypoint /bin/rollback-entrypoint.sh <image>

Also fix core_app_state cleanup to DELETE instead of UPDATE (avoids
primary key conflict since the previous block's row already exists).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add -blocks N flag to roll back multiple blocks in a loop.
Add command dispatch in entrypoint so rollback can be invoked as:
  docker run <image> rollback [-blocks N] [-dry-run]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@raymondjacobson raymondjacobson changed the title Remove re-validation from FinalizeBlock to fix LastResultsHash mismatch Add rollback script, fix deadline non-determinism Mar 30, 2026
@raymondjacobson raymondjacobson merged commit 26c02c9 into main Mar 30, 2026
5 checks passed
@raymondjacobson raymondjacobson deleted the fix/finalize-block-no-revalidation branch March 30, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants