Skip to content

Replace direct-DB validator deregistration with consensus-based removal#143

Merged
raymondjacobson merged 6 commits intomainfrom
rj-uptime-fix
Mar 13, 2026
Merged

Replace direct-DB validator deregistration with consensus-based removal#143
raymondjacobson merged 6 commits intomainfrom
rj-uptime-fix

Conversation

@raymondjacobson
Copy link
Contributor

@raymondjacobson raymondjacobson commented Mar 12, 2026

The deregisterValidator() function was directly deleting rows from core_validators outside of CometBFT consensus, causing state drift across nodes. Production nodes were reporting different chain state node counts because each node independently decided when to delete entries. Moreover depending on which node you state sync from, you may get different initial values in the validators table.

In this PR:

  • Add a remove flag to the ValidatorDeregistration proto message to distinguish between jailing (underperformance) and removal (eth L1 deregistration, deletes row entirely)
  • Introduces removeValidator() which submits a deregistration attestation with remove=true through CometBFT consensus, ensuring all nodes process the deletion deterministically
  • Replaces both call sites (validator warden and eth event listener) that previously called deregisterValidator() with the new consensus-based removeValidator()
  • Fixes self-unjailing: jailed nodes that come back online can now re-register through consensus to unjail themselves, which was previously blocked because isSelfAlreadyRegistered returned true for jailed nodes (this was a bug introduced in my previous PRs)
  • Also fix the uptime page which would crash for all nodes if one node had no rollup info

@raymondjacobson raymondjacobson changed the title Fix uptime UI and validator consistency Replace direct-DB validator deregistration with consensus-based removal Mar 12, 2026
Copy link
Contributor

@rickyrombo rickyrombo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the problem that deregistered state was not going through consensus or jailing?

@raymondjacobson
Copy link
Contributor Author

Was the problem that deregistered state was not going through consensus or jailing?

deregistered state was not going through consensus, at all! there was a function to just watch eth events coming in and then straight up remove the node

@raymondjacobson raymondjacobson merged commit 32659c3 into main Mar 13, 2026
7 of 8 checks passed
@raymondjacobson raymondjacobson deleted the rj-uptime-fix branch March 13, 2026 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants