fix(core): B3b — recompute total_minted from blocks when blob is stale by satyakwok · Pull Request #712 · sentrix-labs/sentrix

satyakwok · 2026-05-25T11:25:16Z

load_blockchain runs Patch B3 to overwrite stale accounts from the canonical trie. But total_minted lives only in the bincode blob — no trie key, no per-block index. So when B3 fires (trie ahead of blob), accounts snap to canonical but total_minted stays at whatever the last save_blockchain wrote.

That alone wouldn't matter if save_blockchain were always called in lock-step with apply, but there are edges where it isn't: an offline chain.db moved between hosts via cp without first halting, a save that raced a halt window, or a B2 replay path where the per-block apply updates total_minted but the corrupted-from-the-start blob doesn't get rewritten until after B3.

In the 2026-05-24 mainnet incident, treasury and beacon ended up with identical accounts (B3 reconciled both from the same trie) but differing total_minted. STATE-FP fp = SHA256(acc + total_minted) then disagreed across two otherwise-identical validators. Same block content, same state-root in the trie, but the fingerprint trace fired.

This commit adds B3b right after B3:

recompute_total_minted_from_blocks(bc) walks every persisted block 1..=tip, sums block.coinbase().amount, and adds TOTAL_PREMINE. C-04 (block_executor.rs:336) enforces coinbase.amount == reward, so the chain has no other source of newly-minted supply — this sum is the canonical value of total_minted at the tip.
load_blockchain compares the blob value against the recomputed value and warns + overwrites on mismatch.
Save-back happens via the existing atomic B1 path when either B3 repaired an account OR B3b repaired total_minted.

Cost: O(N) MDBX block loads at boot. At mainnet h≈2.2M this is ~30–60s on warm SSD. Acceptable for a once-per-boot sanity pass that only writes back when divergence is detected. Missing blocks are skipped with a warning, matching the B3 fail-soft posture earned during the 2026-05-20 testnet trie-gap incident.

Two unit tests added:

test_recompute_total_minted_matches_block_sum — ground-truth check that the helper matches TOTAL_PREMINE + N * BLOCK_REWARD for a freshly-mined chain.
test_b3b_repairs_stale_total_minted_on_load — end-to-end via the production load_blockchain path: persist a blob with intentionally- stale total_minted, reload, assert the loaded value is canonical. Fails on main; passes after this commit.

Summary

Risk tier

Check ONE:

🟢 Low — docs, tools/, tests/, CI configs, dependency patch bumps in dev-only crates, comments
🟡 Medium — non-consensus production code (RPC handlers, network plumbing, observability, ops scripts)
🟠 High — consensus-critical crates (sentrix-core, sentrix-trie, sentrix-staking, sentrix-bft), block_executor, apply_block_*, state_root path
🔴 Critical — Voyager activation, fork-height changes, hard-fork rollouts, anything that flips env vars on mainnet

Required by tier

🟢 Low — minimum bar

CI green (tests + clippy + audit + gitleaks)

🟡 Medium — adds

New public function or behavioural change has at least one corresponding #[test] in same PR
Brief description of how this was tested (manual run, integration test, etc.)

🟠 High — adds

Regression test that fails on main and passes with this change — paste test name in PR body
Designed against documented invariant (link the audit/runbook/design doc)
Fresh-brain review by someone other than the author (per the consensus-change review checklist)
Single conceptual unit per PR (no bundling — bundling consensus changes burned us on v2.1.12 → 2026-04-25 livelock)

🔴 Critical — adds

Testnet rehearsal completed with success criteria + log evidence linked here
Bake window observed: minimum 2h on testnet at the same configuration before mainnet
Coordinated rollback plan documented in PR body — exact commands operator runs if it fails
Operator sign-off at activation moment (not just PR approval — separate moment for the actual flip)

Test plan

Rollback plan

Summary by CodeRabbit

Bug Fixes
- Added automatic validation and repair of the blockchain state on startup to correct inaccurate total-minted tracking.
- Repaired state is now persisted when either account reconciliation or total-minted correction occurs; logging clarifies both outcomes.
- Startup recovery is more resilient: missing persisted blocks are skipped with warnings instead of causing startup failure.

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'sentrix-trie benches'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite	Current: `9278cb3`	Previous: `c6ae261`	Ratio
`insert_single`	`172113` ns/iter (`± 15633`)	`83470` ns/iter (`± 10721`)	`2.06`

This comment was automatically generated by workflow using github-action-benchmark.

codecov · 2026-05-25T11:27:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai · 2026-05-25T11:28:46Z

📝 Walkthrough

Walkthrough

This PR adds load-time repair logic to detect and fix stale total_minted snapshots during blockchain initialization. A new helper function recompute_total_minted_from_blocks computes the canonical total by summing coinbase amounts from all persisted blocks plus genesis premine. The load_blockchain function now invokes this helper after trie reconciliation, compares the computed value against the persisted blob, logs a warning when they diverge, and extends the persist-repair flow to trigger whenever either account reconciliation or total_minted was stale. Tests validate that the helper matches block-sum ground truth and that load_blockchain successfully repairs corrupted snapshots.

Sequence Diagram(s)

sequenceDiagram
  participant Loader as load_blockchain
  participant Recompute as recompute_total_minted_from_blocks
  participant BlockStore as PersistedBlocks(MDBX)
  participant Blockchain as in-memory Blockchain

  Loader->>Recompute: request recomputed_total(tip, TOTAL_PREMINE)
  Recompute->>BlockStore: load block at height i
  BlockStore-->>Recompute: Block{coinbase.amount} or missing (warn & skip)
  Recompute-->>Loader: recomputed_total
  Loader->>Blockchain: compare recomputed_total vs persisted_blob.total_minted
  alt mismatch
    Loader->>Blockchain: overwrite bc.total_minted with recomputed_total
    Loader->>BlockStore: persist repaired state (if needed)
  else match
    Loader-->>Blockchain: no repair needed
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

sentrix-labs/sentrix#696: Modifies boot-time trie reconciliation that this PR relies on; both touch load-time reconciliation and persist-repair decisions.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: implementing Patch B3b to recompute total_minted from persisted blocks when the bincode blob snapshot is stale, addressing the root cause of state-fingerprint mismatches.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering the problem context, specific solution details, performance implications, and test coverage. However, the formal template sections (Summary, Risk tier, Test plan, Rollback plan) are incompletely filled with minimal details under 'Test plan'.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/b3-reconcile-total-minted-from-blocks

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

`load_blockchain` runs Patch B3 to overwrite stale `accounts` from the canonical trie. But `total_minted` lives only in the bincode blob — no trie key, no per-block index. So when B3 fires (trie ahead of blob), `accounts` snap to canonical but `total_minted` stays at whatever the last save_blockchain wrote. That alone wouldn't matter if save_blockchain were always called in lock-step with apply, but there are edges where it isn't: an offline chain.db moved between hosts via cp without first halting, a save that raced a halt window, or a B2 replay path where the per-block apply updates `total_minted` but the corrupted-from-the-start blob doesn't get rewritten until after B3. In the 2026-05-24 mainnet incident, treasury and beacon ended up with identical `accounts` (B3 reconciled both from the same trie) but differing `total_minted`. `STATE-FP fp = SHA256(acc + total_minted)` then disagreed across two otherwise-identical validators. Same block content, same state-root in the trie, but the fingerprint trace fired. This commit adds B3b right after B3: * `recompute_total_minted_from_blocks(bc)` walks every persisted block 1..=tip, sums `block.coinbase().amount`, and adds `TOTAL_PREMINE`. C-04 (`block_executor.rs:336`) enforces `coinbase.amount == reward`, so the chain has no other source of newly-minted supply — this sum is the canonical value of `total_minted` at the tip. * `load_blockchain` compares the blob value against the recomputed value and warns + overwrites on mismatch. * Save-back happens via the existing atomic B1 path when either B3 repaired an account OR B3b repaired `total_minted`. Cost: O(N) MDBX block loads at boot. At mainnet h~2.2M this is ~30-60s on warm SSD. Acceptable for a once-per-boot sanity pass that only writes back when divergence is detected. Missing blocks are skipped with a warning, matching the B3 fail-soft posture earned during the 2026-05-20 testnet trie-gap incident. Two unit tests added: - `test_recompute_total_minted_matches_block_sum` — ground-truth check that the helper matches `TOTAL_PREMINE + N * BLOCK_REWARD` for a freshly-mined chain. - `test_b3b_repairs_stale_total_minted_on_load` — end-to-end via the production `load_blockchain` path: persist a blob with intentionally- stale `total_minted`, reload, assert the loaded value is canonical. Fails on main; passes after this commit.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/sentrix-core/src/storage.rs (1)

272-278: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Debug log format inconsistency.

The format string "{}/{} accounts checked" uses repaired as the first argument, which will print something like "0/150 accounts checked". This reads awkwardly — when nothing is repaired, the denominator-style fraction is misleading. Consider simplifying to just show the count checked:

📝 Suggested simplification

         } else {
             tracing::debug!(
-                "load_blockchain B3: {}/{} accounts checked, total_minted matches; \
+                "load_blockchain B3: {} accounts checked, total_minted matches; \
                  nothing to reconcile at height {}",
-                repaired,
                 checked,
                 bc.height()
             );

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/sentrix-core/src/storage.rs` around lines 272 - 278, The debug line in
function load_blockchain uses the format "{}/{} accounts checked" with repaired
as the first arg which reads awkwardly; update the tracing::debug call in
load_blockchain to simplify the message (e.g., "{} accounts checked" and pass
checked) or explicitly label both values (e.g., "repaired: {} checked: {}") and
pass repaired, checked, keeping the rest of the context (total_minted/height)
unchanged so the log reads clearly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sentrix-core/src/storage.rs`:
- Around line 891-908: After deleting the MDBX entry, add an explicit assertion
that the persisted block entry is gone to ensure the test premise is exercised:
call storage.mdbx_arc() (same as used to delete), check TABLE_META for the key
corresponding to format!("block:{}", 2) (i.e. the raw key used b"block:2" in the
diff) using the MDBX read/get API and assert that the lookup returns None or an
empty result before running recompute_total_minted_from_blocks(&bc); this makes
the test fail clearly if the key schema or storage behavior changes.

---

Outside diff comments:
In `@crates/sentrix-core/src/storage.rs`:
- Around line 272-278: The debug line in function load_blockchain uses the
format "{}/{} accounts checked" with repaired as the first arg which reads
awkwardly; update the tracing::debug call in load_blockchain to simplify the
message (e.g., "{} accounts checked" and pass checked) or explicitly label both
values (e.g., "repaired: {} checked: {}") and pass repaired, checked, keeping
the rest of the context (total_minted/height) unchanged so the log reads
clearly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: a1ae5142-27d8-412c-92a7-c4e4c66b262d

📥 Commits

Reviewing files that changed from the base of the PR and between 9278cb3 and 2866b11.

📒 Files selected for processing (1)

crates/sentrix-core/src/storage.rs

coderabbitai · 2026-05-25T13:46:48Z

+        // Yank block #2 out of MDBX. The block stays in `bc.chain`
+        // (in-memory window) but the persisted entry is gone, so the
+        // recompute loop hits `load_block(2) -> Ok(None)` and counts
+        // the gap.
+        let mdbx = storage.mdbx_arc();
+        mdbx.delete(sentrix_storage::tables::TABLE_META, b"block:2")
+            .unwrap();
+
+        // Sum should be `TOTAL_PREMINE + 2 * BLOCK_REWARD` (block 1 +
+        // block 3; block 2 was deleted).
+        let recomputed = storage
+            .recompute_total_minted_from_blocks(&bc)
+            .unwrap();
+        assert_eq!(
+            recomputed,
+            TOTAL_PREMINE + 2 * BLOCK_REWARD,
+            "missing block must be silently skipped (fail-soft per B3 precedent)"
+        );


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check how ChainStorage stores blocks — the test assumes "block:{height}" in TABLE_META rg -n --type rust -C5 'save_block|TABLE_META|"block:' crates/sentrix-storage/src/

Repository: sentrix-labs/sentrix

Length of output: 24365

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '840,930p' crates/sentrix-core/src/storage.rs | nl -ba | sed -n '1,120p'

Repository: sentrix-labs/sentrix

Length of output: 107

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '840,930p' crates/sentrix-core/src/storage.rs | nl -ba | sed -n '1,120p'

Repository: sentrix-labs/sentrix

Length of output: 107

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '840,930p' crates/sentrix-core/src/storage.rs | cat -n | sed -n '1,130p'

Repository: sentrix-labs/sentrix

Length of output: 3457

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "fn\s+load_block\s*\(|load_block\(" crates/sentrix-core/src/storage.rs crates/sentrix-core/src -S

Repository: sentrix-labs/sentrix

Length of output: 2614

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '330,420p' crates/sentrix-core/src/storage.rs | cat -n

Repository: sentrix-labs/sentrix

Length of output: 4381

Add an assertion after the MDBX delete to ensure the missing-block premise is actually exercised

sentrix_storage::ChainStorage persists blocks in TABLE_META under keys format!("block:{}", index), so b"block:2" matches the current schema.

The test should still assert the block entry is gone, so if the key/schema ever changes the failure mode is explicit.

🛡️ Proposed guard assertion

mdbx.delete(sentrix_storage::tables::TABLE_META, b"block:2") .unwrap(); + // Verify delete succeeded — if this fails, the key format/schema is wrong. + assert!( + storage.load_block(2).unwrap().is_none(), + "delete must remove block:2 from MDBX for test to be valid" + ); + // Sum should be `TOTAL_PREMINE + 2 * BLOCK_REWARD` (block 1 + // block 3; block 2 was deleted).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sentrix-core/src/storage.rs` around lines 891 - 908, After deleting the MDBX entry, add an explicit assertion that the persisted block entry is gone to ensure the test premise is exercised: call storage.mdbx_arc() (same as used to delete), check TABLE_META for the key corresponding to format!("block:{}", 2) (i.e. the raw key used b"block:2" in the diff) using the MDBX read/get API and assert that the lookup returns None or an empty result before running recompute_total_minted_from_blocks(&bc); this makes the test fail clearly if the key schema or storage behavior changes.

Ships PRs #711 (trie prune race augment-walk), #712 (B3b total_minted self-heal), #713 (pay_one_signer sorted delegators), #714 (collect_reachable depth-aware empty check). See CHANGELOG entry for the per-PR narrative and the 2026-05-24 incident root-cause analysis.

github-actions Bot enabled auto-merge (squash) May 25, 2026 11:25

satyakwok self-assigned this May 25, 2026

github-actions Bot reviewed May 25, 2026

View reviewed changes

satyakwok force-pushed the fix/b3-reconcile-total-minted-from-blocks branch from 9278cb3 to 2866b11 Compare May 25, 2026 13:41

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

github-actions Bot merged commit 7e4b8cb into main May 25, 2026
18 checks passed

satyakwok mentioned this pull request May 25, 2026

chore: bump workspace 2.2.15 -> 2.2.16 #715

Merged

15 tasks

coderabbitai Bot mentioned this pull request May 25, 2026

fix(core): b3b uses emission schedule, not per-block disk reads #716

Merged

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): B3b — recompute total_minted from blocks when blob is stale#712

fix(core): B3b — recompute total_minted from blocks when blob is stale#712
github-actions[bot] merged 1 commit into
mainfrom
fix/b3-reconcile-total-minted-from-blocks

satyakwok commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

codecov Bot commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

satyakwok commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Risk tier

Required by tier

🟢 Low — minimum bar

🟡 Medium — adds

🟠 High — adds

🔴 Critical — adds

Test plan

Rollback plan

Related

Summary by CodeRabbit

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

codecov Bot commented May 25, 2026

Codecov Report

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

satyakwok commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading