Skip to content

fix: ignore epoch finality from state sync#5750

Merged
Scottcjn merged 3 commits into
Scottcjn:mainfrom
ItsOtherMauridian:fix-p2p-state-epoch-validation
May 20, 2026
Merged

fix: ignore epoch finality from state sync#5750
Scottcjn merged 3 commits into
Scottcjn:mainfrom
ItsOtherMauridian:fix-p2p-state-epoch-validation

Conversation

@ItsOtherMauridian
Copy link
Copy Markdown
Contributor

Summary

Fixes #5749.

_handle_state() verified the signed STATE message, then merged state["epochs"] directly into epoch_crdt. A peer signature authenticates the snapshot sender, but it is not quorum/commit evidence that those epochs finalized.

This PR stops importing epoch finality from generic state sync. Epoch finality should enter epoch_crdt via the EPOCH_COMMIT path after local validation, not via additive CRDT snapshot merge.

The regression test creates a valid signed STATE message from a peer containing forged epoch 999, verifies the signature is accepted, and asserts the victim does not add epoch 999 to epoch_crdt.

Validation

RC_P2P_SECRET=$(printf 'a%.0s' {1..64}) PYTHONPATH=node python3 -B -m pytest -q \
  node/tests/test_p2p_state_epoch_sync.py \
  --tb=short -p no:cacheprovider
# 1 passed

python3 -B -m py_compile node/rustchain_p2p_gossip.py node/tests/test_p2p_state_epoch_sync.py

git diff --check

Note: I also ran adjacent P2P tests; node/tests/test_p2p_entropy_score_downgrade.py passes, while node/tests/test_p2p_vote_spoofing.py::test_vote_spoofing_finds_quorum appears to be an older reproduction test that now expects a previously fixed vulnerability to still reproduce, so I did not include it as blocking validation for this scoped fix.

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes size/M PR: 51-200 lines labels May 19, 2026
Copy link
Copy Markdown

@TJCurnutte TJCurnutte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I verified the state-sync change at head ffb9c50435aa81062a8351610ff355f1678748dc.

Validation run:

git diff --check origin/main...HEAD -- node/rustchain_p2p_gossip.py node/tests/test_p2p_state_epoch_sync.py
python3 -B -m py_compile node/rustchain_p2p_gossip.py node/tests/test_p2p_state_epoch_sync.py
RC_P2P_SECRET=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa PYTHONPATH=node python3 -B -m pytest -q node/tests/test_p2p_state_epoch_sync.py node/tests/test_p2p_hardening_phase2.py --tb=short
# 11 passed in 0.22s

I also ran a direct origin/main-vs-PR probe with a valid signed STATE message from peer1 carrying epochs: [999] plus fake finality metadata. On origin/main, the message verified and _handle_state() merged the epoch (contains_999=true, metadata_keys=["999"]). On this PR, the same verified message returned status=ok but logged ignoring epoch finality data in STATE sync and left contains_999=false, metadata_keys=[].

That matches the intended boundary: a peer signature authenticates the sender of a state snapshot, but it is not quorum evidence for epoch finality. Keeping finality on the EPOCH_COMMIT path while preserving the existing attestation and balance state-sync handling is the right fix for this issue.

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on this PR. 🚀

@ItsOtherMauridian
Copy link
Copy Markdown
Contributor Author

Good catch. Addressed in the latest commit.

The fix now includes an explicit EPOCH_COMMIT dispatch path and _handle_epoch_commit() so peers can import quorum finality from commit messages instead of only ignoring STATE["epochs"].

The commit handler validates quorum-shaped metadata before marking the epoch finalized:

  • required epoch and proposal_hash,
  • accept_count >= quorum,
  • quorum-sized known voter set,
  • rejects malformed/unknown/insufficient-quorum commits.

Validation: 3 passed for node/tests/test_p2p_state_epoch_sync.py, plus py_compile and git diff --check.

Copy link
Copy Markdown

@kevinyan911 kevinyan911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #5750

Reviewer: @kevinyan911
Wallet: RTCcd1dd903b3cbbfca24c30bd98973931a4af53302

What this PR does

Adds EPOCH_COMMIT message handler with quorum validation, and explicitly stops importing epoch finality from generic STATE sync. Previously _handle_state() would merge epochs from any verified peer signature — but a peer's signature only authenticates who sent the snapshot, not that quorum actually committed those epochs. This PR draws the correct security boundary.

Code quality

  • _handle_epoch_commit() validates accept_count and voter set against quorum before adding to epoch_crdt — correct.
  • _handle_state() now only logs a warning when epochs appear in STATE sync rather than merging them — the intent comment makes the reasoning explicit.
  • Regression test creates a valid signed STATE with forged epoch 999 and asserts epoch_crdt.contains(999) == False — directly proves the attack vector is closed.
  • py_compile clean, git diff --check clean, 11 pytests pass.

APPROVED — scoped security fix with clear test coverage.


Code review bounty claim submitted to rustchain-bounties

Copy link
Copy Markdown
Contributor

@akaalholdings akaalholdings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes because the new EPOCH_COMMIT path still lets one peer inject epoch finality by claiming quorum-shaped metadata without proving the listed voters actually signed or sent votes.

The PR correctly stops importing STATE[\"epochs\"], but _handle_epoch_commit() currently trusts these sender-provided fields:

accept_count = payload.get("accept_count", 0)
voters = payload.get("voters", [])
...
if accept_count < quorum or len(voter_set) < quorum:
    return {"status": "error", "reason": "insufficient_quorum"}
self.epoch_crdt.add(epoch, ...)

A valid message signature only authenticates the committer peer. It does not authenticate peer2 or peer3 as voters. I verified this against PR head with a one-off probe: peer1 alone creates an EPOCH_COMMIT for epoch 31337, lists peer1, peer2, and peer3 as voters, and the victim accepts it.

signature_valid= True
result= {'status': 'committed', 'epoch': 31337, 'accept_count': 3}
contains_31337= True
metadata= {'proposal_hash': 'forged-by-one-peer', 'finalized': True, 'accept_count': 3, 'voters': ['peer1', 'peer2', 'peer3'], 'committer': 'peer1'}

The included regression suite also currently codifies that behavior:

RC_P2P_SECRET=$(printf 'a%.0s' {1..64}) PYTHONPATH=node /tmp/rustchain-review-venv/bin/python -B -m pytest -q node/tests/test_p2p_state_epoch_sync.py --tb=short -p no:cacheprovider
# 3 passed

This reintroduces the same security boundary problem through a different message type: peer signature plus self-reported quorum metadata is still not quorum evidence. Please require verifiable voter evidence before accepting an external commit, for example signed vote records from the listed voters, or only accept EPOCH_COMMIT when it matches locally stored validated votes/quorum state.

@ItsOtherMauridian
Copy link
Copy Markdown
Contributor Author

Addressed in commit fed6fd6.

I tightened EPOCH_COMMIT so sender-reported quorum metadata is no longer accepted as finality evidence. The receiver now requires locally validated accept votes from the listed voters before adding the epoch to epoch_crdt, and the broadcast path now advertises only accept voters (not reject voters) as commit voters.

Regression coverage now includes both sides of the boundary:

  • forged one-peer EPOCH_COMMIT with quorum-shaped voters is rejected with unverified_voters;
  • a commit backed by locally stored accept votes is accepted.

Local validation:

python3 -B -m py_compile node/rustchain_p2p_gossip.py node/tests/test_p2p_state_epoch_sync.py
RC_P2P_SECRET=<64-byte test secret> PYTHONPATH=node python3 -B -m pytest -q node/tests/test_p2p_state_epoch_sync.py node/tests/test_p2p_hardening_phase2.py --tb=short -p no:cacheprovider
# 14 passed
git diff --check

@github-actions github-actions Bot added size/L PR: 201-500 lines and removed size/M PR: 51-200 lines labels May 20, 2026
@Scottcjn Scottcjn merged commit 078cf21 into Scottcjn:main May 20, 2026
11 checks passed
@Scottcjn
Copy link
Copy Markdown
Owner

Merged. This completes the file-by-file review of your full 8-PR batch (#5732 + #5740#5752).

Review outcome — all 8 APPROVED

PR Fix Tier RTC
#5732 BCOS directory dedupe Low 10
#5740 anti-double-mining: use canonical epoch_enroll.weight Medium 25
#5742 claims: read canonical epoch-enroll snapshot for delayed claims Medium 25
#5744 claim-hijack fix: bind claim to registered miner pubkey High 50
#5746 settlement double-spend race: BEGIN IMMEDIATE atomic reservation High 50
#5748 ROM-cluster unique index (fixes silently-broken ON CONFLICT) Medium 25
#5750 finality-forgery fix: epochs only via validated quorum votes High 50
#5752 double-pay fix: orphan recovery no longer auto-refunds High 50
Total 285 RTC

Every PR: clean 2–3 file scope, real exploit-demonstrating or concurrency tests, VM-zero rule preserved, no smuggled hunks. This is exceptionally strong security work — and filing the paired bug issues (#5743#5751) before the fixes is exactly the right practice.

On #5750 specifically: merged as-is. The offline-node catch-up tradeoff is tracked in follow-up issue #5950 — closing the forgery hole now is the right call; the validated catch-up path gets designed separately.

To get paid — post your wallet

I have no RTC wallet on file for you. Reply here (or on any of your PRs) with your wallet — format RTC + 40 hex characters — and I'll dispatch the full 285 RTC in one transfer (24h confirm window).

Also: you're clearly a serious security contributor. If you want, the standing bounty queue is at https://github.com/Scottcjn/rustchain-bounties/issues — and consensus/settlement hardening like this is the highest-value work there.

— Sophia

Copy link
Copy Markdown
Contributor

@BossChaos BossChaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #5750 Review - Epoch commit handler with strict type validation

Security Analysis - Consensus Security

This PR adds a new EPOCH_COMMIT message handler with comprehensive type validation.

Key improvements:

  1. _handle_epoch_commit(): New handler for finalized epoch commits
  2. Strict type checking: Validates epoch (int), proposal_hash (str), accept_count (int >= 0), voters (list)
  3. Known voters validation: Verifies all listed voters are known nodes
  4. Commit voter filtering: Only includes accept voters in the commit message broadcast, not reject voters
  5. Type safety: Returns specific error reasons for each validation failure

Note: Also fixes _handle_epoch_vote() to filter voters to only those who voted "accept" when broadcasting the commit message.

Recommendation: Merge - important consensus security hardening.

@ItsOtherMauridian
Copy link
Copy Markdown
Contributor Author

Payout address: RTC71d6976297ed35377b867f13ed962f54020ef434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/L PR: 201-500 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: signed P2P state sync can inject settled epochs

7 participants