Skip to content

security: Phase F — per-peer Ed25519 identity (dual-mode) (#2256)#2260

Merged
Scottcjn merged 1 commit intomainfrom
security/phase-f-ed25519
Apr 18, 2026
Merged

security: Phase F — per-peer Ed25519 identity (dual-mode) (#2256)#2260
Scottcjn merged 1 commit intomainfrom
security/phase-f-ed25519

Conversation

@Scottcjn
Copy link
Copy Markdown
Owner

Closes the last open CRITICAL finding from the 2026-04-14 audit

Shared P2P_SECRET = any peer with the HMAC key can mint messages as any other peer. Phase F fixes this with per-peer Ed25519 keypairs + root-signed peer registry. One compromised peer = one compromised identity.

Codex-confirmed Byzantine threat model. Phase 2 (#2258 + #2259 hotfix) closed 4 of 5 findings; this closes the fifth.

What's in this PR

New: node/p2p_identity.py

  • LocalKeypair — Ed25519 keypair, PKCS#8 PEM at $RC_P2P_PRIVKEY_PATH (default /etc/rustchain/p2p_identity.pem), mode 0600, generated on first start
  • PeerRegistry — JSON file mapping node_id → pubkey_hex at $RC_P2P_PEER_REGISTRY
  • pack_signature / unpack_signature — wire-compatible signature encoding. Legacy HMAC-only = raw hex. Dual/Ed25519 = JSON bundle {"h":"..","e":".."}
  • verify_ed25519 — thin wrapper with InvalidSignature catching

Modified: node/rustchain_p2p_gossip.py

  • GossipLayer initializes LocalKeypair + PeerRegistry when mode != hmac
  • _sign_message emits HMAC / dual / Ed25519 per mode
  • verify_message threads sender_id into Ed25519 verification, looks up sender's pubkey in registry, falls back to HMAC per mode

Signing mode ($RC_P2P_SIGNING_MODE)

Mode Sign Verify Use
hmac HMAC only HMAC only Legacy peers (Phase 2 behavior)
dual Both Either F.1 migration stage (recommended initial deploy)
ed25519 Ed25519 only Either F.2 transition (after all nodes on F.1)
strict Ed25519 only Ed25519 only — HMAC rejected F.3 final state

Wire compatibility

Dual-mode signatures are JSON bundles that legacy hmac-mode nodes parse fine (unpack_signature returns the HMAC component when JSON is seen). A Phase F dual-mode node can talk to an old-HMAC node and vice versa, as long as the HMAC secret is shared during migration.

This means: the migration can roll out without a hard flag-day.

Rollout plan (detailed in staged /home/scott/staged_patches/phase_f_ed25519/DESIGN.md)

  • F.1: deploy dual-mode to all 5 nodes, each generates keypair, collect pubkeys, distribute root-signed registry
  • F.2 (after 48-72h soak): staggered flip to ed25519 mode on each node
  • F.3 (final): strict mode — HMAC rejected; P2P_SECRET can be revoked from the environment

Rollback: at any stage, flip the mode env var back. HMAC stays available as fallback until F.3.

Test plan

  • node/tests/test_p2p_phase_f_ed25519.py — 10 tests, all pass:
    • Signature pack/unpack (3 shapes)
    • Keypair generation + 0600 perms + persistence across loads
    • Registry load + unknown-peer lookup
    • Dual-mode legacy HMAC still verifies
    • Dual-mode Ed25519 verifies via registered pubkey
    • Strict mode rejects HMAC-only
    • Unknown-peer Ed25519 rejected
  • Existing Phase 2 tests (6) still pass in hmac legacy mode
  • Staging: 2-node mesh, both in dual-mode, confirm cross-verify works
  • Production: deploy to .131 in dual-mode first, verify it still talks to other 4 nodes on hmac-mode

Dependency

Requires cryptography package for dual/ed25519/strict modes. Legacy hmac-mode does NOT require it (lazy import).

What this PR does NOT fix

  • DDoS / peer-exhaustion / gossip flood (availability, separate concern)
  • RIP-PoA attestation fraud (different layer)
  • State CRDT content-level attacks from a trusted peer (Phase D mitigated some; others remain)

Phase F closes the identity layer. Consensus and content attacks are separate hardening tracks.

Credits

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes size/XL PR: 500+ lines labels Apr 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 14, 2026

✅ BCOS v2 Scan Results

Metric Value
Trust Score 60/100
Certificate ID BCOS-73767d8f
Tier L1 (met)

BCOS Badge

What does this mean?

The BCOS (Beacon Certified Open Source) engine scans for:

  • SPDX license header compliance
  • Known CVE vulnerabilities (OSV database)
  • Static analysis findings (Semgrep)
  • SBOM completeness
  • Dependency freshness
  • Test infrastructure evidence
  • Review attestation tier

Full report | What is BCOS?


BCOS v2 Engine - Free & Open Source (MIT) - Elyan Labs

Copy link
Copy Markdown

@fengqiankun6-sudo fengqiankun6-sudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #2260: Per-Peer Ed25519 Identity (Phase F)

Quality: Security-Focused (15-25 RTC)

Summary

Replaces shared HMAC trust model with per-peer Ed25519 identity. Implements 4-phase migration strategy (hmac → dual → ed25519 → strict). Adds comprehensive test coverage.

What's Strong

  1. Clean Migration Path: 4-mode approach (hmac/dual/ed25519/strict) allows gradual rollout without breaking existing nodes. Smart design.
  2. Lazy Crypto Import: Only requires cryptography library when using Ed25519 modes — avoids breaking legacy hmac-only nodes. Good backward compatibility thinking.
  3. Well-Documented: Docstring explains wire format, migration path, and rationale. Clear as a research paper.
  4. Test Coverage: 240-line test file with dedicated test cases.
  5. Dataclass-based Design: Clean abstraction with PeerIdentity and SignedPeerRegistry.

Minor Observations

  1. Default key path (/etc/rustchain/) requires root — consider allowing ~/.rustchain/ for non-root miners.
  2. No key rotation mechanism mentioned — Ed25519 keys should ideally rotate periodically.
  3. Registry signature verification doesn't check expiry — consider adding not_after to peer entries.

Verdict

LGTM — Production-ready security upgrade. The migration strategy is thoughtful. Strong addition to RustChain's security posture.


Reviewer: fengqiankun
RTC Wallet: fengqiankun

Addresses the last open CRITICAL finding from the 2026-04-14 codex audit:
shared P2P_SECRET means any peer with the HMAC key can mint messages as any
other peer, even after Phase A bound sender_id into the signed content.

Phase F introduces per-peer Ed25519 keypairs and a root-signed peer
registry. One compromised peer = one compromised identity, not the whole
mesh.

- LocalKeypair: per-node Ed25519 keypair, generated on first start, stored
  PKCS#8 PEM at $RC_P2P_PRIVKEY_PATH (default /etc/rustchain/p2p_identity.pem)
  with mode 0600.
- PeerRegistry: JSON file mapping node_id -> pubkey_hex. Path via
  $RC_P2P_PEER_REGISTRY (default /etc/rustchain/peer_registry.json).
- pack_signature / unpack_signature: backwards-compatible wire format.
  Legacy HMAC-only = raw hex. Dual/Ed25519 = JSON bundle {"h":"..","e":".."}.
- verify_ed25519: thin wrapper with InvalidSignature catching.

- hmac     : legacy only (Phase 2 behavior) — default for old nodes
- dual     : sign with BOTH, verify either — recommended migration stage (F.1)
- ed25519  : sign Ed25519 only, verify either — transition stage (F.2)
- strict   : Ed25519 only, reject HMAC-only messages — post-migration (F.3)

- Initializes LocalKeypair + PeerRegistry when mode != hmac.
- _sign_message now emits dual/ed25519-only signatures based on mode.
- verify_message threads sender_id into Ed25519 verification, looks up
  the sender's pubkey in the registry, falls back to HMAC per mode.

Dual mode produces a JSON-bundle signature that legacy peers running in
'hmac' mode can still parse: unpack_signature returns the HMAC component
when JSON is encountered. So a dual-mode node talking to an hmac-mode
node still works as long as the HMAC secret is shared.

Operational plan in /home/scott/staged_patches/phase_f_ed25519/DESIGN.md:
  F.1: deploy dual mode to all nodes, distribute registry + pubkeys
  F.2: staggered flip to ed25519 mode
  F.3: strict mode after all nodes confirmed

node/tests/test_p2p_phase_f_ed25519.py — 10 tests, all pass:
- Signature pack/unpack (3 shapes)
- Keypair generation + 0600 perms + persistence
- Registry load + unknown-peer lookup
- Dual-mode legacy HMAC still verifies
- Dual-mode Ed25519 verifies via registered pubkey
- Strict mode rejects HMAC-only
- Unknown-peer Ed25519 rejected

Existing Phase 2 tests (6) still pass in legacy mode.
@Scottcjn Scottcjn force-pushed the security/phase-f-ed25519 branch from 946e0db to 1fe57c0 Compare April 18, 2026 16:12
@Scottcjn
Copy link
Copy Markdown
Owner Author

Rebased onto current main (was 4 commits behind / CONFLICTING). Summary of what changed during the rebase:

Conflict resolution in node/rustchain_p2p_gossip.py::verify_message

Default signing-mode change (node/p2p_identity.py)

  • Default changed from "dual""hmac". Rationale: this keeps pre-Phase-F regression tests (test_p2p_hardening_phase2.py, test_p2p_vote_spoofing.py, etc.) passing without env-var setup, and prevents legacy nodes from silently auto-upgrading to dual-mode on first import without explicit deployment-side opt-in. The F.1 rollout plan in the PR body already specifies explicitly setting RC_P2P_SIGNING_MODE=dual in the systemd unit — no behavioral change for deployment, just safer default.

Test results

  • All 10 Phase F tests pass: tests/test_p2p_phase_f_ed25519.py
  • All 15 Phase 2 hardening + entropy regression tests pass ✅
  • One pre-existing failure in test_p2p_vote_spoofing.py::test_vote_spoofing_finds_quorum — this test tries to reproduce the vote-spoof vuln that Phase A (security: P2P hardening Phase 2 — supersedes #2257 (#2256 A+B+C+D+E) #2258) already closed. Same failure happens on current main without this PR; stale test, unrelated to Phase F. Leaving for a follow-up cleanup (state_reason: not_planned or refactor to assert-the-fix instead of assert-the-bug).

Self-merging since @fengqiankun6-sudo gave it LGTM with 3 minor observations (#2273 tracks those as follow-ups) and BCOS Trust Score = 60/100 L1.

@Scottcjn
Copy link
Copy Markdown
Owner Author

Pre-merge CI note: the test job shows 3 failures — all pre-existing Beacon flakes unrelated to Phase F:

  • test_beacon_atlas_behavior.py::TestBeaconAtlasAPIBehavior::test_bounty_completion_updates_reputation — 401 auth flake
  • test_beacon_join_routing.py::TestBeaconJoinRouting::test_full_join_then_atlas_workflow — 403 flake
  • test_beacon_join_routing.py::TestBeaconJoinRouting::test_join_upsert_duplicate_agent — 403 flake

Same 3 failures have been observed on multiple prior unrelated PRs (documented in the 2026-04-14 session notes). 1222 tests pass, 18 skipped, all security-relevant checks green including the P2P Epoch Vote Spoofing PoC / UTXO Float Precision PoC / RIP-309 dedicated CI jobs.

Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/XL PR: 500+ lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants