docs(rfc): lock RFC-21 Phase-3 design decisions#3967
Merged
mswilkison merged 1 commit intoMay 23, 2026
Merged
Conversation
Promotes the resolved Phase-3 design decisions (settled in the 2026-05-22 cross-team review) from the Open Questions section into a dedicated Resolved Decisions section. Four targeted edits: 1. Cross-process coordinator agreement -- replaces the all-to-all-with-local-union recommendation (which silently assumed synchronous gossip) with coordinator-proposed aggregation on a dedicated topic, signed with the operator key, with receiver-side bundle verification for censorship detection. Documents the rejected alternatives and the liveness/safety properties. 2. AttemptSeed source -- the DkgGroupPublicKey input to the seed derivation comes from the FFI signer material at attempt construction time, not from a wallet registry lookup. Removes hot-path async coupling and respects layering between core signing and application state. 3. SelectCoordinator seed bridging -- BeginAttempt wraps the legacy int64-seeded SelectCoordinator with a sterile, named adapter that folds the new [32]byte AttemptSeed into the legacy parameter shape. Bridge is exhaustively tested so later edits cannot accidentally desynchronise it. 4. Silence-parking transience -- Layer B exclusion policy now states explicitly that silence-based parking is single-attempt only with no escalation, so a peer falsely labelled silent (late delivery, coordinator censorship) is reinstated by the very next attempt. Permanent exclusion only follows from overflow or non-transport reject events, neither of which can fire on a slow-but-honest peer. Also: removes a stale "(see open question 1)" reference in Layer A, and adds compact decision blocks for the remaining Phase-3 questions (signer-material binding, key reuse, JSON format, message-size budget). Open questions reduced to three: persistence across restart (Phase 5+), FFI surface guidance (follows L5 pattern from PR #425 / #3961), and AttemptContextHash backward-compat horizon (Phase 6+). No code changes. Implementation PRs reference these decisions in their descriptions.
6214eec
into
feat/frost-schnorr-migration-scaffold
15 checks passed
3 tasks
mswilkison
added a commit
that referenced
this pull request
May 23, 2026
…idge (#3968) ## Summary First Phase-3 implementation PR for **RFC-21**. Introduces the ROAST coordinator state-machine surface (`Coordinator` interface, in-memory implementation, attempt-handle identity, state enum) plus the sterile seed-folding adapter that lets the new `[32]byte` `AttemptSeed` drive the legacy `SelectCoordinator` helper without modifying it. **No production code path uses the new \`Coordinator\` yet.** Phase 3 "ships unused" per the RFC. Phase 4 wires it into receivers behind the \`frost_roast_retry\` build tag. ## What lands ### \`pkg/frost/roast/coordinator_state.go\` | Surface | Role | |---|---| | \`AttemptState\` enum | \`Pending / Collecting / Aggregating / Succeeded / Transitioned\` with \`String()\`. | | \`AttemptHandle\` | Opaque per-attempt identity. \`ContextHash()\` accessor cross-checks the bound context. | | \`Coordinator\` interface | \`BeginAttempt(ctx) → handle\`, \`State(handle) → state\`, \`SelectedCoordinator(handle) → member\`. Later Phase-3 PRs (3.2 / 3.3 / 3.4) extend with \`TransitionMessage\`, \`AggregateBundle\`, \`VerifyBundle\`, and \`NextAttempt\`. | | \`NewInMemoryCoordinator()\` | Concurrent-safe via \`sync.Mutex\` + \`atomic.Uint64\` next-id counter. | | \`ErrUnknownAttempt\` | Sentinel for handle/instance mismatch. | ### \`pkg/frost/roast/seed_bridge.go\` | Surface | Role | |---|---| | \`foldAttemptSeed(seed [32]byte) int64\` | First 8 bytes BE → int64 reinterpretation. Sterile, named, non-cryptographic adapter. Documented contract: byte-identical input must produce byte-identical output on every honest signer. | \`BeginAttempt\` calls \`foldAttemptSeed\` and forwards to the existing \`SelectCoordinator\` to elect the attempt's coordinator. The legacy helper itself is **not modified** -- the bridge is the only thing between RFC-21 contexts and the legacy seed format. ## Why the seed bridge The legacy \`SelectCoordinator\` takes \`(seed int64, attemptNumber uint)\` and is correct in isolation. RFC-21 widens \`AttemptSeed\` to \`[32]byte\` for the canonical-hash binding. We could rewrite the shuffle, but rewriting cryptographic-consensus logic that already agrees across the network is the wrong trade-off; the audit and behaviour are settled. The bridge satisfies the resolved decision in RFC-21: > \"BeginAttempt wraps it with a sterile bridge that folds the new > [32]byte AttemptSeed into the legacy parameter shape... The bridge > is named, isolated, and exhaustively tested so later edits cannot > accidentally desynchronise it.\" ## Test coverage ### \`coordinator_state_test.go\` (9 tests) - \`TestBeginAttempt_ReturnsHandleWithMatchingContextHash\` - \`TestBeginAttempt_HandlesAreDistinctAcrossAttempts\` - \`TestBeginAttempt_RejectsEmptyIncludedSet\` (defence-in-depth) - \`TestState_ReturnsCollectingAfterBegin\` - \`TestState_UnknownHandleReturnsSentinel\` - \`TestSelectedCoordinator_ReturnsMemberFromIncludedSet\` - \`TestSelectedCoordinator_IsDeterministicForSameContext\` -- two independent \`Coordinator\` instances agree on the elected member - \`TestSelectedCoordinator_DifferentAttemptNumbersCanProduceDifferentLeaders\` -- 16 attempts produce ≥2 distinct leaders, defending the ROAST leader-rotation property - \`TestSelectedCoordinator_UnknownHandleReturnsSentinel\` - \`TestInMemoryCoordinator_ConcurrentBeginAttemptsAreRaceSafe\` -- 16 goroutines × 50 calls each, all handles unique - \`TestAttemptState_String\` -- all enum values + unknown sentinel ### \`seed_bridge_test.go\` (5 tests) - \`TestFoldAttemptSeed_IsDeterministic\` - \`TestFoldAttemptSeed_TakesFirst8BytesBigEndian\` -- specific byte pattern verified - \`TestFoldAttemptSeed_IgnoresBytesAfterIndex7\` -- documents the contract: bytes 8..31 don't influence output (still bound at the \`AttemptContext.Hash()\` layer) - \`TestFoldAttemptSeed_FirstByteSwept\` -- 256-value sweep of the high byte produces 256 distinct outputs (no collisions) - \`TestFoldAttemptSeed_GoldenFixture\` -- literal int64 value locks the wire-format reduction; literal drift caught at code review ### Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/roast/...\` | pass (14 cases) | | \`go test -race ./pkg/frost/roast/...\` | pass | | \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/roast/...\` | silent | | \`go vet ./pkg/frost/roast/...\` | clean | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the seed bridge's discard of bytes 8..31 is acceptable. (Bytes 8..31 still appear in \`AttemptContext.Hash()\`, so any mutation is detected at the protocol-message layer in Phase 1B; the bridge merely reduces 256-bit input to the 64-bit width \`SelectCoordinator\` needs.) - [ ] Reviewer confirms the \`Coordinator\` interface scope is appropriate for Phase 3.1 (state surface only). Phase 3.2 will extend with \`TransitionMessage\` types. Refs RFC-21 Phase 3 (\`docs/rfc/rfc-21-*\`). Stacked at the integration tip after #3967 merged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes the Phase-3 design decisions settled in the 2026-05-22
cross-team review into a dedicated Resolved Decisions section
of RFC-21. Doc-only; +180/-38.
Why
The previous draft listed Phase-3 questions under "Open questions"
with a recommended-entering-Phase-3 path that turned out, on
review, to have a critical safety gap: the all-to-all signed-evidence
gossip recommendation silently assumed gossip is synchronously
consistent across the signer set. In practice gossip is eventually
consistent, so two honest signers can hold divergent evidence sets
at the moment the deterministic `NextAttempt` boundary triggers,
producing divergent next-attempt contexts and fracturing the group.
This PR locks the replacement design before Phase 3 implementation
PRs begin landing.
What the resolved-decisions section pins
Layer-B exclusion-policy strengthening
The exclusion-policy list in Layer B is extended with explicit
"no escalation" wording for the silence/parking case. The risk
Gemini's review surfaced (late-arriving evidence weaponised into
permanent exclusion) is bounded by:
or non-transport reject (validation-blamable). Neither can
trigger on a slow-but-honest peer.
tries to censor an honest peer's signed snapshot.
Open questions reduced to three
What remains in the Open Questions section is genuinely open:
L5 pattern from Cache and Go Seek: Use Circle CI docker caching #425 / fix(frost): prefer FFI error code over substring for replay detection #3961).
Test plan
documented matches the agreed design.
documentation` covers this).
No code change; no behaviour-test surface.