docs(encryption): Stage 6E design proposal — enable-raft-envelope admin RPC + Phase-2 cutover#893
Conversation
…in RPC + Phase-2 cutover
Lands the design for the §7.1 Phase-2 raft cutover: admin RPC,
sidecar cutover-index recording, engine apply-hook unwrap,
coordinator wrap-on-propose, and the §7.1 6-step proposal-
quiescence barrier that prevents the unwrap path from seeing
a plaintext entry at index > cutover.
Sliced into three sequential implementation milestones:
- 6E-1: Admin RPC + sidecar plumbing (no behavior change)
- 6E-2: Engine unwrap + coord wrap + quiescence barrier
(the atomic flip — pieces cannot ship separately)
- 6E-3: 6C-4 Phase-2-specific fail-closed guards
Documents the load-bearing details from the parent design:
- Why the barrier's source-tag exemption is required (a global
gate would deadlock on the cutover proposal itself).
- Why the dispatch is strict-greater-than (the cutover entry
at index == cutover MUST NOT be unwrapped — chicken/egg).
- Why decodeProposalEnvelope MUST run before unwrap (clobbers
the proposal-ID handoff otherwise; every coordinator write
would time out forever).
- Why 6E-2's three pieces cannot ship separately (either
ordering produces a cluster-wide halt-apply at cutover).
- Why 6C-4 guards bundle into 6E-3 not 6E-1 (dead code
without an active Phase 2 to gate on).
Verification action items per milestone (8 + 8 + 3 tests
across 6E-1 / 6E-2 / 6E-3) and 5-lens self-review focus
per milestone.
Builds on Stage 8a (#886) — 8a's v2 snapshot reader already
carries the cutover across snapshot restore, so a fresh
follower joining mid-Phase-2 reconstructs the dispatch
boundary without operator intervention. Sequencing: 8a MUST
ship before 6E (already satisfied).
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Review limit reached
More reviews will be available in 29 minutes and 22 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@claude review Specifically please verify:
|
|
Claude finished @bootjp's task in 7m 23s —— View job Stage 6E Design ReviewTasks:
SummaryThe core design is sound. The quiescence-barrier logic, the strict- Milestone Breakdown6E-1 — safe to ship alone: ✅ correct The reasoning holds: no coordinator wrap + no engine hook + no barrier means the cluster stays in plaintext-on-wire mode regardless of the cutover index now in the sidecar. 8a's v2 snapshot writer already reads 6E-2 — three pieces cannot ship separately: ✅ correct The inseparability argument is airtight. Engine-unwrap-before-coordinator-wrap → followers see GCM failures on plaintext. Coordinator-wrap-before-engine-unwrap → FSM receives opaque raft-envelope bytes and produces protobuf decode errors. Both orderings produce a cluster-wide halt at the cutover boundary. The bundle is mandatory. 6E-3 guards bundle correctly:
Why this matters for safety: If the guard ships in 6E-3 (not 6E-1), an operator on a 6E-1 cluster without a raft DEK bootstrap ( Fix: Move §7.1 6-Step Quiescence-BarrierSequence correctness: ✅ The 6-step sequence in §2.2 matches the parent design's §7.1 Phase 2 exactly. Step ordering is sound: block → drain → propose → await-commit-and-apply → flip-wrap → unblock. Source-tag exemption rationale: ✅ correct and complete A global Strict- At step 3, the cutover entry is proposed un-wrapped. It commits at Step 4 mechanization — flag for implementation: Step 4 says "wait for that entry to commit AND for the local FSM apply to set
|
There was a problem hiding this comment.
Code Review
This pull request proposes the design document for Stage 6E, detailing the enable-raft-envelope admin RPC and Phase-2 raft cutover. The feedback highlights a critical design gap regarding the initialization of the wrapOnPropose flag on node startup or restart to prevent accidental plaintext proposals. Additionally, the feedback suggests correcting the proposed applyNormalEntry signature to match the existing codebase and fixing a contradictory section header in the document.
| 3. **Coordinator wrap-on-propose switch** — | ||
| `kv/coordinator.go` / `kv/sharded_coordinator.go` consult an | ||
| in-process `wrapOnPropose` flag (set by the cutover barrier, | ||
| reset never within a process lifetime) before calling | ||
| `engine.Propose`. Phase-2 leaders wrap; Phase-0/1 leaders | ||
| propose plaintext exactly as today. |
There was a problem hiding this comment.
The design states that the wrapOnPropose flag is set by the cutover barrier and never reset within a process lifetime. However, it does not specify how wrapOnPropose is initialized on startup or restart.
If a node restarts after the cutover has already occurred (meaning sidecar.RaftEnvelopeCutoverIndex > 0), its in-process wrapOnPropose flag would default to false. If this restarted node becomes the leader, it will propose plaintext entries. Since these new entries will have index > cutover_index, all replicas' apply-hooks will attempt to unwrap them, fail GCM verification, and halt the cluster.
Suggestion:
Specify that on startup, the coordinator must initialize wrapOnPropose to true if the local sidecar has a non-zero RaftEnvelopeCutoverIndex.
| 3. **Coordinator wrap-on-propose switch** — | |
| `kv/coordinator.go` / `kv/sharded_coordinator.go` consult an | |
| in-process `wrapOnPropose` flag (set by the cutover barrier, | |
| reset never within a process lifetime) before calling | |
| `engine.Propose`. Phase-2 leaders wrap; Phase-0/1 leaders | |
| propose plaintext exactly as today. | |
| 3. **Coordinator wrap-on-propose switch** — | |
| kv/coordinator.go / kv/sharded_coordinator.go consult an | |
| in-process wrapOnPropose flag (initialized to true on startup if | |
| sidecar.RaftEnvelopeCutoverIndex > 0, and also set to true by the | |
| cutover barrier) before calling engine.Propose. Phase-2 leaders wrap; | |
| Phase-0/1 leaders propose plaintext exactly as today. |
References
- When designing a production-grade system migration, the plan must consider and detail strategies for live or zero-downtime cutovers to avoid service interruption.
| // internal/raftengine/etcd/engine.go::applyNormalEntry | ||
| func (e *engine) applyNormalEntry(entry raftpb.Entry) error { | ||
| id, maybeEncPayload, ok := decodeProposalEnvelope(entry.Data) | ||
| if !ok { | ||
| return nil // pre-envelope entry, leave intact | ||
| } | ||
| payload := maybeEncPayload | ||
| // §6.3 hook: unwrap only at index strictly greater than | ||
| // the locally-recorded cutover. The local sidecar is the | ||
| // source of truth — every replica decides independently | ||
| // and deterministically because the sidecar value is | ||
| // itself replicated via the cutover entry's FSM apply. | ||
| if entry.Index > e.encryption.RaftEnvelopeCutoverIndex() { | ||
| var err error | ||
| payload, err = e.encryption.RaftDEK().Unwrap(maybeEncPayload) | ||
| if err != nil { | ||
| // GCM tag mismatch = sidecar/keystore divergence | ||
| // or on-disk corruption. Halt apply WITHOUT | ||
| // setApplied so the next restart replays this | ||
| // entry under a corrected keystore. Silent skip | ||
| // would diverge the FSM. | ||
| return errors.Wrap(err, "raft envelope: unwrap") | ||
| // engine treats this as ErrRaftUnwrapFailed → fatal | ||
| } | ||
| } | ||
| response := e.fsm.Apply(payload) | ||
| e.resolveProposal(entry.Index, entry.Data, response) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The proposed signature and implementation of applyNormalEntry in the design document do not match the existing codebase in internal/raftengine/etcd/engine.go. Currently, applyNormalEntry returns (any, error) (where any is the FSM response), and the responsibilities of calling resolveProposal and setApplied belong to applyNormalCommitted.
If implemented as proposed, it would break the signature contract and result in duplicate proposal resolution or panics when applyNormalCommitted attempts to resolve the already-popped proposal.
Suggestion:
Update the code snippet to align with the existing applyNormalEntry signature and return values.
// internal/raftengine/etcd/engine.go::applyNormalEntry
func (e *Engine) applyNormalEntry(entry raftpb.Entry) (any, error) {
id, maybeEncPayload, ok := decodeProposalEnvelope(entry.Data)
if !ok {
return nil, nil // pre-envelope entry, leave intact
}
payload := maybeEncPayload
// §6.3 hook: unwrap only at index strictly greater than
// the locally-recorded cutover. The local sidecar is the
// source of truth — every replica decides independently
// and deterministically because the sidecar value is
// itself replicated via the cutover entry's FSM apply.
if entry.Index > e.encryption.RaftEnvelopeCutoverIndex() {
var err error
payload, err = e.encryption.RaftDEK().Unwrap(maybeEncPayload)
if err != nil {
// GCM tag mismatch = sidecar/keystore divergence
// or on-disk corruption. Halt apply WITHOUT
// setApplied so the next restart replays this
// entry under a corrected keystore. Silent skip
// would diverge the FSM.
return nil, errors.Wrap(err, "raft envelope: unwrap")
// engine treats this as ErrRaftUnwrapFailed → fatal
}
}
return e.fsm.Apply(payload), nil
}| field has been on disk since 6D shipped; only its | ||
| load-bearingness changes per milestone. | ||
|
|
||
| ## 5. Why the cutover entry is unwrapped at index == cutover |
There was a problem hiding this comment.
The section header contradicts the explanation below it. The text explains that the cutover entry itself flows through unwrap-free (i.e., it is not unwrapped) because the strict > comparison is false at entry.Index == N.
Suggestion:
Update the header to reflect that the entry is not unwrapped.
| ## 5. Why the cutover entry is unwrapped at index == cutover | |
| ## 5. Why the cutover entry is NOT unwrapped at index == cutover |
…init) + 2 mediums
gemini HIGH on §2.1 (line 57)
The original spec said wrapOnPropose is 'set by the cutover
barrier and never reset within a process lifetime' but did
not specify how the flag is initialized on startup. A node
that restarts AFTER the cluster has cut over would default
the flag to false, propose plaintext as a new leader, and
the engine apply-hook (which uses the sidecar — NOT the
in-process flag — as the source of truth) would attempt
Unwrap on those plaintext entries, fail GCM, and halt apply
cluster-wide.
Fix: §2.1 now requires the flag to be initialized to true
iff sidecar.RaftEnvelopeCutoverIndex != 0 at coordinator
construction time. Two set-paths total (startup hydration,
barrier step 5) and zero reset-paths within a process load.
Added test TestCoordinatorWrap_StartupInitFromSidecar in
§6.2 to pin the regression.
gemini MEDIUM on §2.4 (line 143)
applyNormalEntry signature in the proposed code snippet did
not match the actual codebase. Verified at
internal/raftengine/etcd/engine.go:2226 that the real
signature is (entry raftpb.Entry) (any, error); setApplied
and resolveProposal live in applyNormalCommitted (line 2173,
2186-2187) which calls applyNormalEntry. Updated the
snippet + commentary to match: 6E-2 adds the unwrap shim
inside applyNormalEntry without changing the caller
contract; the error return is what makes
applyNormalCommitted skip setApplied (the existing
fail-closed shape).
gemini MEDIUM on §5 header (line 268)
Header said 'is unwrapped' but the body says 'is NOT
unwrapped'. Header was a typo. Now reads 'NOT unwrapped'.
Self-review (5-lens) on the doc change itself:
1. Data loss — startup-init rule directly prevents the
halt-apply-cluster-wide scenario that the original spec
left open.
2. Concurrency — the flag has two write paths (startup
hydration on a single goroutine, barrier step 5 on a
single goroutine), both serial; no race possible within
the lifetime of one process load.
3. Performance — startup-init is a single sidecar read at
coordinator construction (already happens for 6C-2);
zero hot-path cost.
4. Data consistency — the engine apply-hook still reads
the sidecar as the source of truth (per-replica
deterministic); the in-process flag is the leader's
write-side mirror, not a parallel decision oracle.
5. Test coverage — new test pinned; without it the gemini
HIGH scenario would regress silently.
|
Round-1 fixes in dd0beb2:
@claude review |
|
Claude finished @bootjp's task in 5m 14s —— View job Review of Stage 6E Design (Round-2: post dd0beb2 fixes)Tasks:
Round-1 Fixes — all three correctly applied ✅gemini HIGH (§2.1 startup init): The rule is present, correctly scoped, and its rationale is documented at doc lines 58–68. Two write paths (startup hydration, barrier step 5), zero reset paths, explicit call-site ( gemini MEDIUM (§2.4 func (e *Engine) applyNormalEntry(entry raftpb.Entry) (any, error)
gemini MEDIUM (§5 header): Line 297 now reads Original review itemsMilestone breakdown6E-1 safe to ship alone ✅. The sidecar records the cutover index, but the engine's 6E-2 three pieces cannot ship separately ✅. The interlocking argument in §3.2 is correct:
The design correctly identifies that the quiescence barrier is also needed even within the same PR, because the runtime window between "cutover entry applied" and " 6E-3 guards bundle correctly ✅. The §7.1 quiescence-barrier + source-tag exemption + strict-
|
Summary
Stage 6E design proposal — lands
enable-raft-envelopeadmin RPC + the §7.1 Phase-2 raft cutover end-to-end. Unblocked by Stage 8a (#886), which shipped the v2 snapshot reader that 6E depends on for mid-Phase-2 follower catch-up.Why now
Every prerequisite has shipped:
6E is the missing piece between sidecar-recorded cutover and on-the-wire wrap/unwrap.
Sliced into 3 implementation milestones
Load-bearing details documented
Proposemutex would deadlock on the cutover's own proposal.index > cutover— the cutover entry atindex == cutoverMUST NOT be unwrapped (chicken/egg with the sidecar update).decodeProposalEnvelopeMUST run before unwrap — wrappingentry.Dataitself would clobber the proposal-ID handoff toresolveProposal; every coordinator write would time out forever.Verification action items
8 + 8 + 3 tests across the three milestones, with explicit 5-lens self-review focus per milestone (data loss for 6E-2's barrier window, data consistency for 6E-3's startup divergence refusal).
Test plan
>index dispatch rule.