docs(encryption): Stage 8b design proposal — WAL coverage analysis and closure by bootjp · Pull Request #897 · bootjp/elastickv

bootjp · 2026-06-01T11:29:41Z

Summary

Stage 8b design — closure-style doc that formalizes the parent §4.3 + §4.6 analyses on what stays cleartext in the etcd-raft WAL after Stage 6E's raft envelope ships.

This PR ships no code. The parent design already settled WAL file-level wrapping in §4.3 ("no direct file-level wrapping ... protection comes entirely from §4.2"); 8b consolidates the analysis, justifies the trade-off explicitly, and closes the Stage 8 row.

Why this doc exists

The parent's Stage 8 row was labeled "Snapshot header v2 + WAL coverage". 8a (feat(encryption): Stage 8a implementation — snapshot header v2 + ReadSnapshotHeader API + no-downgrade latch #886) shipped the snapshot header v2.
The WAL-coverage half has been settled in §4.3 but never formally closed. The 8a doc's (§4.5) shorthand for 8b was a stale label (corrected in PR docs(encryption): Stage 8a design proposal — snapshot header v2 #877 round-3).
Operators reasonably ask "where does the design address WAL-at-rest?" and the answer is currently scattered.

What's in the doc

§2 What 6E's raft envelope covers on disk (ASCII diagram of WAL record framing vs. ciphertext payload boundary).
§3 Residual cleartext on disk — re-stated from §4.6 with traffic-analysis framing (entry count, term changes, ConfChange topology, proposal-ID cadence).
§4 Threat-model justification for accepting the residual: defense-in-depth atop infrastructure FS encryption (LUKS, EBS, GCE).
§5 Why patching upstream etcd-raft was ruled out (maintenance burden, crash-safety semantics, perf hot-path overhead).
§6 Forward-compat hook at internal/raftengine/etcd/wal_store.go — informational only, not committed.
§7 Stage 8 row closure proposal.

Sequencing

Stage 8b is decoupled from Stage 6E (the 8a closure notes 8b is "not blocked on 6E"). Lands as a docs-only PR; the parent doc's Stage 8 row update can ride this PR or land separately.

Test plan

Design doc reviewable on its own (no code).
@claude review for the §4.3 affirmation, the §3 residual enumeration accuracy, and the §6 forward-compat-hook scope.

…d closure Closure-style design that consolidates the parent §4.3 + §4.6 analyses on what stays cleartext in the etcd-raft WAL after the §4.2 raft envelope ships (Stage 6E). Why this doc exists, not why this work exists: - The parent's Stage 8 row was labeled 'Snapshot header v2 + WAL coverage'. 8a (#886) shipped the snapshot header v2. - The WAL-coverage half has been settled in §4.3 ('no direct file-level wrapping') but never formally closed. The 8a doc's (§4.5) shorthand for 8b was a stale label. - Operators reasonably ask 'where does the design address WAL-at-rest?' and the answer is currently scattered. What 8b says: - Affirms §4.3's existing no-file-wrapping decision; lists why (upstream maintenance, crash-safety semantics, perf hot-path overhead). - Enumerates the residual cleartext on disk (term, index, type, proposal ID, ConfChange topology) and frames it as traffic-analysis metadata, not user data. - Documents the threat-model justification for accepting the residual: defense-in-depth atop infrastructure FS encryption (LUKS, EBS, GCE); high-compliance overlays deferred to Stage 9. - Carves out a forward-compat hook at internal/raftengine/etcd/wal_store.go for a hypothetical future application-level file-encryption layer. Not committed. What 8b does NOT do: - Ship any code. This is a documentation-only closure. - Re-litigate §4.3's decision (the design predates Stage 6E but the reasoning still holds). - Commit to any forward-compat work. After 8b lands, the parent Stage 8 row collapses to '8a shipped, 8b closed by docs' and the encryption roadmap moves to Stage 9 once 6E ships.

coderabbitai · 2026-06-01T11:29:51Z

Warning

Review limit reached

@bootjp, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 57 minutes and 14 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b648795b-f720-4501-9624-e0ea2bf23f0c

📥 Commits

Reviewing files that changed from the base of the PR and between 2b118a4 and 2d3f80e.

📒 Files selected for processing (1)

docs/design/2026_06_01_proposed_8b_wal_coverage_closure.md

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/encryption-8b-design

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request proposes a design document for Stage 8b, providing a documentation-only closure analysis for WAL coverage and affirming the decision to rely on the Raft envelope payload encryption rather than direct file-level wrapping. The review feedback points out discrepancies between the document's description of the proposal envelope header and threat-model details versus the actual implementation in internal/encryption/raft_envelope.go. Specifically, the diagram and text should be updated to correctly reference the actual Envelope header fields (such as KeyID and Nonce) and the leaked monotonic sequence (write_count within the deterministic nonce) instead of a stale "proposal ID" representation.

gemini-code-assist · 2026-06-01T11:30:46Z

+| Proposal envelope header (cleartext)            |
+|   - version byte (0x01), proposal ID (8B)       |


There is a discrepancy between the proposal envelope header described here and the actual Envelope structure implemented in internal/encryption/raft_envelope.go.

According to internal/encryption/raft_envelope.go, the Envelope struct consists of:

Version (1 byte)

Flag (1 byte)

KeyID (4 bytes)

Nonce (12 bytes)

Body (ciphertext)

The diagram and subsequent text refer to a proposal ID (8B) instead of the KeyID and Nonce (which contains the deterministic nonce components like write_count). To ensure high-fidelity documentation, the diagram should be updated to accurately reflect the actual Envelope header fields (KeyID and Nonce).

Suggested change

| Proposal envelope header (cleartext) |

| - version byte (0x01), proposal ID (8B) |

| Proposal envelope header (cleartext) |

| - version (1B), flag (1B), key ID (4B), |

| - nonce (12B) |

gemini-code-assist · 2026-06-01T11:30:46Z

+2. **Entry index gaps** (presence of compaction / snapshot installs).
+3. **Term changes** (visible from raft-entry framing).
+4. **Entry types** — normal entry vs `ConfChangeV2` / `ConfChange`. ConfChange entries carry node IDs and addresses (topology, not user data).
+5. **Proposal ID sequence** (8-byte counter, mostly monotonic per leader). Leaks throughput rate and the rough cadence of leadership flips.


Consistent with the actual Envelope header fields, the monotonic sequence leaked is the write_count component within the 12-byte deterministic Nonce (constructed as node_id ‖ local_epoch ‖ write_count), rather than a standalone 8-byte proposal ID. Updating this description will align the threat analysis with the actual implementation.

Suggested change

5. **Proposal ID sequence** (8-byte counter, mostly monotonic per leader). Leaks throughput rate and the rough cadence of leadership flips.

5. **Nonce sequence** (specifically the write_count component within the 12-byte deterministic nonce). Leaks throughput rate and the rough cadence of leadership flips.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9587da1018

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T11:31:41Z

+What the adversary **cannot** observe (because §4.2 wraps the payload):
+
+- User keys, values, operation type (PUT/DEL/GET/COMMIT/ABORT).
+- Transaction grouping or two-phase commit metadata.
+- Backup or admin operation contents.


Document pre-cutover WAL exposure

The parent design states that after raft_envelope_active is committed, the WAL still interleaves cleartext pre-flag entries with post-flag raft-envelope entries, so on any cluster that enables Stage 6E after existing traffic, raw WAL access can still reveal user keys, values, and op tags until those old entries are compacted/removed. This unconditional “cannot observe” list makes the Stage 8b closure look effective immediately after cutover and can mislead operators relying on WAL-at-rest guarantees.

Useful? React with 👍 / 👎.

…write_count nonce leakage) gemini MEDIUM #1 on §2 (line 41) Original diagram conflated two distinct envelopes that both appear cleartext on disk: - Outer 'proposal envelope' from the etcd-raft engine (proposalEnvelopeVersion 1B + proposal_id 8B); owned by internal/raftengine/etcd/engine.go:3887 encodeProposalEnvelope. - Inner 'AEAD envelope' from §4.2 (Version 1B + Flag 1B + KeyID 4B + Nonce 12B + Body); owned by internal/encryption/envelope.go:68. 6E activates the wrap-on-propose / unwrap-on-apply boundary against the INNER one. Both have cleartext headers on disk; only the AEAD body is encrypted. Updated §2's ASCII diagram to show both envelopes stacked with the field accounting matching the real implementation. gemini MEDIUM #2 on §3 (line 59) §3 item 5 (residual cleartext) listed only the 8-byte proposal ID as a monotonic-counter leak. The 12B deterministic nonce (node_id || local_epoch || write_count) ALSO leaks its write_count component on disk — both counters move under steady state but diverge under leader flip / DEK rotation in informative ways. Updated §3 to enumerate BOTH counters and added §3 item 6 for DEK rotation events (KeyID changes between consecutive entries). Both fixes improve the threat-model fidelity of the closure analysis; the underlying §4.3 decision (no file wrapping) is unaffected. No semantic change to the design's conclusions, just factual corrections to the residual-leakage enumeration.

bootjp · 2026-06-01T11:33:27Z

Round-1 fixes in 1423fa3:

gemini MEDIUM §2 (line 41) — the diagram conflated two envelopes that both appear cleartext on disk. The OUTER 'proposal envelope' (proposalEnvelopeVersion 1B + proposal_id 8B; owned by internal/raftengine/etcd/engine.go:3887 encodeProposalEnvelope) is what resolveProposal depends on; the INNER 'AEAD envelope' (Version 1B + Flag 1B + KeyID 4B + Nonce 12B + Body; owned by internal/encryption/envelope.go:68) is what 6E activates the wrap-on-propose boundary against. Both have cleartext headers on disk; only the AEAD body is encrypted. Diagram now shows both stacked with field accounting matching the real implementation.
gemini MEDIUM §3 (line 59) — the residual-cleartext enumeration listed only the 8B proposal_id as a monotonic-counter leak. The 12B deterministic nonce's write_count component (constructed as node_id ‖ local_epoch ‖ write_count) also leaks on disk; the two counters move together under steady state but diverge under leader flip / DEK rotation in informative ways. Updated §3 to enumerate BOTH counters, plus added §3 item 6 for DEK rotation events (KeyID changes between consecutive entries are visible on disk as cleartext).

Both fixes improve threat-model fidelity; the §4.3 'no file wrapping' decision is unaffected.

@claude review

claude · 2026-06-01T11:33:41Z

Claude finished @bootjp's task in 3m 14s —— View job

Review of PR #897 — Stage 8b WAL Coverage Closure (Round-1 Fixes)

Tasks

Round-1 Fixes: Verified ✅

Both round-1 changes are correct against the implementation.

§2 dual-envelope diagram — The outer proposal envelope layout (proposalEnvelopeVersion 1B + proposal ID 8B) matches engine.go:100,103,3887-3892 exactly (envelopeHeaderSize = 9). The inner AEAD envelope layout (Version 1B + Flag 1B + KeyID 4B uint32 + Nonce 12B + Body) matches envelope.go:HeaderSize = 18 exactly.

§3 dual-counter enumeration — Both the proposal ID (uint64, monotonic per leader) and the write_count component of the AEAD nonce are correctly identified as independent on-disk monotonic sequences. The divergence behavior under leader flip / DEK rotation is accurate: proposal ID resets on leader change while write_count tracks per-DEK writes per node. Item 6 (DEK rotation events via cleartext KeyID changes) is a correct and useful addition.

Issues Found

1. Pre-cutover cleartext WAL entries — `§3` "cannot observe" list is unconditional (CodeX P2 is valid)

The "cannot observe" block at the end of §3 reads:

What the adversary cannot observe (because §4.2 wraps the payload):

User keys, values, operation type (PUT/DEL/GET/COMMIT/ABORT).
...

This is only true for entries above raftCutoverIndex. The cutover semantics in engine.go:2270 are strict-greater-than:

if entry.Index > e.raftCutoverIndex() {
    // unwrap
}

Entries at or below the cutover index pass through as cleartext. On any cluster that enables Stage 6E after existing traffic, the WAL will interleave cleartext pre-cutover entries with post-cutover ciphertext entries indefinitely — until those segments are compacted away by etcd-raft's log compaction (snapshot + segment purge, which only removes segments with all entries below the latest snapshot index).

As written, §3 implies that a cluster that has applied the Stage 6E cutover immediately gains the "cannot observe" guarantee. That's misleading: an adversary with raw WAL access during the gap between cutover application and full segment compaction can still read pre-cutover entries in cleartext. The threat model in §4 doesn't account for this window.

Suggested fix: Add a qualification note to the "cannot observe" block, e.g.:

The above applies to entries with index > raftCutoverIndex. Pre-cutover entries (index ≤ cutover) remain cleartext in the WAL until the WAL segments holding them are removed by log compaction (snapshot install + segment purge). On clusters enabling Stage 6E after existing traffic, operators should treat the guarantee as "eventually true post-compaction" rather than "immediately true post-cutover."

Fix this →

2. Nonce field widths missing from §2 diagram

The diagram's nonce line reads:

|   - Nonce (12B: node_id ‖ local_epoch ‖ write_count)|

nonce_factory.go:13-15 defines the concrete layout:

bytes 0-1: node_id (BE uint16, 2B)
bytes 2-3: local_epoch (BE uint16, 2B)
bytes 4-11: write_count (BE uint64, 8B)

Since this doc is meant as a closure reference for threat-model fidelity, explicitly naming the widths would match the precision the round-1 fix applied to the envelope fields. Low-priority, but given that write_count is the 8B dominant component and node_id/local_epoch are 2B each, the asymmetry is non-obvious from the current notation.

3. §6 forward-compat hook — mild inconsistency with §5

§6 describes the implementation entry point as "a transparent os.File-wrapping layer" at wal_store.go. §5 already correctly acknowledges the difficulty: "wrapping those would require a fork or a heavy io.Writer shim that intercepts every os.File operation."

Looking at wal_store.go, the WAL interaction happens entirely through go.etcd.io/etcd/server/v3/storage/wal.WAL (wal.Create, wal.Open, persist.Save). That package manages os.File objects internally — there's no external io.Writer/os.File intercept surface exposed to callers. A wrapping layer would need to intercept at the etcd WAL package level, not at the wal_store.go call sites.

§5 describes this accurately. §6 presenting wal_store.go as "the implementation would land here as a transparent os.File-wrapping layer" slightly undersells the structural challenge. Consider adding a back-reference to §5's complexity note, or rephrasing §6's hook description to "the natural operator control surface is at wal_store.go, but the actual wrapping mechanism would require one of the approaches ruled out in §5."

Non-issues / Confirmed Correct

§4.3 affirmation — "No direct file-level wrapping … protection comes entirely from §4.2" still holds after Stage 6E. The §4.2 envelope wraps the Data []byte inside the proposal envelope, which is what gets stored verbatim in the WAL. ✅
§4 threat model — The defense-in-depth reasoning (FS-layer encryption + §4.2 application-layer envelope) is sound for the stated deployment classes. ✅
§7 Stage 8 row update — The proposed row text is correct. The WAL-coverage half being documented-but-not-implemented is well-justified. ✅
§9 staging — The Stage 6E dependency note ("not blocked on 6E") and the Stage 9 forward reference are accurate. ✅

Overall: The round-1 fixes correctly address the two gemini MEDIUM findings. Issue #1 (pre-cutover cleartext window) is the most important to address before merge — it's the CodeX P2 finding and it affects the accuracy of the threat model's "cannot observe" claim. Issues #2 and #3 are lower-priority accuracy improvements for a reference document.
| Branch: feat/encryption-8b-design

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1423fa3d3f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T11:35:47Z

+     the rough cadence of leadership flips (the counter resets on
+     leadership change).


Correct proposal-ID leakage description

In clusters where a node stays up across leadership changes or handles reads/admin requests, the WAL does not show a counter that resets on each leadership change: Engine.nextID() is a process-wide atomic counter shared by Propose, linearizable reads, and admin conf changes, and there is no reset on leader transitions. Documenting resets as a leader-flip signal gives operators the wrong traffic-analysis model; this residual should describe process-local request IDs with possible gaps from non-WAL reads/admin work instead of per-leader reset behavior.

Useful? React with 👍 / 👎.

codex P2 #1 + claude Issue #1 — pre-cutover cleartext window The 'cannot observe' block in §3 was unconditional; in reality it applies only to entries with index > raftEnvelopeCutoverIndex. On a cluster that enables Stage 6E after existing traffic, the WAL interleaves cleartext pre-cutover entries with post-cutover ciphertext until log compaction removes the pre-cutover segments (snapshot install + segment purge). During that window an adversary with raw WAL access can still read user data from the pre-cutover segments. Added an 'Important caveat' note that documents the 'eventually true post-compaction' framing, matching the engine's strict-> dispatch semantics. codex P2 #2 — proposal-ID leakage description corrected The original wording claimed the 8B proposal ID 'resets on leadership change' — wrong. Engine.nextID() at engine.go:3134 is Engine.nextRequestID.Add(1), a process-wide atomic shared by Propose, linearizable reads, and admin conf changes. It does NOT reset across leadership transitions and is bumped by non-WAL paths, so the on-disk WAL sequence has gaps relative to the in-process counter. Updated §3 item 5 to describe the actual semantics: process-local request IDs with gaps from non-WAL traffic, useful for throughput estimation but NOT a reliable leader-flip signal. claude Issue #2 — nonce field widths in §2 diagram Added explicit byte widths (node_id 2B + local_epoch 2B + write_count 8B = 12B total, all big-endian) with a back-reference to internal/encryption/nonce_factory.go:13-15. Matches the precision the round-1 fix applied to the envelope fields. claude Issue #3 — §6 forward-compat hook scope clarification The original wording presented wal_store.go as 'the implementation would land here as a transparent os.File-wrapping layer' — but wal_store.go's WAL operations route through the upstream go.etcd.io/etcd/server/v3/storage/wal package, which manages *os.File internally and exposes no external io.Writer intercept surface. Updated §6 to clarify that wal_store.go is the operator control surface but the actual wrapping mechanism still has to take one of the approaches §5 rules out (fork the upstream package, or move to an FS-level interposition that's effectively §4's infrastructure-FS-encryption answer). Adds a back-reference to §5 so the reader sees the structural challenge consistently. All four fixes improve threat-model fidelity without changing the §4.3 'no file wrapping' decision.

bootjp · 2026-06-01T13:42:22Z

Round-2 fixes in 2d3f80e:

codex P2 Configure Renovate #1 + claude Issue Configure Renovate #1 — §3 'cannot observe' block was unconditional; in reality entries with index <= raftEnvelopeCutoverIndex stay cleartext until log compaction removes their WAL segments. Added an 'Important caveat' note documenting the 'eventually true post-compaction' framing, matching the engine's strict-> dispatch.
codex P2 Update dependency go to v1.21.4 #2 — proposal-ID leakage description was factually wrong. Verified at engine.go:3134 that Engine.nextID() is nextRequestID.Add(1), a process-wide atomic shared by Propose, reads, and admin paths — it does NOT reset on leadership change. Rewrote §3 item 5 to describe the actual semantics: process-local IDs with gaps from non-WAL traffic, not a reliable leader-flip signal.
claude Issue Update dependency go to v1.21.4 #2 — added nonce field widths (node_id 2B + local_epoch 2B + write_count 8B) to §2 diagram with a back-reference to internal/encryption/nonce_factory.go:13-15.
claude Issue Update docker/setup-qemu-action action to v3 #3 — clarified §6's wrap-hook scope. wal_store.go is the operator control surface but the upstream WAL package manages *os.File internally; an actual wrapping layer still has to take one of the approaches §5 rules out (fork the upstream pkg, or move to an FS-level interposition that's effectively §4's infra-FS-encryption answer). Added a back-reference to §5.

All four fixes improve threat-model fidelity; the §4.3 'no file wrapping' decision is unaffected.

@claude review

claude · 2026-06-01T13:42:47Z

Claude finished @bootjp's task in 2m 51s —— View job

Review of PR #897 — Stage 8b WAL Coverage Closure (Round-2 Fixes)

Tasks

Round-2 Fixes: Verified ✅

Fix #1 — pre-cutover cleartext caveat (§3): The "Important caveat" block correctly characterizes the window. The strict-> guard at engine.go:2270 (if entry.Index > e.raftCutoverIndex()) matches the doc's framing — the cutover entry itself and everything before it is plaintext by construction. The "eventually true post-compaction" language is accurate. ✅

Fix #2 — proposal-ID semantics (§3 item 5): Verified against the implementation. Engine.nextID() at engine.go:3134-3135 returns e.nextRequestID.Add(1) on a atomic.Uint64 field that is zero-initialized at Engine construction and never reset. The three call sites are Propose() (line 741, WAL path), submitRead() (line 1174, linearizable reads — no WAL entry), and submitAdminEx() (line 1320, ConfChange admin). The §3 description — "process-wide atomic.Uint64.Add(1) shared by Propose, linearizable reads, and admin conf changes … does NOT reset across leadership transitions … on-disk WAL sequence shows gaps relative to the in-process counter" — is accurate for all three points. ✅

Fix #3 — nonce field widths in §2 diagram: The added widths (node_id(2B) ‖ local_epoch(2B) ‖ write_count(8B)) match nonce_factory.go:13-15 and the actual binary.BigEndian.PutUint16/PutUint64 calls at lines 88-90. The reference to nonce_factory.go:13-15 is precise. ✅

Fix #4 — §6 scope clarification: The revised text correctly identifies wal_store.go as the operator control surface while explaining that an actual wrapping layer must take one of the approaches §5 rules out (fork the upstream go.etcd.io/etcd/server/v3/storage/wal package or move to FS-level interposition). Confirmed: wal_store.go calls wal.Create, wal.Open, and persist.Save exclusively through the upstream package, which manages *os.File objects internally — no external intercept surface is exposed to callers. ✅

Issues Found

1. §2 diagram still says "monotonic per leader" — contradicts §3 item 5 (Fix #2 is incomplete)

docs/design/2026_06_01_proposed_8b_wal_coverage_closure.md:42:

|   - proposal ID (8B uint64, monotonic per leader)   |

Fix #2 correctly updated §3 item 5 to say the proposal ID is "monotonic per process load — NOT per leader", but the §2 diagram label was not updated. These two statements now contradict each other in the same document. The label should read "monotonic per process load" to be consistent.

Fix this →

2. §4 traffic-analysis bullet still says "proposal-ID resets" — also contradicts Fix #2

docs/design/2026_06_01_proposed_8b_wal_coverage_closure.md:135:

- Leader-flip cadence (proposal-ID resets, term increments).

Same root cause as issue #1. Fix #2 removed the "per-leader reset" framing from §3 item 5 but left it intact in §4's traffic-analysis enumeration. The phrase "proposal-ID resets" is now factually wrong per the corrected §3 item 5 (the ID is process-wide and does not reset on leader flip). Term increments remain a valid leader-flip signal; proposal-ID resets do not.

Suggested replacement: Leader-flip cadence (term increments; proposal IDs do not reset on leader flip — see §3.5).

Fix this →

Non-issues / Confirmed Correct

§3 "Important caveat" — Accurately documents the pre-cutover cleartext window and its resolution via compaction. ✅
§3 item 5 — All three claims verified: process-wide atomic, no leader-flip reset, gaps from non-WAL callers (reads + admin). ✅
§2 dual-envelope diagram nonce widths — node_id(2B) ‖ local_epoch(2B) ‖ write_count(8B) matches implementation exactly. ✅
§6 forward-compat hook — Correctly identifies wal_store.go as operator surface and back-references §5's complexity note. ✅
§4.3 affirmation — "No direct file-level wrapping … protection comes entirely from §4.2" still holds. ✅
§7 Stage 8 row closure proposal — Correct. ✅

Overall: Both remaining issues are minor textual inconsistencies introduced when fix #2 updated §3 item 5 but left the matching phrase in §2's diagram label and §4's traffic-analysis bullet unchanged. No new structural or threat-model concerns. Address issues #1 and #2 (a two-line edit) and the doc is ready to merge.
| Branch: feat/encryption-8b-design

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d3f80e427

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T13:44:35Z

+
+## 4. Threat-model justification for accepting the residual
+
+The threat model in §2 of the parent design protects "the persisted state of the cluster" against an adversary with disk access. The residual cleartext above does not reveal user data; it reveals **traffic-analysis metadata**:


Scope the no-user-data claim to post-compaction entries

Fresh evidence: although lines 113-127 now correctly document that pre-cutover WAL segments can still expose keys/values until compaction, this threat-model summary remains unconditional. In clusters enabling Stage 6E after existing traffic, the residual cleartext above does reveal user data from pre-cutover entries, so this sentence should be qualified to index > raftEnvelopeCutoverIndex / after old WAL segments are purged rather than presented as generally true.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-01T13:44:35Z

+
+- Cluster throughput (entries per WAL segment ≈ ops per epoch).
+- Cluster topology changes (ConfChange entries).
+- Leader-flip cadence (proposal-ID resets, term increments).


Remove proposal-ID resets from leader-flip summary

Fresh evidence: the detailed §3 text now says proposal IDs are process-local, shared with reads/admin requests, and do not reset on leadership transitions, which matches Engine.nextID() being a process-wide atomic counter used by Propose, reads, and admin paths. This summary still lists proposal-ID resets as a leader-flip signal, so operators reading only the threat-model justification get the wrong traffic-analysis model; keep term changes or another signal instead.

Useful? React with 👍 / 👎.

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 1, 2026

View reviewed changes

		\| Proposal envelope header (cleartext) \|
		\| - version byte (0x01), proposal ID (8B) \|

	5. Proposal ID sequence (8-byte counter, mostly monotonic per leader). Leaks throughput rate and the rough cadence of leadership flips.
	5. Nonce sequence (specifically the write_count component within the 12-byte deterministic nonce). Leaks throughput rate and the rough cadence of leadership flips.

		the rough cadence of leadership flips (the counter resets on
		leadership change).


		## 4. Threat-model justification for accepting the residual

		The threat model in §2 of the parent design protects "the persisted state of the cluster" against an adversary with disk access. The residual cleartext above does not reveal user data; it reveals traffic-analysis metadata:

Conversation

bootjp commented Jun 1, 2026

Summary

Why this doc exists

What's in the doc

Sequencing

Test plan

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Jun 1, 2026

Uh oh!

claude Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #897 — Stage 8b WAL Coverage Closure (Round-1 Fixes)

Round-1 Fixes: Verified ✅

Issues Found

1. Pre-cutover cleartext WAL entries — §3 "cannot observe" list is unconditional (CodeX P2 is valid)

2. Nonce field widths missing from §2 diagram

3. §6 forward-compat hook — mild inconsistency with §5

Non-issues / Confirmed Correct

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Jun 1, 2026

Uh oh!

claude Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #897 — Stage 8b WAL Coverage Closure (Round-2 Fixes)

Round-2 Fixes: Verified ✅

Issues Found

1. §2 diagram still says "monotonic per leader" — contradicts §3 item 5 (Fix #2 is incomplete)

2. §4 traffic-analysis bullet still says "proposal-ID resets" — also contradicts Fix #2

Non-issues / Confirmed Correct

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

claude Bot commented Jun 1, 2026 •

edited

Loading

1. Pre-cutover cleartext WAL entries — `§3` "cannot observe" list is unconditional (CodeX P2 is valid)

claude Bot commented Jun 1, 2026 •

edited

Loading