Skip to content

kv(composed1): M1 plumbing — ObservedRouteVersion in OperationGroup → pb.Request#881

Merged
bootjp merged 3 commits into
mainfrom
feat/composed1-m1-observed-route-version
May 30, 2026
Merged

kv(composed1): M1 plumbing — ObservedRouteVersion in OperationGroup → pb.Request#881
bootjp merged 3 commits into
mainfrom
feat/composed1-m1-observed-route-version

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented May 29, 2026

Summary

First milestone (M1) of the Composed-1 cross-group commit-time guard per docs/design/2026_05_29_proposed_composed1_cross_group_commit_guard.md §M1.

Plumbs a new ObservedRouteVersion field end-to-end so the M3 FSM apply-time gate (separate PR) can read it without further plumbing churn. Behaviour-neutral: every existing caller leaves the field at its zero value, and the FSM at this milestone ignores it.

What changes

  • proto/internal.proto — adds uint64 observed_route_version = 6 on pb.Request with an inline comment pointing at the design doc. Regenerated proto/internal.pb.go via protoc 29.6 + protoc-gen-go v1.36.11.
  • kv/transcoder.go — adds ObservedRouteVersion uint64 to OperationGroup with a doc comment that explicitly calls out the zero default and the M3 follow-up.
  • kv/coordinator.go
    • onePhaseTxnRequest gains the observedRouteVersion uint64 param and sets it on the returned pb.Request.
    • Coordinate.dispatchTxn gains the param and forwards it from the calling OperationGroup.
    • Coordinate.Dispatch already routes through dispatchTxn, so the OperationGroup field flows in.
    • Coordinate.buildRedirectRequests (follower→leader forward path) reads reqs.ObservedRouteVersion so the field survives a redirected commit too.
  • kv/sharded_coordinator.goShardedCoordinator.dispatchTxn and dispatchSingleShardTxn gain the same param, propagated from the calling OperationGroup.

What does NOT change

  • The FSM (kv/fsm.go) does not read pb.Request.ObservedRouteVersion. That's the M3 milestone.
  • No new errors, no new sentinels, no new control-flow branches on the hot path.
  • No adapter-side change: every adapter calls Dispatch with an OperationGroup whose ObservedRouteVersion is zero by default, which the M3 gate will treat as "unpinned" (legacy) and skip.

Tests

  • TestCoordinateDispatchTxn_PassesObservedRouteVersionToRaftEntry — the round-trip witness: dispatchTxn(..., observedRouteVersion=42) produces pb.Request.ObservedRouteVersion == 42. This is the M1 "done when" criterion from §M1.
  • TestCoordinateDispatchTxn_UnpinnedObservedRouteVersionStaysZero — legacy-default witness: dispatchTxn(..., observedRouteVersion=0) produces pb.Request.ObservedRouteVersion == 0.
  • Existing dispatch tests updated for the new signature (one extra trailing 0 per call site).

Verification

  • go build ./... — clean
  • go test -race -count=1 ./kv — 10.0 s, pass
  • make lint — 0 issues

Self-review (5 lenses)

  1. Data loss — no new write paths; field is plumbed but not yet read by the FSM. Cannot lose a write because no new branch could refuse one.
  2. Concurrency — additive; no new locks, no new goroutines.
  3. Performance — one extra uint64 field on every pb.Request; one extra uint64 param on the dispatch path. No new atomics, no new allocations.
  4. Data consistency — the field flows OperationGroup → dispatchTxn → onePhaseTxnRequest → pb.Request; M3 will close the loop at FSM apply.
  5. Test coverage — round-trip + legacy-default tests cover the M1 contract.

Test plan

  • go test -race ./kv
  • make lint
  • Reviewer confirms the proto regen produces only the expected diff (one field added; one Getter; one descriptor entry) — git diff proto/internal.pb.go should be reviewable as a normal proto field addition
  • (Follow-up PRs) M2 — catalog version retention ring in distribution/Engine + kvFSM struct extension (routes, shardGroupID). M3 — FSM apply-time verifyComposed1 + coordinator retry on ErrComposed1Violation. M4/M5 — Jepsen workload.

Resolves

The M1 row in the Composed-1 design doc.

Summary by CodeRabbit

Release Notes

  • Chores

    • Updated internal transaction coordination and dispatch mechanisms to track and propagate route version information consistently across all transaction lifecycle phases, from preparation through final commit operations.
  • Tests

    • Added regression tests validating that route version information is properly propagated and preserved through transaction dispatch operations in both single-shard and multi-shard transaction scenarios.

Review Change Stack

… pb.Request

First milestone of the Composed-1 cross-group commit-time guard per
docs/design/2026_05_29_proposed_composed1_cross_group_commit_guard.md
§M1.  Plumbs a new ObservedRouteVersion field end-to-end so the M3
FSM apply-time gate (a separate PR) can read it without further
plumbing work.

Behaviour-neutral:

  * Every existing caller leaves ObservedRouteVersion at zero (its
    zero value).
  * The FSM at this milestone IGNORES pb.Request.ObservedRouteVersion;
    M3 will add the apply-time check.
  * No new errors, no new sentinels, no new control-flow branches on
    the hot path.

Changes:

  * proto/internal.proto — adds `uint64 observed_route_version = 6`
    on pb.Request with an inline comment pointing at the design doc.
  * proto/internal.pb.go — regen via protoc 29.6 + protoc-gen-go
    v1.36.11.
  * kv/transcoder.go — adds ObservedRouteVersion to OperationGroup
    with a doc comment that explicitly calls out the zero default and
    the M3 follow-up.
  * kv/coordinator.go — onePhaseTxnRequest signature gains the
    observedRouteVersion uint64 param and sets it on the returned
    pb.Request.  Coordinate.dispatchTxn signature gains the param
    and forwards it.  Coordinate.Dispatch already routes through
    dispatchTxn so the OperationGroup field flows in.
    Coordinate.buildRedirectRequests (the follower→leader forward
    path's txn branch) reads reqs.ObservedRouteVersion and passes it
    so the field survives a redirected commit too.
  * kv/sharded_coordinator.go — ShardedCoordinator.dispatchTxn and
    its dispatchSingleShardTxn helper gain the same param,
    propagated from the calling OperationGroup at the dispatchTxn
    entry point.

Tests (kv/coordinator_txn_test.go):

  * TestCoordinateDispatchTxn_PassesObservedRouteVersionToRaftEntry —
    the round-trip witness: dispatchTxn(..., observedRouteVersion=42)
    produces pb.Request.ObservedRouteVersion == 42.  This is the M1
    "done when" criterion from the design doc.
  * TestCoordinateDispatchTxn_UnpinnedObservedRouteVersionStaysZero —
    legacy-default witness: dispatchTxn(..., observedRouteVersion=0)
    produces pb.Request.ObservedRouteVersion == 0, matching every
    existing caller.

  Existing dispatch tests updated for the new signature (one extra
  trailing `0` per call site).

Verification:

  * go build ./...                          — clean
  * go test -race -count=1 ./kv             — 10.0 s, pass
  * go test -race -count=1 -run ObservedRouteVersion ./kv — 1.0 s, pass

Self-review (5 lenses):

  1. Data loss — no new write paths; the new field is plumbed but
     not yet read by the FSM.  Cannot lose a write because there is
     no new branch that could refuse one.
  2. Concurrency — additive; no new locks, no new goroutines.
  3. Performance — one extra uint64 field on every pb.Request; one
     extra uint64 param on the dispatch path.  No new atomics, no
     new allocations.
  4. Data consistency — the field flows from OperationGroup →
     dispatchTxn → onePhaseTxnRequest → pb.Request; M3 will close
     the loop at FSM apply.  At this milestone the consistency
     surface is unchanged.
  5. Test coverage — round-trip + legacy-default tests cover the M1
     contract.  Existing dispatch tests adapted to the new
     signature; full ./kv suite passes under -race.

Next milestone (separate PR per design doc §6): M2 — distribution
catalog version retention ring + kvFSM struct extension
(routes / shardGroupID), unlocking the M3 apply-time gate.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 29, 2026

@claude review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

This PR threads ObservedRouteVersion from transaction request metadata through the single-shard and multi-shard Raft dispatch paths, adding it to the proto Request schema and OperationGroup data contract, updating all coordinator dispatch plumbing, and verifying end-to-end propagation with new tests.

Changes

ObservedRouteVersion propagation

Layer / File(s) Summary
Data contract and schema
kv/transcoder.go, proto/internal.proto
OperationGroup struct gains an exported ObservedRouteVersion uint64 field documented as the durable catalog version captured at BeginTxn, with 0 treated as "unpinned". Proto Request message adds observed_route_version field (tag 6) to carry it through Raft replication.
Single-shard coordinator dispatch
kv/coordinator.go
dispatchOnce, dispatchTxn, and onePhaseTxnRequestWithPrevCommit are updated to thread observedRouteVersion through the one-phase transaction request path, including follower→leader redirects. The constructed pb.Request now sets ObservedRouteVersion alongside IsTxn, Phase, and other fields.
Single-shard coordinator tests
kv/coordinator_txn_test.go
Five existing test call sites are updated to pass the new observedRouteVersion parameter (set to 0 in existing tests). Two new tests verify round-trip behavior: PassesObservedRouteVersionToRaftEntry asserts non-zero values propagate, and UnpinnedObservedRouteVersionStaysZero asserts 0 is preserved.
Multi-shard sharded coordinator dispatch
kv/sharded_coordinator.go
Dispatch and internal helpers (dispatchTxn, dispatchSingleShardTxn, prewriteTxn, commitPrimaryTxn, commitSecondaryTxns, buildTxnLogs) are updated to accept and propagate observedRouteVersion. All PREPARE and COMMIT pb.Request objects built across single-shard and 2PC multi-shard paths now carry ObservedRouteVersion.
Multi-shard sharded coordinator tests
kv/sharded_coordinator_txn_test.go
New regression test TestShardedCoordinatorDispatchTxn_CrossShardPropagatesObservedRouteVersion dispatches a 2PC transaction across two shard groups with a pinned ObservedRouteVersion and asserts every recorded PREPARE and COMMIT request from both shards carries that pinned value.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 A version flows through the night,
From BeginTxn's starlit height,
Through shards and phases it takes flight,
Each Raft request glows bright!
The route is marked, the route is set—
No catalog drift to regret. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: plumbing the ObservedRouteVersion field from OperationGroup through to pb.Request across multiple coordinator and dispatcher functions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/composed1-m1-observed-route-version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

TLA+ spec divergence review (auto-triggered)

This PR touches files that the TLA+ safety spec has an anchor on (per
docs/design/2026_05_28_partial_tla_safety_spec.md §3),
so an AI review is requested below to verify the implementation has not drifted
from the model.

Anchored files changed in this PR head (3cbe1c3):

  • kv/coordinator.go
  • kv/sharded_coordinator.go

What to check, by subsystem:

  • kv/hlc*.goNext() must respect the HLC-4 preconditions (i)/(ii)/(iii) from the design doc: bounded skew, logical-counter handoff on leader change (strategy (c) Observe(MaxAppliedHLC)), and the commit-time ceiling fence (fail-closed when wall_now >= physicalCeiling). Any change to the bit layout (48/16), the CAS loop, or the ceiling getter/setter is in scope.
  • kv/coordinator.go, kv/sharded_coordinator.goRunHLCLeaseRenewal, hlcRenewalInterval, hlcPhysicalWindowMs constants, and the new-term detection that calls Observe(fsm.MaxAppliedHLC()) (strategy (c)). Any change to renewal cadence, group selection, or fail-closed behaviour is in scope.
  • kv/transaction.go, kv/lock_resolver.go — OCC commit-ts assignment, lock-map encoding (key, lock_ts) -> start_ts, and the LockResolver action OCC-3 depends on. (M2 spec will land OCC-1..OCC-5; until then the spec doc §5.2 is the contract.)
  • kv/fsm.go — FSM apply of HLC lease entries (SetPhysicalCeiling), and any future MaxAppliedHLC() accessor that strategy (c) needs.
  • store/mvcc_store.go — version visibility, snapshot install, and the MVCC-1..MVCC-4 invariants (M3 scope).
  • distribution/** — route catalog versioning, SplitRange atomicity, and CatalogWatcher async fan-out (M4 scope).

If the change is correct but requires a spec update, edit tla/hlc/HLC.tla (or the corresponding M2..M5 module once landed) and the design doc in the same PR. The tla-check workflow runs the TLC model check on the same paths.


@claude review please verify TLA+ spec divergence per the checklist above.

@codex review please verify TLA+ spec divergence per the checklist above.

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@claude
Copy link
Copy Markdown

claude Bot commented May 29, 2026

Claude finished @bootjp's task in 3m 2s —— View job


Code Review — PR #881

Todo

  • Load tools
  • Read changed files
  • Review proto changes
  • Review coordinator changes
  • Review sharded_coordinator changes
  • Review transcoder changes
  • Review tests
  • Post findings

Overall

The M1 plumbing is clean and the single-shard path is correctly wired end-to-end. One medium-severity gap, one low-severity toolchain concern, and a couple of nits follow.


[Medium] Multi-shard 2PC path silently drops ObservedRouteVersion

ShardedCoordinator.dispatchTxn has two branches:

  1. Single-shard fast path — calls dispatchSingleShardTxnonePhaseTxnRequest → correctly carries observedRouteVersion. ✅
  2. Multi-shard 2PC path — calls prewriteTxn, commitPrimaryTxn, commitSecondaryTxns. None of these take observedRouteVersion and none set it on the pb.Request they construct. ✅ for M1 (field ignored by FSM) but ❌ for the stated goal.

Concrete locations where the field is missing in the 2PC path:

  • sharded_coordinator.go:687-693prewriteTxn builds pb.Request{Phase: PREPARE} with no ObservedRouteVersion
  • sharded_coordinator.go:739-744commitPrimaryTxn builds pb.Request{Phase: COMMIT} with no ObservedRouteVersion
  • sharded_coordinator.go:770-775commitSecondaryTxns builds pb.Request{Phase: COMMIT} with no ObservedRouteVersion

When M3 adds verifyComposed1 to the FSM's handleTxnRequest, it must run on PREPARE and COMMIT phases of multi-shard txns too — that's exactly where cross-group route shifts are dangerous. Without the field in those entries, the M3 check will silently be a no-op for any transaction that spans more than one shard, because observedRouteVersion == 0 is the "unpinned / skip check" sentinel.

The M1 scope in the design doc lists "propagate through dispatchTxn and into onePhaseTxnRequest" — but that description implicitly scopes to the one-phase path. If the 2PC path is intentionally deferred, the design doc's M1 "done when" criterion should say so explicitly, and a new sub-milestone (or a note in M3) should track that prewriteTxn / commitPrimaryTxn / commitSecondaryTxns need the same plumbing before M3 can enforce the guard on all txns. Fix this →


[Low] Proto regeneration toolchain divergence

The generated header changed from:

// 	protoc        v7.34.0

to:

// 	protoc        v5.29.6

CLAUDE.md pins the toolchain to libprotoc 29.3. The PR description says protoc 29.6 was used, which is a different version. Additionally, the source path flipped from source: proto/internal.protosource: internal.proto, indicating the regen was run from inside the proto/ directory (consistent with proto/Makefile invoking protoc with internal.proto). This caused a bulk rename of all file_proto_internal_proto_* internal symbols to file_internal_proto_*.

The rename is functionally harmless (none of those names are exported), but it:

  1. Creates diff noise that obscures the actual one-field addition.
  2. Diverges from the CLAUDE.md-pinned libprotoc 29.3 (the existing file was generated with v7.34.0, which itself appeared non-conformant — so this may be a pre-existing issue).

Worth confirming the toolchain against proto/Makefile before merge to keep regeneration reproducible.


[Nit] TestCoordinateDispatchTxn_UnpinnedObservedRouteVersionStaysZero tests proto3 defaults

coordinator_txn_test.go:170-195 — verifying that observedRouteVersion=0 produces pb.Request.ObservedRouteVersion == 0 is asserting a proto3 language guarantee (absent fields decode as zero-value), not application logic. It won't catch any real application bug. If kept, the comment that calls it the "legacy-compatible default the rest of the codebase relies on" adequately justifies its documentary value — but it's worth calling out that the test itself provides no meaningful regression protection.


[Nit] No test for ShardedCoordinator.dispatchSingleShardTxn round-trip

The new round-trip witness (TestCoordinateDispatchTxn_PassesObservedRouteVersionToRaftEntry) exercises Coordinate.dispatchTxn, not ShardedCoordinator.dispatchSingleShardTxn. Looking at kv/coordinator_txn_test.go, there's a sibling file for sharded coordinator tests. A parallel TestShardedCoordinatorDispatchTxn_PassesObservedRouteVersionToRaftEntry would provide symmetric coverage for the ShardedCoordinator single-shard path, matching the pattern established by TestShardedCoordinatorDispatchTxn_SingleShardIncludesReadKeysInRaftEntry (referenced in coordinator_txn_test.go:141).


What's correct

  • Field number 6 is correct and backward-compatible with proto3 (no existing wire encoding breaks).
  • buildRedirectRequests at coordinator.go:1030 correctly forwards the field on the follower→leader redirect path — this is easy to miss and it's done right.
  • OperationGroup.ObservedRouteVersion has a good doc comment explaining the zero/unpinned sentinel and the M3 forward-reference.
  • Both Coordinate.dispatchTxn and ShardedCoordinator.dispatchTxn are correctly updated; the dispatchOncedispatchTxn call site at coordinator.go:469 is also correct.
  • The TestCoordinateDispatchTxn_PassesObservedRouteVersionToRaftEntry round-trip test is the right shape for an M1 "done when" criterion.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces plumbing for ObservedRouteVersion (part of the Composed-1 cross-group commit guard design) to propagate the durable catalog version from the transaction's read set to the FSM via pb.Request. This is implemented for single-shard transactions in both the standard and sharded coordinators, with corresponding unit tests added. However, a critical correctness issue was identified where ObservedRouteVersion is not propagated in the multi-shard (2PC) transaction path in ShardedCoordinator, which would cause multi-shard transactions to bypass the safety checks.

Comment thread kv/sharded_coordinator.go Outdated
// so that validateReadOnlyShards can issue a linearizable read barrier,
// preserving SSI.
return c.dispatchSingleShardTxn(ctx, startTS, commitTS, primaryKey, gids[0], elems, readKeys)
return c.dispatchSingleShardTxn(ctx, startTS, commitTS, primaryKey, gids[0], elems, readKeys, observedRouteVersion)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Correctness Issue: Missing ObservedRouteVersion Propagation in Multi-Shard (2PC) Path

While the single-shard fast path (dispatchSingleShardTxn) correctly propagates observedRouteVersion, the multi-shard 2PC path immediately below it completely omits this propagation.

Multi-shard transactions are the most critical path for cross-group consistency and are highly susceptible to route shifts (e.g., during range splits or moves). If ObservedRouteVersion is not propagated to the PREPARE and COMMIT requests, the FSM will see a default value of 0 (unpinned) and skip the Composed-1 safety checks entirely for all multi-shard transactions.

To fix this, please propagate observedRouteVersion through the 2PC path:

  1. Update dispatchTxn calls (lines 601, 606, 620):
prepared, err := c.prewriteTxn(ctx, startTS, commitTS, primaryKey, grouped, gids, groupedReadKeys, observedRouteVersion)
...
primaryGid, maxIndex, err := c.commitPrimaryTxn(ctx, startTS, primaryKey, grouped, commitTS, observedRouteVersion)
...
maxIndex = c.commitSecondaryTxns(ctx, startTS, primaryGid, primaryKey, grouped, gids, commitTS, maxIndex, observedRouteVersion)
  1. Update prewriteTxn signature and request construction (lines 678, 687):
func (c *ShardedCoordinator) prewriteTxn(ctx context.Context, startTS, commitTS uint64, primaryKey []byte, grouped map[uint64][]*pb.Mutation, gids []uint64, groupedReadKeys map[uint64][][]byte, observedRouteVersion uint64) ([]preparedGroup, error) {
    ...
    req := &pb.Request{
        IsTxn:                true,
        Phase:                pb.Phase_PREPARE,
        Ts:                   startTS,
        Mutations:            append([]*pb.Mutation{prepareMeta}, grouped[gid]...),
        ReadKeys:             groupedReadKeys[gid],
        ObservedRouteVersion: observedRouteVersion,
    }
    ...
}
  1. Update commitPrimaryTxn signature and request construction (lines 726, 739):
func (c *ShardedCoordinator) commitPrimaryTxn(ctx context.Context, startTS uint64, primaryKey []byte, grouped map[uint64][]*pb.Mutation, commitTS uint64, observedRouteVersion uint64) (uint64, uint64, error) {
    ...
    req := &pb.Request{
        IsTxn:                true,
        Phase:                pb.Phase_COMMIT,
        Ts:                   startTS,
        Mutations:            append([]*pb.Mutation{meta}, keys...),
        ObservedRouteVersion: observedRouteVersion,
    }
    ...
}
  1. Update commitSecondaryTxns signature and request construction (lines 756, 770):
func (c *ShardedCoordinator) commitSecondaryTxns(ctx context.Context, startTS uint64, primaryGid uint64, primaryKey []byte, grouped map[uint64][]*pb.Mutation, gids []uint64, commitTS uint64, maxIndex uint64, observedRouteVersion uint64) uint64 {
    ...
    req := &pb.Request{
        IsTxn:                true,
        Phase:                pb.Phase_COMMIT,
        Ts:                   startTS,
        Mutations:            append([]*pb.Mutation{meta}, keyMutations(grouped[gid])...),
        ObservedRouteVersion: observedRouteVersion,
    }
    ...
}
  1. Update txnLogs and buildTxnLogs (lines 1269, 1352) for completeness:
func (c *ShardedCoordinator) txnLogs(reqs *OperationGroup[OP]) ([]*pb.Request, error) {
    ...
    return buildTxnLogs(reqs.StartTS, commitTS, grouped, gids, reqs.ObservedRouteVersion)
}

func buildTxnLogs(startTS uint64, commitTS uint64, grouped map[uint64][]*pb.Mutation, gids []uint64, observedRouteVersion uint64) ([]*pb.Request, error) {
    ...
    // Set ObservedRouteVersion: observedRouteVersion on both PREPARE and COMMIT requests
}

… (gemini critical on PR #881)

Closes a critical hole gemini's review surfaced on the first revision
of M1.  The single-shard fast path (`dispatchSingleShardTxn`)
correctly forwarded OperationGroup.ObservedRouteVersion into
pb.Request via `onePhaseTxnRequest`, but the multi-shard 2PC path
(`prewriteTxn` → `commitPrimaryTxn` → `commitSecondaryTxns`) and the
`buildTxnLogs` helper dropped it on the floor.  Every PREPARE and
COMMIT envelope from a cross-shard txn arrived at the FSM with
ObservedRouteVersion = 0 ("unpinned"), which would silently bypass
the M3 Composed-1 apply-time gate exactly for the workload most at
risk of a cross-group route shift.

Multi-shard txns are precisely where a `MoveRange` or cross-group
`SplitRange` between BeginTxn and Commit can land — losing the
plumbing here would have left M3's guard unable to even see those
txns to enforce on.

Sites fixed:

  * dispatchTxn — three call sites updated to forward
    observedRouteVersion into prewriteTxn / commitPrimaryTxn /
    commitSecondaryTxns.
  * prewriteTxn — signature gains observedRouteVersion; sets it on
    every PREPARE pb.Request.
  * commitPrimaryTxn — signature gains observedRouteVersion; sets
    it on the primary's COMMIT pb.Request.
  * commitSecondaryTxns — signature gains observedRouteVersion;
    sets it on every secondary COMMIT pb.Request.
  * txnLogs / buildTxnLogs — buildTxnLogs signature gains
    observedRouteVersion; sets it on both PREPARE and COMMIT
    requests it constructs; txnLogs forwards reqs.ObservedRouteVersion
    from the OperationGroup.

Tests (added before the fix per CLAUDE.md convention):

  * TestShardedCoordinatorDispatchTxn_CrossShardPropagatesObservedRouteVersion
    — routes two PUTs across two shards, sets
    ObservedRouteVersion=42 on the OperationGroup, asserts every
    PREPARE + COMMIT envelope across both shards carries the pinned
    value.  Verified failing on the parent commit (0x0 instead of
    0x2a, with the precise error message identifying which phase
    leaked) and passing on this commit.

Verification:

  * go build ./...                          — clean
  * go test -race -count=1 ./kv             — 9.9 s, pass

Self-review (5 lenses):

  1. Data loss — pure plumbing; the new field flows through the same
     Raft envelope structure that already round-trips, and the FSM
     ignores it pre-M3.  No new branch that could refuse a write.
  2. Concurrency — additive; no new locks, no new goroutines.
  3. Performance — one extra uint64 set per pb.Request in the 2PC
     path; identical to the existing single-shard cost from #881
     parent commit.  No new allocations.
  4. Data consistency — Composed-1 M1 plumbing is now consistent
     across single-shard AND multi-shard paths; M3's FSM gate will
     see the same field shape from every txn entry point.  Closes
     the gemini-critical regression.
  5. Test coverage — round-trip test covers PREPARE + COMMIT across
     both shards.  Existing 2PC tests (CrossShardPhasesAndCommitIndex,
     SingleShardIncludesReadKeysInRaftEntry) continue to pass.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 30, 2026

Gemini critical addressed in the latest commit:

The single-shard fast path was forwarding ObservedRouteVersion correctly via onePhaseTxnRequest, but the multi-shard 2PC path (prewriteTxn / commitPrimaryTxn / commitSecondaryTxns) and the buildTxnLogs helper silently dropped the field — every cross-shard PREPARE and COMMIT envelope arrived at the FSM with ObservedRouteVersion = 0, silently bypassing the M3 apply-time gate exactly for the workload most at risk of a cross-group route shift.

Fixed all 5 sites per gemini's enumeration:

  • prewriteTxn — PREPARE pb.Request gains ObservedRouteVersion
  • commitPrimaryTxn — primary COMMIT gains it
  • commitSecondaryTxns — secondary COMMIT gains it
  • buildTxnLogs — both PREPARE and COMMIT it builds gain it
  • dispatchTxn (3 call sites) + txnLogs forward observedRouteVersion from the OperationGroup

Regression test added before the fix per CLAUDE.md convention:
TestShardedCoordinatorDispatchTxn_CrossShardPropagatesObservedRouteVersion — routes 2 PUTs across 2 shards, sets ObservedRouteVersion=42 on the OperationGroup, asserts every PREPARE+COMMIT envelope across both shards carries the pinned value. Verified failing on the parent commit (0x0 vs 0x2a, error message identified the leaking PREPARE phase) and passing now.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 30, 2026

Claude finished @bootjp's task in 2m 45s —— View job


Code Review — PR #881 (follow-up: 2PC fix verification)

Todo

  • Load tools
  • Read sharded_coordinator.go changes
  • Read sharded_coordinator_txn_test.go changes
  • Review diff from origin/main
  • Verify all 5 fix sites
  • Check regression test quality
  • Post findings

Overall

The 2PC propagation fix is correct and complete. All five sites identified by Gemini are addressed, the regression test is well-constructed, and the abort path is correctly left alone. No blocking issues remain.


Fix site verification

Site Status Notes
dispatchTxn — 3 call sites (prewriteTxn / commitPrimaryTxn / commitSecondaryTxns) All three now forward observedRouteVersion
prewriteTxn — PREPARE pb.Request ObservedRouteVersion: observedRouteVersion set on struct literal
commitPrimaryTxn — primary COMMIT pb.Request Correctly set
commitSecondaryTxns — secondary COMMIT pb.Request Correctly set
txnLogsbuildTxnLogs — both PREPARE and COMMIT entries Both entries in the batch gain the field

Regression test

TestShardedCoordinatorDispatchTxn_CrossShardPropagatesObservedRouteVersion is the right shape:

  • Routes "b" → shard 1, "x" → shard 2, forcing the multi-shard 2PC code path (single-shard fast path requires len(gids) == 1).
  • recordingTransactional stubs with exactly 2 responses per shard, matching PREPARE + COMMIT ordering.
  • require.Len(t, g1Txn.requests, 2) / require.Len(t, g2Txn.requests, 2) correctly asserts both phases fired — this would catch a regression where the 2PC loop short-circuited.
  • The loop over append(g1Txn.requests, g2Txn.requests...) exercises all 4 envelopes: shard-1 PREPARE, shard-1 COMMIT (primary path via commitPrimaryTxn), shard-2 PREPARE, shard-2 COMMIT (secondary path via commitSecondaryTxns). Both commitPrimaryTxn and commitSecondaryTxns are exercised.
  • The error message includes the pre-fix failure (0x0 instead of 0x2a) and names the leaking phase — sufficient for future bisection.

[Observation, non-blocking] txnLogs/buildTxnLogs is a dead path in the current dispatch flow

txnLogs has a len(gids) != 1 guard (line 1279) that rejects multi-shard inputs, and requestLogs (its only caller) is only invoked from dispatchNonTxn, which is only reached when reqs.IsTxn == false. So in the current flow, txnLogs is never called with a transactional OperationGroup. The fix to buildTxnLogs is nonetheless correct — it avoids a latent bug if the path is ever wired up for single-shard txn replay — but the fix is not on the critical path for M3 enforcement, and a unit test for the single-shard compat path with ObservedRouteVersion would not be worth adding at this milestone.


[Observation, non-blocking] Abort path correctly omits ObservedRouteVersion

abortPreparedTxn (line 836) builds pb.Request{Phase: ABORT} without setting ObservedRouteVersion. This is correct: ABORT is a compensating cleanup action on an already-doomed transaction; the M3 gate should only enforce on PREPARE and COMMIT (the forward commit path), not on rollback cleanup.


What changed since the initial review

  • Prior state: single-shard fast path propagated ObservedRouteVersion correctly; multi-shard 2PC path silently dropped it to 0 on all PREPARE and COMMIT envelopes.
  • Now: the field propagates uniformly across both paths. M3's verifyComposed1 gate will see the pinned version from any txn type.

The M1 contract is now fully satisfied for both the single-shard and multi-shard code paths. Ready to merge pending any other reviewer sign-off.

@bootjp bootjp enabled auto-merge May 30, 2026 06:00
…erved-route-version

# Conflicts:
#	kv/coordinator.go
#	kv/sharded_coordinator.go
@github-actions
Copy link
Copy Markdown
Contributor

TLA+ spec divergence review (auto-triggered)

This PR touches files that the TLA+ safety spec has an anchor on (per
docs/design/2026_05_28_partial_tla_safety_spec.md §3),
so an AI review is requested below to verify the implementation has not drifted
from the model.

Anchored files changed in this PR head (9e2ce9b):

  • kv/coordinator.go
  • kv/sharded_coordinator.go

What to check, by subsystem:

  • kv/hlc*.goNext() must respect the HLC-4 preconditions (i)/(ii)/(iii) from the design doc: bounded skew, logical-counter handoff on leader change (strategy (c) Observe(MaxAppliedHLC)), and the commit-time ceiling fence (fail-closed when wall_now >= physicalCeiling). Any change to the bit layout (48/16), the CAS loop, or the ceiling getter/setter is in scope.
  • kv/coordinator.go, kv/sharded_coordinator.goRunHLCLeaseRenewal, hlcRenewalInterval, hlcPhysicalWindowMs constants, and the new-term detection that calls Observe(fsm.MaxAppliedHLC()) (strategy (c)). Any change to renewal cadence, group selection, or fail-closed behaviour is in scope.
  • kv/transaction.go, kv/lock_resolver.go — OCC commit-ts assignment, lock-map encoding (key, lock_ts) -> start_ts, and the LockResolver action OCC-3 depends on. (M2 spec will land OCC-1..OCC-5; until then the spec doc §5.2 is the contract.)
  • kv/fsm.go — FSM apply of HLC lease entries (SetPhysicalCeiling), and any future MaxAppliedHLC() accessor that strategy (c) needs.
  • store/mvcc_store.go — version visibility, snapshot install, and the MVCC-1..MVCC-4 invariants (M3 scope).
  • distribution/** — route catalog versioning, SplitRange atomicity, and CatalogWatcher async fan-out (M4 scope).

If the change is correct but requires a spec update, edit tla/hlc/HLC.tla (or the corresponding M2..M5 module once landed) and the design doc in the same PR. The tla-check workflow runs the TLC model check on the same paths.


@claude review please verify TLA+ spec divergence per the checklist above.

@codex review please verify TLA+ spec divergence per the checklist above.

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@proto/internal.proto`:
- Around line 47-55: The generated proto file header in proto/internal.pb.go
indicates it was built with protoc v5.29.6 despite proto/Makefile pinning
PROTOC_VERSION := libprotoc 29.3; regenerate proto/internal.pb.go using the
pinned libprotoc 29.3 (via the Makefile's proto target or the pinned toolchain)
so the file header matches the pinned version and then commit the regenerated
file (ensure the symbol observed_route_version and other messages remain
intact).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8e3d5cff-e6b9-4425-a188-24690f5645d8

📥 Commits

Reviewing files that changed from the base of the PR and between 958ddcf and 9e2ce9b.

⛔ Files ignored due to path filters (1)
  • proto/internal.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (6)
  • kv/coordinator.go
  • kv/coordinator_txn_test.go
  • kv/sharded_coordinator.go
  • kv/sharded_coordinator_txn_test.go
  • kv/transcoder.go
  • proto/internal.proto

Comment thread proto/internal.proto
Comment on lines +47 to +55
// observed_route_version is the durable catalog version the
// transaction's read set was captured at (set on BeginTxn from
// distribution.Engine.Version()). Zero means "unpinned" (legacy
// callers + read-only paths). M3 of the Composed-1 design
// (docs/design/2026_05_29_proposed_composed1_cross_group_commit_guard.md)
// will gate the FSM apply path on it; M1 (this field's introduction)
// is plumbing only — the FSM ignores the value, so all existing
// callers see no behaviour change.
uint64 observed_route_version = 6;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the generated field exists and the regen toolchain versions match the pinned ones.
fd -t f 'internal.pb.go' --exec rg -n 'ObservedRouteVersion|protoc-gen-go v|protoc *v' {}
# Inspect proto/Makefile for the pinned toolchain versions referenced in guidelines.
fd -t f 'Makefile' proto --exec cat {}

Repository: bootjp/elastickv

Length of output: 2289


Regenerate proto/internal.pb.go with the pinned libprotoc 29.3.

proto/Makefile pins PROTOC_VERSION := libprotoc 29.3, but the generated proto/internal.pb.go header shows protoc v5.29.6 (while still containing observed_route_version = 6), so it wasn’t produced with the required toolchain.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@proto/internal.proto` around lines 47 - 55, The generated proto file header
in proto/internal.pb.go indicates it was built with protoc v5.29.6 despite
proto/Makefile pinning PROTOC_VERSION := libprotoc 29.3; regenerate
proto/internal.pb.go using the pinned libprotoc 29.3 (via the Makefile's proto
target or the pinned toolchain) so the file header matches the pinned version
and then commit the regenerated file (ensure the symbol observed_route_version
and other messages remain intact).

@bootjp bootjp merged commit 146db6b into main May 30, 2026
13 checks passed
@bootjp bootjp deleted the feat/composed1-m1-observed-route-version branch May 30, 2026 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant