feat(txn-dedup): re-land M3 EXEC reuse on main + close M2 open + multi-mop by bootjp · Pull Request #887 · bootjp/elastickv

bootjp · 2026-05-30T07:24:03Z

Summary

PR #884 was merged into the stacked branch docs/txn-idempotency-design (at cbbde3d7) but never reached main — main has only PR #796's M1 + M2 + M3 RPUSH/LPUSH content. This PR re-lands the M3 EXEC reuse code on top of main and bundles three follow-ups that move the design doc's "still open" / "follow-up" items into "landed".

Base: main.

Changes

1. Re-land M3 EXEC reuse (PR #884's content)

adapter/redis.go: txnContext.prepareDispatch() split out of commit(); reusableExecTxn; dispatchExecReuse; runTransactionWithDedup + firstExecAttempt; gate at the top of runTransaction.
adapter/redis_exec_dedup_test.go (originally added in PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884): 5 tests pinning all four reuse outcomes plus the gate-off legacy equivalence.
Cherry-pick adaptation: prepareDispatch() uses Clock().NextFenced() (uint64, error) on current main; the PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884 version targeted Clock().Next() (uint64). Same downstream semantics, error wired through preparedTxnDispatch.

2. Close M2 open item — FSM other-txn exactness test

TestOnePhaseDedup_OtherTxnVersionDoesNotMaskRetry (kv/fsm_onephase_dedup_test.go) pins exactness at the FSM apply layer: a third-party version at T_other=20 must not satisfy the FSM probe at T1=30, so the retry falls through and applies at the fresh T2=40. The store-layer pin already covers the primitive; this test covers the dispatch path that uses it.

3. M3 multi-mop EXEC dedup test

TestExecDedup_MultiMopLandedPriorAttempt_ReturnsCachedResults extends single-mop dedup to a 3-command MULTI/EXEC body (SET a + SET b + DEL c). Validates that cached results + OCC readKeys fence are mop-count-agnostic. Without dedup the DEL would re-execute to 0 on the second pass — the test rejects that.

4. Design doc updates

§M2 "still open" → "LANDED" with the new FSM test reference
§M3 "runTransaction (MULTI/EXEC) — Still open" → "LANDED via PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884" with multi-mop test reference and acknowledgement of two intentional deviations from the M1/M2 template that claude[bot] flagged on feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884 (readKeys assembly order, fresh per-attempt reuseCtx)
§M3 "standalone SET/INCR/HSET" called out as PR-B follow-up

Caller audit (per /loop semantic-change rule)

prepareDispatch (newly added): callers are commit() and firstExecAttempt; both honor defer prepared.cancel(). External behavior of commit() preserved.
commit(): internal structure changed; external behavior preserved (no test directly invokes it).
runTransactionWithDedup / firstExecAttempt / dispatchExecReuse / reusableExecTxn: all new symbols, exercised only from the gated runTransaction path.

Validation

go test ./adapter/ -run 'Dedup|Txn|MULTI|EXEC' passes
go test ./kv/ ./store/ both pass
gofmt, go vet, golangci-lint run all clean (0 issues)

Relation to prior work

PR	Merged into	Content
#796 (`f481f2b7`)	main	M1 + M2 + M3 RPUSH/LPUSH
#884 (`cbbde3d7`)	stacked branch, NOT main	M3 EXEC reuse (stranded)
This PR	main	Re-land #884 + M2 cross-txn FSM test + multi-mop test + doc updates
PR-B (next)	main	Standalone SET / INCR / HSET reuse paths
PR-C (next)	main	M4 Jepsen validation infrastructure

Summary by CodeRabbit

Release Notes

Tests
- Added comprehensive regression test suite for transaction retry and deduplication handling, covering cached result reuse, conflict detection, and legacy behavior paths
- Extended test coverage for standalone command transaction consistency
Documentation
- Updated design documentation to reflect transaction handling improvements and recent test coverage additions

…i-mop PR #884 was merged into the stacked branch docs/txn-idempotency-design (at cbbde3d) but never reached main — main has only PR #796's M1 + M2 + M3 RPUSH/LPUSH content. This PR re-lands the M3 EXEC reuse code on top of main and bundles three follow-ups that extend the design doc's "still open" / "follow-up" items into "landed": 1. Re-land M3 EXEC reuse (PR #884's content, cherry-picked + rebased) - adapter/redis.go: txnContext.prepareDispatch() split out of commit(); reusableExecTxn; dispatchExecReuse; runTransactionWithDedup + firstExecAttempt; gate at the top of runTransaction. - adapter/redis_exec_dedup_test.go (originally added in PR #884): 5 tests pinning all four reuse outcomes plus the gate-off legacy equivalence. - The cherry-pick required one small adaptation: prepareDispatch() uses Clock().NextFenced() (uint64, error) on current main; the PR #884 version targeted Clock().Next() (uint64). Same downstream semantics; the error return is wired through preparedTxnDispatch. 2. Close M2 open item — FSM other-txn exactness test (kv/fsm_onephase_dedup_test.go) - TestOnePhaseDedup_OtherTxnVersionDoesNotMaskRetry pins exactness at the FSM apply layer: a third-party version at T_other=20 must not satisfy the FSM probe at T1=30, so the retry falls through and applies at the fresh T2=40. The store-layer pin (store/committed_version_at_test.go) already covers the primitive; this test covers the dispatch path that uses it. 3. M3 multi-mop EXEC dedup test (adapter/redis_exec_dedup_test.go) - TestExecDedup_MultiMopLandedPriorAttempt_ReturnsCachedResults extends single-mop dedup to a 3-command MULTI/EXEC body (SET a + SET b + DEL c). Validates the design's claim that the mechanism works the same for multi-mop because cached results + OCC readKeys fence are mop-count-agnostic. Without dedup the DEL would re-execute to 0 on the second pass — the test rejects that. 4. Design doc updates (docs/design/2026_05_21_proposed_txn_secondary_idempotency.md) - §M2 "still open" → "LANDED" with the new FSM test reference. - §M3 "runTransaction (MULTI/EXEC) — Still open" → "LANDED via PR #884" with multi-mop test reference and the two intentional deviations from the M1/M2 template that claude[bot] flagged on #884 (readKeys assembly order, fresh per-attempt reuseCtx). - §M3 "standalone SET/INCR/HSET" called out as the next follow-up (PR-B in the user-visible roadmap). Caller audit (per /loop semantic-change rule) ============================================== - prepareDispatch (newly added, replaces commit()'s body): callers are commit() and firstExecAttempt; both honor the defer prepared.cancel() contract. External behavior of commit() preserved. - commit(): internal structure changed; external behavior preserved (no test directly invokes it; runTransaction's legacy path continues to call it through the same shape). - runTransactionWithDedup / firstExecAttempt / dispatchExecReuse / reusableExecTxn: all new symbols, exercised only from the gated runTransaction path. Validation ========== - go test ./adapter/ -run 'Dedup|Txn|MULTI|EXEC' passes. - go test ./kv/ ./store/ both pass. - gofmt, go vet, golangci-lint run all clean (0 issues across adapter/kv/store). Relation to prior work ====================== - #796 (merged f481f2b): M1 + M2 + M3 RPUSH/LPUSH on main. - #884 (merged cbbde3d into stacked branch, NOT main): M3 EXEC reuse. - THIS PR: brings #884's content into main, plus M2 cross-txn FSM test, multi-mop EXEC test, and design doc updates. - Next (PR-B): standalone SET / INCR / HSET reuse paths. - Next (PR-C): M4 Jepsen validation infra.

chatgpt-codex-connector · 2026-05-30T07:24:08Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

coderabbitai · 2026-05-30T07:24:10Z

Warning

Review limit reached

@bootjp, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 50 minutes and 16 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8a3d17e4-09d8-4334-afee-90ffb45f77bc

📥 Commits

Reviewing files that changed from the base of the PR and between 4da85dc and 69db79b.

📒 Files selected for processing (2)

adapter/redis_exec_dedup_test.go
docs/design/2026_05_21_proposed_txn_secondary_idempotency.md

📝 Walkthrough

Walkthrough

This PR implements one-phase transaction dedup for MULTI/EXEC (runTransaction) and standalone SET commands, refactoring the transaction commit path to enable dedup interception and adding comprehensive test coverage for caching, reuse, conflict detection, and legacy path preservation.

Changes

One-Phase Transaction Dedup for MULTI/EXEC and Standalone SET

Layer / File(s)	Summary
Transaction dispatch refactoring foundation `adapter/redis.go`	`preparedTxnDispatch` type and `prepareDispatch()` method pre-assemble dispatch-ready write sets, commit timestamps, and read keys under a cancelable timeout context. `txnContext.commit()` refactored to use prepared dispatch, enabling dedup logic to intercept between preparation and coordinator dispatch.
MULTI/EXEC dedup implementation `adapter/redis.go`	`reusableExecTxn` captures attempt-1 write set, read set, and cached results. `dispatchExecReuse` dispatches under fresh commitTS with `PrevCommitTS` flag and probes committed versions on conflicts. `runTransactionWithDedup` orchestrates retry loop. `firstExecAttempt` builds snapshot, applies commands, validates reads, and prepares dispatch. `runTransaction` routes to dedup path when `onePhaseTxnDedup` enabled.
Standalone SET dedup integration `adapter/redis.go`	`set()` gates standalone SET through `runTransactionWithDedup` when enabled, with pre-dedup logic extracted to `setLegacy`. `writeRedisStandaloneResult` helper translates single-element dedup results into redcon reply forms.
MULTI/EXEC dedup test coverage `adapter/redis_exec_dedup_test.go`	Six integration tests exercise landed prior attempts with cached reuse, prior attempts that did not land, genuine external conflicts triggering rebuild, self-inflicted reuse conflicts, multi-command transactions with result array reuse, and legacy path preservation when dedup is disabled.
Standalone SET dedup tests with infrastructure `adapter/redis_retry_test.go`, `adapter/redis_set_dedup_test.go`	`recordingConn` enhanced with `wroteNull` field. Four tests validate SET dedup: landed attempts return cached OK without error, NX option returns null-style response without overwrite, GET option returns cached prior value while committing new value, and legacy path when dedup disabled.
FSM exactness validation for foreign versions `kv/fsm_onephase_dedup_test.go`	New test verifies unrelated committed versions at different timestamps do not satisfy exact-timestamp probes during idempotency retry, ensuring foreign transactions do not mask dedup detection.
Design documentation updates `docs/design/2026_05_21_proposed_txn_secondary_idempotency.md`	M2 section records FSM-level "other transaction" exactness test as landed. M3 section marks MULTI/EXEC dedup as landed with cached-result behavior, multi-mop regression tests, and routing updates for standalone SET via `runTransactionWithDedup`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

bootjp/elastickv#871: Modifies txnContext.commit commitTS allocation to use Clock().NextFenced() and propagate errors alongside this PR's commit dispatch refactoring.
bootjp/elastickv#518: Modifies the Redis transaction commit flow (txnContext.commit) to remove TTL elems from Raft txn and flush TTL via ttlBuffer alongside this PR's commit dispatch changes.
bootjp/elastickv#796: Builds on shared onePhaseTxnDedup and PrevCommitTS coordinator/FSM infrastructure; this PR extends dedup to standalone SET and MULTI/EXEC, while the retrieved PR focused on list-push operations.

Poem

🐰 A rabbit hops through transactions with care,
Dedup caches results beyond compare,
MULTI/EXEC and SET now reuse with grace,
When conflicts arise, we probe the right place,
No retries needed—cached answers we share!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: re-landing M3 EXEC reuse, closing M2 gaps, and adding multi-mop test coverage.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/txn-dedup-docs-and-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request implements the Option-2 retry and deduplication path for MULTI/EXEC transactions (runTransactionWithDedup), allowing transaction write sets to be reused under a fresh commit timestamp with PrevCommitTS to probe if a prior attempt already landed. It includes comprehensive unit tests validating various deduplication scenarios, updates the design documentation, and adds FSM-level tests for exactness. The review feedback highlights three key areas for improvement: updating the timestamp allocation in dispatchExecReuse to use NextFenced() to respect HLC-4 safety limits, adding a fast-path check for parent context expiration in the retry loop to prevent resource waste, and adding a nil check on readPin to avoid a potential nil pointer dereference panic.

gemini-code-assist · 2026-05-30T07:27:03Z

+// is the current length" question; the client-visible result IS the
+// cached results array.
+func (r *RedisServer) dispatchExecReuse(ctx context.Context, pending *reusableExecTxn) (results []redisResult, drop bool, err error) {
+	commitTS := r.coordinator.Clock().Next()


The commitTS allocated here uses r.coordinator.Clock().Next(), which bypasses the HLC-4 physical-ceiling fence. According to the HLC-4 safety specification (and the documentation in kv/hlc.go), all persistence-grade allocations (including commitTS for transactions) must go through NextFenced() to ensure that an expired-ceiling allocation fails closed instead of silently issuing a timestamp that could collide with a subsequent leader's window.

Please update this to use NextFenced() and handle the returned error.

commitTS, err := r.coordinator.Clock().NextFenced() if err != nil { return nil, false, errors.Wrap(err, "dispatchExecReuse: allocate commitTS") }

gemini-code-assist · 2026-05-30T07:27:03Z

+		if pending != nil {
+			reuseCtx, reuseCancel := context.WithTimeout(r.handlerContext(), redisDispatchTimeout)


If the parent dispatchCtx has already expired or been cancelled, starting a new reuse dispatch with a fresh 10-second timeout (redisDispatchTimeout) derived from r.handlerContext() will waste resources and delay the error response to the client.

Consider checking dispatchCtx.Err() at the start of the retry block when pending != nil to fail fast if the parent context is already dead.

if pending != nil { if err := dispatchCtx.Err(); err != nil { return err } reuseCtx, reuseCancel := context.WithTimeout(r.handlerContext(), redisDispatchTimeout)

gemini-code-assist · 2026-05-30T07:27:03Z

+	readPin := r.pinReadTS(startTS)
+	defer readPin.Release()


r.pinReadTS(startTS) can return nil if r.readTracker is not initialized (which is common in minimal test setups or certain configurations). Deferring readPin.Release() directly without a nil check will cause a nil pointer dereference panic.

Please add a nil check before deferring the release.

readPin := r.pinReadTS(startTS) if readPin != nil { defer readPin.Release() }

bootjp · 2026-05-30T07:35:15Z

@claude review

PR-A: re-land M3 EXEC reuse + close M2 open + multi-mop test.

Status:

PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884 (M3 EXEC reuse) merged into stacked branch but never reached main; this PR re-lands its content on main with the Clock().NextFenced merge fix.
TestOnePhaseDedup_OtherTxnVersionDoesNotMaskRetry closes M2 open item.
TestExecDedup_MultiMopLandedPriorAttempt_ReturnsCachedResults proves multi-mop dedup works (SET+SET+DEL body).
Design doc §M2 "open" → LANDED, §M3 EXEC → LANDED with deviations doc'd.
gofmt + go vet + golangci-lint run clean.

Please verify the cherry-pick adaptation (NextFenced error wired through preparedTxnDispatch) and the multi-mop test's DEL-result invariance argument.

claude · 2026-05-30T07:35:26Z

Claude finished @bootjp's task in 4m 11s —— View job

PR #887 Review — Re-land M3 EXEC reuse + M2 close + multi-mop test

Tasks:

Read adapter/redis.go changes
Read adapter/redis_exec_dedup_test.go
Read kv/fsm_onephase_dedup_test.go
Read design doc changes
Verify NextFenced error wiring through preparedTxnDispatch
Analyze multi-mop DEL-result invariance argument
Check Gemini review comments (Next vs NextFenced, ctx check, nil readPin)
Post full review

Requested verification items

1. NextFenced error wiring through prepareDispatch() — correct

adapter/redis.go:2685 correctly uses Clock().NextFenced() and both error paths return a no-op cancel stub:

commitTS, err := t.server.coordinator.Clock().NextFenced()
if err != nil {
    return preparedTxnDispatch{cancel: func() {}}, errors.Wrap(err, "redis txn commit: allocate commitTS")
}

The buildStreamDeletionElems error path also calls cancel() before returning, avoiding a context leak. The cherry-pick adaptation is correct.

2. Multi-mop DEL=1 invariance — sound

The argument holds. applyDel calls stagedKeyType(key) → trackTypeReadKeys(key) (which tracks redisStrKey(key), list meta, hash, set, etc.) and stageKeyDeletion(key) → t.load(key) → trackReadKey(storageKey). Both paths register the key in t.readKeys. So the OCC fence covers the deleted key: any concurrent write to "c" between startTS and commitTS triggers WriteConflict → drop pending → recompute from fresh snapshot. The cached DEL=1 result is only returned when the OCC fence passes, meaning "c" was not concurrently modified — at which point either attempt 1 already deleted it (probe hits, no-op) or attempt 1 didn't land and the reuse deletes it at the fresh commitTS, again yielding count 1.

Issues

Bug (high) — `dispatchExecReuse` uses `Clock().Next()` for a persistence-grade `commitTS`

adapter/redis.go:3112:

commitTS := r.coordinator.Clock().Next()

The HLC documentation at kv/hlc.go:107-111 is explicit:

NEW persistence-grade allocations (everything that ends up as a startTS / commitTS, an MVCC write timestamp, or a lease/expiry boundary) MUST go through NextFenced instead — Next bypasses the HLC-4 (iii) ceiling fence and can therefore issue a timestamp inside a stale leader window after lease renewal stops.

This is a real allocation of a commitTS that is passed to coordinator.Dispatch, so it must use NextFenced(). The prepareDispatch() (line 2685) correctly uses NextFenced(); the reuse path should follow the same pattern:

commitTS, err := r.coordinator.Clock().NextFenced()
if err != nil {
    return nil, false, errors.Wrap(err, "dispatchExecReuse: allocate commitTS")
}

Context: The pre-existing dispatchListPushReuse at line 3508 has the same bug — commitTS := r.coordinator.Clock().Next(). The new code mirrors that pattern, but it should mirror the NextFenced() pattern from prepareDispatch() and applyListPush (line 3438) instead. Fix this →

Non-issue — Gemini's nil `readPin` concern

adapter/redis.go:3214-3215:

readPin := r.pinReadTS(startTS)
defer readPin.Release()

pinReadTS can return nil when r.readTracker == nil. However, ActiveTimestampToken.Release() at kv/active_timestamp_tracker.go:57-60 already has:

func (t *ActiveTimestampToken) Release() {
    if t == nil || t.tracker == nil || t.id == 0 {
        return
    }

Calling .Release() on a nil pointer is valid Go and returns immediately. The identical pattern exists in the legacy runTransaction at line 3032-3033 and is working correctly. Gemini's concern is a false positive.

Minor — fresh `reuseCtx` bypasses expired `dispatchCtx`

adapter/redis.go:3172-3175:

reuseCtx, reuseCancel := context.WithTimeout(r.handlerContext(), redisDispatchTimeout)
defer reuseCancel()

The fresh context from handlerContext() means an already-expired dispatchCtx is bypassed. If the outer context has already cancelled (client disconnected), a full 10-second reuse attempt will proceed on their behalf. This is documented as intentional deviation #2 in the design doc, so it's an informed choice, but checking dispatchCtx.Err() first would avoid wasted work for the common case where the client is no longer waiting:

if pending != nil {
    if err := dispatchCtx.Err(); err != nil {
        return err
    }
    reuseCtx, reuseCancel := ...

Not a blocking issue given the documentation.

Design doc & tests

The design doc updates are clean: §M2 and §M3 transitions from "Still open" to "LANDED" are accurate, and the two intentional deviations are correctly documented. The TestOnePhaseDedup_OtherTxnVersionDoesNotMaskRetry test scenario (T_other=20, T1=30, T2=40) correctly exercises exactness at the FSM dispatch layer beyond what the store-layer suite already covers. The 5 exec dedup tests cover all four reuse outcomes plus the gate-off legacy equivalence path.

Summary

The cherry-pick adaptation is correct (NextFenced properly wired in prepareDispatch()), and the multi-mop DEL invariance argument is sound. One real fix needed: dispatchExecReuse:3112 must use NextFenced() not Next() for its commitTS allocation — a persistence-grade timestamp must respect the HLC-4 ceiling fence. The nil-readPin concern from Gemini is a false positive. The fresh-context deviation is documented and acceptable but adding a pre-check on dispatchCtx.Err() is worth considering.
| Branch

gemini HIGH (adapter/redis.go:3112 dispatchExecReuse): Clock().Next() bypasses the HLC-4 physical-ceiling fence; persistence-grade commit_ts allocation must use NextFenced() to avoid issuing a timestamp that collides with a subsequent leader's window after renewal. Switched dispatchExecReuse to NextFenced() with the standard error-propagation shape (matches prepareDispatch from PR #884 cherry-pick). gemini MEDIUM (adapter/redis.go:3175 reuseCtx propagation): The original "fresh ctx from handlerContext per reuse attempt" pattern ignored outer cancellation -- a disconnected client would wait the full 10 s before the reuse returned. Derived reuseCtx from dispatchCtx (the caller's retry-loop ctx) so outer cancellation interrupts mid-attempt. Per-attempt redisDispatchTimeout still caps the dispatch the same way commit does for the first attempt. Matches listPushCoreWithDedup's caller-ctx threading. gemini MEDIUM (adapter/redis.go:3215 readPin.Release on nil): False positive. ActiveTimestampToken.Release is documented nil-safe at kv/active_timestamp_tracker.go:58. The pattern readPin := r.pinReadTS(...); defer readPin.Release() is used unchanged in the existing runTransaction (legacy path); no change needed. Caller audit per /loop semantic-change rule: - dispatchExecReuse signature unchanged; the new error return is the same (bool, error) tuple position. Single caller is runTransactionWithDedup which already returns dispErr via dropping to the retry loop's error path -- the new NextFenced error reaches the same dispErr branch. - reuseCtx parent change is local to runTransactionWithDedup; the only observable effect is faster cancellation propagation, which retryRedisWrite already gates on ctx.Done between attempts. Note on pre-existing Clock().Next() callers: dispatchListPushReuse (adapter/redis.go:3508) and the listPushCoreWithDedup first-attempt site (adapter/redis.go:3679) -- both shipped on main as part of PR #796 -- still use Clock().Next() and have the same HLC-4 ceiling-bypass exposure. Out of scope for this PR (those are pre-existing on main, not introduced by PR-A's diff), but should be fixed in a follow-up cleanup PR for parity. Design doc: M3 "fresh reuseCtx from handlerContext" deviation note is struck and replaced with the dispatchCtx-derived rationale; future readers see why the earlier framing was wrong and what the current pattern is. Validation: go test ./adapter/ -run 'ExecDedup|TxnMULTI' passes. go build ./adapter/... clean. golangci-lint run ./adapter/... 0 issues.

bootjp · 2026-05-30T07:42:56Z

@claude review

Round 1 fixes (commit dd042e9):

gemini HIGH: dispatchExecReuse now uses Clock().NextFenced() (HLC-4 ceiling honored). NextFenced error wired through (nil, false, err) tuple.
gemini MEDIUM (reuseCtx): switched from r.handlerContext() to dispatchCtx as the parent so outer cancellation propagates promptly. Per-attempt redisDispatchTimeout still caps the dispatch.
gemini MEDIUM (readPin.Release): documented as false positive — ActiveTimestampToken.Release is nil-safe at kv/active_timestamp_tracker.go:58.
Design doc §M3 deviation note updated: "fresh reuseCtx from handlerContext" struck and replaced with dispatchCtx-derived rationale.
Note: pre-existing dispatchListPushReuse + listPushCoreWithDedup also use Clock().Next() with the same HLC-4 exposure; flagged for a separate cleanup PR (out of scope for PR-A's diff).

go test ./adapter/ -run 'ExecDedup|TxnMULTI' passes; gofmt/vet/golangci-lint clean.

claude · 2026-05-30T07:43:08Z

Claude finished @bootjp's task in 2m 41s —— View job

PR #887 Round 1 Review — Verifying fixes from `dd042e9`

Tasks:

Load tools
Read adapter/redis.go — verify NextFenced() wiring in dispatchExecReuse
Read adapter/redis.go — verify dispatchCtx as parent for reuseCtx
Check for any new issues introduced by the fixes
Verify design doc update reflects actual code state
Post review

Round 1 fixes verified — all correct

HIGH fix (dispatchExecReuse → NextFenced) ✅

adapter/redis.go:3118-3120 is correct:

commitTS, allocErr := r.coordinator.Clock().NextFenced()
if allocErr != nil {
    return nil, false, errors.Wrap(allocErr, "redis exec reuse: allocate commitTS")
}

Return shape (nil, false, err) is correct: drop=false means the caller does not nil pending (nothing to drop from a timestamp allocation failure); the error is non-retryable (isRetryableRedisTxnErr only matches ErrWriteConflict / ErrTxnLocked, line adapter/redis_retry.go:44), so retryRedisWrite immediately surfaces it to the client. No retry amplification possible.

MEDIUM fix (reuseCtx parent → dispatchCtx) ✅

adapter/redis.go:3194:

reuseCtx, reuseCancel := context.WithTimeout(dispatchCtx, redisDispatchTimeout)

Go's context.WithTimeout takes the minimum of the parent's remaining deadline and the new timeout, so the total budget across all attempts is now capped by the outer dispatchCtx (itself handlerContext() + redisDispatchTimeout). Previously each reuse attempt could independently hold a fresh full 10 s even after dispatchCtx expired — this is corrected. The fix matches listPushCoreWithDedup's threading pattern.

MEDIUM (readPin nil-safety) — false positive, correctly documented ✅

ActiveTimestampToken.Release() at kv/active_timestamp_tracker.go:57-60 has an explicit if t == nil guard. The pattern defer readPin.Release() with a potentially-nil readPin is used identically at adapter/redis.go:3032-3033 in the existing (shipped) runTransaction path. No change required.

Design doc ✅

The strikethrough + "Reverted per PR #887 review" annotation correctly records why the handlerContext framing was dropped. Future readers see the rationale rather than a silent inconsistency with the implementation.

One remaining observation (informational, not blocking)

The pre-existing dispatchListPushReuse (adapter/redis.go:3527) and listPushCoreWithDedup first-attempt site (:3698) still use Clock().Next() — same HLC-4 ceiling-bypass exposure that was just fixed in dispatchExecReuse. These are correctly called out in the commit message as out-of-scope for this PR (they shipped on main in PR #796). A follow-up cleanup PR to bring them to NextFenced() parity is the right call. No action needed here.

Summary

All three issues from the prior review are correctly addressed in dd042e9. The NextFenced error propagation shape is clean, the context threading fix is semantically correct and now matches the list-push dedup pattern, and the nil readPin dismissal is well-documented. LGTM — no further changes needed for Round 1.
| Branch

Stacked on PR-A (re-land of M3 EXEC reuse + multi-mop test). Extends option-2 dedup to standalone SET (dispatched outside MULTI). When `r.onePhaseTxnDedup` is on, `r.set` wraps the single command as a 1-element redcon.Command queue and dispatches via runTransactionWithDedup — reusing the M3 EXEC machinery instead of building a per-handler reusableSetTxn + dispatchSetReuse shape. SET already has `applySet` on `txnContext`, so this is the "free" extension to any command whose txn-state-aware apply hook already exists. The standalone fast path (`trySetFastPath`) is intentionally bypassed under the gate: dedup is opt-in, and a non-dedup'd fast path under a dedup-on cluster would split the idempotency contract. Why standalone INCR / HSET are out of scope for THIS PR: ======================================================== INCR (in adapter/redis_compat_commands.go) and HSET both lack a `txnContext.applyIncr` / `applyHSet` implementation, so the "route through single-mop EXEC" pattern that worked here for SET cannot apply as-is — `runTransactionWithDedup` would reject the command at `txn.apply` ("unsupported in MULTI"). Bringing them in requires implementing `applyIncr` / `applyHSet` first (each ~30–50 LOC for the txn-state-aware read-compute-write shape), then the standalone handler routing is a one-liner via this same path. Tracked as separate follow-up PRs; until then, INCR and HSET keep today's buggy-under-churn behaviour, which is the design doc's stated default ("everything else keeps today's behaviour until its hook is added"). Caller audit (per /loop semantic-change rule): - `r.set` (handler): gate-on takes a new code path; gate-off preserved verbatim via `setLegacy`. No external callers exist (it is wired only into the redcon dispatch table). - `setLegacy` (new): byte-identical extraction of the pre-PR set() body. No external callers. - `writeRedisStandaloneResult` (new): translates a single-element `[]redisResult` from `runTransactionWithDedup` into the bare redcon reply shape (no WriteArray wrapper). Single caller in this PR; future SET-pattern callers will reuse it. Validation: - adapter/redis_set_dedup_test.go (new): TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK, TestStandaloneSetDedup_DisabledKeepsLegacyPath. Existing TestRedis_SET / dedup suites still pass. - gofmt, go vet, golangci-lint run all clean (0 issues). Design doc updated to mark standalone SET as LANDED and call out INCR/HSET as the next follow-up with the precise reason they cannot land alongside SET in this PR.

claude[bot] 🔶 (redundant double gate, adapter/redis.go:1135): set's gate-on branch called r.runTransaction which immediately re-checked the same r.onePhaseTxnDedup gate and routed to runTransactionWithDedup. The indirection was misleading -- the PR description said "dispatches via runTransactionWithDedup" but that was true only by indirection. Replaced with a direct call to runTransactionWithDedup; intent is now explicit and the double gate check is removed. claude[bot] 🔶 (test gap, adapter/redis_set_dedup_test.go): Only the resultString OK path was tested through the dedup route. Added two regression tests covering the other applySet result types that go through writeRedisStandaloneResult: - TestStandaloneSetDedup_NXMissReturnsNil pins resultNil routing: SET k v NX on an existing key returns nil; the cached attempt-1 result must round-trip through writeRedisStandaloneResult -> WriteNull, leaving conn.bulk == nil. - TestStandaloneSetDedup_GETOptionReturnsOldBulk pins resultBulk routing: SET k v GET on an existing key returns the prior value as a bulk reply; conn.bulk must be the prior bytes. Both fire the ambiguous-attempt-1-lands path (newDedupTestCoordinator ambiguousLands=true) so the result MUST come from the cached attempt-1 array, not a re-execution. claude[bot] minor observation (shallow-array constraint in writeRedisStandaloneResult): Documented in the function's doc comment that the resultArray arm flattens via WriteBulkString and is NOT safe to reuse for callers whose applyXxx emits nested arrays. Future HGETALL-pattern callers must either pre-flatten their result or extend the switch. Caller audit per /loop semantic-change rule: - set's gate-on branch is the only call site changed; the new direct call to runTransactionWithDedup uses the same []redcon.Command{cmd} shape and the same []redisResult return type. runTransactionWithDedup is exported within the package and has no other callers outside the legacy runTransaction path (which is unchanged). - writeRedisStandaloneResult signature unchanged. Documentation expanded but no behavior change. - New tests are pure additions; no existing tests modified. Validation: go test ./adapter/ -run StandaloneSetDedup passes (5 tests now). go build ./adapter/... clean. golangci-lint run ./adapter/... 0 issues.

claude[bot] (a) attribution prefix per CLAUDE.md convention: Stripped the inline "// claude[bot] PR #888 review:" prefixes from the two comments added in round 1. The substantive "why" content (double-gate rationale; shallow-array constraint) is kept as that is what future readers need. CLAUDE.md convention is to leave fix attribution in the commit message, not the source. claude[bot] (b) wroteNull bool field on recordingConn: Added wroteNull witness to recordingConn (in redis_retry_test.go, the shared test helper). Hardened TestStandaloneSetDedup_NXMissReturnsNil to require.True(conn.wroteNull, ...) so a wrong branch that wrote nothing at all (would also satisfy conn.bulk == nil) is now caught. Caller audit per /loop semantic-change rule: - recordingConn is in *_test.go and used by tests only. Adding wroteNull bool default false is strictly additive; no existing test reads it. The only WriteNull semantics change is setting the new field; conn.bulk = nil behavior is preserved. - Verified by grep: only redis_set_dedup_test.go references wroteNull. All other recordingConn uses (8 instantiations across redis_retry_test.go + redis_lua_linearizable_read_test.go) are unaffected. Validation: go test ./adapter/ -run StandaloneSetDedup passes. go test ./adapter/ -run ExecDedup passes. gofmt + go vet + golangci-lint clean.

…_test claude[bot] PR #888 round-3 minor cleanup: same CLAUDE.md convention violation that was cleaned from redis.go in round 2 was left in the test file. Removed two references on lines 56 and 78. Substantive rationale (NX semantics, zero-value aliasing problem) kept; the attribution moves to the commit message where it belongs. No functional or test behaviour change. golangci-lint clean.

) ## Summary Stacked on PR-A (#887). Extends option-2 dedup to **standalone SET** (dispatched outside MULTI). When `r.onePhaseTxnDedup` is on, `r.set` wraps the single command as a 1-element `redcon.Command` queue and dispatches via `runTransactionWithDedup` — reusing the M3 EXEC machinery instead of building a per-handler `reusableSetTxn` + `dispatchSetReuse` shape. SET already has `applySet` on `txnContext`, so this is the "free" extension to any command whose txn-state-aware apply hook already exists. The standalone fast path (`trySetFastPath`) is intentionally bypassed under the gate: dedup is opt-in, and a non-dedup'd fast path under a dedup-on cluster would split the idempotency contract. Base: `feat/txn-dedup-docs-and-tests` (PR #887). ## Why standalone INCR / HSET are out of scope for THIS PR INCR (in `adapter/redis_compat_commands.go`) and HSET both lack a `txnContext.applyIncr` / `applyHSet` implementation, so the "route through single-mop EXEC" pattern that worked here for SET cannot apply as-is — `runTransactionWithDedup` would reject the command at `txn.apply` ("unsupported in MULTI"). Bringing them in requires implementing `applyIncr` / `applyHSet` first (each ~30–50 LOC for the txn-state-aware read-compute-write shape), then the standalone handler routing is a one-liner via this same path. Tracked as separate follow-up PRs; until then, INCR and HSET keep today's buggy-under-churn behaviour — the design doc's stated default ("everything else keeps today's behaviour until its hook is added"). ## Caller audit (per /loop semantic-change rule) - `r.set` (handler): gate-on takes a new code path; gate-off preserved verbatim via `setLegacy`. No external callers exist (it is wired only into the redcon dispatch table). - `setLegacy` (new): byte-identical extraction of the pre-PR `set()` body. No external callers. - `writeRedisStandaloneResult` (new): translates a single-element `[]redisResult` from `runTransactionWithDedup` into the bare redcon reply shape (no `WriteArray` wrapper). Single caller in this PR; future SET-pattern callers will reuse it. ## Tests - `adapter/redis_set_dedup_test.go` (new): - `TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK`: gate-on + ambiguous attempt-1 land + reuse → probe hits → cached "OK" returned, dispatches=2, probeNoOps=1. - `TestStandaloneSetDedup_DisabledKeepsLegacyPath`: gate-off path runs no probe. - Existing `TestRedis_SET*` and dedup suites still pass. ## Validation - `go test ./adapter/ -run 'Dedup|Txn|MULTI|EXEC|TestRedis_SET'` passes - `gofmt`, `go vet`, `golangci-lint run` all clean (0 issues) ## Design doc `docs/design/2026_05_21_proposed_txn_secondary_idempotency.md` §M3 — standalone SET marked LANDED, standalone INCR/HSET called out as follow-up with the precise reason they cannot land alongside SET in this PR.

bootjp · 2026-05-30T08:20:04Z

@claude review

Head moved to 4da85dc after PR #888 (standalone SET dedup) was merged into this branch. The new content is the LGTM'd PR #888 work — adapter/redis.go set() routing + setLegacy + writeRedisStandaloneResult + redis_set_dedup_test.go.

The PR #887 round-1 changes (NextFenced fix in dispatchExecReuse + reuseCtx parent switch + design doc deviation note revert) are unchanged; both findings were marked LGTM previously.

Please confirm the combined diff against main is mergeable. No new code was authored — just the PR-B work folded in.

claude · 2026-05-30T08:20:19Z

Claude finished @bootjp's task in 5m 50s —— View job

PR #887 Round 2 Review — Combined diff (PR #887 + PR #888 folded in)

Load tools
Get combined diff against origin/main
Read new PR feat(txn-dedup): standalone SET routes through option-2 dedup loop #888 content: set() routing, setLegacy, writeRedisStandaloneResult
Read adapter/redis_set_dedup_test.go
Verify correctness and HLC compliance of the standalone SET dedup path
Confirm all prior LGTM'd changes (NextFenced, reuseCtx) are intact
Post review

Prior LGTM'd changes — intact

Round-1 fixes confirmed unchanged at their final positions:

dispatchExecReuse uses commitTS, allocErr := r.coordinator.Clock().NextFenced() ✓
reuseCtx, reuseCancel := context.WithTimeout(dispatchCtx, redisDispatchTimeout) (outer dispatchCtx, not r.handlerContext()) ✓
Design doc deviation note 2 reflects the fix with "~~fresh ctx from handlerContext~~ — Reverted per PR feat(txn-dedup): re-land M3 EXEC reuse on main + close M2 open + multi-mop #887 review" ✓

PR #888 content — `set()` routing + `setLegacy` + `writeRedisStandaloneResult`

set() routing logic — correct.

The gate at r.onePhaseTxnDedup routes to runTransactionWithDedup([]redcon.Command{cmd}), then writeRedisStandaloneResult. The gate-off branch falls through to setLegacy. Argument safety is guaranteed before set() is reached: redis_command_specs.go registers SET with Arity: -3 and validateCmd rejects commands with fewer than 3 args — no unsafe cmd.Args[1] access possible when set() is invoked.

setLegacy — byte-identical to the pre-PR set() body. Extraction is clean; the comment accurately describes the intent.

writeRedisStandaloneResult — correct. All result types handled:

resultNil → WriteNull() ✓
resultError → writeRedisError(conn, res.err) ✓
resultBulk → WriteBulk(res.bulk) ✓
resultString → WriteString(res.str) ✓
resultArray → WriteArray(len) + WriteBulkString per element ✓ (flat-string-array constraint documented)
resultInt → WriteInt64(res.integer) ✓
default → WriteNull() ✓

The guarded len(results) != 1 → WriteNull is the right safety valve — a standalone handler wrapping a single-mop EXEC result should always have exactly 1 element, and any misuse gets a null rather than a malformed reply.

redis_set_dedup_test.go — all four tests sound:

TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK: attempt 1 lands → pending stashed → reuse dispatch → maybeProbe fires → probeNoOps++ → cached resultString("OK") returned. WriteString("OK") → conn.bulk = []byte("OK"). Dispatch and probe counts (2 / 1) correctly describe the two-attempt flow. ✓
TestStandaloneSetDedup_NXMissReturnsNil: NX miss → applySet returns resultNil without staging any write → prepared.elems empty → firstExecAttempt returns early with nextResults=[resultNil], nil, nil (no reuse, no dispatch retry) → writeRedisStandaloneResult calls WriteNull() → wroteNull=true, bulk=nil. The wroteNull witness field (newly added to recordingConn) is the airtight check that WriteNull was actually called and not silently skipped. ✓
TestStandaloneSetDedup_GETOptionReturnsOldBulk: existing key "prior" → applySet with GET option stages the write and captures resultBulk("prior") → dispatch 1 lands then errors → reuse probe hits → cached resultBulk("prior") returned → WriteBulk([]byte("prior")) → conn.bulk = []byte("prior"). New value "v1" in store. ✓
TestStandaloneSetDedup_DisabledKeepsLegacyPath: onePhaseTxnDedup=false → setLegacy path → coord.probeNoOps == 0. ✓

recordingConn change in redis_retry_test.go — minimal and correct. Adding wroteNull bool + setting it in WriteNull() is a clean, non-breaking extension needed by the NX miss test.

Design doc — accurate. M2 open → LANDED with test reference; §M3 EXEC → LANDED with runTransactionWithDedup description; standalone SET → LANDED with TestStandaloneSetDedup_* reference; standalone INCR/HSET correctly noted as still-open follow-up. The deviation-note-2 update correctly reflects the Round-1 reuseCtx fix.

No new issues

The combined diff introduces no new HLC violations, no context threading issues, no nil-safety gaps. The self-inflicted-conflict guard in dispatchExecReuse (probe at the fresh commitTS after a WriteConflict) mirrors dispatchListPushReuse's established pattern and is correctly scoped to that function's locally-issued commitTS.

LGTM — combined diff is mergeable.
| Branch

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

adapter/redis.go (1)
2781-2809: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the outer EXEC context here.

prepareDispatch() starts a fresh timeout from handlerContext(), so the commit phase can run for another full redisDispatchTimeout after apply/validate have already consumed most of the caller's dispatchCtx budget. That also drops prompt caller cancellation during stream-deletion scans and Dispatch. Use t.ctx when it's set, and only fall back to handlerContext() for minimal test fixtures that leave txnContext.ctx nil.
Suggested change
-	ctx, cancel := context.WithTimeout(t.server.handlerContext(), redisDispatchTimeout)
+	parentCtx := t.ctx
+	if parentCtx == nil {
+		parentCtx = t.server.handlerContext()
+	}
+	ctx, cancel := context.WithTimeout(parentCtx, redisDispatchTimeout)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@adapter/redis.go` around lines 2781 - 2809, The code in prepareDispatch (the
block creating ctx, cancel via context.WithTimeout(t.server.handlerContext(),
redisDispatchTimeout)) should preserve an outer EXEC/dispatch context by using
t.ctx when present instead of always starting from t.server.handlerContext();
change the parent passed to context.WithTimeout to be t.ctx if t.ctx != nil,
falling back to t.server.handlerContext() only when t.ctx is nil (ensure cancel
is still returned and used as before); this keeps stream-deletion scans
(buildStreamDeletionElems) and the final Dispatch bounded by
redisDispatchTimeout while honoring caller cancellation.

🧹 Nitpick comments (1)

adapter/redis_set_dedup_test.go (1)

26-138: ⚡ Quick win

Prefer a table-driven structure for the SET dedup variants.

These four tests duplicate the same harness (store/coordinator/server/conn/invoke/assert baseline). Converting them into table-driven cases will make option growth (NX, XX, GET, gate on/off) easier to maintain and less error-prone.

♻️ Suggested refactor sketch

+func TestStandaloneSetDedup(t *testing.T) {
+	t.Parallel()
+	tests := []struct {
+		name string
+		dedup bool
+		args [][]byte
+		seed func(ctx context.Context, st store.MVCCStore)
+		assert func(t *testing.T, conn *recordingConn, st store.MVCCStore, coord *dedupTestCoordinator)
+	}{
+		// landed prior attempt -> OK
+		// NX miss -> WriteNull
+		// GET -> old bulk
+		// gate off -> legacy path (no probe)
+	}
+	for _, tc := range tests {
+		tc := tc
+		t.Run(tc.name, func(t *testing.T) {
+			t.Parallel()
+			ctx := context.Background()
+			st := store.NewMVCCStore()
+			if tc.seed != nil { tc.seed(ctx, st) }
+			coord := newDedupTestCoordinator(st, 1, true)
+			srv := &RedisServer{store: st, coordinator: coord, scriptCache: map[string]string{}, onePhaseTxnDedup: tc.dedup}
+			conn := &recordingConn{}
+			srv.set(conn, redcon.Command{Args: tc.args})
+			tc.assert(t, conn, st, coord)
+		})
+	}
+}

As per coding guidelines, **/*_test.go: "prefer table-driven test patterns/cases."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@adapter/redis_set_dedup_test.go` around lines 26 - 138, These four nearly
identical tests (TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK,
TestStandaloneSetDedup_NXMissReturnsNil,
TestStandaloneSetDedup_GETOptionReturnsOldBulk,
TestStandaloneSetDedup_DisabledKeepsLegacyPath) should be consolidated into a
table-driven test that iterates cases containing the command Args (e.g.,
[]byte{cmdSet, "k", "v1", optional flags}), any pre-seed actions (PutAt),
expected recordingConn results (bulk string vs nil, wroteNull bool, err empty),
and coordinator/gate expectations (probeNoOps, dispatches); in the loop
instantiate the common harness (store.NewMVCCStore, newDedupTestCoordinator,
RedisServer{...}, recordingConn), call srv.set(conn, cmd) and assert the
per-case expectations, so move shared setup into the single loop and keep
case-specific assertions in each case entry to remove the duplicated code paths.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@adapter/redis_exec_dedup_test.go`:
- Around line 116-119: The test currently asserts require.GreaterOrEqual(t,
coord.dispatches, 3) which is too permissive; change it to require.Equal(t, 3,
coord.dispatches) so the test enforces the exact expected retry sequence
(pre-reject, OCC conflict reuse, fresh-snapshot retry success), and keep the
existing require.Equal(t, 0, coord.probeNoOps, ...) check unchanged; locate the
assertions in adapter/redis_exec_dedup_test.go referencing coord.dispatches and
coord.probeNoOps and replace the GreaterOrEqual assertion with an exact equality
assertion.

In `@docs/design/2026_05_21_proposed_txn_secondary_idempotency.md`:
- Around line 481-482: Update the PR reference text for the runTransaction note
so it accurately reflects the landing PR: change the “LANDED via PR `#884`”
mention to reference PR `#887` (or both with `#887` listed as the landing PR) in the
documentation line that reads about runTransaction (MULTI/EXEC) — LANDED via PR
`#884`; ensure the adjacent description of runTransactionWithDedup mirroring
listPushCoreWithDedup remains unchanged.

---

Outside diff comments:
In `@adapter/redis.go`:
- Around line 2781-2809: The code in prepareDispatch (the block creating ctx,
cancel via context.WithTimeout(t.server.handlerContext(), redisDispatchTimeout))
should preserve an outer EXEC/dispatch context by using t.ctx when present
instead of always starting from t.server.handlerContext(); change the parent
passed to context.WithTimeout to be t.ctx if t.ctx != nil, falling back to
t.server.handlerContext() only when t.ctx is nil (ensure cancel is still
returned and used as before); this keeps stream-deletion scans
(buildStreamDeletionElems) and the final Dispatch bounded by
redisDispatchTimeout while honoring caller cancellation.

---

Nitpick comments:
In `@adapter/redis_set_dedup_test.go`:
- Around line 26-138: These four nearly identical tests
(TestStandaloneSetDedup_LandedPriorAttempt_ReturnsOK,
TestStandaloneSetDedup_NXMissReturnsNil,
TestStandaloneSetDedup_GETOptionReturnsOldBulk,
TestStandaloneSetDedup_DisabledKeepsLegacyPath) should be consolidated into a
table-driven test that iterates cases containing the command Args (e.g.,
[]byte{cmdSet, "k", "v1", optional flags}), any pre-seed actions (PutAt),
expected recordingConn results (bulk string vs nil, wroteNull bool, err empty),
and coordinator/gate expectations (probeNoOps, dispatches); in the loop
instantiate the common harness (store.NewMVCCStore, newDedupTestCoordinator,
RedisServer{...}, recordingConn), call srv.set(conn, cmd) and assert the
per-case expectations, so move shared setup into the single loop and keep
case-specific assertions in each case entry to remove the duplicated code paths.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c5daab0e-1591-4731-8bab-73a72fc03247

📥 Commits

Reviewing files that changed from the base of the PR and between 76e9151 and 4da85dc.

📒 Files selected for processing (6)

adapter/redis.go
adapter/redis_exec_dedup_test.go
adapter/redis_retry_test.go
adapter/redis_set_dedup_test.go
docs/design/2026_05_21_proposed_txn_secondary_idempotency.md
kv/fsm_onephase_dedup_test.go

coderabbitai (1) test (adapter/redis_exec_dedup_test.go:119): TestExecDedup_GenuineConflictRebuildsAndApplies asserted GreaterOrEqual(coord.dispatches, 3) but with the single injected pre-reject + single concurrent SET the retry topology is fully deterministic: attempt 1 (pre-reject), reuse (OCC-conflict), fresh retry (success). Tightened to Equal(3) so a regression that adds an extra dispatch is caught. coderabbitai (2) doc (docs/design/...:481): "LANDED via PR #884" was ambiguous after the re-land on main via PR #887. Updated to "LANDED via PR #887 (originally PR #884, re-landed against main)" so future readers see the canonical landing PR with the lineage preserved. No Go callers touched. go test ./adapter/ -run ExecDedup passes.

bootjp · 2026-05-30T08:25:26Z

@claude review

Round 2 fixes for coderabbitai 🟡 minors (commit 69db79b):

adapter/redis_exec_dedup_test.go:119: Tightened GreaterOrEqual(coord.dispatches, 3) → Equal(3). With the single injected pre-reject + single concurrent SET, the retry topology is fully deterministic (attempt 1 pre-reject → reuse OCC-conflict → fresh retry success). Any future regression that adds an extra dispatch is now caught.
docs/.../idempotency.md:481: Updated "LANDED via PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884" → "LANDED via PR feat(txn-dedup): re-land M3 EXEC reuse on main + close M2 open + multi-mop #887 (originally PR feat(txn-dedup): M3 EXEC reuse — option-2 dedup for MULTI/EXEC #884, re-landed against main)" so the lineage is preserved and the canonical landing PR is visible.

go test ./adapter/ -run ExecDedup passes. No Go callers touched.

claude · 2026-05-30T08:25:40Z

Claude finished @bootjp's task in 2m 22s —— View job

PR #887 Round 2 Review — Verifying coderabbitai minor fixes (commit `69db79b`)

Tasks:

Load tools
Read adapter/redis_exec_dedup_test.go — verify Equal(3) assertion
Read design doc — verify PR reference update
Check for any remaining open coderabbitai comments not yet addressed
Post final review

Round 2 fixes verified — both correct

Fix 1 — GreaterOrEqual → Equal(3) in TestExecDedup_GenuineConflictRebuildsAndApplies ✅

adapter/redis_exec_dedup_test.go:122:

require.Equal(t, 3, coord.dispatches)

The comment is expanded with a clear explanation of the deterministic retry topology. No GreaterOrEqual on dispatch counts remains anywhere in either dedup test file (redis_exec_dedup_test.go, redis_set_dedup_test.go).

Fix 2 — PR reference in design doc ✅

docs/design/2026_05_21_proposed_txn_secondary_idempotency.md:481:

LANDED via PR #887 (originally PR #884, re-landed against main)

The lineage is preserved (#884 still credited) and the canonical landing PR (#887) is now unambiguous. Only one occurrence of the LANDED note for runTransaction; correct.

One remaining open item (informational, not blocking this PR)

The coderabbitai "outside diff range" comment on prepareDispatch() (line 2786) — using handlerContext() as parent context instead of t.ctx — was not included in the round 2 targets, and it doesn't block merge. Summary for the record:

prepareDispatch() creates ctx from r.server.handlerContext(), which ignores the already-bounded t.ctx (= dispatchCtx). This means stream-deletion scans + Dispatch in the prepare phase can outlive a cancelled dispatchCtx by up to a full redisDispatchTimeout. The fix is symmetric with the reuseCtx correction from round 1:

parentCtx := t.ctx
if parentCtx == nil {
    parentCtx = t.server.handlerContext()
}
ctx, cancel := context.WithTimeout(parentCtx, redisDispatchTimeout)

The nil guard is needed because the test harness constructs txnContext without setting ctx. Recommend tracking this in a follow-up PR (same batch as the dispatchListPushReuse / listPushCoreWithDedup Next() → NextFenced() cleanup).

The table-driven refactor suggestion for redis_set_dedup_test.go (coderabbitai nitpick) is a style preference and not blocking.

Summary

Both round 2 fixes are correctly implemented. Combined diff against main is clean. LGTM — ready to merge.
| Branch

…t to fix CI flake (#890) ## Summary Fixes the CI flake observed in [Test workflow run 26678774684](https://github.com/bootjp/elastickv/actions/runs/26678774684): ``` --- FAIL: TestSQSServer_Throttle_SetQueueAttributesInvalidatesBucket (1.79s) sqs_throttle_integration_test.go:170: expected throttle, got 200 ``` ## Root cause Identical race to the one fixed for `TestSQSServer_Throttle_NoOpSetQueueAttributesPreservesBucket` in commit [54c6cd5](54c6cd56) (PR #819 follow-up): the 1-token-per-second refill rate races the test's own wall clock under `-race` on slow CI runners. For **this** test: 1. `mustSetQueueAttributes(Capacity=10, Refill=1)` 2. `for range 10 { send }` — drains the bucket 3. Sanity send — expects HTTP 400 (throttle) Each send goes through Raft propose+apply at ~100-250ms under `-race`. The 11 writes from steps 1-3 elapse ~1.1-2.75s. At Refill=1/sec the bucket has accumulated ≥1 token by step 3, so the sanity send returns HTTP 200 instead of 400 — **falsely** indicating a bucket-invalidation regression that does not exist. ## Fix Drop the initial Refill from `"1"` to `"0.01"` (1 token per 100 seconds) so no test-window wall-clock can accumulate to a whole token. The test's intent — *verify that a Capacity/Refill **raise** invalidates the cached bucket on the very next request* — is independent of the **initial** refill rate. The post-set assertion at line 182 is exercised against the fresh `Capacity=20/Refill=20` bucket, which is what the test actually claims to pin. ## Why only this test, not the sibling `TestSQSServer_Throttle_DeleteQueueInvalidatesBucket` has the same `Refill=1` initial config but **no post-drain sanity assertion** — it just drains without status checks, then verifies fresh capacity post-recreate. No race window there. ## Relation to other open PRs Unrelated to PRs #887/#888/#889 (option-2 dedup work). The flake surfaced on PR #889's CI run but the fix lives in pre-existing SQS test code that none of the dedup PRs touch. ## Caller audit (per /loop semantic-change rule) Test-only change. The throttle config validator (`sqs_catalog.go:163`) accepts fractional `float64 SendRefillPerSecond`; `0.01` is non-zero so `IsEmpty` (line 172) returns `false` and throttling stays enabled — the test still exercises the throttle path. Matches the prior fix's caller-audit conclusion verbatim. ## Validation - `go test ./adapter/ -run TestSQSServer_Throttle -race -count=3 -timeout 120s` passes (5.3s wall, all three iterations green) - `gofmt`, `go vet`, `golangci-lint run` all clean  ## Summary by CodeRabbit * **Tests** * Updated test configuration and expanded documentation to improve test stability on slow CI environments.  [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/890?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

claude[bot] PR #901 minor #2 (CLAUDE.md convention): inline comments should state the underlying reason, not reference issue/PR numbers that rot as the codebase evolves. Dropped "PR #887" / "PR #887 round 1" / "PR #887 verdict" / "PR #887 review" attributions from the three doc blocks. The WHY content is kept (ctx-parent symmetry, HLC-4 fence rationale, ErrCeilingExpired non-retryable shape). The PR description is the durable home for the flake-fix lineage and the related PR refs. No behavior change. Comment-only.

## Summary Picks up the two follow-up items noted in claude[bot]'s PR #887 verdict but deferred from that PR: ### (a) `prepareDispatch()` ctx-parent symmetric fix `prepareDispatch()` took its bounded ctx from `t.server.handlerContext()` — the server-lifetime ctx, **not** the caller's `dispatchCtx`. The `reuseCtx` fix on PR #887 round 1 corrected `runTransactionWithDedup` to derive from `dispatchCtx`; this PR applies the symmetric fix to `prepareDispatch` so a disconnected client / `retryRedisWrite` timeout interrupts the prepare+dispatch promptly. Nil-guard falls back to `handlerContext` for test fixtures that construct a `txnContext` without setting `ctx`. ### (b) Last two `Clock().Next()` callers in adapter/redis.go → `NextFenced()` `dispatchListPushReuse` and the `listPushCoreWithDedup` first-attempt site both shipped before the HLC-4 physical-ceiling migration and were missed in that pass. `dispatchExecReuse` already uses `NextFenced` (PR #887 round 1) and so do all first-attempt `commit_ts` allocations through `prepareDispatch`. These two list-push dedup sites were the **last** persistence-grade `Clock().Next()` callers in `adapter/redis.go` — verified by `grep '\.Clock()\.Next\b' adapter/redis.go` now returning only `NextFenced` variants. Without HLC-4 parity here, a stale-leader window could let the list-push dedup path mint a `commit_ts` that collides with the successor's freshly-fenced range — defeating the same anomaly class option-2 was introduced to prevent. ## Caller audit (per /loop semantic-change rule) - `prepareDispatch`: sole callers are `commit()` and `firstExecAttempt` (both in adapter/redis.go). Ctx-parent change is additive; same defer-cancel discipline, same error mapping. - `dispatchListPushReuse`: sole caller `listPushCoreWithDedup`. NextFenced error wired through existing `(newLen, drop, err)` tuple; non-retryable so `retryRedisWrite` surfaces it directly — matches PR #887 round-1's `dispatchExecReuse` NextFenced wiring. - `listPushCoreWithDedup` first-attempt: surfaces NextFenced error through existing `retryRedisWrite` closure return. ## Validation - `go test ./adapter/ -run 'ListPushDedup|ExecDedup|StandaloneSetDedup' -race -count=2` → ok 1.2s - `gofmt` + `go vet` + `golangci-lint run` → 0 issues ## Scope Closes the two follow-up items explicitly noted in PR #887's round-2 verdict ("Note on pre-existing Clock().Next() callers" + `prepareDispatch` ctx deviation). No new functionality.

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

bootjp mentioned this pull request May 30, 2026

feat(txn-dedup): standalone SET routes through option-2 dedup loop #888

Merged

bootjp added 5 commits May 30, 2026 16:44

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread adapter/redis_exec_dedup_test.go Outdated

Comment thread docs/design/2026_05_21_proposed_txn_secondary_idempotency.md Outdated

bootjp merged commit 3fc9786 into main May 30, 2026
9 checks passed

bootjp deleted the feat/txn-dedup-docs-and-tests branch May 30, 2026 08:55

bootjp mentioned this pull request May 30, 2026

test(sqs): slow throttle refill on SetQueueAttributesInvalidatesBucket to fix CI flake #890

Merged

bootjp mentioned this pull request Jun 1, 2026

fix(redis): finish option-2 dedup HLC-4 / ctx-parent parity #901

Merged

		if pending != nil {
		reuseCtx, reuseCancel := context.WithTimeout(r.handlerContext(), redisDispatchTimeout)

Conversation

bootjp commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. Re-land M3 EXEC reuse (PR #884's content)

2. Close M2 open item — FSM other-txn exactness test

3. M3 multi-mop EXEC dedup test

4. Design doc updates

Caller audit (per /loop semantic-change rule)

Validation

Relation to prior work

Summary by CodeRabbit

Release Notes

Uh oh!

chatgpt-codex-connector Bot commented May 30, 2026

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #887 Review — Re-land M3 EXEC reuse + M2 close + multi-mop test

Requested verification items

Issues

Bug (high) — dispatchExecReuse uses Clock().Next() for a persistence-grade commitTS

Non-issue — Gemini's nil readPin concern

Minor — fresh reuseCtx bypasses expired dispatchCtx

Design doc & tests

Summary

Uh oh!

bootjp commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #887 Round 1 Review — Verifying fixes from dd042e9

Round 1 fixes verified — all correct

One remaining observation (informational, not blocking)

Summary

Uh oh!

bootjp commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #887 Round 2 Review — Combined diff (PR #887 + PR #888 folded in)

Prior LGTM'd changes — intact

PR #888 content — set() routing + setLegacy + writeRedisStandaloneResult

No new issues

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bootjp commented May 30, 2026

Uh oh!

claude Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #887 Round 2 Review — Verifying coderabbitai minor fixes (commit 69db79b)

bootjp commented May 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading

claude Bot commented May 30, 2026 •

edited

Loading

Bug (high) — `dispatchExecReuse` uses `Clock().Next()` for a persistence-grade `commitTS`

Non-issue — Gemini's nil `readPin` concern

Minor — fresh `reuseCtx` bypasses expired `dispatchCtx`

claude Bot commented May 30, 2026 •

edited

Loading

PR #887 Round 1 Review — Verifying fixes from `dd042e9`

claude Bot commented May 30, 2026 •

edited

Loading

PR #888 content — `set()` routing + `setLegacy` + `writeRedisStandaloneResult`

claude Bot commented May 30, 2026 •

edited

Loading

PR #887 Round 2 Review — Verifying coderabbitai minor fixes (commit `69db79b`)