Skip to content

feat(redis): embed TTL inline in string values, remove TTL buffer#521

Merged
bootjp merged 14 commits intomainfrom
claude/sad-hermann-0a3a28
Apr 17, 2026
Merged

feat(redis): embed TTL inline in string values, remove TTL buffer#521
bootjp merged 14 commits intomainfrom
claude/sad-hermann-0a3a28

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 17, 2026

Summary

  • Implements docs/design/ttl-inline-value.md (Phases 0–3): TTL is now embedded directly inside Redis string values rather than stored as a separate !redis|ttl|<key> MVCC key.
  • Removes the TTL buffer (redis_ttl_buffer.go) and its flusher goroutine entirely — they were a workaround for write conflicts that this design makes unnecessary.
  • All existing tests pass; Jepsen Redis, DynamoDB, and S3 workloads report {:valid? true}.

Problem

The previous design stored TTL in a separate key, causing three structural issues:

  1. Read amplification — every read issued two store lookups per key (data + TTL).
  2. Broken MVCC snapshot semantics — data and TTL could be committed in different Raft entries, so a point-in-time read might see a data version from one commit and a TTL from another.
  3. TTL write conflicts — concurrent EXPIRE commands on the same key conflicted in the MVCC layer, requiring the in-memory buffer workaround.

Solution

String values now use a new binary encoding:

[0xFF 0x01][flags(1)][expireAtMs(8, optional)][user bytes]

The 0xFF 0x01 magic prefix distinguishes new-format values from legacy raw bytes, allowing zero-cost backward compatibility (legacy values decode as-is with no TTL).

Key changes:

File Change
redis_compat_types.go encodeRedisStr / decodeRedisStr / isNewRedisStrFormat; ttlAt reads embedded TTL for new-format strings
redis_compat_helpers.go readRedisStringAt decodes header; saveString encodes with TTL
redis.go replaceWithStringTxn, buildKeyElems, txnContext.commit, applyExpire updated; TTL buffer fields/goroutine removed
redis_compat_commands.go setExpire performs atomic read-modify-write on string anchor; INCR encodes with encodeRedisStr
redis_lua_context.go stringCommitElems encodes TTL in value; nonStringTTLElems handles non-string keys; flushTTLForKeyToBuffer removed
redis_ttl_buffer.go Deleted
redis_ttl_buffer_test.go Deleted

The !redis|ttl|<key> key is still written alongside the value as a scan index for TTL expiry sweeps, but it is no longer the authoritative source — the embedded value is.

Test plan

  • go test ./adapter/... -timeout 120s -count=1 — all pass
  • Jepsen unit tests (lein test) — 11 tests, 0 failures
  • Jepsen Redis workload — {:valid? true}
  • Jepsen DynamoDB workload — {:valid? true}
  • Jepsen S3 workload — {:valid? true}
  • golangci-lint run — 0 issues

Summary by CodeRabbit

Release Notes

  • New Features

    • TTL operations are now atomic with data writes for improved consistency
    • INCR command now preserves existing TTL on values
  • Bug Fixes

    • Improved TTL handling efficiency by eliminating separate buffer mechanism
  • Refactor

    • Simplified TTL persistence model through transactional data updates
  • Documentation

    • Added design documentation for new TTL handling architecture

bootjp added 2 commits April 17, 2026 01:52
Architecture proposal to move TTL from a separate !redis|ttl|<key> MVCC
key to an inline field in each type's anchor value.

Problems with the current design:
- Read amplification: every key-existence check issues two store reads
- Broken MVCC snapshot semantics: data and TTL committed in separate
  Raft entries can yield mismatched versions at the same readTS
- TTL write conflicts (the TTL buffer is a workaround, not a root fix)

Proposed approach:
- Embed expireAt in the anchor value for all types (string header flags,
  ListMeta/HashMeta/SetMeta/ZSetMeta struct field)
- !redis|ttl|<key> retained only as a background expiry scan index
- Four-phase migration plan for rolling upgrades without downtime
- TTL buffer and background flush goroutine removed in Phase 3
Implements the design from docs/design/ttl-inline-value.md (Phases 0–3).

Previously, TTL was stored as a separate !redis|ttl|<key> MVCC key,
which caused three structural problems:
  1. Read amplification — every read issued two store lookups per key.
  2. Broken MVCC snapshot semantics — data and TTL could come from
     different commits, making point-in-time reads inconsistent.
  3. TTL write conflicts — concurrent EXPIRE commands could conflict,
     motivating the TTL buffer workaround.

Changes:
- New binary encoding for Redis string values:
    [0xFF 0x01][flags(1)][expireAtMs(8, optional)][user bytes]
  The magic prefix allows backward compatibility: legacy raw bytes are
  decoded as-is (no TTL); new-format values carry the TTL embedded
  directly in the value.
- encodeRedisStr / decodeRedisStr / isNewRedisStrFormat added to
  redis_compat_types.go; ttlAt updated to read the embedded TTL for
  new-format strings and fall back to !redis|ttl| for legacy data and
  non-string types.
- readRedisStringAt decodes the header transparently; saveString,
  replaceWithStringTxn, and setExpire write data + TTL index in the
  same Raft entry (IsTxn=true for string keys, false for others).
- txnContext.buildKeyElems encodes TTL into the string value and emits
  the scan-index key in the same batch; buildTTLElems handles
  non-string keys. commit includes both sets atomically.
- luaScriptContext.stringCommitElems encodes TTL in the value;
  nonStringTTLElems handles non-string keys. commit includes both
  in a single Raft transaction.
- TTL buffer (redis_ttl_buffer.go, redis_ttl_buffer_test.go,
  runTTLFlusher goroutine, cancelFlush/flushWG) removed entirely.
- Tests updated to decode the new binary header before asserting user
  values read directly from the store.
- applyExpire and setExpire refactored into smaller helpers to satisfy
  cyclop/gocognit lint limits.

All unit tests pass; Jepsen Redis, DynamoDB, and S3 workloads all
report {:valid? true}.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 11 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 24 minutes and 11 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9c369be1-f674-4476-8e82-439e46c3a7ed

📥 Commits

Reviewing files that changed from the base of the PR and between e65648d and c3738fa.

📒 Files selected for processing (7)
  • adapter/redis.go
  • adapter/redis_compat_commands.go
  • adapter/redis_compat_helpers.go
  • adapter/redis_compat_types.go
  • adapter/redis_lua_context.go
  • adapter/redis_ttl_compat_test.go
  • docs/design/ttl-inline-value.md
📝 Walkthrough

Walkthrough

This PR removes the Redis TTL buffer—an in-memory write buffer with periodic flushing—and refactors TTL handling to embed TTL directly in string values via encoding and in collection metadata via fields, while committing TTL changes in the same Raft transaction as data updates.

Changes

Cohort / File(s) Summary
TTL Buffer Removal
adapter/redis.go, adapter/redis_ttl_buffer.go, adapter/redis_ttl_buffer_test.go
Deleted entire TTL buffer infrastructure including TTLBuffer type, flusher goroutine, and WithTTLFlushInterval option; removed ttlBuffer and ttlFlushInterval fields from RedisServer.
String TTL Encoding
adapter/redis_compat_types.go, adapter/redis_compat_helpers.go
Added encodeRedisStr/decodeRedisStr for embedding TTL in string values; modified readRedisStringAt to return (value, *time.Time, error) and saveString to encode TTL inline with the value.
Command Refactoring
adapter/redis_compat_commands.go
Updated INCR to preserve existing TTL instead of clearing it; refactored EXPIRE into doSetExpire, expireDeleteKey, and dispatchStringExpire helpers; updated GETDEL to unpack new readRedisStringAt signature.
Transaction & Lua Handling
adapter/redis_lua_context.go
Refactored transaction commit to combine data and TTL operations in a single Dispatch(IsTxn=true) call; added nonStringTTLElems helper and updated stringCommitElems to embed TTL in encoded values.
Test Updates
adapter/redis_lua_context_test.go, adapter/redis_ttl_compat_test.go, adapter/redis_retry_test.go, adapter/redis_txn_test.go
Removed flushTTLForKeyToBuffer tests; renamed and inverted INCR TTL test expectation (clear→preserve); updated value decoding in retry and transaction tests to work with encoded string format.
Design Documentation
docs/design/ttl-inline-value.md
New document specifying TTL embedding in string values and collection metadata, migration strategy, and removal of secondary TTL reads from the hot path.

Sequence Diagram(s)

The changes satisfy the conditions for a sequence diagram: they introduce a significant architectural modification to TTL handling with multiple component interactions (Client, Commands, Transaction Layer, MVCC/KV Store, Raft), multiple sequential operations (encode/commit/flush flow), and altered control flow. However, the core architectural change is more structural than sequential—the primary effect is moving TTL from a separate flushing mechanism to embedded encoding and inline transaction commits, which is better understood through code structure than interaction sequence. Thus, a sequence diagram would not meaningfully clarify the changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 Hops and encodes with glee,
No more buffers in the tree,
TTL nestled in the string,
One transaction—everything!
Swift as carrots, clean and lean. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(redis): embed TTL inline in string values, remove TTL buffer' directly and clearly summarizes the main change: implementing TTL embedding in string values while removing the separate TTL buffer mechanism. It is concise, specific, and accurately reflects the primary objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/sad-hermann-0a3a28

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request embeds TTL metadata directly into Redis string values and collection anchor keys, eliminating the TTLBuffer and background flusher to reduce read amplification and ensure MVCC snapshot consistency. String values now use a versioned header for optional expiration times. Feedback was provided to refactor repeated type-checking logic into a shared helper function for better maintainability.

Comment thread adapter/redis_lua_context.go Outdated
keyPlans = append(keyPlans, plan)
// For non-string keys with dirty TTL: include !redis|ttl| in the same txn.
// String keys already have TTL embedded in the value via stringCommitElems.
if plan.finalType != redisTypeNone && plan.finalType != redisTypeString {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition plan.finalType != redisTypeNone && plan.finalType != redisTypeString is repeated. To improve maintainability and avoid code duplication, consider extracting this logic into a shared helper function like isNonStringCollectionType(typ redisValueType).

References
  1. Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.

bootjp added 2 commits April 17, 2026 22:53
- Keep TTL inline encoding; drop redis_ttl_buffer.go/test additions from main
- Restore compactor field (added independently in main, unrelated to TTL buffer)
- Remove redis_lua_context_test.go added in main that tested flushTTLForKeyToBuffer
  (the function no longer exists; TTL is now embedded in string values)
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a significant architectural change by embedding TTL metadata directly into Redis string values and collection metadata, as detailed in the new design document. By co-locating TTL with data, the change eliminates the need for the TTLBuffer and its background flusher, reduces read amplification by removing extra lookups for the !redis|ttl| key, and restores MVCC snapshot consistency. The !redis|ttl| key is now maintained solely as a secondary scan index for background expiration. Feedback was provided regarding potential race conditions in the dispatchStringExpire read-modify-write operation, which requires careful leadership verification and locking.

Comment thread adapter/redis_compat_commands.go Outdated
Comment on lines +396 to +400
elems := []*kv.Elem[kv.OP]{
{Op: kv.Put, Key: redisStrKey(key), Value: encoded},
{Op: kv.Put, Key: redisTTLKey(key), Value: encodeRedisTTL(expireAt)},
}
return r.dispatchElems(ctx, true, readTS, elems)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The dispatchStringExpire function performs a read-modify-write operation. To prevent race conditions, ensure that leadership status is verified after acquiring the lock that protects the operation. Additionally, do not release the lock between reading the state variable and performing the conditional check to prevent the use of stale data.

References
  1. To prevent race conditions in read-modify-write operations that require leadership, verify leadership status after acquiring the lock that protects the operation.
  2. When performing atomic operations based on a version or state variable, do not release the lock between reading the variable and performing the conditional check to prevent race conditions and the use of stale data.

bootjp added 2 commits April 17, 2026 23:03
Re-apply deletion of adapter/redis_ttl_buffer.go that main modified.
Auto-merged redis.go and redis_txn_test.go (new ttlStates-initialized
OCC tests from main are preserved).
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a significant architectural change by embedding TTL metadata directly into Redis string values, as detailed in the new design documentation. This change eliminates the need for the TTLBuffer and its background flusher, simplifying the codebase and improving MVCC snapshot consistency. Key updates include a new versioned encoding for string values and modifications to transaction logic to handle atomic data and TTL updates. Feedback focuses on ensuring data integrity during the migration phase by using fallback-aware helpers and maintaining consistent concurrency control for non-string expiration.

Comment thread adapter/redis_compat_commands.go Outdated
Comment on lines +394 to +404
raw, readErr := r.store.GetAt(ctx, redisStrKey(key), readTS)
if readErr != nil && !cockerrors.Is(readErr, store.ErrKeyNotFound) {
return cockerrors.WithStack(readErr)
}
var userValue []byte
if readErr == nil {
userValue, _, readErr = decodeRedisStr(raw)
if readErr != nil {
return readErr
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation of dispatchStringExpire performs a direct GetAt on the prefixed key (redisStrKey). If the key is a legacy bare string key (Phase 0/1 migration), this read will return ErrKeyNotFound, causing userValue to remain nil. The subsequent encodeRedisStr will then create an encoded value with an empty user payload, which is written to the prefixed key. This effectively shadows the legacy bare key with an empty value, leading to data loss for the reader.

You should use the fallback-aware readRedisStringAt helper to ensure the current value is correctly retrieved regardless of its storage format.

Suggested change
raw, readErr := r.store.GetAt(ctx, redisStrKey(key), readTS)
if readErr != nil && !cockerrors.Is(readErr, store.ErrKeyNotFound) {
return cockerrors.WithStack(readErr)
}
var userValue []byte
if readErr == nil {
userValue, _, readErr = decodeRedisStr(raw)
if readErr != nil {
return readErr
}
}
userValue, readErr := r.readRedisStringAt(key, readTS)
if readErr != nil && !cockerrors.Is(readErr, store.ErrKeyNotFound) {
return cockerrors.WithStack(readErr)
}
encoded := encodeRedisStr(userValue, &expireAt)

Comment thread adapter/redis_compat_commands.go Outdated
Comment on lines +369 to +370
elems := []*kv.Elem[kv.OP]{{Op: kv.Put, Key: redisTTLKey(key), Value: encodeRedisTTL(expireAt)}}
return 1, r.dispatchElems(ctx, false, 0, elems)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For non-string types, EXPIRE is now dispatched with IsTxn=false and startTS=0. While this matches the 'last-writer-wins' behavior of the removed TTLBuffer, it bypasses the Optimistic Concurrency Control (OCC) provided by the retryRedisWrite loop. If the key is deleted or its type changes between the prepareExpire/rawKeyTypeAt checks and the Dispatch call, a dangling !redis|ttl| key will be created. Although the background scanner can handle this, it would be more consistent with the new design to use IsTxn=true with readTS to ensure the key still exists and is of the correct type at commit time.

Suggested change
elems := []*kv.Elem[kv.OP]{{Op: kv.Put, Key: redisTTLKey(key), Value: encodeRedisTTL(expireAt)}}
return 1, r.dispatchElems(ctx, false, 0, elems)
elems := []*kv.Elem[kv.OP]{{Op: kv.Put, Key: redisTTLKey(key), Value: encodeRedisTTL(expireAt)}}
return 1, r.dispatchElems(ctx, true, readTS, elems)

- dispatchStringExpire now uses readRedisStringAt to dual-read prefixed
  and legacy bare string keys, avoiding data loss during migration.
- Non-string EXPIRE now dispatches with IsTxn=true, StartTS=readTS so
  OCC catches type changes or deletions between the type check and
  commit, preventing dangling !redis|ttl| entries.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements inline TTL encoding for Redis string values, embedding the expiration timestamp directly within the data value to reduce read amplification and restore MVCC snapshot consistency. The changes include new encoding/decoding logic, updates to transaction and Lua script contexts, and the removal of the TTLBuffer mechanism. Feedback highlights several critical issues: the implementation incorrectly clears existing TTLs during INCR and other string modifications, and the removal of the TTLBuffer is considered premature since collection types have not yet been migrated to the inline format and still rely on separate TTL keys prone to write conflicts. Additionally, it is suggested that internal read functions return the decoded TTL to further optimize performance.

Comment thread adapter/redis.go Outdated
Comment on lines +2321 to +2325
var ttl *time.Time
if ttlSt := t.ttlStates[string(userKey)]; ttlSt != nil && ttlSt.dirty {
ttl = ttlSt.value
}
value = encodeRedisStr(tv.raw, ttl)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This logic clears the TTL for any string modification where the TTL state is not explicitly marked dirty (e.g., via EXPIRE or SET EX). In Redis, commands like INCR (when executed via MULTI/EXEC) or SETBIT must preserve the existing TTL. Since txnContext.load currently discards the TTL when reading from the store, and txnValue does not store it, any update to the value via txnContext will result in TTL loss. To fix this, txnValue should be updated to store the TTL found during load, and buildKeyElems should use that stored TTL as a fallback when ttlSt is not dirty.

Comment thread adapter/redis.go
Comment on lines 354 to 366
r := &RedisServer{
listen: listen,
store: store,
coordinator: coordinate,
redisTranscoder: newRedisTranscoder(),
redisAddr: redisAddr,
relay: relay,
leaderRedis: leaderRedis,
leaderClients: make(map[string]*redis.Client),
pubsub: newRedisPubSub(),
scriptCache: map[string]string{},
traceCommands: os.Getenv("ELASTICKV_REDIS_TRACE") == "1",
ttlBuffer: newTTLBuffer(),
ttlFlushInterval: defaultTTLFlushInterval,
listen: listen,
store: store,
coordinator: coordinate,
redisTranscoder: newRedisTranscoder(),
redisAddr: redisAddr,
relay: relay,
leaderRedis: leaderRedis,
leaderClients: make(map[string]*redis.Client),
pubsub: newRedisPubSub(),
scriptCache: map[string]string{},
traceCommands: os.Getenv("ELASTICKV_REDIS_TRACE") == "1",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Removing the TTLBuffer and its flusher goroutine is premature because the inline TTL design has not yet been implemented for collection types (Hash, Set, List, ZSet). As these types still use the separate !redis|ttl| key, concurrent EXPIRE operations on the same collection key will once again produce write conflicts in the MVCC layer. The TTLBuffer should be retained until all types have been migrated to the inline format as described in the design document.

if typ == redisTypeString {
return 1, r.dispatchStringExpire(ctx, key, readTS, expireAt)
}
elems := []*kv.Elem[kv.OP]{{Op: kv.Put, Key: redisTTLKey(key), Value: encodeRedisTTL(expireAt)}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For non-string types, the implementation still writes to the separate !redis|ttl| key. This contradicts the design document (§3.1, §3.3) which states that ListMeta, HashMeta, etc., should carry the expireAt field. More importantly, it means that write conflicts on the TTL key are not resolved for collection types, which makes the removal of the TTLBuffer premature for these types.

Comment thread adapter/redis_compat_commands.go Outdated
Comment on lines +1976 to +1977
// INCR clears any TTL (PERSIST semantics); encode value with no TTL.
encoded := encodeRedisStr([]byte(strconv.FormatInt(current, 10)), nil)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

INCR must preserve the existing TTL of the key. The current implementation incorrectly clears it by passing nil to encodeRedisStr (as noted in the comment which incorrectly identifies this as PERSIST semantics).

		// INCR must preserve the existing TTL.
		ttl, err := r.ttlAt(ctx, cmd.Args[1], readTS)
		if err != nil {
			return err
		}
		encoded := encodeRedisStr([]byte(strconv.FormatInt(current, 10)), ttl)

Comment thread adapter/redis_compat_helpers.go Outdated
v, err := r.readValueAt(redisStrKey(key), readTS)
if err == nil {
return v, nil
userValue, _, decErr := decodeRedisStr(v)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

readRedisStringAt should return the decoded TTL along with the user value. This would allow callers (like incr or txnContext.load) to preserve the existing TTL during value updates without issuing an additional ttlAt store lookup, fulfilling the design goal of reducing read amplification.

Critical correctness fixes from Gemini PR review:

- readRedisStringAt now also returns the decoded TTL, eliminating the
  extra ttlAt lookup callers would otherwise need.
- txnValue caches the loaded TTL; buildKeyElems falls back to it when
  no EXPIRE/PERSIST ran in the transaction, so INCR/SETBIT/etc. inside
  MULTI/EXEC no longer silently clear TTL.
- INCR now preserves the existing TTL (Redis semantics); the test that
  asserted PERSIST semantics was wrong and is updated.

For non-string EXPIRE, OCC (IsTxn=true, StartTS=readTS) + retryRedisWrite
already gives correct last-writer-wins under concurrent EXPIRE, so the
TTLBuffer stays removed; inline TTL for collection metas remains a
follow-up per docs/design/ttl-inline-value.md.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
adapter/redis.go (1)

2318-2339: ⚠️ Potential issue | 🟠 Major

MULTI/EXEC leaves orphaned TTL indexes for strings.

This path never deletes !redis|ttl|<key> for strings: not when a string is deleted, and not when a dirty string is re-encoded with ttl == nil. Because buildTTLElems skips string keys, those orphaned scan-index entries survive the transaction and can later expire a persistent or recreated value incorrectly.

Suggested fix
 		if tv.deleted {
 			elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: storageKey})
+			if bytes.HasPrefix(storageKey, []byte(redisStrPrefix)) {
+				userKey := storageKey[len(redisStrPrefix):]
+				elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: redisTTLKey(userKey)})
+			}
 			continue
 		}
 		value := tv.raw
 		// For string keys: embed TTL in the encoded value.
 		if bytes.HasPrefix(storageKey, []byte(redisStrPrefix)) {
@@
 			value = encodeRedisStr(tv.raw, ttl)
 			// Write !redis|ttl| scan index if TTL is set.
 			if ttl != nil {
 				elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: redisTTLKey(userKey), Value: encodeRedisTTL(*ttl)})
+			} else {
+				elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: redisTTLKey(userKey)})
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapter/redis.go` around lines 2318 - 2339, The code path for string keys
never removes the "!redis|ttl|<key>" index: when tv.deleted is true you must
append a kv.Del for redisTTLKey(userKey) before continuing, and when re-encoding
a string with ttl == nil (i.e. previously had a TTL but now cleared) you must
append a kv.Del for redisTTLKey(userKey) instead of leaving the old index;
locate the string handling block around storageKey/redisStrPrefix/userKey/tv.ttl
and add elems = append(... &kv.Elem{Op: kv.Del, Key: redisTTLKey(userKey)}) both
in the tv.deleted branch and in the branch where ttl==nil (while preserving the
existing kv.Put when ttl != nil and using encodeRedisTTL as before).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@adapter/redis_compat_commands.go`:
- Around line 393-404: dispatchStringExpire currently treats
store.ErrKeyNotFound as non-fatal and proceeds to write an empty-string anchor
and TTL; change it to treat ErrKeyNotFound as a no-op by returning early without
dispatching any kv.Elem so the key is not resurrected (update
dispatchStringExpire to detect cockerrors.Is(readErr, store.ErrKeyNotFound) and
return a “not applied” signal instead of encoding/putting), and plumb that “not
applied” result back to the caller path (doSetExpire) so the final reply becomes
0 when expire was not applied rather than unconditionally 1.

In `@adapter/redis_compat_helpers.go`:
- Around line 392-401: In saveString, when ttl is nil the code currently
overwrites the embedded value but leaves an existing TTL scan index (!redis|ttl|
via redisTTLKey) in place; change saveString (and its use of redisStrKey,
redisTTLKey, encodeRedisTTL, dispatchElems) to, when ttl==nil, append a kv.Elem
with Op: kv.Delete (or equivalent) for redisTTLKey(key) so the old TTL index is
removed; ensure dispatchElems is called with this delete element alongside the
Put for redisStrKey to atomically remove the stale TTL entry.

In `@adapter/redis_lua_context.go`:
- Around line 2562-2572: The Lua-string path currently only writes the TTL index
when finalTTL(...) returns non-nil, but it must also remove any existing TTL
index when ttl == nil to avoid stale !redis|ttl| entries; update the block in
adapter/redis_lua_context.go (around finalTTL, encodeRedisStr, redisStrKey,
redisTTLKey usage) so that when ttl == nil you append a kv.Elem with Op:
kv.Delete for redisTTLKey([]byte(key)) (instead of doing nothing), while
preserving the existing Put of the encoded value via encodeRedisStr and the Put
of the TTL when ttl != nil.

In `@adapter/redis_ttl_compat_test.go`:
- Around line 97-101: Read and store the key TTL before calling INCR (e.g.,
ttlBefore := rdb.TTL(ctx, "incr:key").Result()), then after INCR use the
existing ttlAfter check and add an assertion comparing ttlAfter to ttlBefore to
ensure the expiry was preserved (for example assert ttlAfter > 0 and ttlAfter <=
ttlBefore or assert ttlAfter is within a small delta of ttlBefore); reference
the rdb.TTL calls and the ttlAfter/ttlBefore variables in your change.

In `@adapter/redis.go`:
- Around line 742-748: replaceWithStringTxn currently only writes a TTL index
when ttl != nil, but when ttl == nil it must remove any existing
!redis|ttl|<key> entry to avoid leaving stale expiry markers; in the
replaceWithStringTxn block (where encoded := encodeRedisStr(...) and elems is
appended) add an elems append that deletes the TTL index when ttl == nil (use
the same redisTTLKey(key) and the kv delete operation consistent with other
code), while keeping the existing elems append for ttl != nil that writes
encodeRedisTTL(*ttl); then call r.dispatchElems as before.

In `@docs/design/ttl-inline-value.md`:
- Around line 316-320: The Phase 2 table row contains the literal key string
`!redis|ttl|` which Markdown treats as cell separators; update that cell to
escape the pipe characters (e.g., `!redis\|ttl\|`) or reword/move the key (e.g.,
put it in backticks outside the cell or describe it as "legacy key
`!redis|ttl|`") so the three-column table remains valid and the MD056 warning is
resolved.

---

Outside diff comments:
In `@adapter/redis.go`:
- Around line 2318-2339: The code path for string keys never removes the
"!redis|ttl|<key>" index: when tv.deleted is true you must append a kv.Del for
redisTTLKey(userKey) before continuing, and when re-encoding a string with ttl
== nil (i.e. previously had a TTL but now cleared) you must append a kv.Del for
redisTTLKey(userKey) instead of leaving the old index; locate the string
handling block around storageKey/redisStrPrefix/userKey/tv.ttl and add elems =
append(... &kv.Elem{Op: kv.Del, Key: redisTTLKey(userKey)}) both in the
tv.deleted branch and in the branch where ttl==nil (while preserving the
existing kv.Put when ttl != nil and using encodeRedisTTL as before).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f29bb93d-ef51-40f8-bb4f-7b0afbdf4739

📥 Commits

Reviewing files that changed from the base of the PR and between 5be1b5c and e65648d.

📒 Files selected for processing (12)
  • adapter/redis.go
  • adapter/redis_compat_commands.go
  • adapter/redis_compat_helpers.go
  • adapter/redis_compat_types.go
  • adapter/redis_lua_context.go
  • adapter/redis_lua_context_test.go
  • adapter/redis_retry_test.go
  • adapter/redis_ttl_buffer.go
  • adapter/redis_ttl_buffer_test.go
  • adapter/redis_ttl_compat_test.go
  • adapter/redis_txn_test.go
  • docs/design/ttl-inline-value.md
💤 Files with no reviewable changes (3)
  • adapter/redis_lua_context_test.go
  • adapter/redis_ttl_buffer_test.go
  • adapter/redis_ttl_buffer.go

Comment thread adapter/redis_compat_commands.go Outdated
Comment thread adapter/redis_compat_helpers.go
Comment thread adapter/redis_lua_context.go
Comment thread adapter/redis_ttl_compat_test.go
Comment thread adapter/redis.go
Comment thread docs/design/ttl-inline-value.md Outdated
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

CodeRabbit review follow-ups:

- All string write paths (saveString, replaceWithStringTxn,
  txnContext.buildKeyElems PERSIST, lua stringCommitElems) now emit a
  Del on redisTTLKey when the new TTL is nil, so the background sweeper
  cannot expire a key that was just made persistent.
- dispatchStringExpire treats ErrKeyNotFound as a no-op and reports back
  whether the EXPIRE was applied; doSetExpire returns 0 in that race
  instead of resurrecting the key with an empty anchor.
- INCR TTL-preservation test now asserts ttlAfter <= ttlBefore so a
  silent reset would be caught.
- Escape pipes in docs/design table to restore 3-column layout.
- Extract stringValueAndTTLElem helper to keep buildKeyElems under the
  cyclop limit.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a significant architectural change by embedding TTL metadata directly within Redis string values, moving away from a separate TTL key and the asynchronous TTLBuffer. This change aims to reduce read amplification and restore MVCC snapshot consistency. Key updates include a new versioned string encoding format, updated transaction logic to handle embedded TTLs, and the removal of the background TTL flusher. Feedback focuses on ensuring TTL preservation for legacy keys during the migration window, optimizing the read path to avoid redundant store lookups, and properly cleaning up the secondary TTL index when an expiration is removed.

I am having trouble creating individual review comments. Click here to see my feedback.

adapter/redis_compat_helpers.go (385-389)

high

The readRedisStringAt function returns a nil TTL for legacy (unprefixed) string keys. This causes commands that preserve TTL (such as INCR or string updates within a MULTI/EXEC block) to unintentionally strip the expiration from legacy keys during their first update after migration, as these commands rely on the returned TTL to re-encode the value. The function should explicitly fetch the TTL from the secondary index for legacy keys to ensure it is preserved during the transition to the new format.

	legacy, err := r.readValueAt(key, readTS)
	if err != nil {
		return nil, nil, err
	}
	ttl, err := r.ttlAt(context.Background(), key, readTS)
	if err != nil {
		return nil, nil, err
	}
	return legacy, ttl, nil

adapter/redis_compat_helpers.go (373-375)

medium

This implementation introduces redundant read operations for string values. readRedisStringAt calls readValueAt, which invokes hasExpiredTTLAt -> ttlAt. For new-format strings, ttlAt performs a GetAt on the prefixed key to extract the embedded TTL, and then readValueAt performs another GetAt on the same key to retrieve the value. This double lookup on the hot path negates some of the performance benefits of embedding the TTL. Consider refactoring the read path to retrieve both the value and TTL in a single store operation.

adapter/redis.go (2335-2337)

medium

When a string key's TTL is removed (e.g., via PERSIST or a plain SET), the secondary !redis|ttl| index is not deleted. This results in stale entries in the TTL index. While the embedded TTL is the authoritative source for reads, these leaked index entries increase storage overhead and cause unnecessary work for the background expiration scanner. The logic should be updated to emit a Del operation for the TTL key when the TTL is cleared, especially if a TTL was previously loaded.

			if ttl != nil {
				elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: redisTTLKey(userKey), Value: encodeRedisTTL(*ttl)})
			} else if tv.ttl != nil {
				elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: redisTTLKey(userKey)})
			}

Gemini PR review follow-ups:

- readRedisStringAt now fetches the TTL from the !redis|ttl| index when
  falling back to a legacy bare-key read, so INCR / MULTI-EXEC writes
  on unmigrated keys don't strip the TTL on first rewrite.
- stringValueAndTTLElem emits a Del on redisTTLKey whenever the effective
  TTL is nil and there was any prior TTL (loaded or dirty-cleared), not
  only the dirty-PERSIST case.

The remaining double-read suggestion (ttlAt + readValueAt both touching
the prefixed key) is left as a follow-up — the fix needs a new
GetWithTTLAt helper on the store layer.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a new storage format for Redis strings that embeds TTL metadata directly within the data value, addressing issues with read amplification and MVCC snapshot consistency. By including the expiry timestamp in the anchor key's value using a versioned header, the implementation eliminates the need for a separate TTL key lookup on the hot read path and allows data and expiry to be committed in a single Raft transaction. Consequently, the TTLBuffer and its background flusher have been removed. The changes include updated encoding/decoding logic, support for legacy data fallbacks, and adjustments to commands like INCR to preserve existing TTLs. Feedback was provided regarding an opportunity to further optimize the legacy fallback path in readRedisStringAt to reduce store read amplification.

Comment on lines +373 to 396
func (r *RedisServer) readRedisStringAt(key []byte, readTS uint64) ([]byte, *time.Time, error) {
v, err := r.readValueAt(redisStrKey(key), readTS)
if err == nil {
return v, nil
userValue, ttl, decErr := decodeRedisStr(v)
if decErr != nil {
return nil, nil, decErr
}
return userValue, ttl, nil
}
if !errors.Is(err, store.ErrKeyNotFound) {
return nil, err
return nil, nil, err
}
legacy, err := r.readValueAt(key, readTS)
if err != nil {
return nil, nil, err
}
return r.readValueAt(key, readTS)
// Legacy bare-key data still stores its TTL in the !redis|ttl| index;
// fetch it so callers that re-encode (INCR, MULTI/EXEC writes) preserve it.
legacyTTL, err := r.ttlAt(context.Background(), key, readTS)
if err != nil {
return nil, nil, err
}
return legacy, legacyTTL, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of the legacy fallback path in this function introduces significant read amplification. For a legacy key, this function and its callees (readValueAt, ttlAt) may perform up to 6 store reads, including redundant lookups for !redis|str| and !redis|ttl|. Refactor this to check for the TTL once and then read the bare key value, reducing store reads to 3. When refactoring, ensure that the timestamp used for the MVCC read in readValueAt is acquired after the linearizable read fence has completed to guarantee a consistent snapshot view.

References
  1. Linearizable reads in MVCC systems must acquire the read timestamp after the fence to prevent stale data, particularly for operations like readValueAt.

Gemini PR review: the legacy fallback path repeatedly re-entered
readValueAt → hasExpiredTTLAt → ttlAt, which could issue up to 6 GetAt
calls per string read (two redundant checks each on !redis|str| and
!redis|ttl|).

Refactor to call a new leaderAwareGetAt helper directly and perform
expiration checks locally on the TTL we already decoded:

- new-format path: 1 GetAt on !redis|str| (value + embedded TTL).
- legacy path: 1 GetAt on !redis|ttl|, then 1 GetAt on the bare key.

readTS is still acquired via snapshotTS before this function runs, so
the linearizable-read fence remains intact.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes the Redis adapter’s TTL persistence model to embed TTL metadata directly inside Redis string values (new binary header format), and removes the prior in-memory TTL buffer + flusher workaround. This is intended to make TTL updates atomic with string value writes and reduce inconsistencies caused by separate TTL-key commits.

Changes:

  • Introduces a versioned binary encoding for Redis string values with optional embedded expireAtMs, plus encode/decode helpers.
  • Updates Redis command, Lua, and MULTI/EXEC transaction paths to read/modify/write embedded TTL for strings and to write !redis|ttl| only as a scan index.
  • Removes the TTL buffer implementation and its unit tests; updates existing tests to decode new-format string payloads.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docs/design/ttl-inline-value.md New design doc describing TTL-in-value approach and migration phases
adapter/redis_compat_types.go Adds new string encoding/decoding helpers; updates ttlAt to read embedded TTL
adapter/redis_compat_helpers.go Updates string read/save helpers to decode/encode embedded TTL and manage scan-index key
adapter/redis_compat_commands.go Refactors EXPIRE to RMW for strings; updates INCR to preserve TTL
adapter/redis_lua_context.go Updates Lua commit to encode embedded TTL for strings and include scan-index TTL elems in-txn
adapter/redis.go Removes TTL buffer lifecycle; updates txn commit to embed TTL for strings and emit scan-index mutations
adapter/redis_ttl_compat_test.go Updates TTL behavior tests (notably INCR semantics)
adapter/redis_txn_test.go Adjusts assertions to decode stored string value format
adapter/redis_retry_test.go Adjusts assertions to decode stored string value format
adapter/redis_ttl_buffer.go Deleted (TTL buffer removed)
adapter/redis_ttl_buffer_test.go Deleted (buffer tests removed)
adapter/redis_lua_context_test.go Deleted (buffer-related Lua tests removed)

Comment on lines 282 to 298
func (r *RedisServer) ttlAt(ctx context.Context, userKey []byte, readTS uint64) (*time.Time, error) {
// The buffer holds the most recent TTL write regardless of Raft flush state.
if expireAt, found := r.ttlBuffer.Get(userKey); found {
return expireAt, nil
// For string keys with new encoding: TTL is embedded in the string value.
// Trust only the embedded TTL; do not fall back to !redis|ttl| for new-format strings.
raw, err := r.store.GetAt(ctx, redisStrKey(userKey), readTS)
if err == nil {
if isNewRedisStrFormat(raw) {
_, expireAt, decErr := decodeRedisStr(raw)
return expireAt, decErr
}
// Legacy string format: fall through to !redis|ttl| key.
} else if !errors.Is(err, store.ErrKeyNotFound) {
return nil, errors.WithStack(err)
}
raw, err := r.store.GetAt(ctx, redisTTLKey(userKey), readTS)

// Fall back to !redis|ttl| key (non-string types and legacy string data).
raw, err = r.store.GetAt(ctx, redisTTLKey(userKey), readTS)
if err != nil {
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ttlAt now always does a GetAt(!redis|str|) before consulting !redis|ttl|. For non-string keys this adds an extra store read (the !redis|str| lookup will almost always miss), which is a regression vs the previous single read of !redis|ttl|. Consider passing the known type into ttlAt/hasExpiredTTLAt (call sites already compute rawKeyTypeAt), or splitting into string/non-string TTL lookups so non-strings don’t pay the extra GetAt.

Copilot uses AI. Check for mistakes.
Comment thread adapter/redis.go Outdated
Comment on lines +1006 to +1014
// For string keys with new encoding: check embedded TTL.
if raw, err := r.tryLeaderGetAt(redisStrKey(key), 0); err == nil {
if isNewRedisStrFormat(raw) {
_, expireAt, _ := decodeRedisStr(raw)
if expireAt == nil {
return false
}
return !expireAt.After(time.Now())
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isLeaderKeyExpired ignores decodeRedisStr errors for new-format strings (the error return is discarded). If a value is malformed or if a legacy payload accidentally matches the magic prefix, this will silently treat the key as non-expired and skip the TTL-key fallback. Handle the decode error explicitly (e.g., fall back to !redis|ttl| or treat it as expired/not-found) so existence checks don’t hinge on silent parse failures.

Copilot uses AI. Check for mistakes.
Comment thread docs/design/ttl-inline-value.md Outdated
Comment on lines +130 to +156
### 3.3 Collection metadata encoding

`ListMeta`, `HashMeta`, `SetMeta`, `ZSetMeta` are fixed-width binary structs.
Each gains an `ExpireAt uint64` field (0 = no TTL):

```go
// Example: ListMeta before
type ListMeta struct {
Head int64
Len int64
Tail int64 // computed field, not stored
}

// ListMeta after
type ListMeta struct {
Head int64
Len int64
ExpireAt uint64 // ms since epoch; 0 = no TTL
}
```

The on-disk format gains 8 bytes; the existing 24-byte format is distinguished
from the new 32-byte format by length during decode (backward-compatible read).

For HLL and stream types the TTL is appended to the serialised value using the
same flags+expireAt header described in §3.2.

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design doc states that collection anchor metadata (ListMeta/HashMeta/SetMeta/ZSetMeta) gains an ExpireAt field and that TTL becomes part of each anchor value. In the current codebase, ListMeta (and related helpers) still uses the pre-existing 24-byte format with Head/Tail/Len only, and the adapter still relies on !redis|ttl| for non-string TTL. Please either scope the doc to “strings only” for this PR/phase, or update it to match the actually-implemented collection TTL representation.

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +128
A leading `flags` byte equal to `0x00` (no TTL) followed by raw bytes is fully
backward-compatible with no-TTL reads once a migration is applied; the old raw
bytes that start with `0x00` are rare in practice (binary data) but must be
handled during migration (see §7).

Alternatively, a two-byte magic + version prefix (`0xFF 0x01`) avoids the
collision problem at the cost of two extra bytes per value:

```
┌─────────────────────────────────────────────────────┐
│ 2 bytes magic+version 0xFF 0x01 │
│ 1 byte flags │
│ 8 bytes expireAtMillis (present iff flags & 0x01) │
│ N bytes user value │
└─────────────────────────────────────────────────────┘
```

A raw value not starting with `0xFF 0x01` is treated as a legacy (no-TTL)
encoding during the migration window.
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc claims the 0xFF 0x01 magic prefix “avoids the collision problem”, but legacy Redis string payloads are arbitrary bytes and can still legitimately start with 0xFF 0x01, which would be misinterpreted as the new header. Please call out this remaining ambiguity explicitly and/or document the additional validation/escape strategy used to make decoding unambiguous during migration.

Copilot uses AI. Check for mistakes.
// path does a single GetAt on the prefixed key (value + TTL in one read);
// the legacy path does at most two extra reads (TTL index, then bare key).
// Expiration is checked locally from the TTL we just decoded.
func (r *RedisServer) readRedisStringAt(key []byte, readTS uint64) ([]byte, *time.Time, error) {
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readRedisStringAt currently returns (userValue, nil TTL) when the value exists under !redis|str| but is still in the legacy (no-magic) format. During the rolling-upgrade/phase-0 window, TTL for those legacy string values still lives in !redis|ttl|, so callers like INCR and MULTI/EXEC will silently drop the existing TTL when they re-encode and write back. Consider: when decodeRedisStr sees legacy bytes (or when !isNewRedisStrFormat(v)), fetch TTL via ttlAt(readTS) and return it so TTL is preserved until migration completes.

Suggested change
func (r *RedisServer) readRedisStringAt(key []byte, readTS uint64) ([]byte, *time.Time, error) {
}
if !isNewRedisStrFormat(v) {
// During the rolling-upgrade window, some values may already live under
// !redis|str| while still using the legacy payload encoding, with TTL
// remaining in the !redis|ttl| index. Fetch it so rewrites preserve TTL.
legacyTTL, ttlErr := r.ttlAt(context.Background(), key, readTS)
if ttlErr != nil {
return nil, nil, ttlErr
}
return userValue, legacyTTL, nil
}

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +76
// decodeRedisStr decodes a Redis string value.
// For new format (magic prefix 0xFF 0x01): returns user value and optional expireAt.
// For legacy format (no magic): returns raw bytes as user value with nil expireAt.
func decodeRedisStr(raw []byte) (value []byte, expireAt *time.Time, err error) {
if isNewRedisStrFormat(raw) {
if len(raw) < redisStrBaseHeader {
return nil, nil, errors.New("invalid encoded string: too short")
}
flags := raw[2]
rest := raw[redisStrBaseHeader:]
if flags&redisStrHasTTL != 0 {
if len(rest) < redisUint64Bytes {
return nil, nil, errors.New("invalid encoded string: missing TTL bytes")
}
ms := min(binary.BigEndian.Uint64(rest[:redisUint64Bytes]), math.MaxInt64)
t := time.UnixMilli(int64(ms)) // #nosec G115
return rest[redisUint64Bytes:], &t, nil
}
return rest, nil, nil
}
// Legacy format: raw bytes with no TTL header.
return raw, nil, nil
}

// isNewRedisStrFormat reports whether raw uses the new magic+version prefix.
func isNewRedisStrFormat(raw []byte) bool {
return len(raw) >= 2 && raw[0] == redisStrMagic && raw[1] == redisStrVersion
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new string encoding detection is ambiguous for legitimate legacy Redis string values that begin with the chosen magic prefix (0xFF 0x01…). Those values will be mis-decoded (header stripped / TTL bytes interpreted) or hard-fail decode, which is user-visible data corruption. To make this safe, add stronger framing/validation (e.g., reject/ignore unknown flag bits and treat invalid headers as legacy, or add an integrity check such as a CRC/versioned header) so legacy payload bytes are never reinterpreted as metadata.

Copilot uses AI. Check for mistakes.
Copilot PR review follow-ups:

- ttlAt: split off legacyIndexTTLAt and add hasExpired(nonStringOnly)
  so readValueAt on non-string internal keys skips the redundant
  GetAt on !redis|str|<key>.
- isLeaderKeyExpired: handle decodeRedisStr errors for new-format
  strings by treating malformed payloads as expired rather than
  silently alive; extract leaderEmbeddedTTLExpired helper.
- readRedisStringAt: when the !redis|str| payload is still in legacy
  format during rolling upgrade, consult the secondary !redis|ttl|
  index so INCR/MULTI-EXEC rewrites preserve TTL. Split into
  decodePrefixedString / readBareLegacyString helpers to keep the
  function under cyclop/nestif limits.
- docs: scope collection-metadata encoding as future work; call out
  the 0xFF 0x01 magic-prefix ambiguity on legacy payloads and the
  mitigation (decode-error fallback, migration compactor).
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 17, 2026

/gemini review

@bootjp bootjp requested a review from Copilot April 17, 2026 16:21
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements embedding TTL timestamps directly within Redis string values to reduce read amplification and improve MVCC snapshot consistency. It removes the asynchronous TTLBuffer mechanism, opting instead to commit TTL updates in the same transaction as the data. Key changes include a new versioned string encoding, updated read/write paths that prioritize embedded TTLs while maintaining backward compatibility with the legacy !redis|ttl| index, and a fix for INCR to correctly preserve existing TTLs. Feedback suggests refining leadership checks to use logical keys, stabilizing expiration timestamps across transaction retries, and ensuring consistent cleanup of legacy TTL indices during increment operations.

Comment on lines +448 to 457
func (r *RedisServer) leaderAwareGetAt(key []byte, readTS uint64) ([]byte, error) {
if r.coordinator.IsLeaderForKey(key) {
if err := r.coordinator.VerifyLeaderForKey(key); err != nil {
return nil, errors.WithStack(err)
}
v, err := r.store.GetAt(context.Background(), key, readTS)
return v, errors.WithStack(err)
}
return r.tryLeaderGetAt(key, readTS)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The leadership check in leaderAwareGetAt is performed using the physical key. However, leadership is partitioned by the logical user key. It is safer to extract the user key using extractRedisInternalUserKey before performing leadership checks. Additionally, ensure that leadership status is verified after acquiring any locks that protect the operation to prevent race conditions.

func (r *RedisServer) leaderAwareGetAt(key []byte, readTS uint64) ([]byte, error) {
	routingKey := key
	if userKey := extractRedisInternalUserKey(key); userKey != nil {
		routingKey = userKey
	}
	if r.coordinator.IsLeaderForKey(routingKey) {
		if err := r.coordinator.VerifyLeaderForKey(routingKey); err != nil {
			return nil, errors.WithStack(err)
		}
		v, err := r.store.GetAt(context.Background(), key, readTS)
		return v, errors.WithStack(err)
	}
	return r.tryLeaderGetAt(key, readTS)
}
References
  1. To prevent race conditions in read-modify-write operations that require leadership, verify leadership status after acquiring the lock that protects the operation.

Comment thread adapter/redis_compat_commands.go Outdated
if ttl <= 0 {
return r.expireDeleteKey(ctx, key, readTS)
}
expireAt := time.Now().Add(time.Duration(ttl) * unit)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The expireAt timestamp is calculated using time.Now() inside the retryRedisWrite closure. If the transaction retries due to a write conflict, the expiration time will shift forward on each attempt. For consistency and to ensure all operations within the transaction share a consistent reference point, the expiration timestamp should be calculated once before entering the retry loop and passed into the closure.

References
  1. In an MVCC system, all operations within a single transaction must share the same start timestamp to ensure atomicity and a consistent snapshot view.

Comment on lines +1992 to +1994
if existingTTL != nil {
elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: redisTTLKey(cmd.Args[1]), Value: encodeRedisTTL(*existingTTL)})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When existingTTL is nil, the INCR command should explicitly emit a Del operation for the redisTTLKey scan index. While a key with no TTL shouldn't have an index entry, emitting an explicit Del (as done in replaceWithStringTxn and saveString) is safer to ensure any stale or legacy index entries are cleared when the key is updated.

if existingTTL != nil {
	elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Put, Key: redisTTLKey(cmd.Args[1]), Value: encodeRedisTTL(*existingTTL)})
} else {
	elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: redisTTLKey(cmd.Args[1])})
}

…r INCR TTL index

Gemini PR review follow-ups:

- leaderAwareGetAt now strips the internal prefix before asking the
  coordinator so leadership is resolved on the logical user key.
- setExpire computes expireAt once before the retryRedisWrite loop so
  OCC retries all write the same deadline instead of shifting it
  forward on each attempt. Adds the matching overflow guard.
- INCR explicitly Dels the redisTTLKey scan index when the key has no
  TTL, defensively clearing any stale/legacy entry.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Comment on lines +361 to +372
func (r *RedisServer) doSetExpire(ctx context.Context, key []byte, ttl int64, expireAt time.Time, nxOnly bool) (int, error) {
readTS, eligible, err := r.prepareExpire(key, nxOnly)
if err != nil {
return 0, err
}
if !eligible {
return 0, nil
}
if ttl <= 0 {
return r.expireDeleteKey(ctx, key, readTS)
}
typ, err := r.rawKeyTypeAt(ctx, key, readTS)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doSetExpire treats every redisTypeString key as a plain Redis string and routes to dispatchStringExpire (read-modify-write on !redis|str|<key>). But rawKeyTypeAt also returns redisTypeString for HyperLogLog keys stored under !redis|hll|<key>, so EXPIRE on an HLL will fail to update TTL (and may incorrectly return 0). Consider distinguishing plain strings vs HLL (e.g., check ExistsAt(redisStrKey(key)) / bare key vs ExistsAt(redisHLLKey(key))), and for HLL keep using the !redis|ttl| index-only path until inline TTL for HLL is implemented.

Copilot uses AI. Check for mistakes.
Comment thread adapter/redis.go
Comment on lines +2153 to +2158
if typ == redisTypeString {
return t.markStringDirty(key)
}
return redisResult{typ: resultInt, integer: 1}, nil
}

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applyPositiveExpire calls markStringDirty for any key whose staged type is redisTypeString, which will load/write !redis|str|<key> so the embedded TTL can be updated. However, this codebase also reports HyperLogLog keys (!redis|hll|<key>) as redisTypeString; for those keys, markStringDirty will load a non-existent plain-string anchor and the transaction may end up creating an empty !redis|str|<key> value just to carry TTL. You likely need to only mark dirty for plain string anchors (prefixed/bare string), and for HLL keep TTL in !redis|ttl| (so buildTTLElems emits the TTL mutation).

Suggested change
if typ == redisTypeString {
return t.markStringDirty(key)
}
return redisResult{typ: resultInt, integer: 1}, nil
}
if typ == redisTypeString && isPlainStringExpireTarget(key) {
return t.markStringDirty(key)
}
return redisResult{typ: resultInt, integer: 1}, nil
}
func isPlainStringExpireTarget(key []byte) bool {
if bytes.HasPrefix(key, []byte("!redis|")) {
return bytes.HasPrefix(key, []byte("!redis|str|"))
}
return true
}

Copilot uses AI. Check for mistakes.
Comment thread adapter/redis.go
Comment on lines +2537 to +2543
// String keys encode TTL inside the value in buildKeyElems; skip them here.
if _, isString := t.working[string(redisStrKey([]byte(k)))]; isString {
continue
}
if st.value == nil {
elems = append(elems, &kv.Elem[kv.OP]{Op: kv.Del, Key: redisTTLKey([]byte(k))})
} else {
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildTTLElems skips emitting any !redis|ttl| mutations whenever the key has a !redis|str|<key> entry in t.working. That’s correct for string updates (since buildKeyElems emits the scan-index elem), but it breaks string deletes: buildKeyElems only deletes !redis|str|<key> and does not delete the !redis|ttl|<key> index, and buildTTLElems will currently skip it. This can leave a stale TTL index behind, which may later cause incorrect expiry behavior when the key is recreated as a non-string. Consider emitting a Del !redis|ttl|<key> when deleting a string key (either in buildKeyElems for the delete case, or by making buildTTLElems skip strings only when they are being put/updated, not when deleted).

Copilot uses AI. Check for mistakes.
Comment on lines 1987 to 1995
return err
}
existingTTL = ttl
current, err = strconv.ParseInt(string(raw), 10, 64)
if err != nil {
return fmt.Errorf("ERR value is not an integer or out of range")
}
}
current++
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INCR only writes the !redis|ttl|<key> scan-index when existingTTL != nil. If a stale !redis|ttl|<key> entry exists (e.g., from legacy buffered TTL writes or partial cleanup), INCR on a missing key will create a new persistent string but leave the old TTL index in place, which could later cause background TTL sweeping (or any TTL-index-based logic) to expire the new value incorrectly. Consider explicitly deleting !redis|ttl|<key> when existingTTL == nil (or generally mirroring the SET behavior of clearing the index when making a key persistent).

Copilot uses AI. Check for mistakes.
…delete

Copilot PR review follow-ups:

- rawKeyTypeAt reports HLL as redisTypeString, but HLL payloads live
  under !redis|hll|<key> and do not carry inline TTL. doSetExpire and
  applyPositiveExpire now call isPlainRedisString to distinguish plain
  Redis strings from HLLs; HLLs keep TTL in the !redis|ttl| scan index
  so EXPIRE/PEXPIRE work correctly on HyperLogLogs.
- buildKeyElems now emits Del on redisTTLKey when deleting a string
  anchor, so stale scan-index entries don't outlive the string and
  incorrectly expire a future key reincarnation.
@bootjp bootjp merged commit e08475a into main Apr 17, 2026
8 checks passed
@bootjp bootjp deleted the claude/sad-hermann-0a3a28 branch April 17, 2026 20:34
bootjp added a commit that referenced this pull request Apr 30, 2026
Three follow-up issues from Codex on top of round 1.

#619 Codex P1 -- assemble only manifest-declared parts.
filterChunksForManifest previously matched on uploadID only; the
manifest declares specific (partNo, partVersion) tuples, and S3's
overwrite-then-async-cleanup window can leave older partVersion
chunks present under the same (bucket, generation, object,
uploadID). Mixing them produces corrupted bytes.

s3ObjectState gains declaredParts map[s3PartKey]struct{}, populated
by HandleObjectManifest. filterChunksForManifest takes the set and
restricts emission to declared (partNo, partVersion) tuples. Test
TestS3_StalePartVersionExcludedFromAssembledBody asserts a stale
partVersion=7 cannot leak when the manifest declares partVersion=9.

#497 Codex P2 -- preserve dot segments.
safeJoinUnderRoot used filepath.Clean which collapses "a/../b" to
"b". S3 treats those bytes literally; "a/../b" and "b" are distinct
keys that would have silently merged into one output file. Now
explicitly rejects any object key whose segments are "." or ".."
with ErrS3MalformedKey. NUL bytes also rejected. Test
TestS3_DotSegmentObjectKeyRejected covers the four forms ("a/../b",
"a/./b", "..", ".").

#521 Codex P2 -- cross-generation collision.
s3BucketState gains activeGen captured from the bucket-meta record.
flushBucket suppresses objects whose generation differs from
activeGen (under --include-orphans, those flow to _orphans/; by
default they're dropped with an s3_stale_generation_objects
warning). Tests TestS3_StaleGenerationObjectExcluded.

flushBucket cyclomatic complexity stayed under the cap by extracting
flushBucketObjects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants