perf(redis): fast-path HGET / HEXISTS / SISMEMBER wide-column lookups by bootjp · Pull Request #565 · bootjp/elastickv

bootjp · 2026-04-20T17:18:19Z

Summary

Extends the GET / EXISTS / SET fast-path work (PRs #560, #562, #563) to three more hot type-specific reads.

Before, every HGET / HEXISTS / SISMEMBER paid the full ~17-seek rawKeyTypeAt probe before even calling loadHashAt / loadSetAt, even though the specific wide-column entry for the requested field / member is addressable in a single seek.

New flow per command

Probe the wide-column entry directly:
- HGET → store.GetAt(store.HashFieldKey(key, field))
- HEXISTS → store.ExistsAt(store.HashFieldKey(key, field))
- SISMEMBER → store.ExistsAt(store.SetMemberKey(key, member))
On hit, apply hasExpiredTTLAt(key) and return the appropriate Redis response. Expired → nil / 0, matching the pre-optimisation keyTypeAt semantics.
On miss, delegate to a per-command slow helper (hgetSlow / hexistsSlow / sismemberSlow) that preserves the exact legacy keyTypeAt + loadHashAt / loadSetAt path. Legacy-blob encodings and nil / WRONGTYPE disambiguation retain their prior behaviour.

Per-command effect on a live wide-column key

Command	Before	After
HGET	~17 seeks	2–3 seeks
HEXISTS	~17 seeks	2–3 seeks
SISMEMBER	~17 seeks	2–3 seeks

Legacy-blob-encoded hashes/sets go through the slow path (miss on the wide-column key → fallback) so behaviour there is unchanged.

Test plan

go test -race ./adapter/... -short passes
Existing HGET / HEXISTS / SISMEMBER tests cover: hit / miss / WRONGTYPE / nil-key / TTL-expired / legacy-blob
Production: deploy and verify per-command GetAt counters (once Grafana dashboard in obs(redis): add hot-path dashboard and metrics to verify PR #560 #564 lands) show HGET/HEXISTS/SISMEMBER drop into the same latency band as GET

Summary by CodeRabbit

Refactor
- Improved performance and responsiveness for HGET, HEXISTS, and SISMEMBER commands via faster lookup paths and more efficient handling of type/TTL conditions.
Tests
- Added extensive end-to-end tests covering fast-path hits, TTL-expired cases, misses, and type-mismatch behavior to ensure correctness and stability.

Extends the GET / EXISTS / SET fast-path work (PRs #560, #562, #563) to the three most common type-specific read commands. Before, every HGET / HEXISTS / SISMEMBER paid the full ~17-seek rawKeyTypeAt probe before falling through to loadHashAt / loadSetAt, even though the specific wide-column entry for the requested field/member is addressable in a single seek. Each handler now: 1. Probes the wide-column entry directly: - HGET store.GetAt(store.HashFieldKey(key, field)) - HEXISTS store.ExistsAt(store.HashFieldKey(key, field)) - SISMEMBER store.ExistsAt(store.SetMemberKey(key, member)) 2. On hit, applies hasExpiredTTLAt(key) and returns the appropriate Redis protocol response. Expired -> nil / 0, matching the pre-optimisation keyTypeAt semantics. 3. On miss, delegates to a per-command slow helper (hgetSlow / hexistsSlow / sismemberSlow) that preserves the exact legacy keyTypeAt + loadHashAt / loadSetAt path, so legacy-blob encodings and nil / WRONGTYPE disambiguation retain their prior behaviour. Effect per hit on a live wide-column key: ~17 seeks → 2-3 seeks (1 field/member lookup + 1-2 for the TTL probe). Split into fast/slow helpers per command so each handler body stays under the cyclomatic-complexity cap.

gemini-code-assist · 2026-04-20T17:18:23Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-04-20T17:18:26Z

📝 Walkthrough

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.76% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding fast-path optimizations for HGET, HEXISTS, and SISMEMBER wide-column lookups, which is the primary focus of the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/redis-type-specific-fastpath

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

HGET / HEXISTS / SISMEMBER are type-specific: they only succeed on hash / set keys. Call hasExpired(..., nonStringOnly=true) directly instead of hasExpiredTTLAt, which defaults to probing !redis|str| for an embedded TTL that those keys never carry. One guaranteed-miss pebble GetAt saved per fast-path hit. Also prepares #565 for #562 dropping the unused hasExpiredTTLAt wrapper.

Copilot

Pull request overview

This PR extends the Redis adapter’s existing “avoid keyTypeAt / rawKeyTypeAt probe storm” work by adding fast paths for three hot, type-specific reads (HGET, HEXISTS, SISMEMBER) when the data is stored in wide-column format.

Changes:

Add direct wide-column probes for HGET (GetAt(HashFieldKey)), HEXISTS (ExistsAt(HashFieldKey)), and SISMEMBER (ExistsAt(SetMemberKey)), with TTL filtering.
Introduce per-command slow-path helpers (hgetSlow, hexistsSlow, sismemberSlow) to preserve legacy behavior (nil vs WRONGTYPE, legacy-blob formats) when the fast probe misses.
Refactor the handlers to reuse a shared ctx, key, and field/member variables.

Copilot · 2026-04-20T17:50:36Z

+func (r *RedisServer) hashFieldFastLookup(ctx context.Context, key, field []byte, readTS uint64) (raw []byte, hit, alive bool, err error) {
+	raw, err = r.store.GetAt(ctx, store.HashFieldKey(key, field), readTS)
+	if err != nil {
+		if errors.Is(err, store.ErrKeyNotFound) {
+			return nil, false, false, nil
+		}
+		return nil, false, false, cockerrors.WithStack(err)
+	}
+	expired, expErr := r.hasExpired(ctx, key, readTS, true)
+	if expErr != nil {
+		return nil, false, false, cockerrors.WithStack(expErr)
+	}
+	return raw, true, !expired, nil


hashFieldFastLookup returns a hash field value solely based on the wide-column field key existing. In the (documented) dual-encoding corruption scenario where the same user key has both a string/list encoding and leftover wide-column hash keys, rawKeyTypeAt/keyTypeAt would treat the key as string/list (string-first tiebreaker) and HGET should return WRONGTYPE (or nil if the winning type is TTL-expired). Consider adding a cheap guard before treating this as a hit (e.g., check for string/list encodings first, or fall back to the slow path when any higher-priority encoding is present) so HGET stays consistent with keyTypeAt semantics.

Copilot · 2026-04-20T17:50:36Z

+		conn.WriteBulkString(string(raw))
+		return


The fast path writes the value via WriteBulkString(string(raw)), which forces a []byte→string conversion/allocation. Since the value is already []byte from MVCC, prefer writing it as bulk bytes to avoid the extra allocation/copy on this hot path.

Copilot · 2026-04-20T17:50:36Z

+	if err != nil {
+		if errors.Is(err, store.ErrKeyNotFound) {
+			return nil, false, false, nil
+		}


This function uses errors.Is(err, store.ErrKeyNotFound) while most other store.ErrKeyNotFound checks in this file use cockerrors.Is. Standardizing on one (typically cockerrors.Is in this codebase) avoids subtle mismatches if error wrapping differs across layers.

Copilot · 2026-04-20T17:50:37Z

+func (r *RedisServer) setMemberFastExists(ctx context.Context, key, member []byte, readTS uint64) (hit, alive bool, err error) {
+	exists, err := r.store.ExistsAt(ctx, store.SetMemberKey(key, member), readTS)
+	if err != nil {
+		return false, false, cockerrors.WithStack(err)
+	}
+	if !exists {
+		return false, false, nil
+	}
+	expired, expErr := r.hasExpired(ctx, key, readTS, true)
+	if expErr != nil {
+		return false, false, cockerrors.WithStack(expErr)
+	}
+	return true, !expired, nil


setMemberFastExists treats any existing wide-column member row as an authoritative SISMEMBER hit. In the dual-encoding corruption scenario (string wins per rawKeyTypeAt’s documented tiebreaker), leftover set member rows alongside a live string key would cause SISMEMBER to return 1/0 instead of WRONGTYPE (or 0 if the winning string type is TTL-expired). Add a cheap guard/fallback (e.g., check for higher-priority string/list encodings before honoring the member-row hit) to keep behavior consistent with keyTypeAt.

Copilot · 2026-04-20T17:50:37Z

+func (r *RedisServer) hashFieldFastExists(ctx context.Context, key, field []byte, readTS uint64) (hit, alive bool, err error) {
+	exists, err := r.store.ExistsAt(ctx, store.HashFieldKey(key, field), readTS)
+	if err != nil {
+		return false, false, cockerrors.WithStack(err)
+	}
+	if !exists {
+		return false, false, nil
+	}
+	expired, expErr := r.hasExpired(ctx, key, readTS, true)
+	if expErr != nil {
+		return false, false, cockerrors.WithStack(expErr)
+	}
+	return true, !expired, nil
+}


hashFieldFastExists can return 1 based solely on the wide-column field row existing, even if the logical key would be classified as string/list under rawKeyTypeAt’s string-first tiebreaker (corrupt dual-encoding). That would change HEXISTS from WRONGTYPE to 1/0 in those cases, and it also bypasses embedded-TTL checks for strings. Consider guarding against higher-priority encodings (or falling back to the slow path) before treating the field-row as an authoritative hit.

…ursts Under production congestion the etcd engine's inbound stepCh and outbound heartbeat priority queue were both overflowing: - leader log: dispatchDropCount=1.6M+ on MsgHeartbeat (priority queue cap=64 filled because the transport drained slower than heartbeat tick issuance under CPU starvation) - follower log: RPC returned "etcd raft inbound step queue is full" (stepCh cap=256 filled because the follower's event loop was held up by adapter-side pebble seek storms) The pebble-seek CPU work that drove the starvation is gone in #560 through #565. This change is defence-in-depth for bursts: - defaultMaxInflightMsg: 256 → 1024 (sizes stepCh, dispatchReportCh, and the per-peer outbound "normal" queue). 4x safety margin; worst-case memory stays under a few MB per engine at defaultMaxSizePerMsg = 1 MiB. - defaultHeartbeatBufPerPeer: 64 → 512 (outbound priority queue). Heartbeats are tiny (<100 B), so 512 × numPeers is ≪ 1 MB total; a ~5 s transient pause (election-timeout scale) no longer drops heartbeats and forces peers' leases to expire. Both existing raftengine test suites (etcd + hashicorp) pass under -race with no code changes required elsewhere.

…ursts (#566) ## Summary Defence-in-depth bump for the Raft engine's step queue / heartbeat queue capacities after the production step-queue-saturation incident. ### Production evidence - **Leader logs** (192.168.0.210): ``` dropping etcd raft outbound message type=MsgHeartbeat drop_count=1,600,000+ ``` The priority queue (capacity 64) filled because the transport drained slower than heartbeat tick issuance under CPU starvation. - **Follower logs** (192.168.0.211): ``` etcd raft outbound dispatch failed err="etcd raft inbound step queue is full" ``` Peer stepCh (capacity 256) filled because the event loop was held up by adapter-side pebble seek storms. ### What changed - `defaultMaxInflightMsg`: **256 → 1024** (sizes stepCh, dispatchReportCh, and the per-peer outbound "normal" dispatch queue). 4× safety margin; worst-case memory stays under a few MB per engine at `defaultMaxSizePerMsg = 1 MiB`. - `defaultHeartbeatBufPerPeer`: **64 → 512** (outbound priority queue). Heartbeats are ~100 B, so `512 × numPeers` is ≪ 1 MB; a ~5 s transient pause (election-timeout scale) no longer drops heartbeats and forces peer leases to expire. ### Why this is only "defence in depth" The primary fix for the starvation is #560 / #562 / #563 / #565, which removed the adapter-side pebble seek storm. This PR widens the safety margin against future bursts but does not attempt to paper over any remaining throughput issue. ## Test plan - [x] `go test -race ./internal/raftengine/... -short` passes - [x] golangci-lint clean - [ ] Production: after deploy, verify `elastickv_raft_dispatch_dropped_total` (from PR #564) stays at zero across a tick cycle and `elastickv_raft_step_queue_full_total` does not increment under load

Addresses copilot review on PR #565. rawKeyTypeAt (since #563) probes string prefixes first -- in a dual-encoding corruption where the same user key carries both a string row and a wide-column collection row, rawKeyTypeAt returns String, and keyTypeAt-based HGET / HEXISTS / SISMEMBER return WRONGTYPE (or nil if the string is TTL-expired). The new fast paths treated the presence of the wide-column field / member row as authoritative. That flipped the tiebreaker under corruption (hash/set value returned instead of WRONGTYPE). Add a cheap hasHigherPriorityStringEncoding() guard (single ExistsAt on redisStrKey) in all three fast-path helpers. Hits the slow path when a string encoding exists, preserving keyTypeAt semantics. Non-corrupt collection data still has at most one encoding per user key, so the guard is a guaranteed miss and only adds 1 seek to the fast path. Also address the two remaining copilot items on this PR: - Replace stdlib errors.Is(..., store.ErrKeyNotFound) with cockerrors.Is for consistency with the surrounding file. - Replace WriteBulkString(string(raw)) with WriteBulk(raw) to avoid the []byte → string copy on every HGET fast-path hit. Rarer legacy corruption mixing the collection with HLL / bare / list encodings will still let the fast path fire; those edge cases are documented in hashFieldFastLookup's comment.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Copilot · 2026-04-20T18:44:52Z

+// reports whether it is present and TTL-alive. Returns hit=false when
+// the wide-column key is absent OR when a higher-priority encoding
+// (string / list) exists on the same user key, so the caller falls
+// through to hgetSlow and ends up with the keyTypeAt-consistent
+// WRONGTYPE / nil response.
+//
+// The priority guard costs a single ExistsAt on redisStrKey, covering
+// the most common dual-encoding corruption (SET on a key that
+// previously held a hash, where cleanup did not run). Rarer legacy
+// corruption mixing hash with HLL / bare-key / list encodings will
+// still let this fast-path hit; those workloads should take the slow
+// path too, but in normal operation at most one encoding exists per
+// user key and the guard is a guaranteed miss.


The fast-path doc says it returns hit=false when a higher-priority encoding (string/list) exists, but the implementation only guards on redisStrKey(key) (via hasHigherPriorityStringEncoding). That means dual-encoded keys involving list meta/delta keys, legacy bare string keys, or HLL keys can still return a hash/set fast-path answer instead of the keyTypeAt-consistent WRONGTYPE/nil behavior. Consider expanding the guard to match rawKeyTypeAt’s actual priority checks (string types + list), or update the doc to reflect the narrower guarantee.

Suggested change

// reports whether it is present and TTL-alive. Returns hit=false when

// the wide-column key is absent OR when a higher-priority encoding

// (string / list) exists on the same user key, so the caller falls

// through to hgetSlow and ends up with the keyTypeAt-consistent

// WRONGTYPE / nil response.

//

// The priority guard costs a single ExistsAt on redisStrKey, covering

// the most common dual-encoding corruption (SET on a key that

// previously held a hash, where cleanup did not run). Rarer legacy

// corruption mixing hash with HLL / bare-key / list encodings will

// still let this fast-path hit; those workloads should take the slow

// path too, but in normal operation at most one encoding exists per

// user key and the guard is a guaranteed miss.

// reports whether it is present and TTL-alive. It returns hit=false

// when the wide-column key is absent, or when the string-encoding

// guard in hasHigherPriorityStringEncoding fires for the same user

// key so the caller falls through to hgetSlow.

//

// This fast path does not fully mirror rawKeyTypeAt/keyTypeAt

// priority checks. In particular, it only performs the redisStrKey

// existence probe described below; other mixed-encoding corruption

// cases (for example list / HLL / bare-key remnants) are not filtered

// here and may still hit the wide-column fast path instead of taking

// the slow path that produces keyTypeAt-consistent WRONGTYPE / nil

// behavior.

//

// The priority guard costs a single ExistsAt on redisStrKey, covering

// the most common dual-encoding corruption (SET on a key that

// previously held a hash, where cleanup did not run). In normal

// operation at most one encoding exists per user key and the guard is

// a guaranteed miss.

Copilot · 2026-04-20T18:44:53Z

+// hasHigherPriorityStringEncoding returns true iff a string-encoded
+// entry exists for key. Matches the "string wins" tiebreaker in
+// rawKeyTypeAt (see adapter/redis_compat_helpers.go), so a corrupt
+// dual-encoded key where both a string and a collection row are
+// present will take the slow path and return WRONGTYPE / nil from
+// keyTypeAt rather than the collection-specific fast-path answer.
+func (r *RedisServer) hasHigherPriorityStringEncoding(ctx context.Context, key []byte, readTS uint64) (bool, error) {
+	exists, err := r.store.ExistsAt(ctx, redisStrKey(key), readTS)
+	if err != nil {
+		return false, cockerrors.WithStack(err)
+	}
+	return exists, nil


hasHigherPriorityStringEncoding currently only checks existence of !redis|str|<key>, but rawKeyTypeAt’s string priority includes !redis|hll|<key> and the legacy bare <key> as well (and list types are probed before wide-column collections). With dual-encoded keys, HGET/HEXISTS/SISMEMBER can return a collection-specific fast-path result where rawKeyTypeAt/keyTypeAt would classify the key as string/list and return WRONGTYPE/nil. To keep semantics aligned, the guard should mirror rawKeyTypeAt’s priority probes (or be renamed/docs adjusted if the behavior change is intentional).

Suggested change

// hasHigherPriorityStringEncoding returns true iff a string-encoded

// entry exists for key. Matches the "string wins" tiebreaker in

// rawKeyTypeAt (see adapter/redis_compat_helpers.go), so a corrupt

// dual-encoded key where both a string and a collection row are

// present will take the slow path and return WRONGTYPE / nil from

// keyTypeAt rather than the collection-specific fast-path answer.

func (r *RedisServer) hasHigherPriorityStringEncoding(ctx context.Context, key []byte, readTS uint64) (bool, error) {

exists, err := r.store.ExistsAt(ctx, redisStrKey(key), readTS)

if err != nil {

return false, cockerrors.WithStack(err)

}

return exists, nil

// hasHigherPriorityStringEncoding returns true iff keyTypeAt would

// classify the key as a higher-priority string-like encoding before a

// wide-column collection fast-path should answer. This mirrors the

// effective type-priority behavior of rawKeyTypeAt/keyTypeAt for

// explicit string keys, HLL-backed strings, legacy bare-key strings,

// and list encodings that outrank wide-column collection rows.

func (r *RedisServer) hasHigherPriorityStringEncoding(ctx context.Context, key []byte, readTS uint64) (bool, error) {

typ, err := r.keyTypeAt(ctx, key, readTS)

if err != nil {

return false, cockerrors.WithStack(err)

}

return typ == redisTypeString || typ == redisTypeList, nil

Addresses copilot review on PR #565. The previous hashFieldFastLookup doc claimed the guard fires on "string / list" higher-priority encodings, but the implementation only probes redisStrKey. Expanding the guard to fully mirror rawKeyTypeAt would add ~3-5 extra seeks per fast-path hit (string variants + list meta + list delta scan), negating most of the gain versus the ~17-seek keyTypeAt slow path. Narrow the doc to describe the actual scope: only the common SET-over-previous-hash corruption case is caught; HLL / bare / list dual-encoding remnants still take the fast path. Call out the trade-off explicitly so future readers don't assume stricter alignment than we provide, and document that in normal operation at most one encoding exists per user key so the gap is invisible.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-20T19:04:52Z

 func (r *RedisServer) hexists(conn redcon.Conn, cmd redcon.Command) {
 	if r.proxyToLeader(conn, cmd, cmd.Args[1]) {
 		return
 	}
+	key := cmd.Args[1]
+	field := cmd.Args[2]
 	readTS := r.readTS()
-	typ, err := r.keyTypeAt(context.Background(), cmd.Args[1], readTS)
+	ctx := context.Background()
+
+	// Fast path: direct wide-column field existence check. ExistsAt
+	// is cheaper than GetAt since we don't need the value payload.
+	hit, alive, err := r.hashFieldFastExists(ctx, key, field, readTS)
+	if err != nil {
+		conn.WriteError(err.Error())
+		return
+	}


HEXISTS now has a wide-column fast path (hashFieldFastExists) with TTL filtering and a slow-path fallback. I couldn’t find any Go tests that directly exercise HEXISTS (including fast-path hit/miss, TTL-expired -> 0, nil key -> 0, and WRONGTYPE). Please add coverage for these cases to ensure the new branching preserves legacy behavior.

Copilot · 2026-04-20T19:04:53Z

+	key := cmd.Args[1]
+	member := cmd.Args[2]
 	readTS := r.readTS()
-	typ, err := r.keyTypeAt(context.Background(), cmd.Args[1], readTS)
+	ctx := context.Background()
+
+	hit, alive, err := r.setMemberFastExists(ctx, key, member, readTS)
+	if err != nil {
+		conn.WriteError(err.Error())
+		return


SISMEMBER now has a fast path (setMemberFastExists) with new TTL/priority-guard behavior and a slow-path fallback. I couldn’t find targeted tests exercising the fast-path hit/miss and TTL-expired cases for SISMEMBER (current usage in tests appears limited to a basic happy-path). Please add/extend tests to lock in the intended semantics for wide-column hits, misses (fallback), WRONGTYPE, and TTL-expired keys.

Copilot · 2026-04-20T19:04:53Z

+	// Fast path: look the wide-column field up directly. Live
+	// wide-column hashes resolve here in 1 seek + TTL probe versus
+	// the ~17 seeks rawKeyTypeAt issues through keyTypeAt. Legacy-
+	// blob hashes miss the wide-column key and fall through.
+	raw, hit, alive, err := r.hashFieldFastLookup(ctx, key, field, readTS)
+	if err != nil {
+		conn.WriteError(err.Error())
+		return
+	}
+	if hit {
+		if !alive {
+			conn.WriteNull()
+			return
+		}
+		// WriteBulk sends the payload directly from the []byte backing
+		// store; WriteBulkString(string(raw)) would force a []byte →
+		// string copy on every fast-path hit.


HGET now has a new wide-column fast path (hashFieldFastLookup) with TTL filtering and a corruption-priority guard before falling back to hgetSlow. I couldn’t find tests that specifically cover the fast-path behavior (wide-column hit, TTL-expired -> nil, miss -> slow path, and WRONGTYPE handling). Please add/extend tests to exercise these scenarios so the optimization can’t accidentally change Redis semantics.

Addresses copilot review on PR #565. Pin the semantics of the new wide-column fast paths at the Redis wire-protocol level. Each of HGET / HEXISTS / SISMEMBER gets five cases driven by a real redis.Client against createNode(t, 3): - FastPathHit: HSET/SAdd then read -> value / 1 - Miss: read on a nonexistent user key -> nil / 0 - UnknownField (or UnknownMember): collection exists but the requested field / member does not -> nil / 0 (exercises the miss branch of the fast-path lookup) - WRONGTYPE: user key exists as a string -> WRONGTYPE error from the slow-path fallback - TTLExpired: HSET + PEXPIRE with a short TTL, sleep, then read -> nil / 0 (exercises the !alive branch of the fast-path lookup) Additionally, HGET_FastPathGuardDualEncodingSmoke drives the string-priority guard (hasHigherPriorityStringEncoding) at the wire level: HSET then SET on the same key must never surface the hash field value from the fast path -- either WRONGTYPE or nil are acceptable depending on cleanup timing, but the hash value leaking through the string-priority tiebreaker is not.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-20T19:44:44Z

+const collectionFastPathTTL = 80 * time.Millisecond
+
+// waitForTTLExpiry sleeps past a short TTL so the next read sees the
+// key as expired. A small headroom accounts for wall-clock jitter on
+// CI; tests that are sensitive to this use short TTLs (~80 ms) so the
+// delay stays well under a second.
+func waitForTTLExpiry() {
+	time.Sleep(collectionFastPathTTL + 50*time.Millisecond)
+}


waitForTTLExpiry uses a fixed time.Sleep to wait out expiry. This tends to be flaky under CI scheduling jitter and clock granularity; other TTL tests in this repo use require.Eventually polling for redis.Nil/false instead. Consider replacing the fixed sleep with an Eventually loop (with a short poll interval and ~1s timeout) so the test only waits as long as needed and remains stable on slow machines.

Copilot · 2026-04-20T19:44:44Z

+// readTS for this HGET. The test tolerates either "WRONGTYPE" or the
+// old hash value being returned, since the exact race window is tiny,
+// but it must NOT panic or silently leak wide-column data as a string
+// hit: the guard is what keeps that outcome consistent with
+// keyTypeAt semantics.


The comment above this test says it "tolerates either "WRONGTYPE" or the old hash value being returned", but the assertion below explicitly requires that "hashVal" is not returned. Please update the comment to match the actual expected outcomes (e.g., WRONGTYPE or nil/cleanup), otherwise it’s misleading when debugging failures.

Suggested change

// readTS for this HGET. The test tolerates either "WRONGTYPE" or the

// old hash value being returned, since the exact race window is tiny,

// but it must NOT panic or silently leak wide-column data as a string

// hit: the guard is what keeps that outcome consistent with

// keyTypeAt semantics.

// readTS for this HGET. The test tolerates either "WRONGTYPE" or a

// nil/cleanup result if the hash keys are already gone, but it must

// NOT return the old hash value, panic, or silently leak wide-column

// data as a string hit: the guard is what keeps that outcome

// consistent with keyTypeAt semantics.

Copilot · 2026-04-20T19:44:44Z

+// hasHigherPriorityStringEncoding returns true iff a string-encoded
+// entry exists for key. Matches the "string wins" tiebreaker in
+// rawKeyTypeAt (see adapter/redis_compat_helpers.go), so a corrupt
+// dual-encoded key where both a string and a collection row are
+// present will take the slow path and return WRONGTYPE / nil from
+// keyTypeAt rather than the collection-specific fast-path answer.
+func (r *RedisServer) hasHigherPriorityStringEncoding(ctx context.Context, key []byte, readTS uint64) (bool, error) {
+	exists, err := r.store.ExistsAt(ctx, redisStrKey(key), readTS)
+	if err != nil {
+		return false, cockerrors.WithStack(err)
+	}
+	return exists, nil


hasHigherPriorityStringEncoding’s docstring says it returns true iff "a string-encoded entry exists for key" and that it matches the rawKeyTypeAt "string wins" tiebreaker, but the implementation only checks redisStrKey(key) (it does not check HLL or legacy bare string keys, which probeStringTypes treats as string too). Please tighten the comment (and/or rename the helper) so it accurately reflects the narrower guard it actually provides.

Addresses the Medium-severity miss-path / legacy-blob regression flagged in the independent review of PR #565. The priority guard (ExistsAt on redisStrKey) was run BEFORE the wide-column probe, so every miss (nonexistent user key, legacy-blob hash / set encoding, wrong-type key) paid the guard seek for no benefit. For miss-heavy workloads and legacy-blob data the PR could be net-slower than the pre-PR keyTypeAt path. Reorder: probe the wide-column field / member key first. On miss, return and let the caller take the slow path (1 seek total added, down from 2). Only run the string-priority guard when we have a real hit to defer to the slow path -- preserving the dual-encoding-corruption consistency the original PR aimed for while eliminating the negative-case tax. Per-command effect on non-hit traffic: HGET / HEXISTS / SISMEMBER miss now: 1 seek (was: 2 seeks) HGET / HEXISTS / SISMEMBER hit now: 3 seeks (was: 3 seeks) All existing adapter tests pass unchanged.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

adapter/redis_collection_fastpath_test.go (1)

41-270: Consider consolidating the fast-path tests into a table-driven suite.

The fourteen HGET/HEXISTS/SISMEMBER cases share ~6 lines of identical setup (t.Parallel(), createNode, shutdown, NewClient, defer Close) and then differ only in the pre-state and the assertion. A single table-driven test per command would remove the ~80 lines of duplication and make it easier to add new cases (e.g. wrong-type variants for legacy-blob hashes).

As per coding guidelines: "Unit tests should be co-located with packages in *_test.go files; prefer table-driven test cases."

♻️ Sketch of a table-driven shape for the HGET group

func TestRedis_HGET_FastPath(t *testing.T) {
    t.Parallel()
    nodes, _, _ := createNode(t, 3)
    defer shutdown(nodes)
    rdb := redis.NewClient(&redis.Options{Addr: nodes[0].redisAddress})
    defer func() { _ = rdb.Close() }()
    ctx := context.Background()

    cases := []struct {
        name      string
        setup     func(t *testing.T)
        key       string
        field     string
        wantErr   string // "", "WRONGTYPE", "redis.Nil"
        wantValue string
    }{
        {name: "hit", setup: func(t *testing.T) {
            require.NoError(t, rdb.HSet(ctx, "h:fast", "field", "value").Err())
        }, key: "h:fast", field: "field", wantValue: "value"},
        {name: "miss_key", key: "h:missing", field: "field", wantErr: "redis.Nil"},
        // ...
    }
    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            if tc.setup != nil { tc.setup(t) }
            got, err := rdb.HGet(ctx, tc.key, tc.field).Result()
            switch tc.wantErr {
            case "":
                require.NoError(t, err)
                require.Equal(t, tc.wantValue, got)
            case "redis.Nil":
                require.ErrorIs(t, err, redis.Nil)
            default:
                require.Error(t, err)
                require.Contains(t, err.Error(), tc.wantErr)
            }
        })
    }
}

Subtests that need TTL setup can be kept as separate cases using a helper that PExpires + waits.

As per coding guidelines: **/*_test.go: "prefer table-driven test cases".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@adapter/redis_collection_fastpath_test.go` around lines 41 - 270, Consolidate
the duplicated tests (TestRedis_HGET_FastPathHit, TestRedis_HGET_Miss,
TestRedis_HGET_UnknownField, TestRedis_HGET_WRONGTYPE, TestRedis_HGET_TTLExpired
and the analogous HEXISTS and SISMEMBER tests) into three table-driven tests
(e.g. TestRedis_HGET_FastPath, TestRedis_HEXISTS_FastPath,
TestRedis_SISMEMBER_FastPath): move the common setup (t.Parallel(), createNode,
shutdown, redis.NewClient, defer Close, ctx) to the top of each table test,
define a cases slice with fields like name, setup func, key, field/member,
wantErr/wantValue, and loop with t.Run for subtests, invoking per-case setup to
create pre-state (including PExpire + waitForTTLExpiry for TTL cases) and then
assert according to wantErr/wantValue (handle redis.Nil via require.ErrorIs).
Ensure you reference the existing helpers createNode, shutdown, waitForTTLExpiry
and the redis client usage so the refactor replaces the duplicated blocks with
the table-driven loops.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@adapter/redis_collection_fastpath_test.go`:
- Around line 286-307: The smoke test
TestRedis_HGET_FastPathGuardDualEncodingSmoke doesn't reliably exercise the
guard because replaceWithStringTxn (the SET path) may atomically clean up hash
rows, so HGET observes nil/WRONGTYPE for unrelated reasons; either
rename/document the test as a "no-panic/no-leak regression" instead of a
guard-fires assertion, or add a deterministic unit test that bypasses SET's
cleanup by writing both HashFieldKey(...) and redisStrKey(...) directly through
the store interface (or lower-level helper) so hasHigherPriorityStringEncoding
and the guard path are reached predictably; update test name/comments or add the
new test harness exercising the guard explicitly.

---

Nitpick comments:
In `@adapter/redis_collection_fastpath_test.go`:
- Around line 41-270: Consolidate the duplicated tests
(TestRedis_HGET_FastPathHit, TestRedis_HGET_Miss, TestRedis_HGET_UnknownField,
TestRedis_HGET_WRONGTYPE, TestRedis_HGET_TTLExpired and the analogous HEXISTS
and SISMEMBER tests) into three table-driven tests (e.g.
TestRedis_HGET_FastPath, TestRedis_HEXISTS_FastPath,
TestRedis_SISMEMBER_FastPath): move the common setup (t.Parallel(), createNode,
shutdown, redis.NewClient, defer Close, ctx) to the top of each table test,
define a cases slice with fields like name, setup func, key, field/member,
wantErr/wantValue, and loop with t.Run for subtests, invoking per-case setup to
create pre-state (including PExpire + waitForTTLExpiry for TTL cases) and then
assert according to wantErr/wantValue (handle redis.Nil via require.ErrorIs).
Ensure you reference the existing helpers createNode, shutdown, waitForTTLExpiry
and the redis client usage so the refactor replaces the duplicated blocks with
the table-driven loops.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1fe881c4-e583-432d-b32c-bad81862611e

📥 Commits

Reviewing files that changed from the base of the PR and between 21425cf and 80df00c.

📒 Files selected for processing (2)

adapter/redis_collection_fastpath_test.go
adapter/redis_compat_commands.go

coderabbitai · 2026-04-20T20:37:32Z

+func TestRedis_HGET_FastPathGuardDualEncodingSmoke(t *testing.T) {
+	t.Parallel()
+	nodes, _, _ := createNode(t, 3)
+	defer shutdown(nodes)
+
+	ctx := context.Background()
+	rdb := redis.NewClient(&redis.Options{Addr: nodes[0].redisAddress})
+	defer func() { _ = rdb.Close() }()
+
+	require.NoError(t, rdb.HSet(ctx, "dual:key", "field", "hashVal").Err())
+	require.NoError(t, rdb.Set(ctx, "dual:key", "stringVal", 0).Err())
+
+	// After the SET, HGET must either see WRONGTYPE (string-priority
+	// guard fires, slow path reports the key as string) or nil (the
+	// hash keys were cleaned up). It must NOT return "hashVal".
+	got, err := rdb.HGet(ctx, "dual:key", "field").Result()
+	if err != nil {
+		require.Contains(t, err.Error(), "WRONGTYPE")
+		return
+	}
+	require.NotEqual(t, "hashVal", got, "fast path must not leak collection data when a string has taken over the key")
+}


⚠️ Potential issue | 🟡 Minor

Smoke test likely does not exercise the guard in practice.

If replaceWithStringTxn (the SET path's cleanup for pre-existing non-string encodings) removes the hash wide-column rows atomically with the SET commit, then any HGET at readTS >= SET.commitTS will never observe the dual-encoded state — the guard's ExistsAt on redisStrKey simply fires against an already-cleaned hash, and the test falls into the "nil or WRONGTYPE" branch for reasons unrelated to hasHigherPriorityStringEncoding.

As the comment at lines 281-285 acknowledges, the race window is tiny and this is a regression-safety smoke test rather than a direct guard assertion. Consider either:

Renaming/documenting it as a "no-panic, no-leak" regression guard rather than a guard-fires test, or

Adding a direct unit test that writes both HashFieldKey(...) and redisStrKey(...) through the store interface (bypassing SET's cleanup) so the guard path is deterministically reached.

Either way, hasHigherPriorityStringEncoding remains lightly covered — worth a follow-up if the dual-encoding corruption scenario is considered important to regression-test.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@adapter/redis_collection_fastpath_test.go` around lines 286 - 307, The smoke test TestRedis_HGET_FastPathGuardDualEncodingSmoke doesn't reliably exercise the guard because replaceWithStringTxn (the SET path) may atomically clean up hash rows, so HGET observes nil/WRONGTYPE for unrelated reasons; either rename/document the test as a "no-panic/no-leak regression" instead of a guard-fires assertion, or add a deterministic unit test that bypasses SET's cleanup by writing both HashFieldKey(...) and redisStrKey(...) directly through the store interface (or lower-level helper) so hasHigherPriorityStringEncoding and the guard path are reached predictably; update test name/comments or add the new test harness exercising the guard explicitly.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Copilot · 2026-04-21T05:50:55Z

+// hasHigherPriorityStringEncoding returns true iff a string-encoded
+// entry exists for key. Matches the "string wins" tiebreaker in
+// rawKeyTypeAt (see adapter/redis_compat_helpers.go), so a corrupt
+// dual-encoded key where both a string and a collection row are
+// present will take the slow path and return WRONGTYPE / nil from
+// keyTypeAt rather than the collection-specific fast-path answer.
+func (r *RedisServer) hasHigherPriorityStringEncoding(ctx context.Context, key []byte, readTS uint64) (bool, error) {
+	exists, err := r.store.ExistsAt(ctx, redisStrKey(key), readTS)
+	if err != nil {
+		return false, cockerrors.WithStack(err)
+	}
+	return exists, nil


The docstring for hasHigherPriorityStringEncoding says it “matches the 'string wins' tiebreaker in rawKeyTypeAt”, but the implementation only checks redisStrKey(key) and does not include the other string encodings rawKeyTypeAt considers (redisHLLKey(key) and the legacy bare key). This is misleading for future maintainers and for callers relying on the comment. Either expand the check to match probeStringTypes’ candidate set, or update the comment/name to explicitly say it only guards against the new-format string encoding.

Addresses copilot follow-up on PR #565. The previous docstring claimed the guard "matches the 'string wins' tiebreaker in rawKeyTypeAt", but rawKeyTypeAt's full tiebreaker also covers redisHLLKey and the legacy bare key, while this guard only probes redisStrKey. The wording was misleading for future maintainers and for callers relying on the comment. Rewrite to be explicit: - name the narrow scope (new-format string encoding only) - call out which dual-encoding cases still slip through (HLL, legacy bare, list encodings that outrank wide-column collections) - justify the narrow scope: a complete mirror would add ~4-5 extra seeks per fast-path hit, undoing the miss-path reordering win from the previous follow-up commit - point callers that need strict alignment at keyTypeAt

coderabbitai

🧹 Nitpick comments (1)

adapter/redis_compat_commands.go (1)
1234-1234: Optional: wide-column misses still do a full-collection load.

When the fast-path ExistsAt/GetAt on the specific field/member misses but the key is a live wide-column hash/set (e.g. HGET k nonexistent_field on a 10k-field hash, or SISMEMBER k nonmember on a large set), the slow path runs keyTypeAt → loadHashAt/loadSetAt, which loads the entire collection just to return nil/0. That is pre-existing behavior — not a regression from this PR — but it means the fast-path win is concentrated on hits only.

A cheap follow-up would be: after a fast-path miss, probe store.HashMetaKey(key) / store.SetMetaKey(key) — if present, the key is a live wide-column collection and you can short-circuit to WriteNull/WriteInt(0) without loadHashAt/loadSetAt. This keeps negative lookups at O(1) seeks for wide-column encodings and only falls through to the full slow path for legacy-blob / WRONGTYPE disambiguation. Defer if out of scope for this PR.

Also applies to: 1674-1674, 1991-1991
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapter/redis_compat_commands.go` at line 1234, After a fast-path miss in the
ExistsAt/GetAt flow (and in callers like sismemberSlow which currently fall
through to keyTypeAt → loadHashAt/loadSetAt), probe the wide-column meta keys
first: check store.HashMetaKey(key) for hash ops or store.SetMetaKey(key) for
set ops; if the meta key exists treat the key as a live wide-column collection
and short-circuit the slow path by emitting WriteNull (for missing hash field)
or WriteInt(0) (for SISMEMBER negative) instead of calling loadHashAt/loadSetAt;
keep falling through to keyTypeAt/load* only when the meta key is absent so
legacy blob/WRONGTYPE handling still occurs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@adapter/redis_compat_commands.go`:
- Line 1234: After a fast-path miss in the ExistsAt/GetAt flow (and in callers
like sismemberSlow which currently fall through to keyTypeAt →
loadHashAt/loadSetAt), probe the wide-column meta keys first: check
store.HashMetaKey(key) for hash ops or store.SetMetaKey(key) for set ops; if the
meta key exists treat the key as a live wide-column collection and short-circuit
the slow path by emitting WriteNull (for missing hash field) or WriteInt(0) (for
SISMEMBER negative) instead of calling loadHashAt/loadSetAt; keep falling
through to keyTypeAt/load* only when the meta key is absent so legacy
blob/WRONGTYPE handling still occurs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f853974c-df25-4ac6-861e-2fb8e65ccfcf

📥 Commits

Reviewing files that changed from the base of the PR and between 80df00c and d3323e0.

📒 Files selected for processing (1)

adapter/redis_compat_commands.go

…pts (#567) ## Summary Follow-up to PR #565 (top-level HGET / HEXISTS / SISMEMBER fast-paths). BullMQ and similar Lua-heavy workloads never saw the benefit because `luaScriptContext.cmdHGet` etc. still went through `hashState` / `setState`, which paid the full `keyTypeAt` probe (~8 seeks post-#563) before loading the value. ### Effect on BullMQ-style scripts A typical BullMQ script (`moveToActive`, `moveToCompleted`, etc.) touches 5–10 keys. Before this PR, each key cost ~8 seeks just for type classification, i.e. **40–80 wasted seeks per script invocation**. After: 1–2 seeks per fast-path hit. ### Changes Port #565's pattern verbatim into the Lua command handlers: | Lua cmd | Fast-path helper reused from #565 | |---------------|-----------------------------------| | `cmdHGet` | `hashFieldFastLookup` | | `cmdHExists` | `hashFieldFastExists` | | `cmdSIsMember`| `setMemberFastExists` | Each handler still consults `c.hashes` / `c.sets` **first**, so in-script mutations (`HSET` followed by `HGET` in the same `Eval`) stay visible — the fast path only fires when the key has not yet been loaded into the script-local cache. Misses fall through to the legacy `hashState` / `setState` loader so legacy-blob encodings, `WRONGTYPE`, and TTL-expired keys behave identically to before. Small helpers factored out (`hgetFromHashState`, `hexistsFromHashState`, `sismemberFromSetState`) to keep each command body under the cyclomatic-complexity cap. ## Test plan - [x] `go test -race -short ./adapter/...` passes - [x] `adapter/redis_lua_collection_fastpath_test.go` drives each command through a real `redis.Eval`: - fast-path hit returns the value / 1 - in-script write (HSET / SADD) is visible to a subsequent HGET / SISMEMBER - miss / unknown field / member → nil / 0 via slow-path fallback - TTL-expired key → nil / 0 - WRONGTYPE propagates as script error ## Depends on - PR #565 (not yet merged) — provides `hashFieldFastLookup`, `hashFieldFastExists`, `setMemberFastExists`. This branch is based on #565; once #565 merges, rebase to main will be clean.

BullMQ and similar Lua-heavy workloads call ZSCORE for delay-queue position checks and job-existence checks; the legacy path went through zsetState (~8-seek keyTypeAt + full zset state init) before a direct GetAt on ZSetMemberKey. Reuse PR #565 / #567's pattern for the single-member ZSET case: - adapter/redis_compat_commands.go: new zsetMemberFastScore helper. Probes store.ZSetMemberKey directly. Probe-first-then-guard ordering matches the post-PR #565 helpers so missing / legacy- blob / wrong-type workloads do not pay the priority-guard seek. Narrow-scope priority guard covers the redisStrKey dual-encoding corruption case only, same scope as hashFieldFastLookup. - adapter/redis_lua_context.go: cmdZScore rewrite. * Enforce exact arity (zscoreArgCount = 2) -- extras rejected, not silently ignored. * luaZSetAlreadyLoaded helper mirrors luaHashAlreadyLoaded / luaSetAlreadyLoaded (from PR #567 follow-up): a loaded zset state (even exists=false) short-circuits to the slow path so repeated ZSCORE on the same missing key does not re-probe. * cachedType defer: prior ZADD / ZREM in this Eval, explicit DEL tombstone, or type change via SET all go through the slow path so zsetState -> keyType -> cachedType produces the correct WRONGTYPE / nil / cached-score answer. * Fast path via zsetMemberFastScore when both guards pass. * zscoreFromSlowPath helper factored out to keep the function body under the cyclomatic-complexity cap. Per-call effect on BullMQ position checks: Before: ~8 seeks (keyTypeAt) + scan / GetAt inside zsetState + GetAt inside memberScore After (fast-path hit): 3 seeks -- ZSetMemberKey GetAt + redisStrKey ExistsAt guard + TTL probe Falls back to the legacy flow on miss so legacy-blob zsets and WRONGTYPE paths retain their prior behaviour. Tests: 10 new ZSCORE cases in redis_lua_collection_fastpath_test.go: fast-path hit / in-script ZADD visible / miss / unknown member / WRONGTYPE / TTL-expired / SET-then-ZSCORE (WRONGTYPE via cachedType guard) / DEL-then-ZSCORE (nil via cachedType guard) / arity too few (1 arg rejected) / arity too many (3 args rejected).

Production pprof after #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding 40 active goroutines (top Lua-path signal), driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via zsetState -> ensureZSetLoaded before filtering by score bounds -- O(N) GetAts per ZRANGEBYSCORE, so an N=1000 delay queue cost several seconds per call and was the dominant residual after the previous keyTypeAt fast-paths landed. Replace the load-then-filter flow with a bounded scan over the pre-sorted score index: - adapter/redis_compat_commands.go adds zsetRangeByScoreFast which translates [startKey, endKey) byte bounds + offset / limit / scoreFilter into a ScanAt / ReverseScanAt on ZSetScoreRangeScanPrefix. Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-is-missing cases bail out with hit=false so the slow path's WRONGTYPE / disambiguation semantics are preserved. zsetRangeEmptyFastResult checks ZSetMetaKey to distinguish "empty range on a live zset" from "no zset here". Split into zsetFastPathEligible, zsetFastScanLimit, zsetScoreScan, decodeZSetScoreRange, finalizeZSetFastRange helpers so the main entrypoint stays under the cyclomatic- complexity cap. - adapter/redis_lua_context.go cmdZRangeByScore consults luaZSetAlreadyLoaded + cachedType (mirrors PR #567 / #568 safety guards) before entering the fast path. The legacy full-load path lives in cmdZRangeByScoreSlow and still runs for in-script mutations / legacy-blob / type-change cases. zsetScoreScanBounds maps zScoreBound (-Inf / value-inc / value-exc / +Inf) to the byte range; exclusive edges are handled post-scan by scoreInRange. Per-call effect on BullMQ-style delay-queue polls with LIMIT: Before: O(N) member GetAts + in-process filter After (fast-path hit): 1 legacy-blob ExistsAt + a single bounded ScanAt (N_range <= offset + limit, clamped by maxWideScanLimit) + 1 priority-guard ExistsAt + 1 TTL probe. Tests: 10 new cases in redis_lua_collection_fastpath_test.go -- fast-path hit / WITHSCORES / LIMIT offset / exclusive bounds / empty range / missing key / WRONGTYPE / in-script ZADD visibility / ZREVRANGEBYSCORE / TTL-expired. All pass under -race along with the existing adapter suite.

## Summary Phase B of the ZSet fast-path plan. Production pprof after PRs #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding **40 active goroutines** — the top residual Lua-path signal — driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via `zsetState` → `ensureZSetLoaded` before filtering by score bounds: O(N) `GetAt`s per `ZRANGEBYSCORE`, so an N=1000 delay queue cost several seconds per call. Replace load-then-filter with a **bounded scan over the pre-sorted score index** (`!zs|scr|<user>|<sortable-score>|<member>`). ### Changes **`adapter/redis_compat_commands.go`** — new `zsetRangeByScoreFast` helper plus small utilities: - Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-missing cases return `hit=false` so the slow path's WRONGTYPE / disambiguation semantics remain intact. - `zsetRangeEmptyFastResult` checks `ZSetMetaKey` to separate "empty range on a live zset" from "no zset here". - Split into `zsetFastPathEligible` / `zsetFastScanLimit` / `zsetScoreScan` / `decodeZSetScoreRange` / `finalizeZSetFastRange` to stay under the cyclomatic-complexity cap. **`adapter/redis_lua_context.go`** — `cmdZRangeByScore` gated on `luaZSetAlreadyLoaded` + `cachedType` (mirrors PR #567 / #568 safety guards): - In-script ZADD / ZREM / DEL / SET / type-change still go through the legacy full-load path (`cmdZRangeByScoreSlow`). - `zsetScoreScanBounds` maps `zScoreBound` (-Inf / value-inc / value-exc / +Inf) to a byte range; exclusive edges are handled post-scan by `scoreInRange`. ### Per-call effect on BullMQ-style delay-queue polls | Scenario | Before | After (fast-path hit) | |---|---|---| | `ZRANGEBYSCORE key 0 now LIMIT 0 10` over N=1000 zset | ~1000 member `GetAt`s | 1 legacy-blob `ExistsAt` + 1 bounded `ScanAt` (≤10 rows) + 1 priority-guard `ExistsAt` + 1 TTL probe | | Empty range | ~1000 `GetAt`s | 1 legacy-blob + 1 `ZSetMetaKey` + 1 TTL | | In-script `ZADD` then `ZRANGEBYSCORE` | unchanged | unchanged (slow path via `luaZSetAlreadyLoaded`) | ## Test plan - [x] `go test -race -short ./adapter/...` passes - [x] 10 new cases in `adapter/redis_lua_collection_fastpath_test.go`: - fast-path hit (ascending) - WITHSCORES - LIMIT + offset - exclusive bounds `(min (max` - empty range on live zset - missing key - WRONGTYPE (key holds a string) - in-script `ZADD` visible to subsequent `ZRANGEBYSCORE` - `ZREVRANGEBYSCORE` (reverse iteration) - TTL-expired zset - [ ] Production: deploy and verify `cmdZRangeByScore` active goroutines drop from ~20+20 back toward the keyTypeAt-style residual. Expect p50 redis.call latency inside Lua to fall from the current 3s toward ms-scale as delay-queue polls stop pulling the whole zset. ## Follow-ups (out of scope) - `ZRANGE` (by rank) streaming — needs the same pattern with positional iteration. - `ZPOPMIN` — writes; needs the streaming scan plus the delete-member transaction plumbing. - `HGETALL` / `HMGET` — separate wide-column prefix scans.  ## Summary by CodeRabbit * **New Features** * Optimized sorted-set range-by-score fast path for faster, paginated queries with correct inclusive/exclusive bound handling. * **Bug Fixes** * Safer fallback to full-path when results may be incomplete; fixed empty-range, truncation, type-mismatch and TTL/expiration behaviors; negative LIMIT offsets now rejected and negative limits treated as “no limit.” * **Tests** * Added extensive Lua-scripted tests covering range reads, WITHSCORES, LIMIT semantics, exclusive bounds, in-script mutations, TTL, and regressions.

bootjp mentioned this pull request Apr 20, 2026

perf(raft): raise stepCh / heartbeat queue capacities to absorb CPU bursts #566

Merged

3 tasks

bootjp requested a review from Copilot April 20, 2026 17:45

Copilot started reviewing on behalf of bootjp April 20, 2026 17:46 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bootjp requested a review from Copilot April 20, 2026 18:39

Copilot started reviewing on behalf of bootjp April 20, 2026 18:40 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bootjp requested a review from Copilot April 20, 2026 18:58

Copilot started reviewing on behalf of bootjp April 20, 2026 18:59 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bootjp requested a review from Copilot April 20, 2026 19:40

Copilot started reviewing on behalf of bootjp April 20, 2026 19:41 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

This was referenced Apr 20, 2026

perf(redis-lua): fast-path HGET / HEXISTS / SISMEMBER inside Lua scripts #567

Merged

perf(redis-lua): fast-path ZSCORE wide-column lookup #568

Closed

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

bootjp requested a review from Copilot April 21, 2026 05:46

Copilot started reviewing on behalf of bootjp April 21, 2026 05:46 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

bootjp enabled auto-merge April 21, 2026 06:31

Merge branch 'main' into perf/redis-type-specific-fastpath

99bb599

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

bootjp merged commit 363544c into main Apr 21, 2026
8 of 9 checks passed

bootjp deleted the perf/redis-type-specific-fastpath branch April 21, 2026 06:37

bootjp mentioned this pull request Apr 21, 2026

perf(redis-lua): fast-path ZSCORE wide-column lookup #569

Merged

bootjp mentioned this pull request Apr 21, 2026

perf(redis-lua): stream ZRANGEBYSCORE from the score index #570

Merged

3 tasks

Conversation

bootjp commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New flow per command

Per-command effect on a live wide-column key

Test plan

Summary by CodeRabbit

Uh oh!

gemini-code-assist Bot commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

bootjp commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading