perf(redis-lua): fast-path ZSCORE wide-column lookup by bootjp · Pull Request #568 · bootjp/elastickv

bootjp · 2026-04-20T19:54:07Z

Summary

Phase A of the ZSET fast-path plan discussed alongside PR #567. BullMQ and other Lua-heavy workloads call ZSCORE for delay-queue position checks and job-existence checks; the legacy path went through zsetState (~8-seek keyTypeAt + full zset state init) before a direct GetAt on ZSetMemberKey.

Change

Reuse PR #565's pattern for the single-member ZSET case:

adapter/redis_compat_commands.go — new zsetMemberFastScore(ctx, key, member, readTS) helper. Single GetAt on ZSetMemberKey, returns (score, hit, alive). Priority-alignment scope identical to hashFieldFastLookup: only the redisStrKey dual-encoding case is guarded; legacy / HLL / list corner cases fall through to the slow path.
adapter/redis_lua_context.go — cmdZScore consults c.zsets[key] first (so ZADD → ZSCORE inside the same Eval honors unflushed mutations), otherwise calls zsetMemberFastScore, then falls back to zsetState on miss. zscoreFromZSetState helper factored out to keep the function body under the cyclomatic-complexity cap.

What this does not do

Phase B / C (ZRANGEBYSCORE streaming, ZADD / ZREM fast-path) need production profiling before we commit to the deeper refactor. See #567's discussion for the staging rationale.

Per-call effect on BullMQ position checks

Before: ~8 seeks (keyTypeAt) + 1 scan / GetAt inside zsetState + 1 GetAt on ZSetMemberKey inside memberScore.
After (fast-path hit): 2 seeks — 1 string-priority guard + 1 ZSetMemberKey GetAt + 1 TTL probe.

Falls back to the legacy flow on miss so legacy-blob zsets and WRONGTYPE paths retain their prior behaviour.

Test plan

go test -race -short ./adapter/... passes
New test cases in redis_lua_collection_fastpath_test.go:
- fast-path hit
- in-script ZADD visible to subsequent ZSCORE
- miss on nonexistent zset
- unknown member on existing zset
- WRONGTYPE when key is a string
- TTL-expired zset

Depends on

PR perf(redis-lua): fast-path HGET / HEXISTS / SISMEMBER inside Lua scripts #567 (not yet merged) — adds the in-script-cache-first wiring pattern this PR follows. Once perf(redis-lua): fast-path HGET / HEXISTS / SISMEMBER inside Lua scripts #567 merges, rebase to main will be clean.

Phase A of the ZSET fast-path work discussed alongside PR #567. BullMQ and other Lua-heavy workloads use ZSCORE on delay-queue positions and job-existence checks; the legacy path went through zsetState (~8-seek keyTypeAt + full zset state init) before a direct GetAt on ZSetMemberKey. Reuse PR #565's pattern: - adapter/redis_compat_commands.go adds zsetMemberFastScore(ctx, key, member, readTS), a lock-free GetAt on ZSetMemberKey that returns (score, hit, alive). Same priority-alignment scope as hashFieldFastLookup: only the redisStrKey dual-encoding is guarded; legacy / HLL / list corner cases still take the slow fallback. - adapter/redis_lua_context.go cmdZScore consults c.zsets[key] first (so ZADD -> ZSCORE within the same Eval honors the unflushed mutation), otherwise calls zsetMemberFastScore, and only falls back to zsetState on miss. Factor the state-reply path into zscoreFromZSetState to keep the cyclomatic-complexity cap. Phase B / C (ZRANGEBYSCORE streaming, ZADD / ZREM fast-path) need production profiling before we commit to the deeper refactor; this PR is the cheap ZSCORE win that applies immediately to BullMQ job position checks. Tests: redis_lua_collection_fastpath_test.go gets 6 new ZSCORE cases (hit / in-script-ZADD visibility / miss / unknown member / WRONGTYPE / TTL-expired).

gemini-code-assist · 2026-04-20T19:54:10Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-04-20T19:54:13Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 11772e74-68a0-4fa3-a23b-dee33c9765e1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/redis-lua-zscore-fastpath

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…nswer Addresses High-severity finding from the independent review of PR #567. The previous fast path only consulted c.hashes[key] / c.sets[key], which misses the three script-local scenarios where another command earlier in the Eval has published an authoritative type answer via c.cachedType(): 1. A prior DEL sets c.deleted[key]=true but does NOT populate c.hashes[key] unless the hash was already loaded. The fast path would then read pebble at c.startTS and surface pre-script values instead of nil. 2. A prior SET / other-type write populates c.strings[key] etc., changing the logical type to something other than Hash. The slow path's hashState -> keyType -> cachedType chain would return WRONGTYPE; the fast path silently returned the pre-script hash value. 3. A prior RENAME-away of the key, same as (1) for the source. Fix: consult c.cachedType(key) before the fast path. Any authoritative answer (loaded state, explicit deletion, different type) defers to the legacy hashState / setState slow path, which already handles all three cases correctly. Only "unknown" cachedType proceeds to the pebble fast path. Factor hgetFromSlowPath / hexistsFromSlowPath / sismemberFromSlowPath out of each handler to keep the cyclomatic complexity under the linter cap and share the slow-path implementation across the cachedType-cached and fast-path-miss branches. Tests: six new regressions in redis_lua_collection_fastpath_test.go pin each cachedType scenario end-to-end via redis.Eval: - SET-then-HGET / HEXISTS / SISMEMBER -> WRONGTYPE - DEL-then-HGET / HEXISTS / SISMEMBER -> nil / 0 - HSET-then-HGET-on-another-field in the same Eval still honors the cached state for both the seeded field and the unflushed one.

Addresses Medium-severity findings from the independent review of PR #568. - Arity: legacy cmdZScore returned nil when invoked with only a key because the missing-key check fired before the args[1] read. The fast-path intro indexed args[1] unconditionally and would panic on a 1-arg call (no arity validation in the Lua dispatcher). Add explicit `len(args) < 2` guard returning the standard "wrong number of arguments" error. - cachedType defer: align cmdZScore with the PR #567 follow-up fix. Prior SET on the same key (type change to String), DEL tombstone, or a prior ZADD / ZREM in this Eval must all go through the slow path so zsetState -> keyType -> cachedType produces the correct WRONGTYPE / nil / cached-score answer. The previous c.zsets[key]-only check missed DEL and type-change cases. - Factor zscoreFromSlowPath out to share the slow-path body between the cached and miss branches. Test: TestLua_ZSCORE_ArityTooFew covers the 1-arg panic regression; TestLua_ZSCORE_SetThenZScoreReturnsWrongType and TestLua_ZSCORE_DelThenZScoreReturnsNil pin the two cachedType scenarios end-to-end via redis.Eval. All existing ZSCORE tests still pass.

Copilot

Pull request overview

This PR adds a Lua-context fast path for ZSCORE on wide-column ZSETs, reducing storage seeks on common Lua-heavy workloads (e.g., BullMQ) while preserving legacy behavior via slow-path fallback.

Changes:

Add RedisServer.zsetMemberFastScore helper to probe ZSetMemberKey directly with TTL + string-priority guarding.
Update Lua command handlers (HGET, HEXISTS, SISMEMBER, ZSCORE) to defer to the slow path when the script-local cache has an authoritative answer, otherwise use wide-column fast paths.
Expand Lua fast-path regression tests to cover ZSCORE hits/misses, TTL expiry, WRONGTYPE, and cache/arity edge cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`adapter/redis_lua_context.go`	Adds ZSCORE fast-path and factors slow-path helpers; changes cache-guarding logic for HGET/HEXISTS/SISMEMBER/ZSCORE.
`adapter/redis_lua_collection_fastpath_test.go`	Adds end-to-end Lua/EVAL tests for ZSCORE fast-path behavior and cache-guard regressions.
`adapter/redis_compat_commands.go`	Introduces `zsetMemberFastScore` wide-column lookup helper with TTL and priority guard.

Copilot · 2026-04-21T05:53:13Z

+	// See cmdHGet: defer to the slow path whenever cachedType has an
+	// authoritative answer, so prior DEL / RENAME / SET / HSET within
+	// this Eval is honored before the pebble probe fires.
+	if _, cached := c.cachedType(key); cached {
+		return hexistsFromSlowPath(c, key, field)


With the new cachedType guard, a prior hashState() load that determined the hash is missing (loaded=true/exists=false) won’t be considered “cached”, so repeated HEXISTS calls on the same missing key will keep probing Pebble instead of using the already-loaded c.hashes entry. Consider also short-circuiting on an existing loaded hash state (even when exists is false) to avoid repeated storage work within the same script.

Copilot · 2026-04-21T05:53:14Z

+	// See cmdHGet: defer to the slow path whenever cachedType has an
+	// authoritative answer.
+	if _, cached := c.cachedType(key); cached {
+		return sismemberFromSlowPath(c, key, member)


Similar to HGET/HEXISTS: if a set has already been loaded into c.sets and found missing (loaded=true/exists=false), cachedType will not treat it as cached, so repeated SISMEMBER calls on that key will keep hitting the Pebble fast-path + slow-path. Consider short-circuiting when a loaded set state exists (even when exists is false) to avoid repeated storage probes within the same Eval.

Copilot · 2026-04-21T05:53:14Z

 func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {
+	if len(args) < 2 { //nolint:mnd


cmdZScore now only checks len(args) < 2 and will silently ignore any extra arguments (e.g., redis.call("ZSCORE", key, member, extra)), which diverges from Redis' arity enforcement and makes it easier for scripts to accidentally pass bad inputs. Consider requiring exactly 2 arguments and removing the //nolint:mnd by introducing a small named constant for the expected count (similar to xAddMinArgs).

Suggested change

func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {

if len(args) < 2 { //nolint:mnd

const zScoreArgCount = 2

func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {

if len(args) != zScoreArgCount {

Copilot · 2026-04-21T05:53:14Z

+	// pebble state and leak stale data.
+	if _, cached := c.cachedType(key); cached {
+		return hgetFromSlowPath(c, key, field)
 	}


Using cachedType as the only “cache-hit” signal means a key that has already been loaded into c.hashes but found to be non-existent (st.loaded=true/st.exists=false, e.g. a repeated HGET miss) will not be treated as cached and will re-run the Pebble fast-path + slow-path on every call. Consider short-circuiting when a hash state is already present and loaded (even if exists is false), while still using cachedType to protect type-change/DEL cases.

Suggested change

}

}

// Also treat an already-loaded hash state as authoritative, even

// when it represents a known non-existent key. cachedType only

// reports existing typed values, so repeated HGET misses would

// otherwise keep re-running the storage fast path and then fall

// back again. The slow path will reuse the loaded hash state.

if st, ok := c.hashes[string(key)]; ok && st != nil && st.loaded {

return hgetFromSlowPath(c, key, field)

}

…probes Addresses copilot follow-up on PR #567. cachedType() only returns "cached" when a loaded typed value exists. A key that was loaded and found MISSING (loaded=true, exists=false, e.g. a prior HGET miss) is not considered cached, so repeated HGET / HEXISTS / SISMEMBER on that key kept re-running the wide-column probe + slow-path fallback on every redis.call. Add luaHashAlreadyLoaded / luaSetAlreadyLoaded helpers that report whether a state is populated AND loaded, regardless of exists. Each of cmdHGet / cmdHExists / cmdSIsMember checks the loaded helper first (short-circuit to slow path, which returns immediately when c.hashes / c.sets has a loaded entry) before falling back to cachedType. Preserves the High-severity correctness guards for DEL / RENAME / SET-type-change (those still flow through cachedType); just trims the redundant storage probe on the repeated-miss scenario BullMQ- style scripts tend to produce when polling queue state. Test: TestLua_HGET_RepeatedMissReusesLoadedState pins correctness for 4 successive HGET misses on the same key (returns nil|nil|nil|nil). The seek-count saving itself is pprof-verifiable but not unit-testable cheaply; the guard is what matters here.

Addresses two copilot follow-up findings on PR #568. 1. Arity: cmdZScore's `len(args) < 2` accepted extra arguments silently (e.g. `redis.call("ZSCORE", key, member, extra)`), diverging from Redis' arity enforcement. Tighten to `!= 2` and introduce a named zscoreArgCount constant so the //nolint:mnd suppression can come off. 2. Fast-path optimisation gap: cachedType only reports "cached" when a loaded typed value exists. A zset loaded and found MISSING (loaded=true, exists=false -- e.g. a prior ZRANGE miss) was not considered cached, so subsequent ZSCOREs on the same key re-ran the wide-column probe on every call. Add luaZSetAlreadyLoaded helper mirroring luaHashAlreadyLoaded / luaSetAlreadyLoaded from the PR #567 follow-up; cmdZScore now short-circuits to the slow path whenever a loaded state exists. Test: TestLua_ZSCORE_ArityTooMany pins the new arity rejection.

Production pprof after #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding 40 active goroutines (top Lua-path signal), driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via zsetState -> ensureZSetLoaded before filtering by score bounds -- O(N) GetAts per ZRANGEBYSCORE, so an N=1000 delay queue cost several seconds per call and was the dominant residual after the previous keyTypeAt fast-paths landed. Replace the load-then-filter flow with a bounded scan over the pre-sorted score index: - adapter/redis_compat_commands.go adds zsetRangeByScoreFast which translates [startKey, endKey) byte bounds + offset / limit / scoreFilter into a ScanAt / ReverseScanAt on ZSetScoreRangeScanPrefix. Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-is-missing cases bail out with hit=false so the slow path's WRONGTYPE / disambiguation semantics are preserved. zsetRangeEmptyFastResult checks ZSetMetaKey to distinguish "empty range on a live zset" from "no zset here". Split into zsetFastPathEligible, zsetFastScanLimit, zsetScoreScan, decodeZSetScoreRange, finalizeZSetFastRange helpers so the main entrypoint stays under the cyclomatic- complexity cap. - adapter/redis_lua_context.go cmdZRangeByScore consults luaZSetAlreadyLoaded + cachedType (mirrors PR #567 / #568 safety guards) before entering the fast path. The legacy full-load path lives in cmdZRangeByScoreSlow and still runs for in-script mutations / legacy-blob / type-change cases. zsetScoreScanBounds maps zScoreBound (-Inf / value-inc / value-exc / +Inf) to the byte range; exclusive edges are handled post-scan by scoreInRange. Per-call effect on BullMQ-style delay-queue polls with LIMIT: Before: O(N) member GetAts + in-process filter After (fast-path hit): 1 legacy-blob ExistsAt + a single bounded ScanAt (N_range <= offset + limit, clamped by maxWideScanLimit) + 1 priority-guard ExistsAt + 1 TTL probe. Tests: 10 new cases in redis_lua_collection_fastpath_test.go -- fast-path hit / WITHSCORES / LIMIT offset / exclusive bounds / empty range / missing key / WRONGTYPE / in-script ZADD visibility / ZREVRANGEBYSCORE / TTL-expired. All pass under -race along with the existing adapter suite.

## Summary Phase B of the ZSet fast-path plan. Production pprof after PRs #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding **40 active goroutines** — the top residual Lua-path signal — driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via `zsetState` → `ensureZSetLoaded` before filtering by score bounds: O(N) `GetAt`s per `ZRANGEBYSCORE`, so an N=1000 delay queue cost several seconds per call. Replace load-then-filter with a **bounded scan over the pre-sorted score index** (`!zs|scr|<user>|<sortable-score>|<member>`). ### Changes **`adapter/redis_compat_commands.go`** — new `zsetRangeByScoreFast` helper plus small utilities: - Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-missing cases return `hit=false` so the slow path's WRONGTYPE / disambiguation semantics remain intact. - `zsetRangeEmptyFastResult` checks `ZSetMetaKey` to separate "empty range on a live zset" from "no zset here". - Split into `zsetFastPathEligible` / `zsetFastScanLimit` / `zsetScoreScan` / `decodeZSetScoreRange` / `finalizeZSetFastRange` to stay under the cyclomatic-complexity cap. **`adapter/redis_lua_context.go`** — `cmdZRangeByScore` gated on `luaZSetAlreadyLoaded` + `cachedType` (mirrors PR #567 / #568 safety guards): - In-script ZADD / ZREM / DEL / SET / type-change still go through the legacy full-load path (`cmdZRangeByScoreSlow`). - `zsetScoreScanBounds` maps `zScoreBound` (-Inf / value-inc / value-exc / +Inf) to a byte range; exclusive edges are handled post-scan by `scoreInRange`. ### Per-call effect on BullMQ-style delay-queue polls | Scenario | Before | After (fast-path hit) | |---|---|---| | `ZRANGEBYSCORE key 0 now LIMIT 0 10` over N=1000 zset | ~1000 member `GetAt`s | 1 legacy-blob `ExistsAt` + 1 bounded `ScanAt` (≤10 rows) + 1 priority-guard `ExistsAt` + 1 TTL probe | | Empty range | ~1000 `GetAt`s | 1 legacy-blob + 1 `ZSetMetaKey` + 1 TTL | | In-script `ZADD` then `ZRANGEBYSCORE` | unchanged | unchanged (slow path via `luaZSetAlreadyLoaded`) | ## Test plan - [x] `go test -race -short ./adapter/...` passes - [x] 10 new cases in `adapter/redis_lua_collection_fastpath_test.go`: - fast-path hit (ascending) - WITHSCORES - LIMIT + offset - exclusive bounds `(min (max` - empty range on live zset - missing key - WRONGTYPE (key holds a string) - in-script `ZADD` visible to subsequent `ZRANGEBYSCORE` - `ZREVRANGEBYSCORE` (reverse iteration) - TTL-expired zset - [ ] Production: deploy and verify `cmdZRangeByScore` active goroutines drop from ~20+20 back toward the keyTypeAt-style residual. Expect p50 redis.call latency inside Lua to fall from the current 3s toward ms-scale as delay-queue polls stop pulling the whole zset. ## Follow-ups (out of scope) - `ZRANGE` (by rank) streaming — needs the same pattern with positional iteration. - `ZPOPMIN` — writes; needs the streaming scan plus the delete-member transaction plumbing. - `HGETALL` / `HMGET` — separate wide-column prefix scans.  ## Summary by CodeRabbit * **New Features** * Optimized sorted-set range-by-score fast path for faster, paginated queries with correct inclusive/exclusive bound handling. * **Bug Fixes** * Safer fallback to full-path when results may be incomplete; fixed empty-range, truncation, type-mismatch and TTL/expiration behaviors; negative LIMIT offsets now rejected and negative limits treated as “no limit.” * **Tests** * Added extensive Lua-scripted tests covering range reads, WITHSCORES, LIMIT semantics, exclusive bounds, in-script mutations, TTL, and regressions.

bootjp added 2 commits April 21, 2026 05:29

bootjp requested a review from Copilot April 21, 2026 05:46

Copilot started reviewing on behalf of bootjp April 21, 2026 05:46 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

bootjp added 2 commits April 21, 2026 15:28

Base automatically changed from perf/redis-lua-fastpath to main April 21, 2026 06:40

bootjp enabled auto-merge April 21, 2026 06:41

bootjp closed this Apr 21, 2026

auto-merge was automatically disabled April 21, 2026 06:50
Pull request was closed

bootjp mentioned this pull request Apr 21, 2026

perf(redis-lua): stream ZRANGEBYSCORE from the score index #570

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(redis-lua): fast-path ZSCORE wide-column lookup#568

perf(redis-lua): fast-path ZSCORE wide-column lookup#568
bootjp wants to merge 5 commits intomainfrom
perf/redis-lua-zscore-fastpath

bootjp commented Apr 20, 2026

Uh oh!

gemini-code-assist Bot commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Review skipped

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {
		if len(args) < 2 { //nolint:mnd

-	}
+	}
+	// Also treat an already-loaded hash state as authoritative, even
+	// when it represents a known non-existent key. cachedType only
+	// reports existing typed values, so repeated HGET misses would
+	// otherwise keep re-running the storage fast path and then fall
+	// back again. The slow path will reuse the loaded hash state.
+	if st, ok := c.hashes[string(key)]; ok && st != nil && st.loaded {
+		return hgetFromSlowPath(c, key, field)
+	}

Conversation

bootjp commented Apr 20, 2026

Summary

Change

What this does not do

Per-call effect on BullMQ position checks

Test plan

Depends on

Uh oh!

gemini-code-assist Bot commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading