Skip to content

perf(redis-lua): fast-path ZSCORE wide-column lookup#568

Closed
bootjp wants to merge 5 commits intomainfrom
perf/redis-lua-zscore-fastpath
Closed

perf(redis-lua): fast-path ZSCORE wide-column lookup#568
bootjp wants to merge 5 commits intomainfrom
perf/redis-lua-zscore-fastpath

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 20, 2026

Summary

Phase A of the ZSET fast-path plan discussed alongside PR #567. BullMQ and other Lua-heavy workloads call ZSCORE for delay-queue position checks and job-existence checks; the legacy path went through zsetState (~8-seek keyTypeAt + full zset state init) before a direct GetAt on ZSetMemberKey.

Change

Reuse PR #565's pattern for the single-member ZSET case:

  • adapter/redis_compat_commands.go — new zsetMemberFastScore(ctx, key, member, readTS) helper. Single GetAt on ZSetMemberKey, returns (score, hit, alive). Priority-alignment scope identical to hashFieldFastLookup: only the redisStrKey dual-encoding case is guarded; legacy / HLL / list corner cases fall through to the slow path.
  • adapter/redis_lua_context.gocmdZScore consults c.zsets[key] first (so ZADDZSCORE inside the same Eval honors unflushed mutations), otherwise calls zsetMemberFastScore, then falls back to zsetState on miss. zscoreFromZSetState helper factored out to keep the function body under the cyclomatic-complexity cap.

What this does not do

Phase B / C (ZRANGEBYSCORE streaming, ZADD / ZREM fast-path) need production profiling before we commit to the deeper refactor. See #567's discussion for the staging rationale.

Per-call effect on BullMQ position checks

  • Before: ~8 seeks (keyTypeAt) + 1 scan / GetAt inside zsetState + 1 GetAt on ZSetMemberKey inside memberScore.
  • After (fast-path hit): 2 seeks — 1 string-priority guard + 1 ZSetMemberKey GetAt + 1 TTL probe.

Falls back to the legacy flow on miss so legacy-blob zsets and WRONGTYPE paths retain their prior behaviour.

Test plan

  • go test -race -short ./adapter/... passes
  • New test cases in redis_lua_collection_fastpath_test.go:
    • fast-path hit
    • in-script ZADD visible to subsequent ZSCORE
    • miss on nonexistent zset
    • unknown member on existing zset
    • WRONGTYPE when key is a string
    • TTL-expired zset

Depends on

Phase A of the ZSET fast-path work discussed alongside PR #567.
BullMQ and other Lua-heavy workloads use ZSCORE on delay-queue
positions and job-existence checks; the legacy path went through
zsetState (~8-seek keyTypeAt + full zset state init) before a
direct GetAt on ZSetMemberKey.

Reuse PR #565's pattern:

- adapter/redis_compat_commands.go adds zsetMemberFastScore(ctx,
  key, member, readTS), a lock-free GetAt on ZSetMemberKey that
  returns (score, hit, alive). Same priority-alignment scope as
  hashFieldFastLookup: only the redisStrKey dual-encoding is
  guarded; legacy / HLL / list corner cases still take the slow
  fallback.
- adapter/redis_lua_context.go cmdZScore consults c.zsets[key]
  first (so ZADD -> ZSCORE within the same Eval honors the
  unflushed mutation), otherwise calls zsetMemberFastScore, and
  only falls back to zsetState on miss. Factor the state-reply
  path into zscoreFromZSetState to keep the cyclomatic-complexity
  cap.

Phase B / C (ZRANGEBYSCORE streaming, ZADD / ZREM fast-path) need
production profiling before we commit to the deeper refactor; this
PR is the cheap ZSCORE win that applies immediately to BullMQ job
position checks.

Tests: redis_lua_collection_fastpath_test.go gets 6 new ZSCORE
cases (hit / in-script-ZADD visibility / miss / unknown member /
WRONGTYPE / TTL-expired).
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 11772e74-68a0-4fa3-a23b-dee33c9765e1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/redis-lua-zscore-fastpath

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

bootjp added 2 commits April 21, 2026 05:29
…nswer

Addresses High-severity finding from the independent review of PR #567.

The previous fast path only consulted c.hashes[key] / c.sets[key],
which misses the three script-local scenarios where another command
earlier in the Eval has published an authoritative type answer via
c.cachedType():

1. A prior DEL sets c.deleted[key]=true but does NOT populate
   c.hashes[key] unless the hash was already loaded. The fast path
   would then read pebble at c.startTS and surface pre-script
   values instead of nil.
2. A prior SET / other-type write populates c.strings[key] etc.,
   changing the logical type to something other than Hash. The slow
   path's hashState -> keyType -> cachedType chain would return
   WRONGTYPE; the fast path silently returned the pre-script hash
   value.
3. A prior RENAME-away of the key, same as (1) for the source.

Fix: consult c.cachedType(key) before the fast path. Any
authoritative answer (loaded state, explicit deletion, different
type) defers to the legacy hashState / setState slow path, which
already handles all three cases correctly. Only "unknown" cachedType
proceeds to the pebble fast path.

Factor hgetFromSlowPath / hexistsFromSlowPath / sismemberFromSlowPath
out of each handler to keep the cyclomatic complexity under the
linter cap and share the slow-path implementation across the
cachedType-cached and fast-path-miss branches.

Tests: six new regressions in redis_lua_collection_fastpath_test.go
pin each cachedType scenario end-to-end via redis.Eval:

- SET-then-HGET / HEXISTS / SISMEMBER -> WRONGTYPE
- DEL-then-HGET / HEXISTS / SISMEMBER -> nil / 0
- HSET-then-HGET-on-another-field in the same Eval still honors the
  cached state for both the seeded field and the unflushed one.
Addresses Medium-severity findings from the independent review of
PR #568.

- Arity: legacy cmdZScore returned nil when invoked with only a key
  because the missing-key check fired before the args[1] read.
  The fast-path intro indexed args[1] unconditionally and would
  panic on a 1-arg call (no arity validation in the Lua
  dispatcher). Add explicit `len(args) < 2` guard returning the
  standard "wrong number of arguments" error.
- cachedType defer: align cmdZScore with the PR #567 follow-up
  fix. Prior SET on the same key (type change to String), DEL
  tombstone, or a prior ZADD / ZREM in this Eval must all go
  through the slow path so zsetState -> keyType -> cachedType
  produces the correct WRONGTYPE / nil / cached-score answer. The
  previous c.zsets[key]-only check missed DEL and type-change
  cases.
- Factor zscoreFromSlowPath out to share the slow-path body
  between the cached and miss branches.

Test: TestLua_ZSCORE_ArityTooFew covers the 1-arg panic regression;
TestLua_ZSCORE_SetThenZScoreReturnsWrongType and
TestLua_ZSCORE_DelThenZScoreReturnsNil pin the two cachedType
scenarios end-to-end via redis.Eval. All existing ZSCORE tests
still pass.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a Lua-context fast path for ZSCORE on wide-column ZSETs, reducing storage seeks on common Lua-heavy workloads (e.g., BullMQ) while preserving legacy behavior via slow-path fallback.

Changes:

  • Add RedisServer.zsetMemberFastScore helper to probe ZSetMemberKey directly with TTL + string-priority guarding.
  • Update Lua command handlers (HGET, HEXISTS, SISMEMBER, ZSCORE) to defer to the slow path when the script-local cache has an authoritative answer, otherwise use wide-column fast paths.
  • Expand Lua fast-path regression tests to cover ZSCORE hits/misses, TTL expiry, WRONGTYPE, and cache/arity edge cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
adapter/redis_lua_context.go Adds ZSCORE fast-path and factors slow-path helpers; changes cache-guarding logic for HGET/HEXISTS/SISMEMBER/ZSCORE.
adapter/redis_lua_collection_fastpath_test.go Adds end-to-end Lua/EVAL tests for ZSCORE fast-path behavior and cache-guard regressions.
adapter/redis_compat_commands.go Introduces zsetMemberFastScore wide-column lookup helper with TTL and priority guard.

Comment thread adapter/redis_lua_context.go Outdated
Comment on lines +1559 to +1563
// See cmdHGet: defer to the slow path whenever cachedType has an
// authoritative answer, so prior DEL / RENAME / SET / HSET within
// this Eval is honored before the pebble probe fires.
if _, cached := c.cachedType(key); cached {
return hexistsFromSlowPath(c, key, field)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new cachedType guard, a prior hashState() load that determined the hash is missing (loaded=true/exists=false) won’t be considered “cached”, so repeated HEXISTS calls on the same missing key will keep probing Pebble instead of using the already-loaded c.hashes entry. Consider also short-circuiting on an existing loaded hash state (even when exists is false) to avoid repeated storage work within the same script.

Copilot uses AI. Check for mistakes.
Comment thread adapter/redis_lua_context.go Outdated
Comment on lines +2128 to +2131
// See cmdHGet: defer to the slow path whenever cachedType has an
// authoritative answer.
if _, cached := c.cachedType(key); cached {
return sismemberFromSlowPath(c, key, member)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to HGET/HEXISTS: if a set has already been loaded into c.sets and found missing (loaded=true/exists=false), cachedType will not treat it as cached, so repeated SISMEMBER calls on that key will keep hitting the Pebble fast-path + slow-path. Consider short-circuiting when a loaded set state exists (even when exists is false) to avoid repeated storage probes within the same Eval.

Copilot uses AI. Check for mistakes.
Comment thread adapter/redis_lua_context.go Outdated
Comment on lines +2482 to +2483
func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {
if len(args) < 2 { //nolint:mnd
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmdZScore now only checks len(args) < 2 and will silently ignore any extra arguments (e.g., redis.call("ZSCORE", key, member, extra)), which diverges from Redis' arity enforcement and makes it easier for scripts to accidentally pass bad inputs. Consider requiring exactly 2 arguments and removing the //nolint:mnd by introducing a small named constant for the expected count (similar to xAddMinArgs).

Suggested change
func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {
if len(args) < 2 { //nolint:mnd
const zScoreArgCount = 2
func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) {
if len(args) != zScoreArgCount {

Copilot uses AI. Check for mistakes.
// pebble state and leak stale data.
if _, cached := c.cachedType(key); cached {
return hgetFromSlowPath(c, key, field)
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using cachedType as the only “cache-hit” signal means a key that has already been loaded into c.hashes but found to be non-existent (st.loaded=true/st.exists=false, e.g. a repeated HGET miss) will not be treated as cached and will re-run the Pebble fast-path + slow-path on every call. Consider short-circuiting when a hash state is already present and loaded (even if exists is false), while still using cachedType to protect type-change/DEL cases.

Suggested change
}
}
// Also treat an already-loaded hash state as authoritative, even
// when it represents a known non-existent key. cachedType only
// reports existing typed values, so repeated HGET misses would
// otherwise keep re-running the storage fast path and then fall
// back again. The slow path will reuse the loaded hash state.
if st, ok := c.hashes[string(key)]; ok && st != nil && st.loaded {
return hgetFromSlowPath(c, key, field)
}

Copilot uses AI. Check for mistakes.
bootjp added 2 commits April 21, 2026 15:28
…probes

Addresses copilot follow-up on PR #567.

cachedType() only returns "cached" when a loaded typed value exists.
A key that was loaded and found MISSING (loaded=true, exists=false,
e.g. a prior HGET miss) is not considered cached, so repeated HGET /
HEXISTS / SISMEMBER on that key kept re-running the wide-column
probe + slow-path fallback on every redis.call.

Add luaHashAlreadyLoaded / luaSetAlreadyLoaded helpers that report
whether a state is populated AND loaded, regardless of exists. Each
of cmdHGet / cmdHExists / cmdSIsMember checks the loaded helper
first (short-circuit to slow path, which returns immediately when
c.hashes / c.sets has a loaded entry) before falling back to
cachedType.

Preserves the High-severity correctness guards for DEL / RENAME /
SET-type-change (those still flow through cachedType); just trims
the redundant storage probe on the repeated-miss scenario BullMQ-
style scripts tend to produce when polling queue state.

Test: TestLua_HGET_RepeatedMissReusesLoadedState pins correctness
for 4 successive HGET misses on the same key (returns nil|nil|nil|nil).
The seek-count saving itself is pprof-verifiable but not
unit-testable cheaply; the guard is what matters here.
Addresses two copilot follow-up findings on PR #568.

1. Arity: cmdZScore's `len(args) < 2` accepted extra arguments
   silently (e.g. `redis.call("ZSCORE", key, member, extra)`),
   diverging from Redis' arity enforcement. Tighten to `!= 2` and
   introduce a named zscoreArgCount constant so the //nolint:mnd
   suppression can come off.
2. Fast-path optimisation gap: cachedType only reports "cached"
   when a loaded typed value exists. A zset loaded and found
   MISSING (loaded=true, exists=false -- e.g. a prior ZRANGE miss)
   was not considered cached, so subsequent ZSCOREs on the same
   key re-ran the wide-column probe on every call. Add
   luaZSetAlreadyLoaded helper mirroring luaHashAlreadyLoaded /
   luaSetAlreadyLoaded from the PR #567 follow-up; cmdZScore now
   short-circuits to the slow path whenever a loaded state exists.

Test: TestLua_ZSCORE_ArityTooMany pins the new arity rejection.
Base automatically changed from perf/redis-lua-fastpath to main April 21, 2026 06:40
@bootjp bootjp enabled auto-merge April 21, 2026 06:41
@bootjp bootjp closed this Apr 21, 2026
auto-merge was automatically disabled April 21, 2026 06:50

Pull request was closed

bootjp added a commit that referenced this pull request Apr 21, 2026
Production pprof after #565 / #567 / #568 showed
`cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding 40 active
goroutines (top Lua-path signal), driven by BullMQ delay-queue
polls. The legacy path loaded ALL zset members via
zsetState -> ensureZSetLoaded before filtering by score bounds --
O(N) GetAts per ZRANGEBYSCORE, so an N=1000 delay queue cost
several seconds per call and was the dominant residual after the
previous keyTypeAt fast-paths landed.

Replace the load-then-filter flow with a bounded scan over the
pre-sorted score index:

- adapter/redis_compat_commands.go adds zsetRangeByScoreFast which
  translates [startKey, endKey) byte bounds + offset / limit /
  scoreFilter into a ScanAt / ReverseScanAt on
  ZSetScoreRangeScanPrefix. Legacy-blob zsets, string-priority
  corruption, and empty-result-but-zset-is-missing cases bail out
  with hit=false so the slow path's WRONGTYPE / disambiguation
  semantics are preserved. zsetRangeEmptyFastResult checks
  ZSetMetaKey to distinguish "empty range on a live zset" from "no
  zset here". Split into zsetFastPathEligible, zsetFastScanLimit,
  zsetScoreScan, decodeZSetScoreRange, finalizeZSetFastRange
  helpers so the main entrypoint stays under the cyclomatic-
  complexity cap.
- adapter/redis_lua_context.go cmdZRangeByScore consults
  luaZSetAlreadyLoaded + cachedType (mirrors PR #567 / #568 safety
  guards) before entering the fast path. The legacy full-load path
  lives in cmdZRangeByScoreSlow and still runs for in-script
  mutations / legacy-blob / type-change cases. zsetScoreScanBounds
  maps zScoreBound (-Inf / value-inc / value-exc / +Inf) to the
  byte range; exclusive edges are handled post-scan by
  scoreInRange.

Per-call effect on BullMQ-style delay-queue polls with LIMIT:
  Before: O(N) member GetAts + in-process filter
  After  (fast-path hit): 1 legacy-blob ExistsAt + a single bounded
           ScanAt (N_range <= offset + limit, clamped by
           maxWideScanLimit) + 1 priority-guard ExistsAt + 1 TTL
           probe.

Tests: 10 new cases in redis_lua_collection_fastpath_test.go --
fast-path hit / WITHSCORES / LIMIT offset / exclusive bounds /
empty range / missing key / WRONGTYPE / in-script ZADD visibility
/ ZREVRANGEBYSCORE / TTL-expired. All pass under -race along with
the existing adapter suite.
bootjp added a commit that referenced this pull request Apr 21, 2026
## Summary
Phase B of the ZSet fast-path plan. Production pprof after PRs #565 /
#567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding
**40 active goroutines** — the top residual Lua-path signal — driven by
BullMQ delay-queue polls. The legacy path loaded ALL zset members via
`zsetState` → `ensureZSetLoaded` before filtering by score bounds: O(N)
`GetAt`s per `ZRANGEBYSCORE`, so an N=1000 delay queue cost several
seconds per call.

Replace load-then-filter with a **bounded scan over the pre-sorted score
index** (`!zs|scr|<user>|<sortable-score>|<member>`).

### Changes

**`adapter/redis_compat_commands.go`** — new `zsetRangeByScoreFast`
helper plus small utilities:
- Legacy-blob zsets, string-priority corruption, and
empty-result-but-zset-missing cases return `hit=false` so the slow
path's WRONGTYPE / disambiguation semantics remain intact.
- `zsetRangeEmptyFastResult` checks `ZSetMetaKey` to separate "empty
range on a live zset" from "no zset here".
- Split into `zsetFastPathEligible` / `zsetFastScanLimit` /
`zsetScoreScan` / `decodeZSetScoreRange` / `finalizeZSetFastRange` to
stay under the cyclomatic-complexity cap.

**`adapter/redis_lua_context.go`** — `cmdZRangeByScore` gated on
`luaZSetAlreadyLoaded` + `cachedType` (mirrors PR #567 / #568 safety
guards):
- In-script ZADD / ZREM / DEL / SET / type-change still go through the
legacy full-load path (`cmdZRangeByScoreSlow`).
- `zsetScoreScanBounds` maps `zScoreBound` (-Inf / value-inc / value-exc
/ +Inf) to a byte range; exclusive edges are handled post-scan by
`scoreInRange`.

### Per-call effect on BullMQ-style delay-queue polls

| Scenario | Before | After (fast-path hit) |
|---|---|---|
| `ZRANGEBYSCORE key 0 now LIMIT 0 10` over N=1000 zset | ~1000 member
`GetAt`s | 1 legacy-blob `ExistsAt` + 1 bounded `ScanAt` (≤10 rows) + 1
priority-guard `ExistsAt` + 1 TTL probe |
| Empty range | ~1000 `GetAt`s | 1 legacy-blob + 1 `ZSetMetaKey` + 1 TTL
|
| In-script `ZADD` then `ZRANGEBYSCORE` | unchanged | unchanged (slow
path via `luaZSetAlreadyLoaded`) |

## Test plan
- [x] `go test -race -short ./adapter/...` passes
- [x] 10 new cases in `adapter/redis_lua_collection_fastpath_test.go`:
  - fast-path hit (ascending)
  - WITHSCORES
  - LIMIT + offset
  - exclusive bounds `(min (max`
  - empty range on live zset
  - missing key
  - WRONGTYPE (key holds a string)
  - in-script `ZADD` visible to subsequent `ZRANGEBYSCORE`
  - `ZREVRANGEBYSCORE` (reverse iteration)
  - TTL-expired zset
- [ ] Production: deploy and verify `cmdZRangeByScore` active goroutines
drop from ~20+20 back toward the keyTypeAt-style residual. Expect p50
redis.call latency inside Lua to fall from the current 3s toward
ms-scale as delay-queue polls stop pulling the whole zset.

## Follow-ups (out of scope)
- `ZRANGE` (by rank) streaming — needs the same pattern with positional
iteration.
- `ZPOPMIN` — writes; needs the streaming scan plus the delete-member
transaction plumbing.
- `HGETALL` / `HMGET` — separate wide-column prefix scans.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Optimized sorted-set range-by-score fast path for faster, paginated
queries with correct inclusive/exclusive bound handling.

* **Bug Fixes**
* Safer fallback to full-path when results may be incomplete; fixed
empty-range, truncation, type-mismatch and TTL/expiration behaviors;
negative LIMIT offsets now rejected and negative limits treated as “no
limit.”

* **Tests**
* Added extensive Lua-scripted tests covering range reads, WITHSCORES,
LIMIT semantics, exclusive bounds, in-script mutations, TTL, and
regressions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants