perf(redis-lua): fast-path ZSCORE wide-column lookup#568
perf(redis-lua): fast-path ZSCORE wide-column lookup#568
Conversation
Phase A of the ZSET fast-path work discussed alongside PR #567. BullMQ and other Lua-heavy workloads use ZSCORE on delay-queue positions and job-existence checks; the legacy path went through zsetState (~8-seek keyTypeAt + full zset state init) before a direct GetAt on ZSetMemberKey. Reuse PR #565's pattern: - adapter/redis_compat_commands.go adds zsetMemberFastScore(ctx, key, member, readTS), a lock-free GetAt on ZSetMemberKey that returns (score, hit, alive). Same priority-alignment scope as hashFieldFastLookup: only the redisStrKey dual-encoding is guarded; legacy / HLL / list corner cases still take the slow fallback. - adapter/redis_lua_context.go cmdZScore consults c.zsets[key] first (so ZADD -> ZSCORE within the same Eval honors the unflushed mutation), otherwise calls zsetMemberFastScore, and only falls back to zsetState on miss. Factor the state-reply path into zscoreFromZSetState to keep the cyclomatic-complexity cap. Phase B / C (ZRANGEBYSCORE streaming, ZADD / ZREM fast-path) need production profiling before we commit to the deeper refactor; this PR is the cheap ZSCORE win that applies immediately to BullMQ job position checks. Tests: redis_lua_collection_fastpath_test.go gets 6 new ZSCORE cases (hit / in-script-ZADD visibility / miss / unknown member / WRONGTYPE / TTL-expired).
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…nswer Addresses High-severity finding from the independent review of PR #567. The previous fast path only consulted c.hashes[key] / c.sets[key], which misses the three script-local scenarios where another command earlier in the Eval has published an authoritative type answer via c.cachedType(): 1. A prior DEL sets c.deleted[key]=true but does NOT populate c.hashes[key] unless the hash was already loaded. The fast path would then read pebble at c.startTS and surface pre-script values instead of nil. 2. A prior SET / other-type write populates c.strings[key] etc., changing the logical type to something other than Hash. The slow path's hashState -> keyType -> cachedType chain would return WRONGTYPE; the fast path silently returned the pre-script hash value. 3. A prior RENAME-away of the key, same as (1) for the source. Fix: consult c.cachedType(key) before the fast path. Any authoritative answer (loaded state, explicit deletion, different type) defers to the legacy hashState / setState slow path, which already handles all three cases correctly. Only "unknown" cachedType proceeds to the pebble fast path. Factor hgetFromSlowPath / hexistsFromSlowPath / sismemberFromSlowPath out of each handler to keep the cyclomatic complexity under the linter cap and share the slow-path implementation across the cachedType-cached and fast-path-miss branches. Tests: six new regressions in redis_lua_collection_fastpath_test.go pin each cachedType scenario end-to-end via redis.Eval: - SET-then-HGET / HEXISTS / SISMEMBER -> WRONGTYPE - DEL-then-HGET / HEXISTS / SISMEMBER -> nil / 0 - HSET-then-HGET-on-another-field in the same Eval still honors the cached state for both the seeded field and the unflushed one.
Addresses Medium-severity findings from the independent review of PR #568. - Arity: legacy cmdZScore returned nil when invoked with only a key because the missing-key check fired before the args[1] read. The fast-path intro indexed args[1] unconditionally and would panic on a 1-arg call (no arity validation in the Lua dispatcher). Add explicit `len(args) < 2` guard returning the standard "wrong number of arguments" error. - cachedType defer: align cmdZScore with the PR #567 follow-up fix. Prior SET on the same key (type change to String), DEL tombstone, or a prior ZADD / ZREM in this Eval must all go through the slow path so zsetState -> keyType -> cachedType produces the correct WRONGTYPE / nil / cached-score answer. The previous c.zsets[key]-only check missed DEL and type-change cases. - Factor zscoreFromSlowPath out to share the slow-path body between the cached and miss branches. Test: TestLua_ZSCORE_ArityTooFew covers the 1-arg panic regression; TestLua_ZSCORE_SetThenZScoreReturnsWrongType and TestLua_ZSCORE_DelThenZScoreReturnsNil pin the two cachedType scenarios end-to-end via redis.Eval. All existing ZSCORE tests still pass.
There was a problem hiding this comment.
Pull request overview
This PR adds a Lua-context fast path for ZSCORE on wide-column ZSETs, reducing storage seeks on common Lua-heavy workloads (e.g., BullMQ) while preserving legacy behavior via slow-path fallback.
Changes:
- Add
RedisServer.zsetMemberFastScorehelper to probeZSetMemberKeydirectly with TTL + string-priority guarding. - Update Lua command handlers (
HGET,HEXISTS,SISMEMBER,ZSCORE) to defer to the slow path when the script-local cache has an authoritative answer, otherwise use wide-column fast paths. - Expand Lua fast-path regression tests to cover ZSCORE hits/misses, TTL expiry, WRONGTYPE, and cache/arity edge cases.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
adapter/redis_lua_context.go |
Adds ZSCORE fast-path and factors slow-path helpers; changes cache-guarding logic for HGET/HEXISTS/SISMEMBER/ZSCORE. |
adapter/redis_lua_collection_fastpath_test.go |
Adds end-to-end Lua/EVAL tests for ZSCORE fast-path behavior and cache-guard regressions. |
adapter/redis_compat_commands.go |
Introduces zsetMemberFastScore wide-column lookup helper with TTL and priority guard. |
| // See cmdHGet: defer to the slow path whenever cachedType has an | ||
| // authoritative answer, so prior DEL / RENAME / SET / HSET within | ||
| // this Eval is honored before the pebble probe fires. | ||
| if _, cached := c.cachedType(key); cached { | ||
| return hexistsFromSlowPath(c, key, field) |
There was a problem hiding this comment.
With the new cachedType guard, a prior hashState() load that determined the hash is missing (loaded=true/exists=false) won’t be considered “cached”, so repeated HEXISTS calls on the same missing key will keep probing Pebble instead of using the already-loaded c.hashes entry. Consider also short-circuiting on an existing loaded hash state (even when exists is false) to avoid repeated storage work within the same script.
| // See cmdHGet: defer to the slow path whenever cachedType has an | ||
| // authoritative answer. | ||
| if _, cached := c.cachedType(key); cached { | ||
| return sismemberFromSlowPath(c, key, member) |
There was a problem hiding this comment.
Similar to HGET/HEXISTS: if a set has already been loaded into c.sets and found missing (loaded=true/exists=false), cachedType will not treat it as cached, so repeated SISMEMBER calls on that key will keep hitting the Pebble fast-path + slow-path. Consider short-circuiting when a loaded set state exists (even when exists is false) to avoid repeated storage probes within the same Eval.
| func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) { | ||
| if len(args) < 2 { //nolint:mnd |
There was a problem hiding this comment.
cmdZScore now only checks len(args) < 2 and will silently ignore any extra arguments (e.g., redis.call("ZSCORE", key, member, extra)), which diverges from Redis' arity enforcement and makes it easier for scripts to accidentally pass bad inputs. Consider requiring exactly 2 arguments and removing the //nolint:mnd by introducing a small named constant for the expected count (similar to xAddMinArgs).
| func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) { | |
| if len(args) < 2 { //nolint:mnd | |
| const zScoreArgCount = 2 | |
| func (c *luaScriptContext) cmdZScore(args []string) (luaReply, error) { | |
| if len(args) != zScoreArgCount { |
| // pebble state and leak stale data. | ||
| if _, cached := c.cachedType(key); cached { | ||
| return hgetFromSlowPath(c, key, field) | ||
| } |
There was a problem hiding this comment.
Using cachedType as the only “cache-hit” signal means a key that has already been loaded into c.hashes but found to be non-existent (st.loaded=true/st.exists=false, e.g. a repeated HGET miss) will not be treated as cached and will re-run the Pebble fast-path + slow-path on every call. Consider short-circuiting when a hash state is already present and loaded (even if exists is false), while still using cachedType to protect type-change/DEL cases.
| } | |
| } | |
| // Also treat an already-loaded hash state as authoritative, even | |
| // when it represents a known non-existent key. cachedType only | |
| // reports existing typed values, so repeated HGET misses would | |
| // otherwise keep re-running the storage fast path and then fall | |
| // back again. The slow path will reuse the loaded hash state. | |
| if st, ok := c.hashes[string(key)]; ok && st != nil && st.loaded { | |
| return hgetFromSlowPath(c, key, field) | |
| } |
…probes Addresses copilot follow-up on PR #567. cachedType() only returns "cached" when a loaded typed value exists. A key that was loaded and found MISSING (loaded=true, exists=false, e.g. a prior HGET miss) is not considered cached, so repeated HGET / HEXISTS / SISMEMBER on that key kept re-running the wide-column probe + slow-path fallback on every redis.call. Add luaHashAlreadyLoaded / luaSetAlreadyLoaded helpers that report whether a state is populated AND loaded, regardless of exists. Each of cmdHGet / cmdHExists / cmdSIsMember checks the loaded helper first (short-circuit to slow path, which returns immediately when c.hashes / c.sets has a loaded entry) before falling back to cachedType. Preserves the High-severity correctness guards for DEL / RENAME / SET-type-change (those still flow through cachedType); just trims the redundant storage probe on the repeated-miss scenario BullMQ- style scripts tend to produce when polling queue state. Test: TestLua_HGET_RepeatedMissReusesLoadedState pins correctness for 4 successive HGET misses on the same key (returns nil|nil|nil|nil). The seek-count saving itself is pprof-verifiable but not unit-testable cheaply; the guard is what matters here.
Addresses two copilot follow-up findings on PR #568. 1. Arity: cmdZScore's `len(args) < 2` accepted extra arguments silently (e.g. `redis.call("ZSCORE", key, member, extra)`), diverging from Redis' arity enforcement. Tighten to `!= 2` and introduce a named zscoreArgCount constant so the //nolint:mnd suppression can come off. 2. Fast-path optimisation gap: cachedType only reports "cached" when a loaded typed value exists. A zset loaded and found MISSING (loaded=true, exists=false -- e.g. a prior ZRANGE miss) was not considered cached, so subsequent ZSCOREs on the same key re-ran the wide-column probe on every call. Add luaZSetAlreadyLoaded helper mirroring luaHashAlreadyLoaded / luaSetAlreadyLoaded from the PR #567 follow-up; cmdZScore now short-circuits to the slow path whenever a loaded state exists. Test: TestLua_ZSCORE_ArityTooMany pins the new arity rejection.
Pull request was closed
Production pprof after #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding 40 active goroutines (top Lua-path signal), driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via zsetState -> ensureZSetLoaded before filtering by score bounds -- O(N) GetAts per ZRANGEBYSCORE, so an N=1000 delay queue cost several seconds per call and was the dominant residual after the previous keyTypeAt fast-paths landed. Replace the load-then-filter flow with a bounded scan over the pre-sorted score index: - adapter/redis_compat_commands.go adds zsetRangeByScoreFast which translates [startKey, endKey) byte bounds + offset / limit / scoreFilter into a ScanAt / ReverseScanAt on ZSetScoreRangeScanPrefix. Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-is-missing cases bail out with hit=false so the slow path's WRONGTYPE / disambiguation semantics are preserved. zsetRangeEmptyFastResult checks ZSetMetaKey to distinguish "empty range on a live zset" from "no zset here". Split into zsetFastPathEligible, zsetFastScanLimit, zsetScoreScan, decodeZSetScoreRange, finalizeZSetFastRange helpers so the main entrypoint stays under the cyclomatic- complexity cap. - adapter/redis_lua_context.go cmdZRangeByScore consults luaZSetAlreadyLoaded + cachedType (mirrors PR #567 / #568 safety guards) before entering the fast path. The legacy full-load path lives in cmdZRangeByScoreSlow and still runs for in-script mutations / legacy-blob / type-change cases. zsetScoreScanBounds maps zScoreBound (-Inf / value-inc / value-exc / +Inf) to the byte range; exclusive edges are handled post-scan by scoreInRange. Per-call effect on BullMQ-style delay-queue polls with LIMIT: Before: O(N) member GetAts + in-process filter After (fast-path hit): 1 legacy-blob ExistsAt + a single bounded ScanAt (N_range <= offset + limit, clamped by maxWideScanLimit) + 1 priority-guard ExistsAt + 1 TTL probe. Tests: 10 new cases in redis_lua_collection_fastpath_test.go -- fast-path hit / WITHSCORES / LIMIT offset / exclusive bounds / empty range / missing key / WRONGTYPE / in-script ZADD visibility / ZREVRANGEBYSCORE / TTL-expired. All pass under -race along with the existing adapter suite.
## Summary Phase B of the ZSet fast-path plan. Production pprof after PRs #565 / #567 / #568 showed `cmdZRangeByScore` + `cmdZRangeByScoreAsc` holding **40 active goroutines** — the top residual Lua-path signal — driven by BullMQ delay-queue polls. The legacy path loaded ALL zset members via `zsetState` → `ensureZSetLoaded` before filtering by score bounds: O(N) `GetAt`s per `ZRANGEBYSCORE`, so an N=1000 delay queue cost several seconds per call. Replace load-then-filter with a **bounded scan over the pre-sorted score index** (`!zs|scr|<user>|<sortable-score>|<member>`). ### Changes **`adapter/redis_compat_commands.go`** — new `zsetRangeByScoreFast` helper plus small utilities: - Legacy-blob zsets, string-priority corruption, and empty-result-but-zset-missing cases return `hit=false` so the slow path's WRONGTYPE / disambiguation semantics remain intact. - `zsetRangeEmptyFastResult` checks `ZSetMetaKey` to separate "empty range on a live zset" from "no zset here". - Split into `zsetFastPathEligible` / `zsetFastScanLimit` / `zsetScoreScan` / `decodeZSetScoreRange` / `finalizeZSetFastRange` to stay under the cyclomatic-complexity cap. **`adapter/redis_lua_context.go`** — `cmdZRangeByScore` gated on `luaZSetAlreadyLoaded` + `cachedType` (mirrors PR #567 / #568 safety guards): - In-script ZADD / ZREM / DEL / SET / type-change still go through the legacy full-load path (`cmdZRangeByScoreSlow`). - `zsetScoreScanBounds` maps `zScoreBound` (-Inf / value-inc / value-exc / +Inf) to a byte range; exclusive edges are handled post-scan by `scoreInRange`. ### Per-call effect on BullMQ-style delay-queue polls | Scenario | Before | After (fast-path hit) | |---|---|---| | `ZRANGEBYSCORE key 0 now LIMIT 0 10` over N=1000 zset | ~1000 member `GetAt`s | 1 legacy-blob `ExistsAt` + 1 bounded `ScanAt` (≤10 rows) + 1 priority-guard `ExistsAt` + 1 TTL probe | | Empty range | ~1000 `GetAt`s | 1 legacy-blob + 1 `ZSetMetaKey` + 1 TTL | | In-script `ZADD` then `ZRANGEBYSCORE` | unchanged | unchanged (slow path via `luaZSetAlreadyLoaded`) | ## Test plan - [x] `go test -race -short ./adapter/...` passes - [x] 10 new cases in `adapter/redis_lua_collection_fastpath_test.go`: - fast-path hit (ascending) - WITHSCORES - LIMIT + offset - exclusive bounds `(min (max` - empty range on live zset - missing key - WRONGTYPE (key holds a string) - in-script `ZADD` visible to subsequent `ZRANGEBYSCORE` - `ZREVRANGEBYSCORE` (reverse iteration) - TTL-expired zset - [ ] Production: deploy and verify `cmdZRangeByScore` active goroutines drop from ~20+20 back toward the keyTypeAt-style residual. Expect p50 redis.call latency inside Lua to fall from the current 3s toward ms-scale as delay-queue polls stop pulling the whole zset. ## Follow-ups (out of scope) - `ZRANGE` (by rank) streaming — needs the same pattern with positional iteration. - `ZPOPMIN` — writes; needs the streaming scan plus the delete-member transaction plumbing. - `HGETALL` / `HMGET` — separate wide-column prefix scans. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Optimized sorted-set range-by-score fast path for faster, paginated queries with correct inclusive/exclusive bound handling. * **Bug Fixes** * Safer fallback to full-path when results may be incomplete; fixed empty-range, truncation, type-mismatch and TTL/expiration behaviors; negative LIMIT offsets now rejected and negative limits treated as “no limit.” * **Tests** * Added extensive Lua-scripted tests covering range reads, WITHSCORES, LIMIT semantics, exclusive bounds, in-script mutations, TTL, and regressions. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Phase A of the ZSET fast-path plan discussed alongside PR #567. BullMQ and other Lua-heavy workloads call ZSCORE for delay-queue position checks and job-existence checks; the legacy path went through
zsetState(~8-seekkeyTypeAt+ full zset state init) before a directGetAtonZSetMemberKey.Change
Reuse PR #565's pattern for the single-member ZSET case:
adapter/redis_compat_commands.go— newzsetMemberFastScore(ctx, key, member, readTS)helper. SingleGetAtonZSetMemberKey, returns(score, hit, alive). Priority-alignment scope identical tohashFieldFastLookup: only theredisStrKeydual-encoding case is guarded; legacy / HLL / list corner cases fall through to the slow path.adapter/redis_lua_context.go—cmdZScoreconsultsc.zsets[key]first (soZADD→ZSCOREinside the sameEvalhonors unflushed mutations), otherwise callszsetMemberFastScore, then falls back tozsetStateon miss.zscoreFromZSetStatehelper factored out to keep the function body under the cyclomatic-complexity cap.What this does not do
Phase B / C (
ZRANGEBYSCOREstreaming,ZADD/ZREMfast-path) need production profiling before we commit to the deeper refactor. See #567's discussion for the staging rationale.Per-call effect on BullMQ position checks
zsetState+ 1 GetAt onZSetMemberKeyinsidememberScore.ZSetMemberKeyGetAt + 1 TTL probe.Falls back to the legacy flow on miss so legacy-blob zsets and WRONGTYPE paths retain their prior behaviour.
Test plan
go test -race -short ./adapter/...passesredis_lua_collection_fastpath_test.go:ZADDvisible to subsequentZSCOREDepends on