Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes pooled-buffer leaks when internal ZADD operations are invoked to populate destination sorted sets, by ensuring heap-backed outputs are properly disposed.
Changes:
- Wrap internal
RMWObjectStoreOperationWithOutput(... ZADD ...)calls intry/finallyblocks. - Dispose
GarnetObjectStoreOutput.SpanByteAndMemory.Memory(when heap-backed) to return rentedMemoryPool<byte>buffers.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| libs/server/Storage/Session/ObjectStore/SortedSetOps.cs | Disposes internal ZADD output for ZUNIONSTORE/ZINTERSTORE-style range-store flow to prevent MemoryPool buffer leaks. |
| libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs | Disposes internal ZADD output for GEO*STORE flow to prevent MemoryPool buffer leaks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a6561e6 to
ef0d985
Compare
## Summary Two correctness bugs in the sorted-set object store and the BITOP read loop. ## 1. Sorted-set `IMemoryOwner` leaks in `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE` Three pooled-buffer leaks where backends (`GeoSearch`, `SortedSetRange`, internal `ZADD`'s `SortedSetAdd`) wrote replies via `RespMemoryWriter`, which (with a default `SpanByte`) rents a `MemoryPool<byte>` buffer (≥512 bytes) and assigns it to `.SpanByteAndMemory.Memory`: - **`SortedSetGeoOps.cs` (`GEO*STORE`)** — the `searchOutMem.Memory` from `GeoSearch` was leaked (only `searchOutHandler` — the `MemoryHandle` from `Pin()` — was disposed). The internal `ZADD` invocation (`zAddOutput`) was discarded entirely. - **`SortedSetOps.cs` (`ZUNIONSTORE` / `ZINTERSTORE`)** — same pattern: `rangeOutputMem.Memory` from `SortedSetRange` was leaked, and the internal `ZADD`'s `zAddOutput.SpanByteAndMemory.Memory` was leaked. Under heavy `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE` traffic this was real `MemoryPool` churn and GC pressure. ### Fix - For each `*STORE`-style internal `ZADD`, wrap the `RMWObjectStoreOperationWithOutput` call in a `try` / `finally` that disposes `zAddOutput.SpanByteAndMemory.Memory` if `!IsSpanByte`. - Extend the existing `finally` blocks that dispose `*Handler` (the `MemoryHandle`) to also dispose the underlying `*Mem.Memory` (the `IMemoryOwner<byte>`). ## 2. BITOP: pending-completion epoch tracking was broken `StorageSession.HeadAddress` was a `readonly long` field captured at session-construction time and never updated. `MainStoreOps.ReadWithUnsafeContext` compared it against itself (`HeadAddress == localHeadAddress`) to decide whether to set `epochChanged = true` after pending completion. Two bugs: 1. The field is frozen, so the check was meaningless — the live store `HeadAddress` was never consulted. 2. The condition was also **inverted**: it set `epochChanged = true` when the addresses were equal (i.e., head did NOT move), the opposite of what the comment said. In addition, `Read` can return synchronously with a pointer into the **read cache** (a separate log with its own `HeadAddress` that can be evicted independently of the main log). The original check would not detect read-cache eviction. ### Fix - Removed the stale `StorageSession.HeadAddress` field. - Added `ClientSession.HeadAddress` and `ClientSession.ReadCacheHeadAddress` accessors that read the live values from `store.Log.HeadAddress` / `store.ReadCache?.HeadAddress`. - `ReadWithUnsafeContext` now captures both addresses at the start of the BITOP loop and, after pending completion, sets `epochChanged = true` if **either** has advanced — correctly invalidating any pointers captured into either log. ## Files changed - `libs/storage/Tsavorite/cs/src/core/ClientSession/ClientSession.cs` — new `HeadAddress` and `ReadCacheHeadAddress` live accessors - `libs/server/Storage/Session/StorageSession.cs` — removed stale `HeadAddress` field - `libs/server/Storage/Session/MainStore/MainStoreOps.cs` — `ReadWithUnsafeContext` uses live addresses with the corrected comparison; signature now also takes `localReadCacheHeadAddress` - `libs/server/Storage/Session/MainStore/BitmapOps.cs` — captures both live head addresses; passes them to `ReadWithUnsafeContext` - `libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs` — leak fixes (`searchOutMem` + `zAddOutput`) - `libs/server/Storage/Session/ObjectStore/SortedSetOps.cs` — leak fixes (`rangeOutputMem` + `zAddOutput`) ## Validation - All 660 sorted-set + geo + bitmap tests pass on `main` - `dotnet format --verify-no-changes` clean ## Note on related fixes on `dev` This is a subset of the fixes on the companion `dev` branch (`badrishc/memory-fixes`). Other fixes from that branch were intentionally **not** ported here because they do not apply to `main`: - The BITOP overflow-pointer fix relies on the `ISourceLogRecord` / `LogRecord` / `OverflowByteArray` model that exists only on `dev`. On `main` the BITOP backend operates on `SpanByte` values that are always pinned in log memory, so the use-after-fixed bug fixed on `dev` does not exist on `main`. - The PFCOUNT/PFMERGE bounds-check tightening from `dev` is unnecessary on `main` because the `main` backend already validates `value.Length <= dst.Length` *before* the `Buffer.MemoryCopy`, so the wrong-capacity argument is gated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ef0d985 to
b5c8b57
Compare
TedHartMS
approved these changes
Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two callsites that invoke
ZADDinternally to populate a destination sorted set were leaking theIMemoryOwner<byte>that theZADDbackend rents fromMemoryPool<byte>.Shared.Root cause
The
ZADDbackend (SortedSetAdd) writes its integer reply throughRespMemoryWriter. When the caller passes a defaultGarnetObjectStoreOutput(whoseSpanBytelength is 0), the very firstWriteInt32triggersReallocateOutput, which rents aMemoryPool<byte>buffer (minimum ~512 bytes) and assigns it tozAddOutput.SpanByteAndMemory.Memory.Both callsites had:
Neither disposed the resulting
Memory. Under heavyGEO*STORE/ZUNIONSTORE/ZINTERSTOREtraffic this caused realMemoryPoolchurn and GC pressure.Fix
Wrap each call in
try/finallythat disposes theMemoryif!IsSpanByte:Files changed
libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs(GEO*STORE)libs/server/Storage/Session/ObjectStore/SortedSetOps.cs(ZUNIONSTORE,ZINTERSTORE)Validation
RespSortedSet+RespSortedSetGeotests passdotnet format --verify-no-changescleanNote on related fixes on
devThis is a subset of the fixes on the companion
devbranch (badrishc/memory-fixes). The other fixes from that branch were intentionally not ported here because they do not apply tomain:ISourceLogRecord/LogRecord/OverflowByteArraymodel that exists only ondev. Onmainthe BITOP backend operates onSpanBytevalues that are always pinned in log memory, so the use-after-fixed bug fixed ondevdoes not exist onmain.devis unnecessary onmainbecause themainbackend already validatesvalue.Length <= dst.Lengthbefore theBuffer.MemoryCopy, so the wrong-capacity argument is gated.