Skip to content

Fix misc memory bugs [dev]#1751

Merged
badrishc merged 1 commit into
devfrom
badrishc/memory-fixes
Apr 30, 2026
Merged

Fix misc memory bugs [dev]#1751
badrishc merged 1 commit into
devfrom
badrishc/memory-fixes

Conversation

@badrishc
Copy link
Copy Markdown
Collaborator

Three related correctness fixes around SpanByteAndMemory handling.

BITOP: use-after-fixed for overflow byte[] values (flaky CI failure)

Root cause of the flaky 'BitOp_Binary_DifferentTails(Diff,4131,[0, 7])' test failure (and similar AccessViolationException crashes seen on Windows Release CI):

The BITOP read callback in PrivateMethods.cs captured the value pointer inside a 'fixed (byte* valuePtr = value)' block, stored it as an integer in the output buffer, then exited the fixed block before the BITOP execution dereferenced it. For values larger than the default ObjectAllocator MaxInlineValueSize (4 KB), the value lives in a GC-managed overflow byte[] that compaction can relocate between the fixed-block exit and the SIMD execution in BitmapManager.InvokeBitOperationUnsafe -- yielding either wrong bytes or a hard crash in Vector512.Load.

Fix: expose value memory through the standard SpanByteAndMemory abstraction via a new ISourceLogRecord.ValueSpanByteAndMemory getter.

  • For inline values, returns SpanByte pointing at log memory (stable under the unsafe context, perf-equivalent to the original code).
  • For overflow values, returns Memory via a new no-copy BorrowedMemoryOwner wrapper around the existing OverflowByteArray. BitmapOps pins it via Memory.Pin() for the duration of the BITOP and releases the MemoryHandle in 'finally' (and on goto-readFromScratch).

The getter is a regular interface property (not a default interface method) with a single GetValueFieldInfo call, so the JIT devirtualizes through the existing 'where TSourceLogRecord : ISourceLogRecord' generic constraint and the inline-path cost is unchanged.

Includes new BitOp_OverflowValues_StableUnderGCCompaction regression test that reliably crashes the unfixed code with AccessViolationException (verified) and passes after the fix.

PFCOUNT/PFMERGE backend: latent buffer-overflow in Buffer.MemoryCopy

PrivateMethods.cs passed value.Length as BOTH the destination-capacity argument AND the source-bytes-to-copy argument, so the safety check trivially passed even when value.Length exceeded the 12 KB sector-aligned destination buffer. The post-copy size check at line 274 was effectively dead code. If a corrupted/oversized HLL value ever passed IsValidHYLL (currently bounded by design but defense-in-depth says guard it), this would silently overflow the buffer.

Fix: capture the buffer's actual capacity before any modification, validate size BEFORE the copy, treat oversized values the same as 'invalid' (-1 sentinel that callers already handle), and pass the true capacity to Buffer.MemoryCopy. Added Debug.Assert in HyperLogLogOps that the backend populated SpanByte (not Memory), documenting the contract so any future regression that converts to heap is caught immediately.

Sorted-set Memory leaks in GEO*STORE / ZUNIONSTORE / ZINTERSTORE

SortedSetGeoOps.cs and SortedSetOps.cs both had:

var zAddOutput = new ObjectOutput();
RMWObjectStoreOperation(destinationKey, ref zAddInput, ..., ref zAddOutput);
// zAddOutput never disposed

The internal ZADD's backend (SortedSetAdd) writes its reply via RespMemoryWriter. With a default ObjectOutput (SpanByte length 0), the first WriteInt32 triggers ReallocateOutput which rents a MemoryPool buffer (>=512 bytes) and assigns it to zAddOutput.SpanByteAndMemory.Memory. Neither call site disposed it -- under heavy GEO*STORE / ZUNIONSTORE / ZINTERSTORE traffic this was real MemoryPool churn and GC pressure.

Audit

While in here, audited the rest of the storage layer for the same patterns. No other use-after-fixed pointer escapes (BITOP was the only backend that stored a raw pointer in its output for later use; BITCOUNT/ BITPOS/BITFIELD/HLL all consume the pointer inside the fixed block) and no other Memory leaks (other RMW helpers either consume the output via ProcessResp* helpers / SendAndReset which dispose, or invoke backends that only write result1 and never call RespMemoryWriter).

Files changed

  • libs/server/Storage/Functions/MainStore/PrivateMethods.cs (BITOP, HLL)
  • libs/server/Storage/Session/MainStore/BitmapOps.cs (BITOP front-end)
  • libs/server/Storage/Session/MainStore/HyperLogLogOps.cs (defensive assert)
  • libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs (leak)
  • libs/server/Storage/Session/ObjectStore/SortedSetOps.cs (leak)
  • libs/storage/Tsavorite/cs/src/core/Allocator/ISourceLogRecord.cs (new getter)
  • libs/storage/Tsavorite/cs/src/core/Allocator/LogRecord.cs (impl)
  • libs/storage/Tsavorite/cs/src/core/Allocator/DiskLogRecord.cs (delegating impl)
  • libs/storage/Tsavorite/cs/src/core/Allocator/ObjectScanIterator.cs (delegating impl)
  • libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteScanIterator.cs (delegating impl)
  • libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/TsavoriteIterator.cs (delegating impl)
  • libs/storage/Tsavorite/cs/src/core/Allocator/OverflowByteArray.cs (public AsMemory)
  • libs/storage/Tsavorite/cs/src/core/VarLen/BorrowedMemoryOwner.cs (new)
  • test/Garnet.test/GarnetBitmapTests.cs (regression test)

Validation

  • All 351 GarnetBitmapTests pass (was 347 + 4 new regression variants); verified the new regression test crashes with AccessViolationException on the unfixed code.
  • All 697 sorted-set / geo / bitmap / HLL tests pass.
  • Tsavorite test project builds clean.
  • 'dotnet format --verify-no-changes' is clean.

Copilot AI review requested due to automatic review settings April 29, 2026 20:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Correctness-focused fixes around SpanByteAndMemory lifetime and output-buffer handling in the storage layer, addressing a BITOP use-after-fixed, a PFCOUNT/PFMERGE copy overflow hazard, and MemoryPool buffer leaks from internal ZADD calls.

Changes:

  • BITOP: replace raw pointer marshalling with ISourceLogRecord.ValueSpanByteAndMemory and pin overflow Memory<byte> during execution; add GC-compaction stress regression test.
  • PFCOUNT/PFMERGE: fix Buffer.MemoryCopy capacity/length misuse by validating size pre-copy and using the true destination capacity.
  • Sorted-set store ops: dispose internal ZADD outputs to avoid leaking pooled buffers.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/Garnet.test/GarnetBitmapTests.cs Adds regression test stressing GC compaction during BITOP on overflow-sized values.
libs/server/Storage/Functions/MainStore/PrivateMethods.cs BITOP callback now returns SpanByteAndMemory; PFCOUNT/PFMERGE copy logic hardened.
libs/server/Storage/Session/MainStore/BitmapOps.cs Pins overflow Memory<byte> buffers for BITOP and releases handles reliably.
libs/server/Storage/Session/MainStore/HyperLogLogOps.cs Adds debug assertions documenting PFCOUNT/PFMERGE output-buffer contract.
libs/server/Storage/Session/ObjectStore/SortedSetOps.cs Disposes internal ZADD output buffer to prevent MemoryPool leaks.
libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs Disposes internal ZADD output buffer to prevent MemoryPool leaks.
libs/storage/Tsavorite/cs/src/core/Allocator/ISourceLogRecord.cs Adds ValueSpanByteAndMemory API and documents expected usage.
libs/storage/Tsavorite/cs/src/core/Allocator/LogRecord.cs Implements ValueSpanByteAndMemory for inline vs overflow values.
libs/storage/Tsavorite/cs/src/core/Allocator/DiskLogRecord.cs Delegates ValueSpanByteAndMemory to wrapped LogRecord.
libs/storage/Tsavorite/cs/src/core/Allocator/ObjectScanIterator.cs Delegates ValueSpanByteAndMemory to underlying disk record.
libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteScanIterator.cs Delegates ValueSpanByteAndMemory to underlying disk record.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/TsavoriteIterator.cs Exposes ValueSpanByteAndMemory via iterator wrapper.
libs/storage/Tsavorite/cs/src/core/Allocator/OverflowByteArray.cs Adds AsMemory() to expose value bytes as Memory<byte> without copying.
libs/storage/Tsavorite/cs/src/core/VarLen/BorrowedMemoryOwner.cs Adds no-op IMemoryOwner<byte> wrapper for borrowed memory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libs/server/Storage/Functions/MainStore/PrivateMethods.cs
Comment thread libs/storage/Tsavorite/cs/src/core/Allocator/ISourceLogRecord.cs Outdated
@badrishc badrishc force-pushed the badrishc/memory-fixes branch 4 times, most recently from 1ce98f0 to e9b4ad8 Compare April 29, 2026 22:03
badrishc added a commit that referenced this pull request Apr 30, 2026
…-set Memory leaks

Four related correctness fixes in the storage layer.

## 1. BITOP: use-after-fixed for overflow `byte[]` values (flaky CI failure)

Root cause of the flaky `BitOp_Binary_DifferentTails(Diff,4131,[0, 7])`
test failure (and the related `AccessViolationException` crashes seen on
Windows Release CI in `Vector512.Load`).

The BITOP read callback in
`libs/server/Storage/Functions/MainStore/PrivateMethods.cs` captured the
value pointer inside a `fixed (byte* valuePtr = value)` block, stored it
as an integer in the output buffer, then exited the `fixed` block before
the BITOP execution dereferenced it. For values larger than the default
`ObjectAllocator.MaxInlineValueSize` (4 KB), the value lives in a
GC-managed overflow `byte[]` that compaction can relocate between the
`fixed`-block exit and the SIMD execution in
`BitmapManager.InvokeBitOperationUnsafe` — yielding either wrong bytes or
a hard crash in `Vector512.Load`.

### Fix

Expose value memory through the standard `SpanByteAndMemory` abstraction
via a new `ISourceLogRecord.ValueSpanByteAndMemory` getter:

- For inline in-memory `LogRecord` values, returns `SpanByte` pointing at
  log memory (zero copy, no allocation, stable while the unsafe context
  is held — perf-equivalent to the original code).
- For inline `DiskLogRecord` values, copies the bytes into a pooled
  `IMemoryOwner<byte>` (the underlying `SectorAlignedMemory` recordBuffer
  is returned to the pool when the `DiskLogRecord` is disposed).
- For overflow values, returns `Memory<byte>` via a no-copy
  `BorrowedMemoryOwner` wrapping the existing `OverflowByteArray` byte[];
  the array is rooted via the `Memory<byte>` reference, so it survives
  source-record disposal.

`BitmapOps.StringBitOperation` pins the returned `Memory` via
`Memory.Pin()` for the duration of BITOP execution and disposes the
`MemoryHandle` and `IMemoryOwner` in `finally` (and on
`goto readFromScratch`).

The getter is a regular interface property (not a default interface
method) with a single `GetValueFieldInfo` call, so the JIT devirtualizes
through the existing `where TSourceLogRecord : ISourceLogRecord` generic
constraint and the inline-path cost is unchanged.

Includes a new `BitOp_OverflowValues_StableUnderGCCompaction` regression
test that reliably crashes the unfixed code with
`AccessViolationException` (verified) and passes after the fix.

## 2. BITOP: pending-completion epoch tracking was broken

`StorageSession.HeadAddress` was a `readonly long` field captured at
session-construction time and never updated. `MainStoreOps.ReadWithUnsafeContext`
compared it against itself (`HeadAddress == localHeadAddress`) to decide
whether to set `epochChanged = true` after pending completion. Two bugs:

1. The field is frozen, so the check was meaningless — the live store
   `HeadAddress` was never consulted.
2. The condition was also **inverted**: it set `epochChanged = true` when
   the addresses were equal (i.e., head did NOT move), the opposite of
   what the comment said.

In addition, `Read` can return synchronously with a pointer into the
**read cache** (a separate log with its own `HeadAddress` that can be
evicted independently of the main log). The original check would not
detect read-cache eviction.

### Fix

- Removed the stale `StorageSession.HeadAddress` field.
- Added `ClientSession.HeadAddress` and `ClientSession.ReadCacheHeadAddress`
  accessors that read the live values from `store.Log.HeadAddress` /
  `store.ReadCache?.HeadAddress`.
- `ReadWithUnsafeContext` now captures both addresses at the start of
  the BITOP loop and, after pending completion, sets `epochChanged = true`
  if **either** has advanced — correctly invalidating any pointers
  captured into either log.

## 3. PFCOUNT/PFMERGE backend: latent buffer-overflow in `Buffer.MemoryCopy`

`PrivateMethods.cs` passed `value.Length` as **both** the
destination-capacity argument **and** the source-bytes-to-copy argument
to `Buffer.MemoryCopy`, so the safety check trivially passed even when
`value.Length` exceeded the 12 KB sector-aligned destination buffer.
The post-copy size check was effectively dead code. If a corrupted /
oversized HLL value ever passed `IsValidHYLL` — currently bounded by
design but defense-in-depth says guard it — this would silently overflow
the destination buffer.

### Fix

- Capture the destination buffer's actual capacity before any
  modification.
- Validate **both** signature **and** size *before* the copy; treat
  oversized values the same as `invalid` (`-1` sentinel that callers
  already handle).
- Pass the true buffer capacity to `Buffer.MemoryCopy` so the bounds
  check is meaningful.

Also added a `Debug.Assert(srcMergeBuffer.SpanByteAndMemory.IsSpanByte)`
in `HyperLogLogOps.cs` documenting the backend's contract — any future
regression that converts to heap will be caught immediately.

## 4. Sorted-set `IMemoryOwner` leaks in `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE`

Three pooled-buffer leaks where backends (`GeoSearch`, `SortedSetRange`,
internal `ZADD`'s `SortedSetAdd`) wrote replies via `RespMemoryWriter`,
which (with a default `SpanByte`) rents a `MemoryPool<byte>` buffer
(≥512 bytes) and assigns it to `.SpanByteAndMemory.Memory`:

- **`SortedSetGeoOps.cs` (`GEO*STORE`)** — the `searchOutMem.Memory`
  from `GeoSearch` was leaked (only `searchOutHandler` — the
  `MemoryHandle` from `Pin()` — was disposed). The internal `ZADD`
  invocation (`zAddOutput`) was discarded entirely.
- **`SortedSetOps.cs` (`ZUNIONSTORE` / `ZINTERSTORE`)** — same pattern:
  `rangeOutputMem.Memory` from `SortedSetRange` was leaked, and the
  internal `ZADD`'s `zAddOutput.SpanByteAndMemory.Memory` was leaked.

Under heavy `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE` traffic this was
real `MemoryPool` churn and GC pressure.

### Fix

- For each `*STORE`-style internal `ZADD`, wrap the
  `RMWObjectStoreOperation` call in a `try` / `finally` that disposes
  `zAddOutput.SpanByteAndMemory.Memory` if `!IsSpanByte`.
- Extend the existing `finally` blocks that dispose `*Handler` (the
  `MemoryHandle`) to also dispose the underlying `*Mem.Memory`
  (the `IMemoryOwner<byte>`).

## Audit

While in here, audited the rest of the storage layer for the same
patterns:

- **No other use-after-fixed pointer escapes.** BITOP was the only
  backend that stored a raw pointer in its output for later use;
  `BITCOUNT` / `BITPOS` / `BITFIELD` / HLL all consume the pointer
  inside the `fixed` block.
- **No other `Memory` leaks** in the storage layer beyond the three
  fixed here. Other RMW helpers either consume the output via
  `ProcessResp*` helpers / `SendAndReset` (which dispose), or invoke
  backends that only write `result1` and never call `RespMemoryWriter`.

## Files changed

### Tsavorite

- `libs/storage/Tsavorite/cs/src/core/ClientSession/ClientSession.cs` —
  new `HeadAddress` and `ReadCacheHeadAddress` live accessors
- `libs/storage/Tsavorite/cs/src/core/Allocator/ISourceLogRecord.cs` —
  new `ValueSpanByteAndMemory` getter + lifetime-contract docs
- `libs/storage/Tsavorite/cs/src/core/Allocator/LogRecord.cs` — concrete
  impl (single `GetValueFieldInfo` call, `[AggressiveInlining]`)
- `libs/storage/Tsavorite/cs/src/core/Allocator/DiskLogRecord.cs` — impl
  that copies inline values into a pooled `IMemoryOwner` so the result
  survives `DiskLogRecord` disposal
- `libs/storage/Tsavorite/cs/src/core/Allocator/ObjectScanIterator.cs` —
  delegating impl
- `libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteScanIterator.cs`
  — delegating impl
- `libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/TsavoriteIterator.cs`
  — delegating impl
- `libs/storage/Tsavorite/cs/src/core/Allocator/OverflowByteArray.cs` —
  public `AsMemory()`
- `libs/storage/Tsavorite/cs/src/core/VarLen/BorrowedMemoryOwner.cs` —
  new public class

### Garnet backend / frontend

- `libs/server/Storage/Functions/MainStore/PrivateMethods.cs` — BITOP +
  HLL backend fixes
- `libs/server/Storage/Session/StorageSession.cs` — removed stale
  `HeadAddress` field
- `libs/server/Storage/Session/MainStore/MainStoreOps.cs` —
  `ReadWithUnsafeContext` now uses live `HeadAddress` +
  `ReadCacheHeadAddress` from `ClientSession`, with the corrected
  comparison
- `libs/server/Storage/Session/MainStore/BitmapOps.cs` — captures both
  live head addresses; pins overflow `Memory` and disposes in
  `finally` / on `goto readFromScratch`
- `libs/server/Storage/Session/MainStore/HyperLogLogOps.cs` — defensive
  `Debug.Assert`
- `libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs` — leak
  fixes (`searchOutMem` + `zAddOutput`)
- `libs/server/Storage/Session/ObjectStore/SortedSetOps.cs` — leak
  fixes (`rangeOutputMem` + `zAddOutput`)

### Tests

- `test/Garnet.test/GarnetBitmapTests.cs` — regression test

## Validation

- All 351 `GarnetBitmapTests` pass (was 347, plus 4 new regression
  variants); verified the new regression test crashes with
  `AccessViolationException` on the unfixed code.
- All 697 sorted-set / geo / bitmap / HLL tests pass.
- Tsavorite test project builds clean.
- `dotnet format --verify-no-changes` is clean.
- Reviewed by GPT 5.5 and Claude Opus 4.6 (1M context); review comments
  from the GitHub Copilot reviewer on PR #1751 (BITOP `DiskLogRecord`
  lifetime, `ValueSpanByteAndMemory` lifetime docs) addressed in this
  commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/memory-fixes branch from e9b4ad8 to 7a2f7d9 Compare April 30, 2026 01:50
…-set Memory leaks

Four related correctness fixes in the storage layer.

## 1. BITOP: use-after-fixed for overflow `byte[]` values (flaky CI failure)

Root cause of the flaky `BitOp_Binary_DifferentTails(Diff,4131,[0, 7])`
test failure (and the related `AccessViolationException` crashes seen on
Windows Release CI in `Vector512.Load`).

The BITOP read callback in
`libs/server/Storage/Functions/MainStore/PrivateMethods.cs` captured the
value pointer inside a `fixed (byte* valuePtr = value)` block, stored it
as an integer in the output buffer, then exited the `fixed` block before
the BITOP execution dereferenced it. For values larger than the default
`ObjectAllocator.MaxInlineValueSize` (4 KB), the value lives in a
GC-managed overflow `byte[]` that compaction can relocate between the
`fixed`-block exit and the SIMD execution in
`BitmapManager.InvokeBitOperationUnsafe` — yielding either wrong bytes or
a hard crash in `Vector512.Load`.

### Fix

Expose value memory through the standard `SpanByteAndMemory` abstraction
via a new `ISourceLogRecord.ValueSpanByteAndMemory` getter:

- For inline in-memory `LogRecord` values, returns `SpanByte` pointing at
  log memory (zero copy, no allocation, stable while the unsafe context
  is held — perf-equivalent to the original code).
- For inline `DiskLogRecord` values, copies the bytes into a pooled
  `IMemoryOwner<byte>` (the underlying `SectorAlignedMemory` recordBuffer
  is returned to the pool when the `DiskLogRecord` is disposed).
- For overflow values, returns `Memory<byte>` via a no-copy
  `BorrowedMemoryOwner` wrapping the existing `OverflowByteArray` byte[];
  the array is rooted via the `Memory<byte>` reference, so it survives
  source-record disposal.

`BitmapOps.StringBitOperation` pins the returned `Memory` via
`Memory.Pin()` for the duration of BITOP execution and disposes the
`MemoryHandle` and `IMemoryOwner` in `finally` (and on
`goto readFromScratch`).

The getter is a regular interface property (not a default interface
method) with a single `GetValueFieldInfo` call, so the JIT devirtualizes
through the existing `where TSourceLogRecord : ISourceLogRecord` generic
constraint and the inline-path cost is unchanged.

Includes a new `BitOp_OverflowValues_StableUnderGCCompaction` regression
test that reliably crashes the unfixed code with
`AccessViolationException` (verified) and passes after the fix.

## 2. BITOP: pending-completion epoch tracking was broken

`StorageSession.HeadAddress` was a `readonly long` field captured at
session-construction time and never updated. `MainStoreOps.ReadWithUnsafeContext`
compared it against itself (`HeadAddress == localHeadAddress`) to decide
whether to set `epochChanged = true` after pending completion. Two bugs:

1. The field is frozen, so the check was meaningless — the live store
   `HeadAddress` was never consulted.
2. The condition was also **inverted**: it set `epochChanged = true` when
   the addresses were equal (i.e., head did NOT move), the opposite of
   what the comment said.

In addition, `Read` can return synchronously with a pointer into the
**read cache** (a separate log with its own `HeadAddress` that can be
evicted independently of the main log). The original check would not
detect read-cache eviction.

### Fix

- Removed the stale `StorageSession.HeadAddress` field.
- Added `ClientSession.HeadAddress` and `ClientSession.ReadCacheHeadAddress`
  accessors that read the live values from `store.Log.HeadAddress` /
  `store.ReadCache?.HeadAddress`.
- `ReadWithUnsafeContext` now captures both addresses at the start of
  the BITOP loop and, after pending completion, sets `epochChanged = true`
  if **either** has advanced — correctly invalidating any pointers
  captured into either log.

## 3. PFCOUNT/PFMERGE backend: latent buffer-overflow in `Buffer.MemoryCopy`

`PrivateMethods.cs` passed `value.Length` as **both** the
destination-capacity argument **and** the source-bytes-to-copy argument
to `Buffer.MemoryCopy`, so the safety check trivially passed even when
`value.Length` exceeded the 12 KB sector-aligned destination buffer.
The post-copy size check was effectively dead code. If a corrupted /
oversized HLL value ever passed `IsValidHYLL` — currently bounded by
design but defense-in-depth says guard it — this would silently overflow
the destination buffer.

### Fix

- Capture the destination buffer's actual capacity before any
  modification.
- Validate **both** signature **and** size *before* the copy; treat
  oversized values the same as `invalid` (`-1` sentinel that callers
  already handle).
- Pass the true buffer capacity to `Buffer.MemoryCopy` so the bounds
  check is meaningful.

Also added a `Debug.Assert(srcMergeBuffer.SpanByteAndMemory.IsSpanByte)`
in `HyperLogLogOps.cs` documenting the backend's contract — any future
regression that converts to heap will be caught immediately.

## 4. Sorted-set `IMemoryOwner` leaks in `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE`

Three pooled-buffer leaks where backends (`GeoSearch`, `SortedSetRange`,
internal `ZADD`'s `SortedSetAdd`) wrote replies via `RespMemoryWriter`,
which (with a default `SpanByte`) rents a `MemoryPool<byte>` buffer
(≥512 bytes) and assigns it to `.SpanByteAndMemory.Memory`:

- **`SortedSetGeoOps.cs` (`GEO*STORE`)** — the `searchOutMem.Memory`
  from `GeoSearch` was leaked (only `searchOutHandler` — the
  `MemoryHandle` from `Pin()` — was disposed). The internal `ZADD`
  invocation (`zAddOutput`) was discarded entirely.
- **`SortedSetOps.cs` (`ZUNIONSTORE` / `ZINTERSTORE`)** — same pattern:
  `rangeOutputMem.Memory` from `SortedSetRange` was leaked, and the
  internal `ZADD`'s `zAddOutput.SpanByteAndMemory.Memory` was leaked.

Under heavy `GEO*STORE` / `ZUNIONSTORE` / `ZINTERSTORE` traffic this was
real `MemoryPool` churn and GC pressure.

### Fix

- For each `*STORE`-style internal `ZADD`, wrap the
  `RMWObjectStoreOperation` call in a `try` / `finally` that disposes
  `zAddOutput.SpanByteAndMemory.Memory` if `!IsSpanByte`.
- Extend the existing `finally` blocks that dispose `*Handler` (the
  `MemoryHandle`) to also dispose the underlying `*Mem.Memory`
  (the `IMemoryOwner<byte>`).

## Audit

While in here, audited the rest of the storage layer for the same
patterns:

- **No other use-after-fixed pointer escapes.** BITOP was the only
  backend that stored a raw pointer in its output for later use;
  `BITCOUNT` / `BITPOS` / `BITFIELD` / HLL all consume the pointer
  inside the `fixed` block.
- **No other `Memory` leaks** in the storage layer beyond the three
  fixed here. Other RMW helpers either consume the output via
  `ProcessResp*` helpers / `SendAndReset` (which dispose), or invoke
  backends that only write `result1` and never call `RespMemoryWriter`.

## Files changed

### Tsavorite

- `libs/storage/Tsavorite/cs/src/core/ClientSession/ClientSession.cs` —
  new `HeadAddress` and `ReadCacheHeadAddress` live accessors
- `libs/storage/Tsavorite/cs/src/core/Allocator/ISourceLogRecord.cs` —
  new `ValueSpanByteAndMemory` getter + lifetime-contract docs
- `libs/storage/Tsavorite/cs/src/core/Allocator/LogRecord.cs` — concrete
  impl (single `GetValueFieldInfo` call, `[AggressiveInlining]`)
- `libs/storage/Tsavorite/cs/src/core/Allocator/DiskLogRecord.cs` — impl
  that copies inline values into a pooled `IMemoryOwner` so the result
  survives `DiskLogRecord` disposal
- `libs/storage/Tsavorite/cs/src/core/Allocator/ObjectScanIterator.cs` —
  delegating impl
- `libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteScanIterator.cs`
  — delegating impl
- `libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/TsavoriteIterator.cs`
  — delegating impl
- `libs/storage/Tsavorite/cs/src/core/Allocator/OverflowByteArray.cs` —
  public `AsMemory()`
- `libs/storage/Tsavorite/cs/src/core/VarLen/BorrowedMemoryOwner.cs` —
  new public class

### Garnet backend / frontend

- `libs/server/Storage/Functions/MainStore/PrivateMethods.cs` — BITOP +
  HLL backend fixes
- `libs/server/Storage/Session/StorageSession.cs` — removed stale
  `HeadAddress` field
- `libs/server/Storage/Session/MainStore/MainStoreOps.cs` —
  `ReadWithUnsafeContext` now uses live `HeadAddress` +
  `ReadCacheHeadAddress` from `ClientSession`, with the corrected
  comparison
- `libs/server/Storage/Session/MainStore/BitmapOps.cs` — captures both
  live head addresses; pins overflow `Memory` and disposes in
  `finally` / on `goto readFromScratch`
- `libs/server/Storage/Session/MainStore/HyperLogLogOps.cs` — defensive
  `Debug.Assert`
- `libs/server/Storage/Session/ObjectStore/SortedSetGeoOps.cs` — leak
  fixes (`searchOutMem` + `zAddOutput`)
- `libs/server/Storage/Session/ObjectStore/SortedSetOps.cs` — leak
  fixes (`rangeOutputMem` + `zAddOutput`)

### Tests

- `test/Garnet.test/GarnetBitmapTests.cs` — regression test

## Validation

- All 351 `GarnetBitmapTests` pass (was 347, plus 4 new regression
  variants); verified the new regression test crashes with
  `AccessViolationException` on the unfixed code.
- All 697 sorted-set / geo / bitmap / HLL tests pass.
- Tsavorite test project builds clean.
- `dotnet format --verify-no-changes` is clean.
- Reviewed by GPT 5.5 and Claude Opus 4.6 (1M context); review comments
  from the GitHub Copilot reviewer on PR #1751 (BITOP `DiskLogRecord`
  lifetime, `ValueSpanByteAndMemory` lifetime docs) addressed in this
  commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/memory-fixes branch from 7a2f7d9 to 0711bdc Compare April 30, 2026 03:23
@badrishc badrishc merged commit 509f216 into dev Apr 30, 2026
30 checks passed
@badrishc badrishc deleted the badrishc/memory-fixes branch April 30, 2026 04:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants