Skip to content

Add Roaring Bitmap support (issue #1270)#1741

Open
crprashant wants to merge 2 commits into
microsoft:mainfrom
crprashant:feature/issue-1270-roaring-bitmap
Open

Add Roaring Bitmap support (issue #1270)#1741
crprashant wants to merge 2 commits into
microsoft:mainfrom
crprashant:feature/issue-1270-roaring-bitmap

Conversation

@crprashant
Copy link
Copy Markdown

Add Roaring Bitmap support (issue #1270)

Resolves #1270.

Summary

Adds a Roaring Bitmaps extension to Garnet that introduces a new compressed
bitmap object type plus four R.* RESP commands. Implemented entirely as a
host extension in main/GarnetServer/Extensions/RoaringBitmap/zero
changes to libs/server/
— so this is a clean, reviewable foundation that
can be deepened in follow-up PRs.

Why Roaring?

A naive uint32 bitmap is 512 MiB. Roaring partitions the universe into
65 536 chunks of 65 536 bits and represents each chunk as either:

  • Array container — sorted ushort[], used while a chunk holds ≤ 4 096
    set bits (~2·count bytes).
  • Bitmap containerulong[1024] (8 KiB exactly), used once a chunk
    exceeds the threshold.

Empty chunks consume zero memory. Chunks promote (array → bitmap) and demote
(bitmap → array) automatically as cardinality changes.

Commands

Command Description
R.SETBIT key offset value Set bit at offset[0, 2³²-1] to 0/1. Returns previous bit.
R.GETBIT key offset Returns bit at offset. 0 for missing keys.
R.BITCOUNT key Population count. 0 for missing keys.
R.BITPOS key bit [from] First bit (0/1) at or after from. -1 if none.

Docs: website/docs/commands/roaring-bitmap.md.

Design notes

  • The data structure (RoaringBitmap.cs, Containers/*) is a pure C# library
    with no Garnet dependencies → independently unit-testable.
  • RoaringBitmapObject (CustomObjectBase) wraps the structure and tracks
    size deltas via bitmap.ByteSize for the per-object size accounting.
  • All four commands are registered as CommandType.ReadModifyWrite. The reads
    (R.GETBIT / R.BITCOUNT / R.BITPOS) do not mutate state, but the RMW
    path is required so that NeedInitialUpdate is invoked on missing keys —
    the framework's Read path simply returns nil otherwise. Missing-key
    responses (0 / -1) are written from NeedInitialUpdate which then
    returns false to decline key creation.
  • NeedInitialUpdate error paths use writer.WriteError(...) + return false
    rather than AbortWithErrorMessage (which returns true and would cause the
    framework to proceed into InitialUpdater/Updater, double-writing the
    response and corrupting the protocol stream).

Tests

Suite Count Status
RoaringBitmapDataTests (pure data structure) 27 ✅ Pass
RespRoaringBitmapTests (RESP integration via SE.Redis) 14 ✅ Pass

Coverage highlights:

  • Empty bitmap, single bit, idempotent set/clear.
  • Promotion threshold (40964097) and demotion across both directions.
  • 100 K random ops vs HashSet<uint> oracle.
  • Boundary offsets 0, 65535, 65536, 2³¹, 2³²-1.
  • Serialize → deserialize round-trip equality (empty / sparse / dense / mixed).
  • RESP-level: R.SETBIT/R.GETBIT parity with oracle, R.BITCOUNT,
    R.BITPOS (set / unset / from offset), large offsets, persistence across
    restart, concurrent setbits from multiple clients, error paths
    (bad offset, bad bit, bad value, wrong arity), and the two-store key
    separation property.
$ dotnet test test\Garnet.test\Garnet.test.csproj -c Debug -f net8.0 \
    --filter "FullyQualifiedName~RoaringBitmap" --nologo
Passed!  - Failed: 0, Passed: 43, Skipped: 0, Total: 43

Known limitations (intentional v1 scope)

  • Run container is not implemented. Adds ~30% extra compression on
    contiguous ranges; the array/bitmap pair captures the bulk of real-world
    savings.
  • R.BITOP AND/OR/XOR/NOT is not exposed. The data structure supports
    these natively; only command surface is needed.
  • Empty-key removal: clearing the last bit leaves an empty bitmap object
    rather than removing the key. This is a property of the custom-object
    framework's tombstone path (output.HasRemoveKey is honoured only on the
    built-in path) and is best fixed in libs/server/Storage/Functions/ObjectStore
    in a separate PR.

Files

main/GarnetServer/Extensions/RoaringBitmap/
  Containers/IContainer.cs
  Containers/ArrayContainer.cs
  Containers/BitmapContainer.cs
  RoaringBitmap.cs
  RoaringBitmapObject.cs
  RoaringBitmapCommands.cs
main/GarnetServer/Program.cs                    (registration only)
test/Garnet.test/RoaringBitmapDataTests.cs      (27 tests)
test/Garnet.test/RespRoaringBitmapTests.cs      (14 tests)
website/docs/commands/roaring-bitmap.md         (docs)

Copilot AI review requested due to automatic review settings April 27, 2026 18:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new host-level Roaring Bitmap custom object extension to Garnet, including a compressed bitmap data structure, a RoaringBitmapObject wrapper, and four new R.* RESP commands, along with docs and tests.

Changes:

  • Introduces a pure C# Roaring Bitmap implementation with array/bitmap containers and versioned serialization.
  • Adds a Garnet custom object + RESP command implementations for R.SETBIT, R.GETBIT, R.BITCOUNT, and R.BITPOS, and registers them in the default server host.
  • Adds end-to-end RESP tests and data-structure unit tests, plus documentation for the new commands.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
website/docs/commands/roaring-bitmap.md Adds user-facing documentation for the new Roaring Bitmap object and R.* commands.
test/Garnet.test/RoaringBitmapDataTests.cs Adds unit tests for the standalone RoaringBitmap data structure (promotion/demotion, bitpos, serialization).
test/Garnet.test/RespRoaringBitmapTests.cs Adds RESP-level integration tests for the new commands via StackExchange.Redis.
main/GarnetServer/Program.cs Registers the Roaring Bitmap custom type and R.* commands in the default host.
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapObject.cs Implements the Garnet custom-object wrapper (clone/serialize/size tracking).
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs Implements argument parsing and the four RESP commands.
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmap.cs Implements the roaring bitmap core, bit operations, enumeration, and serialization format.
main/GarnetServer/Extensions/RoaringBitmap/Containers/IContainer.cs Defines the internal container abstraction and serialization kind enum.
main/GarnetServer/Extensions/RoaringBitmap/Containers/BitmapContainer.cs Implements dense bitmap container behavior, popcount, and serialization.
main/GarnetServer/Extensions/RoaringBitmap/Containers/ArrayContainer.cs Implements sparse sorted-array container behavior, promotion logic, and serialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/Garnet.test/RoaringBitmapDataTests.cs Outdated
Comment thread main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmap.cs Outdated
Comment thread main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs Outdated
Comment thread main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs
Comment thread modules/RoaringBitmap/RoaringBitmapObject.cs
@crprashant
Copy link
Copy Markdown
Author

Thanks for the thorough review! Pushed 65ac9f1d4 addressing each comment. Summary:

# File / line Resolution
1 RoaringBitmapDataTests.cs:175bits[actual0] index Applied. (int)actual0 cast plus an explicit actual0 >= 0 guard. (Note: array indexers in C# do accept long per the language spec, which is why the original compiled — but the cast makes intent obvious and removes the runtime OverflowException risk.)
2 RoaringBitmap.cs:189ByteSize doc/impl mismatch Applied. Updated the XML doc to state that the per-entry SortedDictionary node overhead is included (the implementation was already correct; only the comment was misleading).
3 RoaringBitmapCommands.cs:51RSetBit.NeedInitialUpdate always returns true Applied. NeedInitialUpdate now validates offset and value against a copy of the ObjectInput (it's a struct, so var validation = input; snapshots it without disturbing what Updater sees). On bad input it writes the error and returns false, so a malformed R.SETBIT no longer creates an empty tombstone-style key.
4 RoaringBitmapCommands.cs:6using Garnet.common; flagged as unused Disagreed (with evidence). RespMemoryWriter lives in Garnet.common, and removing the using produces 8 × CS0246: The type or namespace name 'RespMemoryWriter' could not be found errors across all NeedInitialUpdate / Updater / Reader signatures. Restored.
5 RoaringBitmapObject.cs:29 — inconsistent default-ctor Size Applied. Default ctor now sets this.Size = ObjectOverhead + bitmap.ByteSize, matching the deserialized constructor so freshly-created and round-tripped objects report identical memory baselines and ByteSize-based mutation deltas don't double-count the empty-bitmap baseline.

All 27 data-structure tests + 14 RESP integration tests still pass:

Passed!  - Failed: 0, Passed: 43, Skipped: 0, Total: 43

@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

2 similar comments
@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

@badrishc
Copy link
Copy Markdown
Collaborator

Thanks for your contribution! This extension is interesting, but instead of putting it in main, we should place it in https://github.com/microsoft/garnet/tree/main/modules (where e.g., GarnetJSON is kept) so that it is not bundled by default in the server.

@badrishc
Copy link
Copy Markdown
Collaborator

badrishc commented Apr 28, 2026

Also, main is closed to new features, so we would request that you retarget your PR to the dev (v2) branch.

@crprashant crprashant force-pushed the feature/issue-1270-roaring-bitmap branch from 02743e5 to 66af335 Compare April 29, 2026 02:45
@crprashant crprashant changed the base branch from main to dev April 29, 2026 02:46
@crprashant
Copy link
Copy Markdown
Author

Thanks for the review! Both points addressed in the latest force-push:

  1. Moved extension into modules/RoaringBitmap/ — mirrors GarnetJSON: new GarnetRoaringBitmap.csproj, new RoaringBitmapModule : ModuleBase entry point that registers the factory + R.SETBIT / R.GETBIT / R.BITCOUNT / R.BITPOS, namespace renamed Garnet.Extensions.RoaringBitmapGarnetRoaringBitmap to avoid the namespace/class collision, wired into Garnet.slnx and test/Garnet.test/Garnet.test.csproj. Nothing in main/GarnetServer references it anymore — it is no longer bundled by default.
  2. Retargeted PR base to dev. Branch was rebased onto current upstream/dev and the CustomObjectFunctions / CustomObjectBase overrides updated to the new scoped ReadOnlySpan<byte> and HeapMemorySize APIs.

Local validation: all 43 RoaringBitmap tests (29 data + 14 RESP) pass on net8.0, dotnet format is clean.

@crprashant crprashant force-pushed the feature/issue-1270-roaring-bitmap branch from 66af335 to dd3e2da Compare April 30, 2026 02:27
Comment thread modules/RoaringBitmap/GarnetRoaringBitmap.csproj
Comment thread modules/RoaringBitmap/RoaringBitmap.cs Outdated
Comment thread modules/RoaringBitmap/RoaringBitmap.cs Outdated
Comment thread modules/RoaringBitmap/RoaringBitmap.cs Outdated
Comment thread modules/RoaringBitmap/RoaringBitmap.cs Outdated
Comment thread test/Garnet.test/RespRoaringBitmapTests.cs
Comment thread test/Garnet.test/RoaringBitmapDataTests.cs Outdated
{
private static ReadOnlySpan<byte> ErrOffset => "ERR bit offset is not an unsigned 32-bit integer"u8;

public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a question for @badrishc as I haven't played around with custom objects too much - is validation expected to get in NeedInitialUpdate like this?

It's unfortunate as it forces this read command to act like a write which will hurt throughput.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left as-is for now and added an explicit architectural note in code (above each affected command) flagging the question for @badrishc. The Reader path is also implemented as an existing-key fast-path, but NeedInitialUpdate is currently the only hook that can emit Redis-compatible missing-key responses (0 / -1) without nil semantics. Happy to switch if @badrishc confirms a cleaner Reader+miss override is planned.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@badrishc — flagging this for your input.

We have R.GETBIT / R.BITCOUNT / R.BITPOS, all conceptually read-only. To return the Redis-compatible missing-key response (0 for GETBIT/BITCOUNT, -1 for BITPOS) without materializing an empty object, we currently register them as CommandType.ReadModifyWrite and emit the response from NeedInitialUpdate. The Reader path is only invoked when the key exists.

Is there an idiomatic way to express "read-only with typed missing-key response" in CustomObjectFunctions today, or is the NeedInitialUpdate hack the recommended pattern? If a cleaner Reader+miss override is on the roadmap, we'd happily refactor these three commands. As of 47d62b29b the Reader override is already in place as a fast path for the existing-key case.

/// </summary>
public sealed class RBitCount : CustomObjectFunctions
{
public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar Q for @badrishc (and again in RBitPos) - using NeedInitialUpdate for "missing" is messy; can this be phrased as a Reader op instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note as above — Reader is now implemented on RGetBit, RBitCount, RBitPos as the existing-key fast-path; NeedInitialUpdate handles the typed missing-key response. Architectural comment added in-file pointing at @badrishc.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@badrishc — same question for R.BITPOS specifically (missing key must return -1 per Redis semantics, not nil). Tagging you so this isn't lost; full context in the sibling thread above. Happy to refactor if CustomObjectFunctions gains a cleaner Reader-with-miss hook.

Comment thread website/docs/commands/roaring-bitmap.md Outdated
@crprashant
Copy link
Copy Markdown
Author

Thank you for your reviews Will address all these in next PR

Addresses PR review feedback from @badrishc:
- Move the extension from main/GarnetServer/Extensions/RoaringBitmap to
  modules/RoaringBitmap so it isn't bundled by default (mirrors GarnetJSON).
- Retarget the PR to dev (companion change).

Implementation changes for the move:
- New modules/RoaringBitmap/GarnetRoaringBitmap.csproj (mirrors GarnetJSON.csproj,
  signs assembly, exposes InternalsVisibleTo Garnet.test).
- New RoaringBitmapModule : ModuleBase entry point that registers the
  factory and the four R.SETBIT/R.GETBIT/R.BITCOUNT/R.BITPOS commands.
- Renamed namespace Garnet.Extensions.RoaringBitmap -> GarnetRoaringBitmap
  to avoid the namespace/class collision with class RoaringBitmap.
- Updated CustomObjectFunctions overrides to dev-branch
  scoped ReadOnlySpan<byte> signatures for NeedInitialUpdate / Updater.
- Updated RoaringBitmapObject to dev-branch CustomObjectBase ctor and
  HeapMemorySize accounting.
- Wired the module into Garnet.slnx and Garnet.test.csproj.
- Tests still register via server.Register.NewCommand in [SetUp] (in-process),
  matching the existing custom-object test pattern.
- Updated StringKeyAndCustomObjectKey_AreSeparate to expect WRONGTYPE on the
  unified store on dev.
@crprashant crprashant force-pushed the feature/issue-1270-roaring-bitmap branch from dd3e2da to 47d62b2 Compare May 14, 2026 12:19
@crprashant
Copy link
Copy Markdown
Author

Updated — all review comments addressed

Force-pushed 47d62b29b (rebased onto current dev from dd3e2dabb).

Highlights:

  • RoaringBitmap chunk store refactored from SortedDictionary<ushort, IContainer> to two parallel arrays (ushort[] highKeys + IContainer[] containers) with MemoryExtensions.BinarySearchBitPos now skips to the first relevant chunk in O(log N).
  • Add / Remove / SetBit / GetBit / BitPos.bit migrated from int to bool (RESP int↔bool bridging stays in RoaringBitmapObject).
  • RoaringBitmap now implements IEnumerable<uint>.
  • ArrayContainer and BitmapContainer rewritten using Span<T>.IndexOfAnyExcept(0) / LastIndexOfAnyExcept(0), MemoryExtensions.BinarySearch, MemoryMarshal.AsBytes for bulk serialize, and Stream.ReadExactly for bulk deserialize. EnsureCapacity adopts the geometric 1,2,4,…,4096 growth pattern via BitOperations.RoundUpToPowerOf2.
  • All unreachable / illegal-state guards removed; impossible BitmapContainer.cardinality == 0 / cardinality == 1 branches gone.
  • Reader overrides added on RGetBit / RBitCount / RBitPos as the existing-key fast path; NeedInitialUpdate retained for typed missing-key responses (with an inline architectural note flagging the question for @badrishc).
  • var (IDE0007) sweep across all changed files; ArgumentNullException.ThrowIfNull swapped in everywhere; style nits resolved.
  • ContainerKind.Run removed (unused).
  • Microsoft.VisualStudio.Threading.Analyzers added to the module csproj.
  • Docs corrected — module is loaded manually via MODULE LOADCS, not wired into the default host.
  • New ConcurrentReadsWhileWrites_StayConsistent integration test (4 reader tasks running 5000 R.GETBIT/R.BITCOUNT ops concurrently with a bit-toggling writer).

Build: 0 warnings, 0 errors.
Tests: 44/44 Roaring tests passing on both net8.0 and net10.0.

Two threads contain open questions for @badrishc about whether the read commands can be expressed as pure Reader ops (avoiding the ReadModifyWrite registration that NeedInitialUpdate requires for missing-key responses). I've replied to each thread inline and the code carries a matching architectural comment — happy to refactor if you confirm a cleaner pattern.

@crprashant
Copy link
Copy Markdown
Author

@badrishc — two open architectural questions on this PR I'd love your read on. Both relate to how a read-only custom object command should emit a Redis-compatible missing-key response (0 / -1) without materializing an empty object:

TL;DR: the three read commands are currently registered as ReadModifyWrite because NeedInitialUpdate is the only hook that lets us return a typed missing-key response. Reader is wired in as the existing-key fast path. Threads are left unresolved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@kevin-montrose kevin-montrose changed the base branch from dev to main May 21, 2026 15:56
@kevin-montrose
Copy link
Copy Markdown
Contributor

On a second review of this, we have concerns about the write commands being registered as RMW. That implies reads will proceed sequentially rather than in parallel.

Unfortunately there's no existing way to customize the RESP response of a custom read command today, you always get a null.

I'm going to take a look at fixing that in general (and moving existing extensions to use it if necessary). Once that's merged you can update this PR to use the new infra and catch up to main (now that dev has been merged in we no longer need to target dev). I'll try and get that done tomorrow, or shortly after the holiday weekend.

@crprashant
Copy link
Copy Markdown
Author

Thank you. Sounds good @kevin-montrose will look for updates on this from you following week.

@kevin-montrose
Copy link
Copy Markdown
Contributor

Infra changes for this are pending review here: #1820

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bitmap compression

5 participants