Skip to content

Conversation

@keegancsmith
Copy link
Member

No description provided.

Change-Id: Ib557078f05714a14b8ee36682db81c8fad4f929a
Tests still don't pass

Change-Id: Ic68811b64aa43324fda9ed9186a2154639f64d8b
Change-Id: I445385bc75a02a3b5d8aabde608d133d95a4fc01
@keegancsmith
Copy link
Member Author

@sourcegraph/core-services so this might not be better. I believe loading a roaring bitset requires scanning over the whole []buf representing it. This may actually be way more IO than the current delta encoding approach, which lazily delta decodes. So in the case of something with a lot of results, you will do way less IO. But in the case of something with few results, you may end up doing the same amount of IO.

So roaring bitsets may be inappropriate for posting lists to byte offsets. There is likely something better though. From a quick look at the roaring bitsets implementation, we can probably get away with something based on roaring bitsets which is optimized for streaming them, or we can look into some research.

@tsenart
Copy link
Contributor

tsenart commented Jun 28, 2019

I believe loading a roaring bitset requires scanning over the whole []buf representing it.

From: https://github.com/RoaringBitmap/RoaringFormatSpec#general-layout

The layout is designed so that random access to the data is possible without materializing the bitmap in memory while being storage efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants