Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement roaring64_bitmap_internal_validate #588

Merged
merged 3 commits into from
Feb 8, 2024
Merged

Conversation

Dr-Emann
Copy link
Member

@Dr-Emann Dr-Emann commented Feb 5, 2024

Add calls to validate the bitmap around the unit tests

@Dr-Emann Dr-Emann requested a review from SLieve February 5, 2024 06:12
@lemire
Copy link
Member

lemire commented Feb 5, 2024

Either here or as part of another PR, we should add this to our fuzzer...

https://github.com/RoaringBitmap/CRoaring/blob/master/fuzz/croaring_fuzzer.c

tests/roaring64_unit.cpp Show resolved Hide resolved
Comment on lines +298 to +301
* Returns true if the bitmap is consistent. It may be useful to call this
* after deserializing bitmaps from untrusted sources. If
* roaring64_bitmap_internal_validate returns true, then the bitmap is
* consistent and can be trusted not to cause crashes or memory corruption.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what the docs describe, I feel that there is some slight mismatch with what we do in roaring64_bitmap_portable_deserialize_safe, because there we construct 32-bit roaring bitmaps and, importantly, don't call internal_validate on them. Not sure what the best solution is here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I think the way we interact with those bitmaps is safe (at least for now):

  • we don't do anything with the containers themselves
  • roaring_bitmap_portable_deserialize_safe will always return a consistent count of containers (if the bitmap says it has N containers, there should be N keys, typecodes, containers)
  • we don't (currently) rely on the container keys being ordered
    • This does mean we could create a valid roaring64 bitmap from an invalid serialization (if a sub-bitmap serialization had unsorted keys, we'll end up with them in the right place because we insert them individually into the ART)
  • freeing the bitmap must always be safe

It does mean that, like roaring_bitmap_portable_deserialize_safe, despite the name containing "safe", it's absolutely possible to get a segfault interacting with the returned roaring64 bitmap if you feed it junk unless you call validate on it (or have external proof that the input is validly the output of serializing a roaring64 bitmap)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair assessment, we can leave the wording as-is then.

include/roaring/art/art.h Outdated Show resolved Hide resolved
src/art/art.c Outdated Show resolved Hide resolved
include/roaring/art/art.h Outdated Show resolved Hide resolved
Add calls to validate the bitmap around the unit tests
That's technically undefined behaviour, and shows up in ubsan
@Dr-Emann Dr-Emann merged commit e1e2359 into master Feb 8, 2024
36 checks passed
@Dr-Emann Dr-Emann deleted the r64_internal_validate branch February 8, 2024 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants