-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add roaring_bitmap_portable_deserialize_frozen #421
Add roaring_bitmap_portable_deserialize_frozen #421
Conversation
I've addressed the above warnings in f2db5d3. The build failure is related to unaligned access, which I can replicate locally when building with |
@andreas Can you write a comment as part of the function definition that this function may execute unaligned memory accesses? |
It would make a lot of sense to add Do something like this... #if defined(__GNUC__) || defined(__clang__)
#define ALLOW_UNDEFINED __attribute__((no_sanitize("undefined")))
#else
#define ALLOW_UNDEFINED
#endif in portability.h and then right before your function just add Note that this is only sane if you have documented the fact that the function does unaligned accesses. Basically, we are pushing the responsibility with the user. There are systems where an unaligned access would crash. It is possible for the compiler to make assumptions... and crash things... so we expect the user of the library to check that it does not crash. |
Right, so we are getting these warnings which are expected...
|
Let us run the tests. |
Added in 2b2d5a9.
I've defined
Note that this is a new occurrence of the error. Basically it seems like I need to add |
I think so, yes. Sadly, this might require the ALLOW_UNALIGNED to spread further than we'd like. There is one more problem. Under Visual Studio, in debug mode, this may also fail... but they have an |
Sorry, we also need to change And we would need to do something like...
and then we will need to add |
This is damn annoying, I realize it. It is more work than you'd think. :-) |
Np, I'll merge It looks like |
With the 10 https://gist.github.com/andreas/e7ad3a03eeaa1a30e270785699dd7588 Additionally, there would need to be annotations for Visual Studio. I'm wondering if we're better off just excluding the test from the sanitize builds? |
Push your changes into this PR. If you have concerns, just open a second PR. I can examine Visual Studio manually if needed. We want to test with sanitizers because our users might do so. We cannot prevent folks form running our tests with sanitizers. We don’t want to rely on the build system because we don’t know how CRoaring is built. Not everyone uses CMake. |
Got it. Merged |
@andreas Let us see what CI does. |
Merging. This great work. |
I sent a PR to add |
This PR adds the function
roaring_bitmap_t *roaring_bitmap_portable_deserialize_frozen(const char *buf)
, as discussed in #352. This allows deserializing a buffer in the portable format into a frozen bitmap, which is faster than regular deserialization withroaring_bitmap_portable_deserialize
.The implementation is a combination of
roaring_bitmap_frozen_view
andra_portable_deserialize
. The function makes a single call tomalloc
and uses the "arena allocator" approach fromroaring_bitmap_frozen_view
. Data for the containers point directly into the provided buffer (buf
). I've included unit tests adapted from testingroaring_bitmap_portable_deserialize
.To gauge the performance impact, I've added deserialization to the
real_bitmaps_benchmark
benchmark. The chart below shows the number of cycles spent deserializing all bitmaps in each dataset usingroaring_bitmap_portable_deserialize
(portable),roaring_bitmap_portable_deserialize_frozen
(portable frozen), androaring_bitmap_frozen_view
(frozen view). Seemingly, portable frozen is very competitive with frozen view, while sticking to the standardised format 🙂In terms of stylistic choices, variables names are in snake case and attempt to use the terminology from the format spec (e.g.
run_flag_bitset
instead ofbitmapOfRunContainers
).Lastly, I should caveat this by saying I very rarely program in C, so apologies in advance for stupid mistakes or oversights 😅 I've only tested this on a mac x86.