-
Notifications
You must be signed in to change notification settings - Fork 113
feat: basic BoolBuffer / BoolBufferMut #2456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can we call this BitBuffer? |
gatesn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we'll also want iterators and vectorized constructors (i.e. BooleanBuffer::collect_bool)
25fc4bd to
67729e2
Compare
|
Sadly |
|
Where are we at with this? Need someone to takeover while you poke at FFI bindings? |
|
Maybe? I'm no longer sure there's a strong path forward that gives us something better than Arrow's BooleanBufferBuilder. The big issue is that mapping a bitset on top of uninit memory causes UB when you try and set bits in-place. While it generally seems harmless it is technically a violation of Rust's memory model so not guaranteed to work across platforms. |
|
Why does the memory have to be uninitialized? Even the ability to into_mut and update bits is valuable from an API perspective. |
|
I will take over this pr |
454875c to
e26e316
Compare
e26e316 to
ceff11e
Compare
39b9c8a to
7f68ba7
Compare
|
This now compiles but there's still a lot of tests to fix |
aff4773 to
68d6261
Compare
|
I need to figure out all the slow downs - seems like arrows append_packed_range is faster than bitvec implementaion and forcing iterators to be owned adds a bunch of drops that could be avoided |
4d268af to
42eb48a
Compare
87ea4e1 to
7e1e2df
Compare
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5a48fba to
a16e4ca
Compare
Pulled from #2456 This separates out the logic of adding the BitBuffer from using it. There are some outstanding performance issues before we can pick up #2456 again. I added some benchmarks that highlight where we're slower than Arrow atm --------- Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Robert Kruszewski <github@robertk.io>
a16e4ca to
d93320a
Compare
|
Will make a new pr |
Replace usages of BooleanBuffer with our own BitBuffer. This lets us have our own backing Buffer for BitBuffer and implement additional optimisations for use in vortex. previously #2456 This pr requires resolving outstanding performance issues before merging 1. Make iterators cheaper to construct, likely can't be owned 2. Improve append_packed_range performance. Arrow has a function that is faster than bitvec crate --------- Signed-off-by: Robert Kruszewski <github@robertk.io> Signed-off-by: Nicholas Gates <nick@nickgates.com> Co-authored-by: Andrew Duffy <andrew@a10y.dev> Co-authored-by: Nicholas Gates <nick@nickgates.com>
It's very annoying to do random in-place setting of bits in Arrow BooleanBufferBuilder. It's very append focused. For something like SparseArray in particular, you want to create a BoolArray against uninitialized memory and only set the bits corresponding to positions of non-null values.