-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement compatible bitfield #466
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool!
I haven't looked too much into the Go implementation but I'm curious what the exact trade-off is for flushing the bitfield on operations that otherwise only need read-only access. Surely something like a single get
call on an RLE+-encoded bitvec can be done more efficiently than decoding it first. So I suppose the idea is that you usually do more than one get
(or other operations that are more efficient on decoded bitvecs than encoded ones), and at some point it pays off to decode it up front?
We could always expose the flush
operation and have the user of the type decide whether to do this or not (if they have a mutable reference to it). Then we can make get
/first
/count
etc work on shared references as well.
Lookups are O(1) after the first flush, instead of checking the set and unset hashsets as well as iterating over the RLE encoded bits.
Yes, I was planning on doing this in a seperate issue which would include benchmarking, but you can iterate over the RLE encoded bits to see if the index bit is set (that is the only other thing preventing) but waiting for benchmarking because checking the hash of the index against the set and unset cache as well as iterating over the RLE bits isn't trivial in extreme cases
Yeah, that's the idea. I don't particularly like it, but it solves our need and matches the go impl (and where it would error, which is important)
Yes, that was what I wanted to do, but this being drop in functionality for the specs-actors, invariants of different error handling and inconsistent functionality is not something worth this change right now, especially since this will most likely be refactored. The main goal is to match functionality and have a consistent API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be good enough for now
Applied f94217d @timvermeulen @AshantiMutinta Sorry to do after your review, but I liked Ashanti's suggestion and it was a very small change and not functional difference to make in another PR |
@austinabell Oh I missed that but lgtm 🙂 |
Summary of changes
Changes introduced in this pull request:
Was previously setup because it was optimistic that a basic bitvec would be sufficient and the API would have all the necessary functionality. Because of inconsistencies in how the bitvectors are decoded and when they error, this functionality was setup.
It doesn't match the go impl 1:1 because to match their iterators is quite tedious and going to PR the functionality in first and I'll open an issue to benchmark and optimize. There are some benefits to having it the way this is rather than those iterators, so will wait for benchmarks to draw conclusions. (This impl does less encoding/decoding but requires more memory in an actual environment probably)
cc @dutterbutter let me know if this interface fits your needs, happy to tweak things if needed. I sometimes require mutable reference and flush to avoid inefficiencies and unnecessary decoding but I can change if needed.
Reference issue to close (if applicable)
Closes
Other information and links