-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access refactoring RFC #47
Comments
Okay, atomics are still required for this, to write data back once it's done with changing dangling bits, and the actual split should return two different AfterSplits: |
Okay so I've been thinking about this all day and you're right in spirit, though as your follow-up comments mention, less so in implementation. This is my fifth draft at a reply, so I'll try to be concise:
We can encode that a particular slice has an unaliased view to memory in its I've been working on doing exactly this. It's a lot of boilerplate, mostly in trait forwarding, but I think once done it should be able to make this work without severe external effects. Thank you for bringing this to my attention; indexed access is important to keep performant and this is a strong opportunity to do so. |
Wow it's only been a month? For some reason I thought this was filed earlier in January. Okay. So. This has been a nightmare of a feature to actually implement. Summary: This program extern crate bitvec;
use bitvec::prelude::*;
fn main() {
let mut raw = [0u8; 3];
let bits = raw.bits_mut::<Local>();
let (left, _right) = bits.split_at_mut(12);
match left.bit_domain_mut() {
BitDomainMut::Region { body, tail, .. } => {
body.set(4, true);
tail.set(2, true);
},
_ => unsafe{std::hint::unreachable_unchecked()},
}
} produces this output bitvec::main:
push rax
mov word, ptr, [rsp, +, 6], 0
mov byte, ptr, [rsp, +, 5], 16
lock or, byte, ptr, [rsp, +, 6], 4
pop rax
ret You can observe that this writes a zero into The default behavior has been moved back from atomic to ordinary accesses. The methods that cause aliasing of memory elements contaminate their entire slices with an aliasing marker, which causes all accesses through tainted slices to use atomic access, even when unneeded. This is because switching instructions once is faster than performing an alias detection check on every write. To untaint a slice, use the You now have full control over whether an access may be unaliased, or must be aliased. This feature is currently only available on branch I have added an example file, Thank you again for this issue. I am happy to have a community of users pushing I am probably going to stop working on it for a month or so once |
This behavior is now present in trunk. I will issue an |
This change reduces the computation done by `.next()` and `.next_back()` by representing the iteration region as a pair of boundary pointers. The behavior is analagous to the double-pointer iteration seen in `core::slice::Iter{,Mut}` and C++ iterators, so I will not explain it further here. The implementation uses `BitIdx` rather than `BitTail` for the ending marker, because the ending marker must be considered a live pointer one past the end of the region, rather than a dead pointer, in order to support `.next_back()` correctly. Instruction size of `.next()` is considerably reduced, and the runtime drops from 17ns to 8. Benchmarks: old: 2,634ns +/- 898 old_rev: 1,519ns +/- 256 new: 1,090ns +/- 246 new_rev: 1,095ns +/- 186
Summary
Rework current mutable slice split logic into non-atomic, where
bitslice.split_at_mut()
would return a tuple of(&mut self, AfterSplit)
. AfterSplit logic is explained lower.Motivation
I was recently thinking about simple and fast iteration logic based on std slice iterator with some bit tweaks, but such iterator should use default access. I simply don't like carrying atomic overhead everywhere, even if I can deal with it. Also, getting rid of atomics would simplify overall logic of this crate.
Guide-level explanation
On the high level AfterSplit (or whatever name it would get) is a sibling of BitSlice: it implements many traits and functions BitSlice implements (for example, it could be mutable split itself, and
after.split_at_mut()
produces(self, AfterSplit)
tuple). But it is a downgrade, and so it doesn't have any public constructor and some BitSlice functions aren't there.Reference-level explanation
On a lower level AfterSplit is a three words wide structure:
(owned_bits: Bitstore, slice: &mut Bitslice)
. The trick is simple: slice would always start with properly owned byte (if it's length is not zero, of course) because slices could be split only at machine word border, and the lone bitstore could only store$BITS - 1
bits at max. The latter is easy: split_mut() returns (self, ...), and "self" part there will be always greedy, so its last item will have at least one bit, and never zero (unless the whole slice has zero length, but in that case both slices have zero length). So, the BitStore part of AfterSplit always has one bit to show it's length, and that's enough (trailing_zeroes, leading_zeroes, wrapped up in an ordering trait function).Drawbacks
A lot of code and docs to write, three words wide instead of two, not a proper BitSlice, thus it could be limited in options.
Rationale and alternatives
I see three ways to deal with mutable splits: no split_mut() at all, atomics, or some kind of AfterSplit. The first way is very simple, but strongly hits functionality; the second way hits performance and now you should uphold the possible double access on most operations with slice, even if they use immutable access; the third way is a compromise: mutable splitting is allowed and doesn't interfere with other parts of library, but is somewhat unhandy for the user.
Prior art
The current atomic access logic crawls everywhere due to edge cases you should think about: for example, iterator over an immutable bitslice should use atomic access, because it could be called onto the product of split_mut(), which could be mutated right now (which is a very rare case, but you are forced to uphold it). Right now atomics are everywhere. This is good to emulate std slice logic, but BitSlice is not a common slice; nobody would treat utf-8 string as ascii, unless it is checked to be ascii. And so BitSlice doesn't need to do everything in the way std slices do. For example, "string".chars() iterator produce value itself, not a reference, and I haven't heard a single complaint about it. And so could work BitSlice immutable iterator: just give 'em the bool and forget about it.
Unresolved questions
Actually, there's just one question: should it be implemented at all, and if yes, how should it be implemented? Once it's resolved, there's a strait way to write code.
Future possibilities
After this, there would be an easy way to write fast value-producing iterator over an immutable bitslice, without thinking much about mutable access problems. Maybe, implementing some other things would become easier. I'm not very sure about it. But it's a move from copypasting common slices' logic, which could be good for performance, as bit slice is special and the fast way to do things is almost never the same as fast way with common slices.
And some things that are not related to this particular question, but related to fast iterations:
rotate_left
,rotate_right
,reverse_bits
(names are the same with numeric primitives' functions);rotate_forward
,rotate_backward
,from_ne_order
— this function takes BitStore with some bits and turns it into the required order (essentially a no-op for Lsb0 and single bit reverse for Msb0). Those functions could be easily implemented for two default orderings with functions from 1.I could have written PR with those things before that RFC, but there's no point in those functions before mutable split question is answered in some way.
The text was updated successfully, but these errors were encountered: