New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elaborate on the invariants for references-to-slices #121965
base: master
Are you sure you want to change the base?
Conversation
Given that @rustbot label +I-lang-nominated |
That sounds like we should go through t-opsem FCP, maybe joint with t-lang. @scottmcm could you spell out the motivation for why it should be a validity invariant? The authoritative location for validity invariants is the Reference ("behavior considered undefined"), so could you prepare an accompanying reference PR? |
/// `isize::MAX / size_of::<E>()`. (Raw pointers may have longer lengths, but | ||
/// references must not. For example, compare the documentation of | ||
/// [`ptr::slice_from_raw_parts`](ptr/fn.slice_from_raw_parts.html) and | ||
/// [`slice::from_raw_parts`](slice/fn.from_raw_parts.html).) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to apply to all types with slice tails, or deliberately restricted to only slices and str?
In &(i32, [i32])
, is the max length reduced by one to make sure that the entire type always fits into isize
?
Does this affect the design of the unstable |
See also: #117474 |
Add `assume`s to slice length calls Since `.len()` on slices is safe, let's see how this impacts things vs what we could do with rust-lang#121965 and LLVM 19
I don't think so, as this is only an invariant for references, so |
Can you elaborate on the complications it would cause? I don't know how to judge what is or isn't hard in there. I'd assumed that it wouldn't be substantially harder to deal with in opsem than how references have different validity invariants on the address+provenance part from pointers already. (Talking about the pointee I think I understand how it's harder to enforce/check something, from the other conversations about that, but since the metadata isn't behind the pointer and thus it's right there to see when doing a typed copy, I didn't imagine it being substantially harder than checking things like the alignment of the pointer that's also reference-only, not applicable for pointers.) I think the only reason I have to make it strictly a validity invariant would be for niches. That would be a better version of what RawVec is currently doing by hand, for example, with rust/library/alloc/src/raw_vec.rs Lines 36 to 40 in 9b8d12c
Today the very-common Otherwise it's mostly convenience. Telling LLVM about the range restrictions is actually helpful, even just putting it on length checks, but also very expensive today -- both demonstrated in #122926 I'd love to just make it a validity invariant we can put on all
Hmm, that's a very good question. I guess everything with a slice tail would be the most consistent, and thus really does need to be phrased in terms of the implied size of the object. (With the element count rule being the simple consequent case for plain slices.) |
It means we can't view metadata as "just a type" and check the metadata field as if it were a field of some type. Instead we have to view it as an inherent part of the pointer that cannot be described separately. (Or there are two separate types, one for 'metadata of reference' and one for 'metadata of raw pointer'.) We also get more degrees of freedom as we have to separately define the invariant for references and raw pointers. It's not a big deal, so if there are clear benefits I don't think this should stop us. But we should clearly motivate breaking this symmetry. In other words, it would have been nice to say that the validity of the metadata field of a pointer/reference to |
Someone should diff the optimized IR for the compiler or some other large project before and after the PR that adds the assumes to see what optimizations are derived. I didn't notice this demonstrated in the PR, but I suspect with the assumes, LLVM optimizes code like |
The length limit on slices is clearly a safety invariant, and I'd like it to also be a validity invariant. With function parameter metadata making progress in LLVM, I'd really like to be able to use it when
&[_]
is passed as a scalar pair, in particular.The documentation for references is cagey about what exactly is a validity invariant, so for now just elaborate on the consequences of the existing safety rules on slices -- the length restriction follows from the
size_of_val
restriction -- as a way to help discourage people from trying to violate them.I also made the existing warning stronger, since I'm fairly sure it's already UB to violate at least the "references must be non-null" rule, rather than it just being that it "might be UB in the future".
cc @joshlf @RalfJung