Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upValidity of references: Memory-related properties #77
Comments
RalfJung
added
active discussion topic
topic-validity
labels
Jan 10, 2019
This comment has been minimized.
This comment has been minimized.
|
Notice that there is an alternative that would make validity not depend on memory, while maintaining the With this approach, the properties of |
This comment has been minimized.
This comment has been minimized.
|
Concerning the second question, my personal thinking is that we should not require the pointed-to memory to be valid itself. One good argument for making things UB with a strict validity invariant is bug-checking tools, but in this case actually doing recursive checking of references all the time is really costly, and probably makes it near impossible to actually develop useful tools. On the other hand, it is very useful when writing unsafe code to be able to pass around a Some examples: |
This was referenced Jan 22, 2019
RalfJung
referenced this issue
Jan 31, 2019
Closed
RFC for a formalized notion on where to enforce reference propertes in MIR #2631
This comment has been minimized.
This comment has been minimized.
|
So @arielb1, for example, has traditionally maintained that having I am also intrigued by this comment from @RalfJung :
That seems like a very good property to have. I am inclined to pursue this approach, personally. |
This comment has been minimized.
This comment has been minimized.
nagisa
commented
Feb 4, 2019
•
|
I (personally) think that not considering fn generic<T>(foo: &T) {
// body
}it is way too easy to end up with something that will most likely indefinitely stay UB for
|
This comment has been minimized.
This comment has been minimized.
arielb1
commented
Feb 4, 2019
Could you come up with such an example - that is UB for |
This comment has been minimized.
This comment has been minimized.
nagisa
commented
Feb 4, 2019
|
@arielb1 I do not think I’m able to come up with an example (is there one?) where it would not be I realized since I last wrote the comment that, in order to obtain |
This comment has been minimized.
This comment has been minimized.
|
We talked about this at the all-hands. @cramertj expressed interest in fn foo<T>(x: &!) -> T { match x { } }Even in unsafe code, the match can never cause issues on its own, the reference would already be invalid and hence you'd have UB earlier. I believe we should handle all types consistently, meaning that if One issue with this is that this makes validity very hard to check for in a UB checker like Miri, or in a valgrind tool. You'd have to do a recursive walk following all the pointers. Also, it is unclear how much optimizations benefit from this (beyond removing dead code for |
This comment has been minimized.
This comment has been minimized.
I don't understand what you are saying here. Making |
This comment has been minimized.
This comment has been minimized.
|
Also one thing @Centril brought up at the all-hands: we need more data. In particular, we should figure out if there are interesting patterns of unsafe code that rely on having references to invalid data, and that would be too disruptive to convert to raw pointers or too widely used to break. |
This comment has been minimized.
This comment has been minimized.
|
One issue with requiring references to be transitively valid: we have a whole bunch of existing reference-based APIs, such as for slices, that we could then not use. I expect this to cause a lot of trouble with existing code, but I am not sure. Another proposal for references that enables @cramertj's optimizations could be: if reference's validity depends on memory in complex ways, we will need a notion of "bitstring validity". (Avoiding that is one argument for shallow validity, IMO.) We could define validity of a reference to require that the pointee is bitstring valid. This makes checking validity feasible and enables some optimizations. However, it would mean that |
This comment has been minimized.
This comment has been minimized.
CAD97
commented
Mar 13, 2019
•
|
Another data point is rust-lang/rfcs#2645 (FCP-merge) which theoretically will allow transmuting between I'm in favor of the validity invariant of |
This comment has been minimized.
This comment has been minimized.
|
@CAD97 that RFC is not necessary for the transmute you mentioned -- it is only needed if we want to pass things by value. In memory, |
RalfJung commentedJan 10, 2019
•
edited
Discussing the memory-related properties of references: does
&Thave to point to allocated memory (with at leastsize_of::<T>()bytes being allocated)? If yes, does the memory have to contain data that satisfies the validity invariant ofT?If the answer to both of these questions is "yes", one consequence is that
&!is uninhabited: There is no valid reference of type&!.Currently, during LLVM lowering, we add a "dereferencable" attribute to references, indicating that the answer to the first question should be "yes". This is a rather unique case in that this is the only case where validity depends on the contents of memory. This opens some new, interesting questions:
I mentioned above that
size_of::<T>()many bytes need to be dereferencable. How do we handle unsized types? We could determine the size according to the metadata and the type of the unsized tail. For slices, that's really easy, but for trait objects this involves the vtable, so it would introduce yet another kind of dependy of validity on the memory. However, vtables must not be modified, and they never deallocated (right?), so this is a fairly weak form of dependency where if a pointer was a valid vtable pointer once, then it always will be.With more exotic forms of unsized types, this becomes less easy.
extern typewe can mostly ignore, we cannot even dynamically know their size so we basically can just assume it is 0, and check dereferencability for that. But what about custom DST? I don't think we want to make validity depend on executing arbitrary user-defined code. We could just check validity for the sized prefix of this unsized type, but that would introduce an inconsistency between primitive DST and user-defined custom DST. Is that a problem?For unsized types, even the requirement that the pointer be well-aligned becomes subtle because determining alignment has similar issues than determining the size.
What about validity of
ManuallyDrop<&T>?ManuallyDrop<T>certainly shares all the bit-level properties ofT, because we perform layout optimization on it. But doesManuallyDrop<&T>have to be dereferencable?