-
Notifications
You must be signed in to change notification settings - Fork 61
Description
I'm interested in being able to write code that looks something like this:
fn drop_static_mut<T>(_r: &'static mut T) {
}
fn drop_then_write(r: &'static mut i32, p: *mut i32) {
drop_static_mut(r);
unsafe { *p = 5; }
}
pub fn main() {
let r = Box::leak(Box::new(1i32));
let p = r as *mut i32;
drop_then_write(r, p);
}You can't actually drop mutable references in current Rust (as any attempt to pass them to a function reborrows rather than moving), but drop_static_mut effectively accomplishes something very similar: because it borrows the &'static mut for the rest of its 'static lifetime and then lets the reborrow go out of scope, it proves that there will be no further use of that reference or any reference it was reborrowed from (because they were all reborrowed in a way that can never end). Something similar can be done with any concrete lifetime 'a – a lifetimed equivalent of drop_static_mut that takes &'a mut T rather than &'static mut T proves that there will be no further use of that reference or any reference it was reborrowed from until the end of lifetime 'a (which in turn means that any such references with a lifetime of 'a or shorter will never be used again).
The motivation behind this is that I've been trying to produce a competitor to/evolution of allocator_api that contains almost no unsafe code, and was wondering what (from the Rust point of view) the type of an allocation from the system allocator would be. I eventually decided that it is an existing Rust type, &'static mut MaybeUninit<T>: it has almost all the same properties that a mutable reference does (the memory can only be accessed via the allocation and will not be read or written otherwise, which is one of the main behaviours of a mutable reference, and the MaybeUninit turns off any validity requirements), and if you use a view in which "freeing the reference gives it back to the memory allocator, which can later give the same reference out again as an allocation", the reference itself persists until the end of the program (thus 'static) – it's transferred back and forth between the program and memory allocator but its lifetime never actually ends.
However, one problem with this is that some allocators need to be able to rearrange the allocations they give out, e.g. by coalescing two allocations and returning them as a single allocation. I'm looking into the possibility that they might be able to do this by "dropping" the references that represent the allocations they want to merge, and creating a new one in the memory they occupied (that they now know that they have full control of because the dropped reference, and everything that it was recursively reborrowed from, will provably never be used again).
As far as I can tell, the parts of the Stacked Borrows and Tree Borrows algorithms that simulate the Rust aspect of borrows are OK with this sort of code, because they don't consider mutable references that are never read or written again to apply their "you can't read or write this memory except through this reference" condition. However, the Rust compiler wants to be able to make use of LLVM's noalias assertion, which promises that the addresses covered by a pointer are, during the body of a function, never written to without using that pointer or a pointer derived from it – so they each add an extra rule, that references passed into a function are "protected" and don't allow conflicting writes even if they are never used again, in order to be compatible with noalias – and that "protection" rule causes code like the code I wrote above to currently be considered UB.
I'm therefore wondering whether it might make sense to weaken the definition of protectors (and, at the same time, weaken the definition of noalias in LLVM so that it can continue to be used to compile Rust code, or add an alternative metadatum that is like noalias but weaker). The weaker noalias definition I've been considering is "if a weaker-noalias pointer is passed as a function argument, the data it references will not be written, except via that pointer or a pointer based on it, until the last use of that pointer or pointers based on it" (the italicised portion replaces "end of the function"). In practice, LLVM mostly wants to use its "noalias" metadata to prove that, if it writes through a pointer, the value it writes will still be there the next time that the pointer is used, so I'm hoping that a change like this wouldn't lead to too many missed optimisations. (I can't think of any optimisations that would be missed in safe Rust code as a result – although it is possible that it might lead to some missed optimisations in C, if weaker-noalias is used to model restrict and the function has both restrict and non-restrict arguments.)
The alternative is likely making the system allocator "magical" by allowing it to break the noalias rules, but I don't like that from an opsem point of view (because it would mean that either you have opsem special cases, or you use intentional UB in the knowledge that the optimiser probably can't make use of it to do something undesired). This is the solution used by C, where essentially the same problem can occur in code that looks like this:
long *free_restrict_then_write(short *restrict p) {
*p = 2;
free(p);
long *q = malloc(sizeof (long));
*q = 3;
return q;
}It is possible that malloc will return memory that overlaps the memory that p occupied, meaning that the write through q breaks the restrict promise if you interpret it as being based on memory locations. In C, this code is legal because *p and *q are considered to be different objects even if they occupy the same memory, because the free call ends the object's lifetime and the malloc call creates a new object with a new lifetime (and the restrict promise is based on objects rather than addresses). However, that means that the code can suddenly become UB if you replace malloc and free with non-standard-ilbrary functions that do the same thing (because then the object's lifetime may not have been ended properly). So I don't like the C solution to the same problem because it doesn't compose very well. (That said, if there is magic available to make C compilers work, might there be a way to make it available to Rust programs more generally? Some sort of "noalias ends here" annotation would solve the problem in a fairly clean way, although it would have to be able to affect the entire stack of reference reborrows rather than being function-local.) Another problem with this approach is that it only works in the 'static case, not with general lifetimes 'a (which makes it hard to "chain" allocators, i.e. an "inside" allocator uses an "outside" allocator to obtain the memory that it allocates from).
Do people have insights into what the best (non-undefined-behaviour) way to write this sort of code in current Rust is, or whether changes to Rust and/or LLVM might be useful to make such code possible to write?