Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ref parameter incorrectly decorated with noalias attribute #63787

Closed
RalfJung opened this issue Aug 21, 2019 · 49 comments
Closed

Ref parameter incorrectly decorated with noalias attribute #63787

RalfJung opened this issue Aug 21, 2019 · 49 comments
Assignees
Labels
A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Aug 21, 2019

The following library:

use std::cell::*;

pub fn break_it(rc: &RefCell<i32>, r: Ref<'_, i32>) {
    // `r` has a shared reference, it is passed in as argument and hence
    // a barrier is added that marks this memory as read-only for the entire
    // duration of this function.
    drop(r);
    // *oops* here we can mutate that memory.
    *rc.borrow_mut() = 2;
}

generates IR for break_it that starts like

; playground::break_it
; Function Attrs: noinline nonlazybind uwtable
define void @_ZN10playground8break_it17hb8a9beeb2affda19E(
  { i64, i32 }* align 8 dereferenceable(16) %rc,
  i32* noalias nocapture readonly align 4 dereferenceable(4) %r.0,
  i64* nocapture align 8 dereferenceable(8) %r.1)
unnamed_addr #1 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {

Note the noalias. That is incorrect, this function violates the noalias assumptions by mutating the data r points to through a pointer (rc) not derived from c. Here's a caller causing bad aliasing:

pub fn main() {
    let rc = RefCell::new(0);
    break_it(&rc, rc.borrow())
}

RefMut has the same problem when -Zmutable-noalias is set.

Cc @rkruppe @comex

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

cc @eddyb

@bjorn3
Copy link
Member

bjorn3 commented Aug 21, 2019

Maybe pour an UnsafeCell into Ref instead of &T?

@RalfJung
Copy link
Member Author

It should probably use a raw pointer (something I have been arguing for for years now ;). For RefMut, UnsafeCell wouldn't even help.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

Changing Ref value field to value: &'b std::cell::UnsafeCell<T> fixes this.

@RalfJung
Copy link
Member Author

Also see #60076: vec_deque::Drain has a similar problem.

Changing Ref value field to value: &'b std::cell::UnsafeCell fixes this.

It does, but I think a raw pointer is the better fix.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

It does, but I think a raw pointer is the better fix.

Why?

The value field inside RefCell is of type UnsafeCell<T>, not T, so IMO Ref has a bug by using &T instead of &UnsafeCell<T>. Using a raw pointer (e.g. NonNull<T>) would mean that we lose the dereferenceable annotation.

@jonas-schievink jonas-schievink added A-codegen Area: Code generation C-bug Category: This is a bug. I-nominated I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 21, 2019
@comex
Copy link
Contributor

comex commented Aug 21, 2019

Proof of concept miscompilation: playground link

Edit: To be fair, it works based on the “a == b implies the compiler can replace b with a” optimization, which can be unsound all by itself in some cases involving pointer arithmetic or allocation/deallocation... but eh, this code does neither.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2019

In rust-lang/unsafe-code-guidelines#125 (comment) I asked:

Is Ref having noalias readonly on the data reference a problem because, unlike a regular reference, it can be invalidated by Drop?

Could we then rely on the presence of a Drop impl to disable noalias readonly, or should Ref itself use a raw pointer or &UnsafeCell?

But it seems like the preferred solution is to change Ref to not contain a reference.

For RefMut, UnsafeCell wouldn't even help.

@RalfJung Why not? It would have to be &UnsafeCell, not &mut UnsafeCell, but I expect it to work.

@RalfJung
Copy link
Member Author

It would have to be &UnsafeCell

Hm okay that might work. That seems real ugly though.

Why?

IMO raw pointers are more honest, as laid out years ago. The lifetime in that reference is a lie.

@RalfJung RalfJung added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Aug 22, 2019
@RalfJung
Copy link
Member Author

Nominated for lang team as well.

@rust-lang/lang the interesting question from a lang team perspective here is to decide whether this is a library bug or a language bug. I.e., is the code wrong and the noalias right, or vice versa? Do we want non-repr(transparent)-structs containing references to be marked noalias?

@comex
Copy link
Contributor

comex commented Aug 22, 2019

Could we then rely on the presence of a Drop impl to disable noalias readonly, or should Ref itself use a raw pointer or &UnsafeCell?

Well, we have other types such as Box, Vec*, etc. that have Drop impls but still want noalias.

With such types, the pointer can be deallocated by Drop and later reused, but... for one thing, the deallocation function takes the pointer itself as an argument, which is treated as the pointer escaping. If the function makes a new allocation and get the same pointer value back, the new pointer can be said to be 'based on' the pointer it had deallocated – depending on the allocator implementation, it often actually is, and when it isn't, LLVM generally can't prove that.

More importantly, __rust_alloc has its return value declared noalias, so LLVM will always assume that newly allocated pointers don't alias other pointers, regardless of whether the other pointers are themselves noalias. This attribute can in fact lead to miscompilations: it's one of the issues that the 'twin allocation' paper tried to address. There might be a case for removing it. But as long as it's there, removing noalias from Box and co. wouldn't make a difference, regardless of the 'based on' question.

* Though Vec can't actually get noalias treatment, at least on x86, because the ABI passes it by a pointer. You know more about that than I do.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Aug 22, 2019

With such types, the pointer can be deallocated by Drop and later reused, but... for one thing, the deallocation function takes the pointer itself as an argument, which is treated as the pointer escaping. If the function makes a new allocation and get the same pointer value back, the new pointer can be said to be 'based on' the pointer it had deallocated – depending on the allocator implementation, it often actually is, and when it isn't, LLVM generally can't prove that.

I would argue for the opposite: if an allocation is freed and the underlying memory is recycled to satisfy another allocation later on, the later allocation should result in a pointer of different provenance from the one that was freed, regardless of the physical address reuse. Each new allocation should be considered a fresh allocation for the purposes of the abstract machine (to justify the different provenance). This is somewhat realized today by allocation functions returning noalias pointers, as you noted yourself.

This should not lead to any issues because all uses of the "old" allocation must happen before it is freed, and all uses of the new allocation must happen after it was allocated. If these two time spans don't overlap, everything is fine. It's a bit weird that pointers to the same memory address but with different provenance are floating around next to each other, but this can happen even in safe code with e.g. int-to-pointer casts or wrapping_offset. And if these two time spans do overlap (i.e., the new allocation is used before the old one is freed), then the new allocation is made before the old one is freed, so it can't reuse the same address.

Note that all of this is completely independent of allocator internals and data structures, which are neither relevant nor usually visible to optimizers.

Freeing an allocation to which one has a safe pointer or reference is problematic for other reasons, namely dereferenceable (see e.g. https://internals.rust-lang.org/t/is-it-possible-to-be-memory-safe-with-deallocated-self/8457/12), which we add not only to references but also to Box and other owned pointers. We do need to solve this somehow, but I don't see a similar issue for noalias.

More importantly, __rust_alloc has its return value declared noalias, so LLVM will always assume that newly allocated pointers don't alias other pointers, regardless of whether the other pointers are themselves noalias. This attribute can in fact lead to miscompilations: it's one of the issues that the 'twin allocation' paper tried to address. There might be a case for removing it. But as long as it's there, removing noalias from Box and co. wouldn't make a difference, regardless of the 'based on' question.

I wrote above why I think noalias on those functions is fine. I also don't recall seeing (and couldn't find with a quick scan just now) any mention of such an issue in the twin allocation paper. Leaving aside that noalias literally doesn't occur at all (IIRC it's out of scope / left for future work), what is there goes in quite the opposite direction: if my reading of figure 5 is correct, malloc always returns a fresh, logical pointer!

  • Though Vec can't actually get noalias treatment, at least on x86, because the ABI passes it by a pointer. You know more about that than I do.

FWIW this is more a consequence of deliberate choices in the Rust ABI(s) to pass structs of two scalars as those scalars but not do the same for structs with more fields. It's neither platform-specific nor otherwise externally-imposed. Also, if we had usable "local noalias" (e.g., via metadata on loads) then this wouldn't matter anyway.

@comex
Copy link
Contributor

comex commented Aug 22, 2019

Note that all of this is completely independent of allocator internals and data structures, which are neither relevant nor usually visible to optimizers.

They're visible if LTO is turned on, but unless the allocator is trivial, the optimizer is nowhere near smart enough to notice that newly allocated pointers might not be 'based on' the previously freed versions. However, assuming that the optimizer will never be smart enough to notice something is a risky strategy in the long term. :)

I wrote above why I think noalias on those functions is fine. I also don't recall seeing (and couldn't find with a quick scan just now) any mention of such an issue in the twin allocation paper. Leaving aside that noalias literally doesn't occur at all (IIRC it's out of scope / left for future work), what is there goes in quite the opposite direction: if my reading of figure 5 is correct, malloc always returns a fresh, logical pointer!

Well, it's a combination of two issues mentioned in the paper.

In 4.6 it discusses the problems with propagating pointer equalities in GVN – that is, in a code path where a == b must have previously evaluated to true, replacing a with b. This is the same optimization I mentioned in my proof-of-concept for the Ref issue earlier in this thread.

In 4.8, it discusses how newly allocated pointers can be equal to previously freed ones, albeit mostly in the context of wanting to optimize the comparison to false. However, one of the approaches it describes would be an alternative way to avoid this issue:

Another solution is to define comparisons with freed blocks as UB (as C and C++ standards do)

To be more concrete, here is an example 'miscompilation' (but not really, because it cheats) based on propagating pointer equalities across a deallocation and reallocation. Under the model proposed by the twin allocation paper, the miscompilation is the optimizer's fault rather than Rust's; I believe the authors' proof-of-concept LLVM modification would have prevented it from occurring, by disabling GVN for pointers outside of limited scenarios where it's known to be safe.
Playground link

The steps:

  • Start with a Box<i32> parameter, x. Store 2 to it.
  • Save x's pointer value as a raw pointer variable.
  • Deallocate x.
  • Obtain a new Box<i32>, y, from another function; it has value 4.
  • Assert that y's pointer value equals the previously saved one from x.
  • Load from y.

Since LLVM knows the new and old boxes' pointers are equal at the point of the load, it treats it as a load from the old box... and since the old box is noalias, it assumes that nobody else could have written to it after it stored a 2, so it optimizes the load to always return 2.

As I said, the example cheats. This optimization can only happen if the pointer doesn't escape from the function, and calling __rust_dealloc on it counts as it escaping. So instead of having the function free the Box directly, it uses a callback which reconstitutes the Box from a previously saved pointer and frees it – which is UB.

However, there are other circumstances where this could theoretically happen without UB. Suppose that instead of a Box, the function takes a reference-counted pointer to immutable data, and that pointer is hypothetically noalias. Rc won't work for this purpose: the pointer it contains cannot be noalias because it's used to access the reference count, so it's not fully immutable. But you could imagine a type where the reference count is stored separately from the data itself:

struct DetachedRc<T> {
    refcount: *mut i32,
    ptr: *const T,
}

In this case, ptr could hypothetically be legally marked as noalias (with some suitable wrapper type that doesn't currently exist), assuming that T does not contain UnsafeCell.

But, assuming DetachedRc's drop routine is inlined into the caller, it does not necessarily cause ptr to escape. As long as there's an additional reference, dropping it will only mutate the refcount, and then some other code can drop the last reference and actually free it. You could use this to construct an example similar to my cheating one but without the UB, yet still vulnerable to the unsound optimization.

(Some practical issues: (1) you would have to make LLVM somehow know statically that the victim function cannot drop the last reference, perhaps by asserting on the refcount, and (2) you'd be working with immutable data, whereas my PoC involves mutation; assert!(*x == 2); ought to work just as well as *x = 2; to let LLVM know that future dereferences should return 2, but it didn't seem to work when I tested it.)

@RalfJung
Copy link
Member Author

RalfJung commented Aug 23, 2019

Another argument for using raw pointers: not only noalias but also derefencable is wrong. Consider this example:

use std::cell::*;

pub fn break_it(rc: &RefCell<Option<Box<i32>>>, r: Ref<'_, i32>) {
    drop(r);
    *rc.borrow_mut() = None;
}

pub fn main() {
    let rc = RefCell::new(Some(Box::new(0)));
    break_it(&rc, Ref::map(rc.borrow(), |x| &**x.as_ref().unwrap()))
}

The memory r points to is deallocated while break_it still runs.

Of course, this could also be viewed as more evidence that UnsafeCell should inhibit dereferencable as well.


Could we then rely on the presence of a Drop impl to disable noalias readonly, or should Ref itself use a raw pointer or &UnsafeCell?

I don't think that would be sufficient in general. For example, consider a silly version of Ref that does not implement drop, but provides a manual "destroy" method. That would still be a sound data structure, but adding noalias / dereferencable would be just as wrong as with the real Ref.

One could say this is silly, but I am not sure interpreting "fuzzy" signals like "is there a Drop impl" for properties like this is a good idea. It also doesn't solve the dereferencable problem.

Btw, readonly is fine I think. That just means that no writes happen through this particular pointer, right?

@comex

More importantly, __rust_alloc has its return value declared noalias, so LLVM will always assume that newly allocated pointers don't alias other pointers, regardless of whether the other pointers are themselves noalias.

That's a very different noalias though -- its behavior in return position is not really the same as its behavior in argument position. Notably, the return position one has "infinite scope", whereas the argument position one only has an effect during the one function call.

Custom allocators are a very underexplored topic, and the LLVM "twin allocation" paper did not explore them either, that one assumed a built-in allocator. The paper does not even mention noalias. It is all about modelling allocators. LLVM uses noalias to indicate that a function behaves "allocator-like", but that doesn't mean that the "twin allocation" model is the right one for custom allocators, it might well only apply to the built-in one.

I don't think that pointer helps us determine how to model LLVM argument-position noalias, other than generally paving the way for a formal treatment of pointer provenance.

the deallocation function takes the pointer itself as an argument, which is treated as the pointer escaping

"escaping" is not an Abstract Machine operation, it is an approximation used by compiler. So it can never be used to argue for correctness.

@rkruppe

FWIW this is more a consequence of deliberate choices in the Rust ABI(s) to pass structs of two scalars as those scalars but not do the same for structs with more fields.

Does this mean what I wrote at #60076 (comment) is wrong? I saw a noalias for a vec_deque::Drain and concluded that it is also affected by this bug.


I have no idea why you are talking about custom allocators and return-position noalias, how is that relevant for the discussion?

@hanna-kruppe
Copy link
Contributor

Another argument for using raw pointers: not only noalias but also derefencable is wrong. Consider this example:

👍

Does this mean what I wrote at #60076 (comment) is wrong? I saw a noalias for a vec_deque::Drain and concluded that it is also affected by this bug.

Oh, yeah, the noalias that shows up in the IR when passing vec_deque::Drain by value is on a pointer to the Drain (which points to a stack slot allocated for passing the parameter by value, which indeed isn't aliased), not on an actual data pointer contained in the iterator.

I have no idea why you are talking about custom allocators and return-position noalias, how is that relevant for the discussion?

I don't know why @comex brought it up here, I don't think it's related (nor am I convinced there is any issue to begin with). In any case I've opened rust-lang/unsafe-code-guidelines#196 to discuss it separately.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 23, 2019

@rust-lang/lang the interesting question from a lang team perspective here is to decide whether this is a library bug or a language bug. I.e., is the code wrong and the noalias right, or vice versa?

Can we answer this question without an aliasing model ?

Independently of who is at fault, fixing this at the "library level", by using raw pointers, would be a small and localized change, that we could justify by trying to avoid relying on unspecified details of the aliasing model.

Fixing this at the language level, by removing noalias for all infringing types, would require changing the ABI of non-repr-transparent Adts containing fields of reference type. I can imagine that we could always pass these by memory, or that we could remove noalias annotations of reference fields when passing these by value. But these changes would make struct Foo<'a, T>(&'a mut T); subtly different from &'a mut T Adts containing references subtly different from their "destructured" representation, and could result in users destructuring Adts manually before function calls to regain noalias. Example:

struct Foo<'a, T>(&'a mut T, &'a mut T);
let foo = Foo { ... };

fn before<'a, T>(x: Foo<'a, T>) { ...x.0/x.1 aren't noalias... }
before(foo);

fn after<'a, T>(x0: &'a mut T, x1: &'a mut T) { ...x0,x1 are noalias... }
let Foo(x0, x1) = foo;
after(x0, x1);

@RalfJung
Copy link
Member Author

RalfJung commented Aug 23, 2019

Can we answer this question without an aliasing model ?

This would be putting a constraint on the aliasing model. We already have many those constraints (e.g. from code where everyone agrees noalias is correct). This would be adding one more.

would require changing the ABI of non-repr-transparent Adts containing fields of reference type.

Not really. At least, I don't view whether noalias is added as part of the ABI. This is a callee choice. There is no representation change.

E.g., we say that Option<&mut T> in Rust and *T in C have the same ABI -- but one gets noalias and the other does not.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 23, 2019

E.g., we say that Option<&mut T> in Rust and *T in C have the same ABI -- but one gets noalias and the other does not.

They have the same calling convention (the values are passed in the same way from caller to callee and vice-versa), but they are different types, just like u32 and NonZeroU32, and this means that one can't use them interchangeably. The C type that one can probably use interchangeably with Option<&mut T> is T* restrict, and this matters because T* and T* restrict and not compatible C types. See this libc soundness bug, and the tracking ctest issue.

This is a callee choice.

This choice must be made depending on Layout/ABI/PointerKind/etc.

@bstrie
Copy link
Contributor

bstrie commented Jul 25, 2021

Is this issue still applicable? When trying the original example on 1.56-nightly (which AFAIK has mutable-noalias on by default) I get the following:

; playground::break_it
; Function Attrs: nonlazybind uwtable
define internal void @_ZN10playground8break_it17h8910f8f2e3383d74E({ i64, i32 }* align 8 dereferenceable(16) %rc, i32* align 4 dereferenceable(4) %r.0, i64* align 8 dereferenceable(8) %r.1) unnamed_addr #1 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality !dbg !797 {

@eddyb
Copy link
Member

eddyb commented Jul 26, 2021

AFAIK noalias readonly (i.e. not mutable, not even by the rest of the world) is what the example in the issue description is about (also I don't think we ever omitted it for &T references), and that seems to be gone (at least from your LLVM IR snippet).
The issue description mentions RefMut too, but you'd have to make your own example by replacing Ref with RefMut (but it does indeed look like even in that case, and even with -Zmutable-noalias, the noalias is gone: https://godbolt.org/z/M1KK4a6Ks).

(I've also edited both the issue description and your comment to add syntax highlighting via ```llvm - sadly I really wish LLVM did any kind of multi-line formatting, it's really annoying to have to use horizontal scroll to read these things)

@RalfJung
Copy link
Member Author

Interesting! So when did this change, and what exactly are the rules for noalias emission these days? I'm not aware of any lang team decision here, so this might be mostly an accident.

@bjorn3
Copy link
Member

bjorn3 commented Jul 26, 2021

I believe the last changes to these rules was in #82834:

ty::Ref(_, ty, mt) if offset.bytes() == 0 => {
let address_space = addr_space_of_ty(ty);
let tcx = cx.tcx();
let kind = if tcx.sess.opts.optimize == OptLevel::No {
// Use conservative pointer kind if not optimizing. This saves us the
// Freeze/Unpin queries, and can save time in the codegen backend (noalias
// attributes in LLVM have compile-time cost even in unoptimized builds).
PointerKind::Shared
} else {
match mt {
hir::Mutability::Not => {
if ty.is_freeze(tcx.at(DUMMY_SP), cx.param_env()) {
PointerKind::Frozen
} else {
PointerKind::Shared
}
}
hir::Mutability::Mut => {
// References to self-referential structures should not be considered
// noalias, as another pointer to the structure can be obtained, that
// is not based-on the original reference. We consider all !Unpin
// types to be potentially self-referential here.
if ty.is_unpin(tcx.at(DUMMY_SP), cx.param_env()) {
PointerKind::UniqueBorrowed
} else {
PointerKind::Shared
}
}
}

@RalfJung
Copy link
Member Author

Ah... noalias is now only added in optimized builds. And with the -O flag, everything seems to behave as before.

@bstrie
Copy link
Contributor

bstrie commented Jul 26, 2021

By "behave as before", you mean you're still seeing an invalid noalias? If I try your original code example with 1.56-nightly in release mode, and add in a #[inline(never)] so that the function doesn't boil away, I get this, without any noalias:

; playground::break_it
; Function Attrs: noinline nonlazybind uwtable
define internal fastcc void @_ZN10playground8break_it17hc2b3a560f5f197e6E({ i64, i32 }* align 8 dereferenceable(16) %rc, i64* nocapture align 8 dereferenceable(8) %r.1) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {

What is the exact output that you're seeing, and with which version?

@eddyb
Copy link
Member

eddyb commented Jul 26, 2021

Make sure to add -C no-prepopulate-passes, otherwise you get the fully optimized inference, including e.g. readnone on unused fn parameters. I haven't actually checked it, tho, just a pitfall to be aware of.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 27, 2021

@bstrie I tried this and the 2nd argument of break_it_ref becomes

i32* noalias readonly align 4 dereferenceable(4) %0

Though strangely there is a third argument...? Ah I guess this is using ScalarPair layout, passing the two fields of Ref as separate arguments. The first one is the relevant one (value: &'b T), assuming no reordering happened. (The second one points to an UnsafeCell so it cannot have noalias here anyway.)

Same for break_it_ref_mut:

i32* noalias align 4 dereferenceable(4) %0

Make sure to add -C no-prepopulate-passes, otherwise you get the fully optimized inference, including e.g. readnone on unused fn parameters. I haven't actually checked it, tho, just a pitfall to be aware of.

I would be very surprised if LLVM ever added noalias automatically. I don't think this would be sound, certainly not in this case. But adding that flag anyway does not seem to make a difference.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 27, 2021

Indeed the original example for some reason looks very different, I think LLVM might be optimizing away an unused function argument? It's a binary crate, so very aggressive optimizations are possible because LLVM "knows" all the call sites.

Here's a playground reproducer though, using a library crate. I will update the original text.

@eddyb
Copy link
Member

eddyb commented Jul 27, 2021

I believe the last changes to these rules was in #82834:

ty::Ref(_, ty, mt) if offset.bytes() == 0 => {
let address_space = addr_space_of_ty(ty);
let tcx = cx.tcx();
let kind = if tcx.sess.opts.optimize == OptLevel::No {
// Use conservative pointer kind if not optimizing. This saves us the
// Freeze/Unpin queries, and can save time in the codegen backend (noalias
// attributes in LLVM have compile-time cost even in unoptimized builds).
PointerKind::Shared
} else {
match mt {
hir::Mutability::Not => {
if ty.is_freeze(tcx.at(DUMMY_SP), cx.param_env()) {
PointerKind::Frozen
} else {
PointerKind::Shared
}
}
hir::Mutability::Mut => {
// References to self-referential structures should not be considered
// noalias, as another pointer to the structure can be obtained, that
// is not based-on the original reference. We consider all !Unpin
// types to be potentially self-referential here.
if ty.is_unpin(tcx.at(DUMMY_SP), cx.param_env()) {
PointerKind::UniqueBorrowed
} else {
PointerKind::Shared
}
}
}

@bjorn3 Ah just opened this on desktop and I can see the code - I'm not really happy with non-codegen parts of the compiler relying on opt-level, but I can kinda see how you couldn't avoid the is_freeze query.
Maybe we should speed up the query instead of trying to bypass it? I wonder what the common case is. (cc @nikic)

@eddyb
Copy link
Member

eddyb commented Jul 27, 2021

@RalfJung Ah I remember now, I removed the body because for some reason I was getting things I didn't want to see in the LLVM IR, codegen'd, and ended up with something like this: https://godbolt.org/z/aooY9jbMa
You can see nocapture readnone in there, since it's an unused &T.

If you add -C no-prepopulate-passes however, you do get the expected attributes: https://godbolt.org/z/sznos8aGz.
And -C no-prepopulate-passes should also help with the binary crate version, since you'll be able to see what we gave to LLVM, before it got to touch it at all.

I would be very surprised if LLVM ever added noalias automatically.

Ah, fair, I was mostly noticing the readnone.

@bstrie
Copy link
Contributor

bstrie commented Jul 29, 2021

So to be clear, is the fix for this as straightforward as changing Ref's value field from &T to *const T, and RefMut's from &mut T to *mut T?

@RalfJung
Copy link
Member Author

Yes I think that would be the best fix (and I've argued for that change even before we knew about this problem^^)

@BurntSushi
Copy link
Member

cc @rust-lang/libs What's stopping us from fixing this? It looks like a low hanging unsound bug. :)

@RalfJung
Copy link
Member Author

One problem is deciding whether this should be fixed on the lang level or libs level...

@BurntSushi
Copy link
Member

It looks like we could at least fix this at the libs level without much difficulty, right? Then the lang team can decide whether they want to fix this on their end or not. If they do, I assume we could then revert the library fix.

Although I will say that I don't quite grok the lang side fix for this.

@RalfJung
Copy link
Member Author

Yeah, I think there are several good reasons to use raw pointers (or rather, NonNull) rather than references here.

On the lang side, we could not add noalias to newtyped references... but that might also be a bad idea since it makes newtypes less powerful.

@cuviper
Copy link
Member

cuviper commented May 13, 2022

I can work on the lib side.
@rfcbot claim

@cuviper
Copy link
Member

cuviper commented May 13, 2022

Please see #97027.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue May 14, 2022
Use pointers in `cell::{Ref,RefMut}` to avoid `noalias`

When `Ref` and `RefMut` were based on references, they would get LLVM `noalias` attributes that were incorrect, because that alias guarantee is only true until the guard drops. A `&RefCell` on the same value can get a new borrow that aliases the previous guard, possibly leading to miscompilation. Using `NonNull` pointers in `Ref` and `RefCell` avoids `noalias`.

Fixes the library side of rust-lang#63787, but we still might want to explore language solutions there.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue May 15, 2022
Use pointers in `cell::{Ref,RefMut}` to avoid `noalias`

When `Ref` and `RefMut` were based on references, they would get LLVM `noalias` attributes that were incorrect, because that alias guarantee is only true until the guard drops. A `&RefCell` on the same value can get a new borrow that aliases the previous guard, possibly leading to miscompilation. Using `NonNull` pointers in `Ref` and `RefCell` avoids `noalias`.

Fixes the library side of rust-lang#63787, but we still might want to explore language solutions there.
bors added a commit to rust-lang-ci/rust that referenced this issue May 20, 2022
Use pointers in `cell::{Ref,RefMut}` to avoid `noalias`

When `Ref` and `RefMut` were based on references, they would get LLVM `noalias` attributes that were incorrect, because that alias guarantee is only true until the guard drops. A `&RefCell` on the same value can get a new borrow that aliases the previous guard, possibly leading to miscompilation. Using `NonNull` pointers in `Ref` and `RefCell` avoids `noalias`.

Fixes the library side of rust-lang#63787, but we still might want to explore language solutions there.
@cuviper
Copy link
Member

cuviper commented Jul 30, 2022

Now that the library fix has landed, are we still considering lang changes, or can we close this?

@RalfJung
Copy link
Member Author

I think the lang side of this is not final yet, but this is not the right issue to track that. We have rust-lang/unsafe-code-guidelines#125 and this will have to be part of the discussion of the aliasing model, once we get there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests