Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutating through addr_of produces LLVM IR with UB #111502

Closed
cbeuw opened this issue May 12, 2023 · 12 comments · Fixed by #111517
Closed

Mutating through addr_of produces LLVM IR with UB #111502

cbeuw opened this issue May 12, 2023 · 12 comments · Fixed by #111517
Labels
A-mir-opt Area: MIR optimizations T-opsem Relevant to the opsem team

Comments

@cbeuw
Copy link
Contributor

cbeuw commented May 12, 2023

As there is a write through a *const derived pointer, this code has UB under Stacked Borrows, but is fine under Tree Borrows,

pub fn first() -> bool {
    let mut a = (0., true);
    let b = (0., false);
    let a_ptr_and_b = (core::ptr::addr_of_mut!(a), b);
    let got = unsafe { second(a_ptr_and_b.0, a_ptr_and_b, true, true) };
    return got;
}

unsafe fn second(
    a_ptr: *mut (f32, bool),
    a_ptr_and_b: (*mut (f32, bool), (f64, bool)),
    t: bool,
    t_copy: bool,
) -> bool {
    let b_bool_ptr = core::ptr::addr_of!(a_ptr_and_b.1 .1) as *mut bool;
    let t_or_t = t | t_copy;
    let t_xor_t = t ^ t_copy;
    (*b_bool_ptr) = t_or_t ^ t_xor_t;
    let unused = a_ptr_and_b;
    return a_ptr_and_b.1.1 == (*a_ptr).1;
}

pub fn main() {
    println!("{}", first());
}

-C opt-level=0 results in true (which Miri agrees and is the right result), but -C opt-level=3 miscompiles it to print false

% rustc -Zmir-opt-level=0 -C opt-level=0 repro.rs && ./repro
true
% rustc -Zmir-opt-level=0 -C opt-level=3 repro.rs && ./repro
false

Interestingly, if you remove unused in function second, the miscompilation goes away. -Zmir-opt-level=0 is required to prevent mir-opt from doing this.

Directly getting a *mut through addr_of_mut! in the first line of second also makes the bug go away.

rustc --version -v
rustc 1.71.0-nightly (2a8221dbd 2023-05-11)
binary: rustc
commit-hash: 2a8221dbdfd180a2d56d4b0089f4f3952d8c2bcd
commit-date: 2023-05-11
host: aarch64-apple-darwin
release: 1.71.0-nightly
LLVM version: 16.0.2

cc @nikic @RalfJung

@RalfJung
Copy link
Member

If you mark a_ptr_and_b as mut, does that fix the behavior? (This should be required if you use addr_of_mut instead of addr_of.)

@cbeuw
Copy link
Contributor Author

cbeuw commented May 12, 2023

If you mark a_ptr_and_b as mut, does that fix the behavior?

No, it doesn't change the behaviour (and there's an unused_mut warning)

@RalfJung RalfJung changed the title Unused variable triggers LLVM miscompilation Mutating through addr_of triggers LLVM miscompilation May 12, 2023
@RalfJung
Copy link
Member

That is very strange, I was under the impression that we generate the same code for addr_of and addr_of_mut.

Making *const and *mut equivalent for UB everwhere is an oft-requested feature, which is why TB accepts this code.

@tmiasko
Copy link
Contributor

tmiasko commented May 12, 2023

a_ptr_and_b parameter, which is passed indirectly, is marked as readonly by DeduceReadOnly pass (the pass assumes that NonMutatingUseContext::AddressOf is non-mutating).

@RalfJung
Copy link
Member

RalfJung commented May 12, 2023

(the pass assumes that NonMutatingUseContext::AddressOf is non-mutating)

AFAIK it was never intended to be considered non-mutating. In fact we still track this Miri behavior as a bug: #56604.

@tmiasko
Copy link
Contributor

tmiasko commented May 12, 2023

Sure, I am just describing why generated LLVM IR has undefined behavior.

@RalfJung
Copy link
Member

RalfJung commented May 12, 2023

@rust-lang/wg-mir-opt @rust-lang/opsem do you know any other places where we assume one cannot mutate through an addr_of!-created pointer? It was always (since #56604) my understanding that this should work and Stacked Borrows was unnecessarily pedantic here.

@cbeuw cbeuw changed the title Mutating through addr_of triggers LLVM miscompilation Mutating through addr_of produces LLVM IR with UB May 12, 2023
@Nilstrieb Nilstrieb added the T-opsem Relevant to the opsem team label May 12, 2023
@saethlin saethlin added the A-mir-opt Area: MIR optimizations label May 12, 2023
@digama0
Copy link
Contributor

digama0 commented May 12, 2023

While I want to be able to say "yes this is DB", doesn't this hit issues depending on how the pointer is obtained? If you do let x = 0; and then addr_of!(x), I don't think you should be allowed to mutate that pointer. For certain kinds of places (ones derived from a path for which &mut would be valid) it might be okay to derive a mutable pointer to the place, but even then you hit mutable borrow invalidation issues if you use addr_of!(x) twice.

Why not just use the mutability the user asked for? This is literally what addr_of_mut is for.

Note that in #56604 the issue was with two ways to obtain a pointer from a &mut, so you definitely have mutable access in that case. When it's just a path you may or may not have mutable access to begin with, and the compiler knows which.

@RalfJung
Copy link
Member

IMO it would be strange if addr_of!(*mutable_ref) allowed mutation but addr_of!(local) did not.

I don't consider let vs let mut on local variables to be meaningful, and in fact we don't even preserve that information to MIR I think.

@digama0
Copy link
Contributor

digama0 commented May 12, 2023

IMO it would be strange if addr_of!(*mutable_ref) allowed mutation but addr_of!(local) did not.

I agree with this, but perhaps draw the opposite conclusion from it. :)

addr_of{_mut}! is the boundary between things the compiler can track and things it can't. When the compiler thinks something is immutable, we should really try to have the opsem respect that. Once it's a pointer I think it is okay to do mutability crimes because it's not helping anyone to be pedantic about it, but if addr_of_mut!(x) would not compile, then it seems really wrong for addr_of!(x) as *mut u8 to work (for mutation without UB).

Put another way, if you want to make it legal to mutate let x = FOO; bindings using raw pointers, that needs a larger quorum beyond just t-opsem, and I think t-lang will have some things to say about it.

@JakobDegen
Copy link
Contributor

I'm quite sympathetic to Mario's point here, but this is still the wrong order to consider this in. We should turn off the sketchy optimization while we figure out if it's ok

@RalfJung
Copy link
Member

I find it funny that first there was a wave of complaints when Miri started making a difference between *const and *mut (and this confusion still regularly comes up every few months), and now there is a wave of complaints when Miri does not make a difference between *const and *mut. 😂 Some of this honestly feels like post-hoc rationalization to me: we got used to how Miri works, now we don't want it to change any more.

But anyway, that's a topic for rust-lang/unsafe-code-guidelines#257.

@bors bors closed this as completed in 3603a84 May 14, 2023
RalfJung pushed a commit to RalfJung/miri that referenced this issue May 16, 2023
allow mutating function args through `&raw const`

Fixes rust-lang/rust#111502 by "turning off the sketchy optimization while we figure out if this is ok", like `@JakobDegen` said.

The first commit in this PR removes some suspicious looking logic from the same method, but should have no functional changes, since it doesn't modify the `context` outside of the method. Best reviewed commit by commit.

r? opsem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations T-opsem Relevant to the opsem team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants