const-eval: loading or overwriting parts of a pointer is not supported #87184

RalfJung · 2021-07-16T08:58:31Z

The Miri core engine can only represent "full" pointers in memory, no parts of a pointer. This leads to strange behavior on code like

    let mut p = &42;
    unsafe {
        let ptr: *mut _ = &mut p;
        *(ptr as *mut u8) = 123; // overwrite the first byte of the pointer
    }
    let x = *p; //~ ERROR this operation requires initialized memory

If overwriting a part of a pointer happens during CTFE, we halt execution (since #87248). In Miri, instead we de-initialize the entire pointer, so a write will affect the bytes "next to it". (Halting execution is not an option here.)

If loading a part of a pointer (including as the source of a mem-to-mem copy) happens during CTFE or Miri, we halt execution.

Long-term, it would be great to implement support for having just a few bytes of a pointer in a Miri core engine Allocation. However, this might be hard to do without a perf regression.

Cc @rust-lang/wg-const-eval

The text was updated successfully, but these errors were encountered:

oli-obk · 2021-07-16T11:41:22Z

* We throw a "not supported" error, halting execution. This at least avoids silently doing the wrong thing. However, Miri might not get away with this if there's code out there relying on this to work (which seems not unlikely).

I prefer the status quo over this, even if it would make debugging easier. We could consider emitting a lint though.

* Throw such an error in CTFE, but in Miri just remove the provenance of the other bytes. With CTFE/Miri engine Pointer type overhaul #87123 this is actually not terribly wrong, but it is not really right either -- losing provenance like this does not match what LLVM does, where individual bytes of a pointer can maintain their provenance separately.

That seems like we're treating this as UB or at least unspecified behaviour and doing something random that we can change in the future. I don't like it much. It will also cause differences between the runtime and compile-time behaviour of const fns

Implement support for having just a few bytes of a pointer in a Miri core engine Allocation. However, this might be hard to do without a perf regression.

preferrable, but yea, a perf problem. If we want per-byte provenance, we either can't do ByRef->ByVal conversion based on types anymore, but need to also consider the value's representability. That could be ok, but makes things more complex in various places. In addition, as you noted, we'd need to adjust the Allocation to allow to represent such things. We could probably get away with adding a u8 field to each relocation entry in the Allocation that specifies the length of the relocation area. This would probably be unlikely to affect the performance of behaviour other than when such partial pointer things are performed.

That said... I think we should at least wait until someone has a use case for such things. It is quite some work to implement and the benefits are unclear without having a use case

RalfJung · 2021-07-16T11:56:02Z

That seems like we're treating this as UB or at least unspecified behaviour and doing something random that we can change in the future. I don't like it much. It will also cause differences between the runtime and compile-time behaviour of const fns

Well, we already treat it as UB to overwrite some bytes of a pointer and then inspect the remaining bytes. So we already have a difference between runtime and compile-time behavior of const fn. The goal of this proposal is to make it more obvious when the issue arises: throwing a "const-unsupported" error seems like a reasonable signal to indicate things that work differently between compile-time and runtime.

preferrable, but yea, a perf problem. If we want per-byte provenance, we either can't do ByRef->ByVal conversion based on types anymore,

One step after the other. :) For now I am just talking about what can be represented inside Allocation. This is relevant for writes. You seem to think of what happens when one loads a value that has partial provenance, and wants to represent this in a ScalarMaybeUninit. This is relevant for reads. We currently throw an "unsupported" error when this happens. (This can already happen now when one does a read that contains a pointer but does not exactly match the pointer.) This partial provenance read case is very similar to supporting reads of partially initialized data, so I think we should discuss that in #69488.

So, let's focus just on what happens on a write now in this issue, I'd say.

We could probably get away with adding a u8 field to each relocation entry in the Allocation that specifies the length of the relocation area.

This is not enough. If we overwrite the first byte of a pointer, we need to be able to represent "bytes 2-7 of this pointer are stored here".

I was thinking of implementing this by adding a second "relocation" table for single-byte relocations: so a relocation at position x with value (tag, offset): (Tag, u8) would mean that at position x we have the byte #offset of a pointer tagged tag. This matches the PtrFragment type in my memory interface notes.

8 adjacent bytes with single-byte relocations [(tag, 0), ..., (tag, 7)] would then be equivalent to having relocation tag in the current relocation table. Basically that table would be a more efficient representation of the common case, and the new table would hopefully usually be empty so that we don't have a big perf impact.

That said... I think we should at least wait until someone has a use case for such things. It is quite some work to implement and the benefits are unclear without having a use case

My usecase is: I want Miri to be a faithful implementation of the Rust semantics.
If nobody has a usecase for this I think we can just make it an error to do these partial overwrites, then at least we will know for sure if/when a usecase materializes in the future. Currently we'd probably not even know...

oli-obk · 2021-07-16T12:40:22Z

My usecase is: I want Miri to be a faithful implementation of the Rust semantics.

well... we can keep CTFE doing simple things and give Miri the ability to go crazy via more Machine shenanigans.

RalfJung · 2021-07-16T16:50:46Z

Propagating the Machine so far down into Allocation is non-trivial. It certainly will mean a lot more work to implement than just doing it for both CTFE and Miri.^^

oli-obk · 2021-07-16T17:01:13Z

don't we already do that with AllocExtra?

RalfJung · 2021-07-16T17:09:55Z

I removed that with my previous PR. ;) (And I am glad I did.)

RalfJung · 2021-07-16T17:11:19Z

We could just always have that new fine-grained relocation table, and have a boolean flag in Provenance to determine if it is used. So CTFE would have zero perf cost (just a tiny bit of space cost).

But honestly I think there's no good reason CTFE shouldn't implement this the right way (if we have the code anyway for Miri) -- as long as that doesn't cause slowdown for code not needing this feature.

oli-obk · 2021-07-16T17:13:23Z

But honestly I think there's no good reason CTFE shouldn't implement this the right way (if we have the code anyway for Miri) -- as long as that doesn't cause slowdown for code not needing this feature.

yea that's true. I'm not going to stop you from pursuing this, it just seems like quite a bit of work and lots of performance tuning

…i-obk CTFE: throw unsupported error when partially overwriting a pointer Currently, during CTFE, when a write to memory would overwrite parts of a pointer, we make the remaining parts of that pointer "uninitialized". This is probably not what users expect, so if this ever happens they will be quite confused about why some of the data just vanishes for seemingly no good reason. So I propose we change this to abort CTFE when that happens, to at last avoid silently doing the wrong thing. Cc rust-lang#87184 Our CTFE test suite still seems to pass. However, we should probably crater this, and I want to do some tests with Miri as well.

RalfJung · 2022-04-05T22:41:13Z

FWIW, even with #94527 landed this issue means #94371 is not fully resolved, but the remaining problem causes an error rather than silent data corruption:

#![feature(const_swap)]
#![feature(const_mut_refs)]

#[repr(C, packed)]
struct Demo(u32, &'static i32, u32, i64, i64);

const C: (Demo, Demo) = {
    let mut x = Demo(0, &1, 2, -1, -1);
    let mut y = Demo(3, &4, 5, -1, -1);
    std::mem::swap(&mut x, &mut y);
    (x, y)
};

fn main() {
    let (d1, d2) = C;
}

says

error: any use of this value will cause an error
    --> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:1096:9
     |
1096 |           copy_nonoverlapping(src, tmp.as_mut_ptr(), 1);
     |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |           |
     |           unable to turn pointer into raw bytes
     |           inside `std::ptr::read::<MaybeUninit<u8>>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:1096:9
     |           inside `mem::swap_simple::<MaybeUninit<u8>>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/mem/mod.rs:762:17
     |           inside `ptr::swap_nonoverlapping_simple::<MaybeUninit<u8>>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:913:9
     |           inside `swap_nonoverlapping::<Demo>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:861:33
     |           inside `std::mem::swap::<Demo>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/mem/mod.rs:726:29
     |           inside `C` at swap.rs:10:5
     |
    ::: swap.rs:7:1
     |
7    | / const C: (Demo, Demo) = {
8    | |     let mut x = Demo(0, &1, 2, -1, -1);
9    | |     let mut y = Demo(3, &4, 5, -1, -1);
10   | |     std::mem::swap(&mut x, &mut y);
11   | |     (x, y)
12   | | };
     | |__-
     |
     = note: `#[deny(const_err)]` on by default
     = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
     = note: for more information, see issue #71800 <https://github.com/rust-lang/rust/issues/71800>

RalfJung · 2022-05-19T07:07:09Z

This affects swap_nonoverlapping, which no longer supports copying a pointer by using type u8 and a large enough size to cover the entire pointer. That used to work basically by chance since the old implementation did some optimizations to swap data in larger chunks; #94212 removed that optimization and thus broke this case in Miri.

RalfJung · 2022-06-03T11:55:00Z

My plans for how I want to implement this in Miri changed, but that new plan I think cannot work in CTFE.

For Miri, I mentioned using the PtrFragment representation of my memory model. But that memory model changed, and now my notion of an AM Byte looks like this: an initialized byte is a u8 with some optional provenance. This greatly simplifies the abstract machine, and also its implementation in Miri. In a pointer, that u8 will be the actual byte on the real machine.

However, for CTFE, we don't know the actual byte on the real machine, so the u8 will be some byte of the relative offset. So, we cannot take a pointer apart into its raw bytes and its provenance the same way we can for the runtime handling of pointers.

We also have the problem that I don't think LLVM supports 'partial relocations', so if the final value of a constant contains some partial pointers we have to throw an error.

All put together, I no longer have plans to fix this for CTFE -- no support for partial pointers is just an inherent limitation of compile-time evaluation. We could mitigate it with a lot of work but it'll always remain half-baked. I will hence close this issue, and open new issues for partial pointer support in Miri and for the swap_nonoverlapping problem.

RalfJung added A-const-eval Area: constant evaluation (mir interpretation) A-miri Area: The miri tool T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 16, 2021

RalfJung mentioned this issue Jul 18, 2021

CTFE: throw unsupported error when partially overwriting a pointer #87248

Merged

RalfJung changed the title ~~const-eval: overwriting parts of a pointer makes the rest uninitialized~~ const-eval: overwriting parts of a pointer is not supported Jul 31, 2021

RalfJung changed the title ~~const-eval: overwriting parts of a pointer is not supported~~ const-eval: loading or overwriting parts of a pointer is not supported Feb 25, 2022

RalfJung mentioned this issue Feb 25, 2022

mem::swap behaves incorrectly in CTFE (and Miri) #94371

Closed

RalfJung mentioned this issue Apr 5, 2022

const-eval: load of partially initialized scalar produces entirely uninitialized result #69488

Closed

RalfJung mentioned this issue Apr 17, 2022

Miri reports "unsupported operation: unable to turn pointer into raw bytes" in futures-lite-1.12.0 rust-lang/miri#2068

Closed

RalfJung mentioned this issue May 13, 2022

Do copy[_nonoverlapping]/swap[_nonoverlapping] do typed copies? #63159

Closed

RalfJung mentioned this issue May 28, 2022

Differences to MiniRust (aka the idealized interpreter) rust-lang/miri#2159

Open

8 tasks

RalfJung closed this as not planned Won't fix, can't repro, duplicate, stale Jun 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

const-eval: loading or overwriting parts of a pointer is not supported #87184

const-eval: loading or overwriting parts of a pointer is not supported #87184

RalfJung commented Jul 16, 2021 •

edited

Loading

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021 •

edited

Loading

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021

RalfJung commented Jul 16, 2021 •

edited

Loading

oli-obk commented Jul 16, 2021

RalfJung commented Apr 5, 2022

RalfJung commented May 19, 2022

RalfJung commented Jun 3, 2022

const-eval: loading or overwriting parts of a pointer is not supported #87184

const-eval: loading or overwriting parts of a pointer is not supported #87184

Comments

RalfJung commented Jul 16, 2021 • edited Loading

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021 • edited Loading

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021

oli-obk commented Jul 16, 2021

RalfJung commented Jul 16, 2021

RalfJung commented Jul 16, 2021 • edited Loading

oli-obk commented Jul 16, 2021

RalfJung commented Apr 5, 2022

RalfJung commented May 19, 2022

RalfJung commented Jun 3, 2022

RalfJung commented Jul 16, 2021 •

edited

Loading

RalfJung commented Jul 16, 2021 •

edited

Loading

RalfJung commented Jul 16, 2021 •

edited

Loading