Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to allow exploiting intprcast alignment #1074

Closed
Aaron1011 opened this issue Nov 24, 2019 · 9 comments · Fixed by #1513
Closed

Add option to allow exploiting intprcast alignment #1074

Aaron1011 opened this issue Nov 24, 2019 · 9 comments · Fixed by #1513
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement I-false-UB Impact: makes Miri falsely report UB, i.e., a false positive (with default settings)

Comments

@Aaron1011
Copy link
Member

Currently, we have a test that asserts that code is not allowed to rely on any 'extra' information provided by the intprcast (e.g. a [u16; 2] array that happens to have alignment '4').

However, this is a perfectly legitimate thing for code to do. I propose that we add a intprcast-alignment option which enables the following behavior:

When we check the alignment for a memory access, we see if we have a recorded base address for the allocation (i.e. if a pointer within the allocation was ever cast to an integer).

If we do not have a base address, we use the current alignment checking behavior (i.e. check the static alignment of the type). This will catch cases where the code is definitely wrong - if the pointer was never cast to an integer, the code cannot possibly know that it happened to have 'extra' alignment.

If we do have a base address, then we do the alignment check based on the actual base address. This will allow some incorrect code, like:

let mut my_arr: [u8; 100]`
my_arr.as_ptr() as usuze; // Dummy cast
unsafe { *(my_arr.as_mut_ptr() as *mut u16) = 25 }

Depending on what alignment we pick for the base address of my_addr, this may or may not work. This means that whether or not this program passes now depends on the random seed.

While this isn't ideal, I don't see a way of allowing the intptrcast_alignment_check to pass without also causing 'spurious passes' (code which really should fail, but doesn't).

@RalfJung RalfJung added A-intptrcast Area: affects int2ptr and ptr2int casts C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out labels Nov 25, 2019
@RalfJung
Copy link
Member

RalfJung commented Dec 1, 2019

If we do have a base address, then we do the alignment check based on the actual base address.

As that test shows, we currently do not do this by design. I understand that this seems strange. See rust-lang/rust#62420 for why I think this is a good idea.

However, I am open to also offer the other mode you are asking for in Miri. What I am not sure about is the default (I lean towards being conservative and keeping the current behavior the default). If we want to change the default, IMO it is a hard requirement that we, by default, at least warn when code exploits alignment and thus could "spuriously pass". (Cc #797)

@tmiasko
Copy link
Contributor

tmiasko commented Feb 16, 2020

A correctly aligned memory access which is under aligned with respect to the
whole allocation constitutes a very weak evidence that there is a bug. Such
memory accesses are relatively common. For example when dealing with binary
data formats, or when writing a custom memory allocators.

I have repeatedly seen prompts for a code review, where person asking had
lingering doubts about code correctness after Miri reported an alignment error.
Thus far, the errors invariably turned out to be a false positives and code correct.

If the code in question does in fact miss an alignment check, the bug can be
reliably caught through randomization without producing false positives.

Furthermore, the suggested workaround using align_to, if applicable at all,
results in no test coverage. After all Miri intentionally exercises possibility
of spurious failure in align_offset, resulting in executions without those
memory accesses, quite unlike executions outside Miri.

@RalfJung
Copy link
Member

RalfJung commented Feb 16, 2020

Such
memory accesses are relatively common. For example when dealing with binary
data formats, or when writing a custom memory allocators.

Could you go into a bit more detail about how this is common when dealing with binary data? I would think then you rarely control the actual alignment of the buffer nor the offset of the data, so you have to use unaligned accesses?

(I also think you are biased here when you say "relatively common", because that seems to be the kind of code you are dealing with, but the vast majority of code is not like that.)

That said...

I have repeatedly seen prompts for a code review, where person asking had
lingering doubts about code correctness after Miri reported an alignment error.
Thus far, the errors invariably turned out to be a false positives and code correct.

... this is a good argument. Miri also has found a few actual alignment errors in the past; I am not sure how many of them we would have missed if full alignment was taken into account.

Anyway, the first step is to implement full alignment checks as an option, keeping the current behavior as a default. I was never opposed to that, it has just not been implemented. :)

If the code in question does in fact miss an alignment check, the bug can be
reliably caught through randomization without producing false positives.

Does this mean Miri should somehow help with that?

Furthermore, the suggested workaround using align_to, if applicable at all,
results in no test coverage.

OTOH, I am worried that that currently untested code will often use SIMD as that's a common reason to want higher alignment in the first place; making align_to always work in Miri will probably reduce the amount of code that can be run in practice.

@tmiasko
Copy link
Contributor

tmiasko commented Feb 16, 2020

Consider for example D-Bus decoding. It is a binary format where the data
stored inside is naturally aligned with respect to the beginning of the
message. If the message is stored in user controlled allocation there is no
guarantee it will be suitably aligned with respect to the address space,
additionally data might have possibly a different endianness. Yet, almost
always it will be possible to cast buffer data directly into decoded type, and
we can take advantage of that by offering an API along the lines of:

impl dbus::Decoder<'d> {
  fn from_bytes(buffer: &'d [u8]) -> Self {
    ...
  }

  fn decode_array<T: FixedSize>(&mut self) -> dbus::Result<Cow<'d, [T]>> {
    ...
  }
}

The implementation would check if data is sufficiently aligned for T and
return &d [T] whenever possible, otherwise it would make a copy and return
Vec<T>. Yet under Miri, if everything happens to align, the application is
unlucky since Miri considers this to be an error.

The bytemuck crate offers a safe API for such slice conversions (and there
are many others). Looking at dependent crates might provide further examples.

The current behaviour of checking alignment against the allocation layout
significantly reduces the amount of code that can run under Miri.

@RalfJung
Copy link
Member

The implementation would check if data is sufficiently aligned for T and
return &d [T] whenever possible, otherwise it would make a copy and return
Vec.

Ah, so there is a fallback path. That is the part I was missing. Now it makes sense. :)

@RalfJung
Copy link
Member

So I propose the next step is to add an option that both exploits full alignment, and makes align_to behave like it does "normally". Then we can experiment with that to see how bad the SIMD situation really is.

I won't have time to do that any time soon, but I can provide some guidance if someone else is up for the task.

bors added a commit that referenced this issue Apr 13, 2020
for alignment errors, note that there might be false positives

Cc @shepmaster

```
error: Undefined Behavior: accessing memory with alignment 1, but alignment 8 is required
 --> tests/compile-fail/unaligned_pointers/alignment.rs:8:9
  |
8 |         *y_ptr = 42;
  |         ^^^^^^^^^^^ accessing memory with alignment 1, but alignment 8 is required
  |
  = help: this usually indicates that your program performed an invalid operation and caused Undefined Behavior
  = help: but alignment errors can also be false positives, see #1074
```
@RalfJung RalfJung added C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement and removed C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out labels Apr 13, 2020
@thomcc
Copy link
Member

thomcc commented Apr 14, 2020

Regarding align_to, it's somewhat common to just do an unaligned load (ptr::read_unaligned, _mm_loadu_blah, etc) for the first and last read (and to handle too small sequences separately). This avoids needing to have different code for all the cases, for example.

I guess it would be unfortunate for miri not to work under that circumstance...

I have repeatedly seen prompts for a code review, where person asking had
lingering doubts about code correctness after Miri reported an alignment error.
Thus far, the errors invariably turned out to be a false positives and code correct

Seconding this -- miri's alignment complaints are basically noise since they're so often false positives -- any code that manually aligns pointers, whether correctly or not, sets miri off...

@RalfJung
Copy link
Member

RalfJung commented Apr 14, 2020

Using align_to you need a fallback path for the prefix and suffix, right? If you use that, things should work fine in Miri because it just puts everything into the prefix.

Seconding this -- miri's alignment complaints are basically noise since they're so often false positives -- any code that manually aligns pointers, whether correctly or not, sets miri off...

Enough people spoke up by now that I am indeed convinced we should at the very least offer the option to be less strict about alignment. :) Repeating more arguments along those lines is not going to convince me even more.

What remains is finding someone to do the actual work. :D

@RalfJung RalfJung added the I-false-UB Impact: makes Miri falsely report UB, i.e., a false positive (with default settings) label Apr 16, 2020
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 17, 2020
miri engine: add option to use force_int for alignment check

This is needed for rust-lang/miri#1074. The Miri-side patch is at rust-lang/miri#1513.

r? @oli-obk
bors added a commit that referenced this issue Aug 17, 2020
add option to use force_int for alignment check

Fixes #1074. Depends on rust-lang/rust#75592.
bors added a commit that referenced this issue Aug 17, 2020
add option to use force_int for alignment check

Fixes #1074. Depends on rust-lang/rust#75592.
bors added a commit that referenced this issue Aug 17, 2020
add option to use force_int for alignment check

Fixes #1074. Depends on rust-lang/rust#75592.
bors added a commit that referenced this issue Aug 17, 2020
add option to use force_int for alignment check

Fixes #1074. Depends on rust-lang/rust#75592.
bors added a commit that referenced this issue Aug 17, 2020
add option to use force_int for alignment check

Fixes #1074. Depends on rust-lang/rust#75592.
@bors bors closed this as completed in 066fa62 Aug 17, 2020
@RalfJung
Copy link
Member

To actually get this fix into the Miri shipped with rustup, we'll need to do a submodule bump in the rustc repo.

Also the test changes in PR #1513 demonstrate quite well the problem with alignment checks that this causes -- I had to make most compile-fail alignment test repeat 10 times to make sure they actually fail. But overall that is probably still better than false positives by default.

bors added a commit to rust-lang-ci/rust that referenced this issue Aug 21, 2020
enable align_to tests in Miri

With rust-lang/miri#1074 resolved, we can enable these tests in Miri.

I also tweaked the test sized to get reasonable execution times with decent test coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement I-false-UB Impact: makes Miri falsely report UB, i.e., a false positive (with default settings)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants