Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on invalid usages of MaybeUninit::uninit().assume_init() #100423

Closed

Conversation

Nilstrieb
Copy link
Member

@Nilstrieb Nilstrieb commented Aug 11, 2022

This MIR transform inserts the same validity checks from mem::{uninitialized,zeroed} to MaybeUninit::{uninit,zeroed}().assume_init().

We have been panicking in mem::uninit on invalid values for quite some time now, and it has helped to get people off the unsound API and towards using MaybeUninit<T>.

While correct usage of MaybeUninit<T> is clearly documented, some people still use it incorrectly and simply replaced their wrong mem::uninit usage with MaybeUninit::uninit().assume_init(). This is not any more correct than the old version, and we should still emit panics in these cases. As this can't be done in the library only, we need this MIR pass to insert the calls.

For now, it only detects direct usages of MaybeUninit::uninit().assume_init() but it could be extended in the future to do more advanced dataflow analysis.

@rustbot rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 11, 2022
@rust-highfive
Copy link
Collaborator

r? @michaelwoerister

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot
Copy link
Collaborator

rustbot commented Aug 11, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 11, 2022
@rust-log-analyzer

This comment has been minimized.

@5225225
Copy link
Contributor

5225225 commented Aug 11, 2022

Also gonna leave a linkback to #66151 which this PR is relevant to... Kinda!

@5225225
Copy link
Contributor

5225225 commented Aug 11, 2022

Heads up, with zstd-seekable 0.1.7, RUSTFLAGS="-Zstrict-init-checks" cargo test just hangs for me (in test, not in build). Not sure if you can reproduce. No idea what it's doing.

@michaelwoerister
Copy link
Member

r? rust-lang/mir

@5225225
Copy link
Contributor

5225225 commented Aug 12, 2022

Heads up, with zstd-seekable 0.1.7, RUSTFLAGS="-Zstrict-init-checks" cargo test just hangs for me (in test, not in build). Not sure if you can reproduce. No idea what it's doing.

Did some investigation with @Nilstrieb and it seems to be their problem. We're inserting panics in a place they're not panic safe. They deadlock. Happens on stable if you insert a panic manually.

continue;
};

let Operand::Move(assume_init_place) = assume_init_operand else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let Operand::Move(assume_init_place) = assume_init_operand else {
let Some(assume_init_place) = assume_init_operand.place() else {

This passes in your tests because MIR only ever sees the generic T type due to the wrapper functions. However, if you add a T: Copy bound, the tests that still compile should fail.

Can you add an mir-opt test that demonstrates this transform, including for both moves and copies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't figure out how to make it generate a Copy.

compiler/rustc_mir_transform/src/check_maybe_uninit.rs Outdated Show resolved Hide resolved
compiler/rustc_mir_transform/src/lib.rs Show resolved Hide resolved
@Nilstrieb
Copy link
Member Author

This will take a little time as I don't have that much time right now, so @rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 22, 2022
@RalfJung
Copy link
Member

Honestly I am not sure this is the right move. With a panic like this, the chances are pretty high that they will hit people that can do absolutely nothing about them -- because the panic is caused by a dependency. And having to implement this via a MIR pass sounds like the kind of hack that will be quite annoying to maintain.

I think I'd be more in favor of using a lint for this. And note that the invalid_values lint already detects MaybeUninit::uninit().assume_init(). Also IIRC @5225225 had some great ideas for how to make assume_init panic on invalid bit patterns, which can be enabled by a -Z flag.

@5225225
Copy link
Contributor

5225225 commented Aug 27, 2022

One advantage the panics have over a lint is the lint needs to work pre-monomorphisation, so we can't lint for making an uninit T, because that might be fine.

And the lint warns about dead code, which is arguably a benefit (don't need to actually test the code, you get immediate feedback).

The panics run where we know the type layout concretely, so we can accurately panic even in cases of zeroed enums where the variant with discriminant zero doesn't allow being left zeroed. That would be very hard, if not impossible, to use a lint for.

That's not to say our lints couldn't be improved, but I think the panics serve a useful purpose by saying you did do a UB. Both are useful in different circumstances, which is why I think we should have both.

@Nilstrieb
Copy link
Member Author

Nilstrieb commented Sep 7, 2022

I've addressed all the comments, but #101061 will force me to change some of the intrinsics used here if it's merged.

This brings me to a question: Should we use strict-init-checks by default here? We guarantee nothing about assume_init, so panicking on everything but unions/zsts should be allowed. Especially when we might enable noundef on integers, this could stop LLVM UB.

@rust-log-analyzer

This comment has been minimized.

@cuviper
Copy link
Member

cuviper commented Sep 12, 2022

With a panic like this, the chances are pretty high that they will hit people that can do absolutely nothing about them -- because the panic is caused by a dependency.

If you replace "panic" in your statement with "UB", is the meaning much different? The dependency is behaving badly, which is hard for its users, but the panic at least makes it deterministic.

@5225225
Copy link
Contributor

5225225 commented Sep 12, 2022

Also, it's worth thinking about if we're trying to prevent just LLVM UB here, or rust UB? I agree that being minimal in our checks for mem::uninitialized is useful because that function is what Old Code does, which is why we added the 0x01 filling, but can/should we be stricter and go "we will panic on rust UB" for MaybeUninit, since we sorted out our messaging (mostly) for what uninit means?

(We did say "uhhhh uninit ints may be fine, they may not, be careful regardless", which has since been changed to "they're UB.")

If it's rust UB, then strict-init-checks is the correct thing to do here.

If it's LLVM UB, then we need to still panic on uninit bool, but not behind arrays. We wouldn't panic on uninit u8 until we start adding noundef to ints (which it would be a shame if we couldn't do).

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 21, 2022
…, r=RalfJung

Rename `assert_uninit_valid` intrinsic

It's not about "uninit" anymore but about "filling with 0x01 bytes" so the name should at least try to reflect that.

This is actually not fully correct though, as it does still panic for all uninit with `-Zstrict-init-checks`. I'm not sure what the best way is to deal with that not causing confusion. I guess we could just remove the flag? I don't think having it makes a lot of sense anymore with the direction that we have chose to go. It could be relevant again if rust-lang#100423 lands so removing it may be a bit over eager.

r? `@RalfJung`
@tmandry
Copy link
Member

tmandry commented Dec 23, 2022

Discussed in this week's @rust-lang/lang meeting (notes). There are some points that we would like to see addressed:

Documentation. It should be as clear as possible how to take action on an abort caused by one of these checks. Documentation should be easy to find. When code that was perceived as "working" breaks, maintainers are less willing to put in effort to understand what is going on, and will often reach for the quickest fix available in the moment.

Debuggability. As previously discussed, we should make sure it is possible to understand how this panic/abort is happening. If looking at the library source for assume_init, for example, there should be some kind of breadcrumb that indicates the compiler can introduce one. This could take the form of an intrinsic with a comment, or even just a comment. The docs for the function should also be updated to talk about the possibility.

Impact. We want to understand the level of impact on users, both library authors and consumers, and map out what options they will have if their program aborts (especially if it's in a library they don't control). We may want to add mitigation options, like a compiler flag that disables these checks.

Best effort. We should communicate broadly and clearly that these checks are best-effort. We don't want users to rely on them as guarantees to prevent classes of UB, unless we can promise that and stick to that promise.

It's possible some of these require further discussion. Feel free to do that on Zulip #t-lang for smaller questions or propose a design meeting for longer discussion. If you feel comfortable addressing them in this thread, go ahead and do so and re-nominate the PR.

@rustbot label: -I-lang-nominated

Finally, it was noted that the general concept of runtime UB detection is one that would benefit from broader discussion and communication. In particular, we would be receptive to an RFC that discusses an overall approach to runtime UB detection and touches on these questions with future expansions in mind, like options for users to add more expensive runtime checks to their programs. We don't necessarily consider that a blocker for this PR, but think that it would be a useful direction for Rust's tooling to evolve in and that it would be useful to do some initial thinking about how this PR fits into a broader picture.

@rustbot rustbot removed the I-lang-nominated The issue / PR has been nominated for discussion during a lang team meeting. label Dec 23, 2022
Aaron1011 pushed a commit to Aaron1011/rust that referenced this pull request Jan 6, 2023
…, r=RalfJung

Rename `assert_uninit_valid` intrinsic

It's not about "uninit" anymore but about "filling with 0x01 bytes" so the name should at least try to reflect that.

This is actually not fully correct though, as it does still panic for all uninit with `-Zstrict-init-checks`. I'm not sure what the best way is to deal with that not causing confusion. I guess we could just remove the flag? I don't think having it makes a lot of sense anymore with the direction that we have chose to go. It could be relevant again if rust-lang#100423 lands so removing it may be a bit over eager.

r? `@RalfJung`
@rust-log-analyzer

This comment has been minimized.

bjorn3 pushed a commit to bjorn3/rust that referenced this pull request Jan 24, 2023
…, r=RalfJung

Rename `assert_uninit_valid` intrinsic

It's not about "uninit" anymore but about "filling with 0x01 bytes" so the name should at least try to reflect that.

This is actually not fully correct though, as it does still panic for all uninit with `-Zstrict-init-checks`. I'm not sure what the best way is to deal with that not causing confusion. I guess we could just remove the flag? I don't think having it makes a lot of sense anymore with the direction that we have chose to go. It could be relevant again if rust-lang#100423 lands so removing it may be a bit over eager.

r? `@RalfJung`
This MIR transform inserts the same validity checks from
`mem::{uninitialized,zeroed}` to `MaybeUninit::{uninit,zeroed}().assume_init()`.

We have been panicking in `mem::uninit` on invalid values for quite some
time now, and it has helped to get people off the unsound API and
towards using `MaybeUninit<T>`.

While correct usage of `MaybeUninit<T>` is clearly documented, some
people still use it incorrectly and simply replaced their wrong
`mem::uninit` usage with `MaybeUninit::uninit().assume_init()`. This
is not any more correct than the old version, and we should still emit
panics in these cases. As this can't be done in the library only, we
need this MIR pass to insert the calls.

For now, it only detects direct usages of
`MaybeUninit::uninit().assume_init()` but it could be extended in the
future to do more advanced dataflow analysis.
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v3' (SHA:ac593985615ec2ede58e132d2e21d2b1cbd6127c)
Download action repository 'rust-lang/simpleinfra@master' (SHA:697bea7ddceb6696743da8f159f268aef8bfb3c6)
Complete job name: PR (x86_64-gnu-tools, false, ubuntu-20.04-xl)
git config --global core.autocrlf false
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  CI_JOB_NAME: x86_64-gnu-tools
---
   Compiling memchr v2.5.0
   Compiling std v0.0.0 (/checkout/library/std)
   Compiling compiler_builtins v0.1.87
   Compiling unwind v0.0.0 (/checkout/library/unwind)
error: intrinsic safety mismatch between list of intrinsics within the compiler and core library intrinsics for intrinsic `assert_uninit_valid`
    |
    |
973 |     pub fn assert_uninit_valid<T>();

error[E0093]: unrecognized intrinsic function: `assert_uninit_valid`
   --> library/core/src/intrinsics.rs:973:5
    |
    |
973 |     pub fn assert_uninit_valid<T>();
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unrecognized intrinsic
For more information about this error, try `rustc --explain E0093`.
error: could not compile `core` due to 2 previous errors
Build completed unsuccessfully in 0:00:08
cat: /tmp/toolstate/toolstates.json: No such file or directory

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 26, 2023
…, r=compiler-errors

Unify validity checks into a single query

Previously, there were two queries to check whether a type allows the 0x01 or zeroed bitpattern.

I am planning on adding a further initness to check in rust-lang#100423, truly uninit for MaybeUninit, which would make this three queries. This seems overkill for such a small feature, so this PR unifies them into one.

I am not entirely happy with the naming and key type and open for improvements.

r? oli-obk
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Feb 26, 2023
…, r=compiler-errors

Unify validity checks into a single query

Previously, there were two queries to check whether a type allows the 0x01 or zeroed bitpattern.

I am planning on adding a further initness to check in rust-lang#100423, truly uninit for MaybeUninit, which would make this three queries. This seems overkill for such a small feature, so this PR unifies them into one.

I am not entirely happy with the naming and key type and open for improvements.

r? oli-obk
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 27, 2023
…, r=compiler-errors

Unify validity checks into a single query

Previously, there were two queries to check whether a type allows the 0x01 or zeroed bitpattern.

I am planning on adding a further initness to check in rust-lang#100423, truly uninit for MaybeUninit, which would make this three queries. This seems overkill for such a small feature, so this PR unifies them into one.

I am not entirely happy with the naming and key type and open for improvements.

r? oli-obk
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Mar 4, 2023
…ompiler-errors

Allow checking whether a type allows being uninitialized

This is useful for clippy ([rust-lang/clippy#10407](rust-lang/rust-clippy#10407)) and for the future `MaybeUninit::assume_init` panics (rust-lang#100423).
bjorn3 pushed a commit to bjorn3/rust that referenced this pull request Mar 15, 2023
…, r=compiler-errors

Unify validity checks into a single query

Previously, there were two queries to check whether a type allows the 0x01 or zeroed bitpattern.

I am planning on adding a further initness to check in rust-lang#100423, truly uninit for MaybeUninit, which would make this three queries. This seems overkill for such a small feature, so this PR unifies them into one.

I am not entirely happy with the naming and key type and open for improvements.

r? oli-obk
@Nilstrieb
Copy link
Member Author

I would prefer to block this on #100342 so that we fire out an FCW for obviously offending code first.
@rustbot label +S-blocked -S-waiting-on-author

@rustbot rustbot added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 6, 2023
@bors
Copy link
Contributor

bors commented Apr 14, 2023

☔ The latest upstream changes (presumably #110324) made this pull request unmergeable. Please resolve the merge conflicts.

@Dylan-DPC Dylan-DPC added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. labels Oct 7, 2023
@Dylan-DPC
Copy link
Member

Closing this as it has hit a deadend and there's no motivation of completing it

@Dylan-DPC Dylan-DPC closed this Feb 4, 2024
@Dylan-DPC Dylan-DPC added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet