Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of possibly uninitialized data [E0381] a hard error #63230

Merged
merged 1 commit into from
Aug 6, 2019

Conversation

tmandry
Copy link
Member

@tmandry tmandry commented Aug 3, 2019

This is one of the behaviors we no longer allow in NLL. Since it can
lead to undefined behavior, I think it's definitely worth making it a
hard error without waiting to turn off migration mode (#58781).

Closes #60450.

My ulterior motive here is making it impossible to leave variables
partially initialized across a yield (see #60889, discussion at #63035), so
tests are included for that.

cc #54987


I'm not sure if bypassing the buffer is a good way of doing this. We could also make a force_errors_buffer or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me.

r? @Centril
cc @cramertj @nikomatsakis @pnkfelix @RalfJung

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 3, 2019
@tmandry tmandry changed the title Make use of possibly uninitialized data a hard error Make E0381 use of possibly uninitialized data a hard error Aug 3, 2019
@tmandry tmandry changed the title Make E0381 use of possibly uninitialized data a hard error Make use of possibly uninitialized data [E0381] a hard error Aug 3, 2019
@RalfJung
Copy link
Member

RalfJung commented Aug 3, 2019

If this is about permanently closing #54987 and basically rejecting Centril's RFC draft, that seems at least lang-team FCP-worthy?

err.buffer(&mut self.errors_buffer);
// Bypass the buffer for this error, so we don't downgrade it to a
// warning ever.
err.emit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a clean approach! 👍 -- but cc @matthewjasper in case they say otherwise... :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer isn't just to downgrade errors, but also to allow sorting them. I would prefer fixing this in the AST borrow checker, although that doesn't appear to be so easy. I would also prefer adding a flag to MirBorrowckCtxt to prevent it from downgrading errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril didn't you recently write a PR that also make NLL errors ignore mogration in some cases (for match guards)? Can't this use the same approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung I defer to @matthewjasper since I basically used the mechanism they proposed in that PR. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a flag as @matthewjasper suggested, PTAL.

src/librustc_mir/borrow_check/conflict_errors.rs Outdated Show resolved Hide resolved
src/test/ui/borrowck/disallow-possibly-uninitialized.rs Outdated Show resolved Hide resolved
@Centril
Copy link
Contributor

Centril commented Aug 3, 2019

This also fixes #46791.

@Centril
Copy link
Contributor

Centril commented Aug 3, 2019

If this is about permanently closing #54987 and basically rejecting Centril's RFC draft, that seems at least lang-team FCP-worthy?

From what @tmandry wrote in #54987 (comment) my understanding is that we are not rejecting my draft RFC at this point. That is, we could for example reject gradual initialization with a yield point in between but allow it otherwise. That would complicate the type checking but it would be possible. My understanding is also that whatever optimizations we do can also be removed, so merging this PR would not be permanent in that sense.

Can @tmandry confirm ^-- ? If so I believe FCP is not necessary. I would even say that having the optimizations in place is a good thing because it allows us to be clear-eyed about what we might be giving up (the optimizations). (And at any rate, the optimizations themselves have not been merged.)

r=me with review comments addressed + confirmation & elaboration on that.

@Centril
Copy link
Contributor

Centril commented Aug 3, 2019

r? @matthewjasper for review of #63230 (comment).

@Aaron1011
Copy link
Member

Does this unblock #58374, since it will no longer be possible to partially initialize uninhabited types?

@RalfJung
Copy link
Member

RalfJung commented Aug 3, 2019

@Aaron1011 nice catch! If this helps for generators, the same argument should also hold for normal functions.

So when/if this lands, it seems the language is at a branching point: we can either do @Centril's RFC and allow partial initialization "for real", or we can make structs with an uninhabited field be ZST.

@Centril
Copy link
Contributor

Centril commented Aug 3, 2019

Meanwhile I do think we can land the optimizations in LLVM and consider my draft RFC some other time. I'm rather conflicted about whether my RFC is still a good idea. It seems to be accumulating notable drawbacks over time and the gains may be marginal in comparison to the added complexity + what we have to give up. =P

@Aaron1011
Copy link
Member

Aaron1011 commented Aug 3, 2019

So when/if this lands, it seems the language is at a branching point: we can either do @Centril's RFC and allow partial initialization "for real", or we can make structs with an uninhabited field be ZST.

I actually think it's possible to have both.

The key thing to note is that, even with partial initialization, it's impossible to ever observe a fully initialized uninhabited aggregate.

Suppose that we have code like this:

#![feature(never_type)]

struct MyStruct<T> {
	field_1: u8,
	field_2: T
}

fn make_struct<T, F: FnOnce() -> T>(f: F) -> MyStruct<T> {
	let foo: MyStruct<T>;
	foo.field_1 = 25;
	foo.field_2 = f();
	foo
}

fn boom() {
    let foo: MyStruct<!> = make_struct(|| panic!());
}

At momorphization time, we can determine that we're attempting to partially initialize an uninhabited type - MyStruct<!>. We could then transform the monomorphized make_struct into this:

let _tmp1 = 25l
let _tmp2 = f()

That is, we remove the partial initialization of the struct, and store each partially-initialized field into a temporary. (Of course, this would be done as a MIR transformation, but the idea is the same).

This preserves the user-visible behavior of the code - we will still drop any partially-initialized fields in the correct order. Since we don't ever attempt to construct an uninhabited type, we don't need to reserve space in its layout for uninhabited fields. And since it's impossible to actually observe a MyStruct<!>, this transformation will have no observable effect.

@tmandry
Copy link
Member Author

tmandry commented Aug 4, 2019

That is, we could for example reject gradual initialization with a yield point in between but allow it otherwise. That would complicate the type checking but it would be possible.

That's right.

My understanding is also that whatever optimizations we do can also be removed, so merging this PR would not be permanent in that sense.

Right, but optimizations are only one reason to do this. Another is simplicity of the generator implementation. It means we can assume aggregates are initialized across yields, and we don't have to do anything special in miri or elsewhere to handle this.

Essentially, this is the fastest way to get to a place where we

  1. aren't "lying" in our type layouts
  2. aren't wrapping a bunch of things with MaybeUninit which don't need to be (this is technically correct, but loses information)

These have some nice benefits both for optimizations and for miri/UB checking. Later, if we want to allow partial initialization across yields, we can have a more sophisticated analysis that wraps some saved locals (or their subfields) with MaybeUninit.

Finally, there's also a bug right now that this change prevents. Partially-initialized vars can contain uninhabited fields (as in #58374), and this causes the generator itself to be marked uninhabited, incorrectly. After this change, the behavior will be correct.

(And at any rate, the optimizations themselves have not been merged.)

The only optimization I can think of off the top of my head is niche finding in cases where there's a single variant (yield point) which holds data. I'm not sure how common such a case would be. In any case, the layout code doesn't attempt such an optimization with generators today.

@Centril
Copy link
Contributor

Centril commented Aug 4, 2019

Alright; this sounds all good to me from a lang perspective then. I'll leave the rest of the review to @matthewjasper aside from the comments I've already left. :)

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2019

@Aaron1011 this might work, but it becomes "interesting" with code like

let x: (u32, !);
x.0 = 42;
x.1 = return x.0;
foo(&x);

Syntactically, it can seem like we are actually initializing every field of a struct even if some fields are uninhabited.

@Aaron1011
Copy link
Member

@RalfJung: That's true. However, using an empty loop {} is already pretty weird, so I don't think your example would be made significantly more confusing.

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2019

I am not worried about it being confusing, I am worried about rustc compiling it the right way. ;) If that x there is optimized to a ZST because it looks like it gets fully initialized, we have a miscompilation bug.

(I also tweaked the example a bit so that the program can go on running.)

And note that we cannot rely on a syntactic analysis here. The "divergence" might be caused by a harmless-looking function and then later unsafe code does some weird stuff, but it's unreachable and hence cannot cause UB.

Reliably determining if a variable will actually get fully initialized requires solving the halting problem, is all I am saying.

@Aaron1011
Copy link
Member

@RalfJung: With my proposed check, rustc would note that the type of x is uninhabited, and create temporaries for the inhabited fields. Your snippet would be transformed to:

let _tmp0 = 42;
loop {}

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2019

@Aaron1011 what would you do in generic code?

fn foo<T>(mk_T: impl FnOnce () -> T) -> (u32, T) {
  let x: (u32, T);
  x.0 = 42;
  x.1 = mk_T();
  x
}

foo::<!>(|| panic!())

MIR generation happens pre-monomorphization.

@Aaron1011
Copy link
Member

@RalfJung: This would be a transformation applied after monomorphization.

@RalfJung
Copy link
Member

RalfJung commented Aug 5, 2019

@Aaron1011 so what would the semantics of the MIR code be then? Note that the ideas of "type" and "layout" are shared across all passes in Rust. So if (x32, !) has size 0, it will have that size everywhere. I don't see a reasonable way to describe MIR semantics (and e.g. implement them in Miri, which executes generic, non-monomorphized MIR together with a Subst that it uses on all types and the like) under your proposal.

@Centril
Copy link
Contributor

Centril commented Aug 5, 2019

@Aaron1011 @RalfJung I feel this is a good discussion to have but it is perhaps best had over at #54987 because it is somewhat tangential to this specific PR.

This is one of the behaviors we no longer allow in NLL. Since it can
lead to undefined behavior, I think it's definitely worth making it a
hard error without waiting to turn off migration mode (rust-lang#58781).

Closes rust-lang#60450.

My ulterior motive here is making it impossible to leave variables
partially initialized across a yield (see discussion at rust-lang#63035), so
tests are included for that.
@tmandry tmandry force-pushed the disallow-possibly-uninitialized branch from f0c4319 to 9058bf2 Compare August 5, 2019 21:57
@Centril
Copy link
Contributor

Centril commented Aug 5, 2019

The changes look right to me wrt. @matthewjasper's notes + mine.

r=me when green

@tmandry
Copy link
Member Author

tmandry commented Aug 6, 2019

@bors r=Centril

@bors
Copy link
Contributor

bors commented Aug 6, 2019

📌 Commit 9058bf2 has been approved by Centril

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 6, 2019
Centril added a commit to Centril/rust that referenced this pull request Aug 6, 2019
…ized, r=Centril

Make use of possibly uninitialized data [E0381] a hard error

This is one of the behaviors we no longer allow in NLL. Since it can
lead to undefined behavior, I think it's definitely worth making it a
hard error without waiting to turn off migration mode (rust-lang#58781).

Closes rust-lang#60450.

My ulterior motive here is making it impossible to leave variables
partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so
tests are included for that.

cc rust-lang#54987

---

I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me.

r? @Centril
cc @cramertj @nikomatsakis @pnkfelix @RalfJung
Centril added a commit to Centril/rust that referenced this pull request Aug 6, 2019
…ized, r=Centril

Make use of possibly uninitialized data [E0381] a hard error

This is one of the behaviors we no longer allow in NLL. Since it can
lead to undefined behavior, I think it's definitely worth making it a
hard error without waiting to turn off migration mode (rust-lang#58781).

Closes rust-lang#60450.

My ulterior motive here is making it impossible to leave variables
partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so
tests are included for that.

cc rust-lang#54987

---

I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me.

r? @Centril
cc @cramertj @nikomatsakis @pnkfelix @RalfJung
Centril added a commit to Centril/rust that referenced this pull request Aug 6, 2019
…ized, r=Centril

Make use of possibly uninitialized data [E0381] a hard error

This is one of the behaviors we no longer allow in NLL. Since it can
lead to undefined behavior, I think it's definitely worth making it a
hard error without waiting to turn off migration mode (rust-lang#58781).

Closes rust-lang#60450.

My ulterior motive here is making it impossible to leave variables
partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so
tests are included for that.

cc rust-lang#54987

---

I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me.

r? @Centril
cc @cramertj @nikomatsakis @pnkfelix @RalfJung
bors added a commit that referenced this pull request Aug 6, 2019
Rollup of 14 pull requests

Successful merges:

 - #61457 (Implement DoubleEndedIterator for iter::{StepBy, Peekable, Take})
 - #63017 (Remove special code-path for handing unknown tokens)
 - #63184 (Explaining the reason why validation is performed in to_str of path.rs)
 - #63230 (Make use of possibly uninitialized data [E0381] a hard error)
 - #63260 (fix UB in a test)
 - #63264 (Revert "Rollup merge of #62696 - chocol4te:fix_#62194, r=estebank")
 - #63272 (Some more libsyntax::attr cleanup)
 - #63285 (Remove leftover AwaitOrigin)
 - #63287 (Don't store &Span)
 - #63293 (Clarify align_to's requirements and obligations)
 - #63295 (improve align_offset docs)
 - #63299 (Make qualify consts in_projection use PlaceRef)
 - #63312 (doc: fix broken sentence)
 - #63315 (Fix #63313)

Failed merges:

r? @ghost
@bors bors merged commit 9058bf2 into rust-lang:master Aug 6, 2019
@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2019

This broke the Reference doctests:

2019-08-06T10:01:20.2097589Z error[E0381]: assign to part of possibly uninitialized variable: `partially_initialized`
2019-08-06T10:01:20.2098466Z   --> /checkout/src/doc/reference/src/destructors.md:69:5
2019-08-06T10:01:20.2099254Z    |
2019-08-06T10:01:20.2100069Z 40 |     partially_initialized.0 = ShowOnDrop("Partial tuple first");
2019-08-06T10:01:20.2100423Z    |     ^^^^^^^^^^^^^^^^^^^^^^^ use of possibly uninitialized `partially_initialized`
2019-08-06T10:01:20.2100603Z 
2019-08-06T10:01:20.2100778Z error: aborting due to previous error

They should be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

assign to part of possibly uninitialized variable is flagged as UB (but only in 2018 edition)
7 participants