Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate uninitialized in favor of a new MaybeUninit type #1892

Merged
merged 20 commits into from Aug 19, 2018

Conversation

@canndrew
Copy link
Contributor

commented Feb 9, 2017

My thoughts on what to do with uninitialized and !.

Rendered

Tracking issue

Edit: This has been updated to instead recommend deprecating uninitialized entirely. The old Inhabited trait proposal is now listed as an alternative.

Edit: FCP Proposal is #1892 (comment) (in the collapsed-by-default part).

@canndrew

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2017

cc @nikomatsakis, @arielb1, @eddyb. I know y'all have some thoughts on what to do with all this.

@ubsan

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2017

Calling uninitialized is already extremely dangerous. This just seems like unnecessary complication. If you want to create a !, for some reason, whatever you're doing is probably UB anyways. At most, we might want to have a warning in the docs.

@canndrew

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2017

@ubsan, what about the catch_unwind example in the RFC? That's based on a compiler bug that got created when feature(never_type) got switched on in the standard library.

Without this, any (otherwise correct) code that uses uninitialized::<T> is landmine waiting to go off when someone tries to set T = !.

@nagisa

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2017

The compiler bug happened mostly because previously the T would become () due to ! not existing. I personally think this PR is going in the right direction (a-la Sized).


This trait is automatically implemented for all inhabited types.

Change the type of `uninitialized` to:

This comment has been minimized.

Copy link
@nagisa

nagisa Feb 9, 2017

Contributor

This conflicts somewhat with Inhabited being an auto-trait (a-la Sized), or rather T is already Inhabited unless specified otherwise (i.e. T: ?Inhabited)

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

Question: Does ! impl Sized? Does that even make sense?

This comment has been minimized.

Copy link
@canndrew

canndrew Feb 10, 2017

Author Contributor

@mark-i-m Yes, size_of::<!>() == 0, More generally though, for any given size every element of ! has that size. Similarly, any two elements of ! have the same size. These are both trivially true since ! has no elements.

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

So we are defining size_of::<T>() == n iff for all values v of T that v takes n bits to express, and since there are no values of T = !, this vacuously true?

This comment has been minimized.

Copy link
@canndrew

canndrew Feb 10, 2017

Author Contributor

Basically, yeah. There's two subtly different definitions you could give of Sized but ! satisfies both of them vacuously.

This could be a rather large breaking change depending on how many people are
currently calling `uninitialized::<T>` with a generic `T`. However all such
code is already somewhat future-incompatible as it will malfunction (or panic)
if used with `!`.

This comment has been minimized.

Copy link
@nagisa

nagisa Feb 9, 2017

Contributor

If Inhabited is auto-trait, like Sized, I do not see how that’s problematic in any way.

The author of the crate may expect this change to be private and its effects
contained to within the crate. But in making this change they've also stopped
exporting the `Inhabited` impl, causing potential breakages for downstream
users.

This comment has been minimized.

Copy link
@nagisa

nagisa Feb 9, 2017

Contributor

Again, that’s pretty much the same story with Sized. I.e. you cannot change pub struct Sized { a: [u8; 42] } to pub struct Sized { a: [u8] }.

@arielb1

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2017

We pretty much decided in the design sprint to deprecate mem::uninitialized in favor of using a MaybeUninitialized<T> union in the standard library.

@canndrew

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2017

@arielb1 Ah okay. I figured that might too radical for now so I intended this RFC as a (possibly temporary) middle ground.

@canndrew

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2017

@nagisa Sorry if my wording is confusing. Inhabited isn't intended to be an auto-trait like Sized in the sense that you have to explicitly opt-out with ?Inhabited. That would create unnecessary restrictions on using uninhabited types. ! and Void should be usable pretty much anywhere, it's only really uninitialized which is problematic.

@nagisa

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2017

it's only really uninitialized which is problematic.

That’s not true. ptr::read (and probably a number of other functions) has the same problem as uninitialized and we ain’t deprecating that, so an auto-trait is still worthwhile IMO:

fn main() {
let x: *const ! = &0 as *const _ as *const _; // imagine this comes as an generic argument from somewhere
let z: ! = unsafe {
    ::std::mem::read(x) // boom bam blamma
};
}
@canndrew

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2017

Ah true, I hadn't thought of ptr::read. transmute has the same problem but I've made a seperate RFC for that. I just went through the list of intrinsics and I couldn't find any others which return a generic T without also taking one as an argument.

I still think ?Inhabited would make uninhabited types almost unusable unless people added the ?Inhabited bound everywhere. It seems simpler and a lot less restrictive to just put T: Inhabited on ptr::read as well.

match std::panic::catch_unwind(|| {
let val = f();
unsafe {
(*foo_ref).value = val;

This comment has been minimized.

Copy link
@ubsan

ubsan Feb 9, 2017

Contributor

This line is broken for types with Drop impls. It should be ptr::write(&mut (*foo_ref).value, val)

This comment has been minimized.

Copy link
@canndrew

canndrew Feb 10, 2017

Author Contributor

Hmm, I thought we settled on different rules for overwriting union fields for some reason. Thanks for the catch!

```
Yet calling this function does not diverge! It just breaks everything then eats
your laundry instead.

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

Somehow, this seems preferable to folding my laundry at the moment...

if used with `!`.

Another drawback is that the `Inhabited` trait leaks private information about
types. Consider a type with the following definition:

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

I am not convinced this is more serious than already-existing leaks... for example, you can already find out the size of a type. Is there any fundamental difference with this?

This comment has been minimized.

Copy link
@canndrew

canndrew Feb 10, 2017

Author Contributor

Not really, it's the same as Sized in this regard.

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

Ok, that makes sense 😄

Ideally, Rust's type system should have a way of talking about initializedness
statically. In the past there have been proposals for new pointer types which
could safely handle uninitialized data. We should seriously consider pursuing
one of these proposals.

This comment has been minimized.

Copy link
@mark-i-m

mark-i-m Feb 10, 2017

Contributor

👍 I totally agree! This would make static muts much more powerful while still being safe.

@ranma42

This comment has been minimized.

Copy link
Contributor

commented Feb 10, 2017

@canndrew could you add the alternative from the design sprint? (add MaybeUninit<T> and deprecate mem::uninitialized)

I find it compelling, as it removes some magic from the compiler (no special checks or automatic traits for size/inhabitedness) and lets the type system just do "its thing".

@ranma42

This comment has been minimized.

Copy link
Contributor

commented Feb 10, 2017

Sorry, I just realised the option of completely deprecating mem::uninitialized is mentioned as a future direction. While it is certainly possible to make this change in smaller steps, I think it might be very interesting to assess the potential disruption of the "direct" change (would a crater run be sufficient for this?). If deemed possible, I think that would be the best course of action.

@ranma42

This comment has been minimized.

Copy link
Contributor

commented Feb 10, 2017

Are there any plans to do the same to mem::zeroed, either through this RFC or through another one? I found no relevant results searching through the RFCs repo.

@whataloadofwhat

This comment has been minimized.

Copy link

commented Feb 10, 2017

I still think ?Inhabited would make uninhabited types almost unusable unless people added the ?Inhabited bound everywhere. It seems simpler and a lot less restrictive to just put T: Inhabited on ptr::read as well.

I agree with what you're saying about ?Inhabited. I don't want that either. But then I think we'd need some kind of read_or_unreachable function without a bound anyway, otherwise Vec<T> will need an Inhabited bound (at least for remove and into_iter), which will infect a LOT of code. It's genuinely unreachable code to get to ptr::read from Vec<!> anyway because the only way to do it would be to push an uninhabited value onto it in the first place. Vec<!> is harmless, so having an Inhabited bound on it would be pointless.

I'd be (more) okay with Inhabited as a regular (compiler implemented) trait but it's completely unworkable to add it onto these functions as it would just break too much code, uninitialized, zeroed and read are all stable and there's probably a lot of code out there that calls them generically. I think the best we could do to make Inhabited as a trait work (though I don't think it will be used much, this is if you want the Inhabited trait in general rather than just for these cases) is to deprecate everything which should have the bound and then provide better alternatives: mem::MaybeInitialized<T> for mem::uninitialized and mem::zeroed. ptr::read_or_unreachable<T> and ptr::read_inhabited<T: Inhabited> for ptr::read/ptr::read_volatile/ptr::read_unaligned (those last two will become pretty verbose :(). Adding the bound (unless it's ?Inhabited) onto the functions is, unfortunately, a no-go because it will cause a lot of breakage.

Myself, I think the best route is:

  • Don't have an Inhabited trait, nor ?Inhabited. I don't think T: Inhabited is the best default to enforce because I think that cases where T = ! will cause issues are rare. It will lead to a lot of unnecessary restrictions on ! or will lead to ?Inhabited everywhere. Neither of those are ideal. Similarly, I don't think there are many cases where knowing that a type is inhabited is useful, but I'm not sure.
  • Deprecate uninitialized and zeroed and leave the signatures as they are. We could, perhaps, throw an error or lint at trans time if T = !, because it's highly likely to be a bug. I don't know if that's possible, but I think it's what is (was?) done for mem::transmute.
  • Add MaybeInitialized<T> to core::mem. Give it a zeroed constructor which calls memset on it. (maybe one day a const constructor by using [0u8; size_of::<T>()], but that's later). In the deprecation message for uninitialized and zeroed, direct them here.
  • Add a note onto ptr::read's docs that if T in uninhabited then the call must be unreachable.
@djzin

This comment has been minimized.

Copy link

commented Feb 12, 2017

Could this issue be solved with ! having no size? If the size of a type is defined to be ceil(log(n)) where n is the number of possible representations, then ! should have undefined size, or, if you like, "negative infinity" size. The main issue with this is that empty enums are already defined as being sized... and also "negative infinity" is not representable in a usize... but just a thought I had ;)

@mark-i-m

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2017

@ubsan

This comment has been minimized.

Copy link
Contributor

commented Aug 11, 2018

@gnzlbg It seems to me that writing an uninit value to an object is where the UB should happen, not just because you happen to have a reference that points to that value (especially since drop(z: &T) is a no-op (and frankly drop(z: &mut i32) should also probably be a no-op))

@gnzlbg

This comment has been minimized.

Copy link
Contributor

commented Aug 11, 2018

@ubsan That makes sense. I guess one would need to drop the object instead, the storage after the drop should be uninit, so there is no need to write uninit again to it, although I guess one could do so.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Aug 17, 2018

honestly I still do not understand why creating &mut to uninitialized memory is automatically undefined behaviour and not reading from it

Well, there are other people saying the exact opposite. ;) We have to make a choice either way, but for both cases we have people arguing that this is clearly the more intuitive option.

I don't see how we could allow &T to point to uninitialized memory without making a pretty big backwards incompatible language change: all code that expects a &T and doesn't guard against T potentially being uninitialized would be at risk at best. Could it become broken or incorrect? No idea.

No, there would be no incompatible change. As @ubsan mentioned, we have two invariants at play here: Which assumptions the compiler can make for its optimizations, and which assumptions safe code can make for values it sees. There is no a priori reason to think these are the same, and in fact I think it is rather impossible to make them the same. I have a blogpost upcoming for this that should hopefully be done no later than Monday...

But just one example: A safe higher-order function which takes an argument f: fn(&i32) -> &i32 can assume that f can be called with any shared reference. So following your "violating safe code assumptions is insta-UB", it would be UB to have a function which does not have this property. On the other hand, many libraries have private functions not marked unsafe that actually are de-facto unsafe because they make extra assumptions that are guaranteed by the surrounding module. So your proposal makes all those libraries UB.

There are other problems as well. For example, the invariant that may be assumed by safe code is impossible to check for because it is frequently not computable (as in, would require solving the halting problem). I think we should have a definition of UB that can, at least in principle, be checked.

japaric added a commit to japaric/heapless that referenced this pull request Aug 19, 2018

internally use MaybeUninit
which has been proposed in rust-lang/rfcs#1892
@japaric japaric referenced this pull request Aug 19, 2018

bors bot added a commit to japaric/heapless that referenced this pull request Aug 19, 2018

Merge #55
55: internally use MaybeUninit r=japaric a=japaric

which has been proposed in rust-lang/rfcs#1892

Co-authored-by: Jorge Aparicio <jorge@japaric.io>
@rfcbot

This comment has been minimized.

Copy link

commented Aug 19, 2018

The final comment period, with a disposition to merge, as per the review above, is now complete.

@Centril Centril merged commit 21f887f into rust-lang:master Aug 19, 2018

@Centril

This comment has been minimized.

Copy link
Member

commented Aug 19, 2018

Huzzah! This RFC has been merged!

Tracking issue: rust-lang/rust#53491

XOSplicer added a commit to XOSplicer/heapless that referenced this pull request Aug 22, 2018

bors added a commit to rust-lang/rust that referenced this pull request May 20, 2019

Auto merge of #60445 - RalfJung:maybe-uninit, r=Centril
stabilize core parts of MaybeUninit

and deprecate mem::uninitialized in the future (1.40.0). This is part of implementing rust-lang/rfcs#1892.

Also expand the documentation a bit.

This type is currently primarily useful when dealing with partially initialized arrays. In libstd, it is used e.g. in `BTreeMap` (with some unstable APIs that however can all be replaced, less ergonomically, by stable ones). What we stabilize should also be enough for `SmallVec` (Cc @bluss).

Making this useful for structs requires rust-lang/rfcs#2582 or a commitment that references to uninitialized data are not insta-UB.

bors added a commit to rust-lang/rust that referenced this pull request May 20, 2019

Auto merge of #60445 - RalfJung:maybe-uninit, r=Centril
stabilize core parts of MaybeUninit

and deprecate mem::uninitialized in the future (1.40.0). This is part of implementing rust-lang/rfcs#1892.

Also expand the documentation a bit.

This type is currently primarily useful when dealing with partially initialized arrays. In libstd, it is used e.g. in `BTreeMap` (with some unstable APIs that however can all be replaced, less ergonomically, by stable ones). What we stabilize should also be enough for `SmallVec` (Cc @bluss).

Making this useful for structs requires rust-lang/rfcs#2582 or a commitment that references to uninitialized data are not insta-UB.

bors added a commit to rust-lang/rust that referenced this pull request May 20, 2019

Auto merge of #60445 - RalfJung:maybe-uninit, r=Centril
stabilize core parts of MaybeUninit

and deprecate mem::uninitialized in the future (1.40.0). This is part of implementing rust-lang/rfcs#1892.

Also expand the documentation a bit.

This type is currently primarily useful when dealing with partially initialized arrays. In libstd, it is used e.g. in `BTreeMap` (with some unstable APIs that however can all be replaced, less ergonomically, by stable ones). What we stabilize should also be enough for `SmallVec` (Cc @bluss).

Making this useful for structs requires rust-lang/rfcs#2582 or a commitment that references to uninitialized data are not insta-UB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.