New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only mark unions as uninhabited if all of their fields are uninhabited #46859

Merged
merged 3 commits into from Dec 24, 2017

Conversation

Projects
None yet
9 participants
@gereeter
Contributor

gereeter commented Dec 19, 2017

Fixes #46845.

@rust-highfive

This comment has been minimized.

Show comment
Hide comment
@rust-highfive

rust-highfive Dec 19, 2017

Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @petrochenkov (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Collaborator

rust-highfive commented Dec 19, 2017

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @petrochenkov (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 19, 2017

Member

r? @eddyb

cc @arielb1 Should we do this or should we just never consider unions uninhabited?

Member

eddyb commented Dec 19, 2017

r? @eddyb

cc @arielb1 Should we do this or should we just never consider unions uninhabited?

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Dec 20, 2017

Member

What should ManuallyDrop<!> do? This reminds me of the "does it need a () field" discussion from back in #40559 (comment)...

(MaybeUninit<T> has the () field already in rust-lang/rfcs#1892)

Member

scottmcm commented Dec 20, 2017

What should ManuallyDrop<!> do? This reminds me of the "does it need a () field" discussion from back in #40559 (comment)...

(MaybeUninit<T> has the () field already in rust-lang/rfcs#1892)

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 20, 2017

Contributor

I would expect ManuallyDrop<!> to be uninhabited both just intuitively and because if I'm using ManuallyDrop, I'm probably doing it for optimization purposes, so taking an optimization away seems problematic. If unions with no inhabited fields are inhabited, then there are certain (admittedly fairly minor, probably) optimization that just can't be done. I don't think a () field in ManuallyDrop is necessary because drop_in_place already leaves its target invalid and I would expect that to be true of the method on ManuallyDrop as well.

That's not to say that there isn't a use for a union

union MaybeInitialized<T> {
    init: T,
    uninit: ()
}

That should be used whenever cleaned up data sticks around, that is when "uninitialized" is a valid state to be in.

All that said, I think arrayvec is a good example of why this might not be desirable. Currently, an ArrayVec<[T; 20]> is represented as (ManuallyDrop<[T; 20]>, usize). This is totally invalid if ManuallyDrop<!> is uninhabited. I personally believe it is invalid anyway, since the type does not at all encode the fact that the [T; 20] might be partially initialized. Even if ManuallyDrop had a () variant, I would expect any union sanitizer to object to this code - the union is in the initialized state, but the inner value is invalid. However, at the very least, it shows that people today are writing code that assumes that values stored in unions can be invalid.

The safe way to write ArrayVec, under the stricter reasoning I'm arguing for, would be to use an associated type parameter so that <[T; 20]>::PartiallyInitialized == [MaybeInitialized<T>; 20] and store the PartiallyInitialized version of an array.

Contributor

gereeter commented Dec 20, 2017

I would expect ManuallyDrop<!> to be uninhabited both just intuitively and because if I'm using ManuallyDrop, I'm probably doing it for optimization purposes, so taking an optimization away seems problematic. If unions with no inhabited fields are inhabited, then there are certain (admittedly fairly minor, probably) optimization that just can't be done. I don't think a () field in ManuallyDrop is necessary because drop_in_place already leaves its target invalid and I would expect that to be true of the method on ManuallyDrop as well.

That's not to say that there isn't a use for a union

union MaybeInitialized<T> {
    init: T,
    uninit: ()
}

That should be used whenever cleaned up data sticks around, that is when "uninitialized" is a valid state to be in.

All that said, I think arrayvec is a good example of why this might not be desirable. Currently, an ArrayVec<[T; 20]> is represented as (ManuallyDrop<[T; 20]>, usize). This is totally invalid if ManuallyDrop<!> is uninhabited. I personally believe it is invalid anyway, since the type does not at all encode the fact that the [T; 20] might be partially initialized. Even if ManuallyDrop had a () variant, I would expect any union sanitizer to object to this code - the union is in the initialized state, but the inner value is invalid. However, at the very least, it shows that people today are writing code that assumes that values stored in unions can be invalid.

The safe way to write ArrayVec, under the stricter reasoning I'm arguing for, would be to use an associated type parameter so that <[T; 20]>::PartiallyInitialized == [MaybeInitialized<T>; 20] and store the PartiallyInitialized version of an array.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 20, 2017

Member

@gereeter Part of the problem is that people have conflated MaybeInitialized and ManuallyDrop.

Member

eddyb commented Dec 20, 2017

@gereeter Part of the problem is that people have conflated MaybeInitialized and ManuallyDrop.

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 20, 2017

Contributor

@eddyb Oh, absolutely. I guess my point is that people have gone further and collapsed some PartiallyInitialized and ManuallyDrop, so I'm not sure that declaring MaybeInitialized to be the same thing as ManuallyDrop will actually help. Therefore, I'd rather just declare that code incorrect.

Contributor

gereeter commented Dec 20, 2017

@eddyb Oh, absolutely. I guess my point is that people have gone further and collapsed some PartiallyInitialized and ManuallyDrop, so I'm not sure that declaring MaybeInitialized to be the same thing as ManuallyDrop will actually help. Therefore, I'd rather just declare that code incorrect.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 20, 2017

Member

@gereeter I still prefer not having unions able to be uninhabited but maybe there's a point to it?

Member

eddyb commented Dec 20, 2017

@gereeter I still prefer not having unions able to be uninhabited but maybe there's a point to it?

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Dec 20, 2017

Member

Hmm, what's an empty union? In the "a union is an enum without an explicit discriminant" model it's uninhabited; in the "a union is a struct where all the fields overlap" model it's a ZST.

Though in both of those models, a union with one uninhabited field is uninhabited...

Edit: To answer my own question, 0-field unions are an error.

Member

scottmcm commented Dec 20, 2017

Hmm, what's an empty union? In the "a union is an enum without an explicit discriminant" model it's uninhabited; in the "a union is a struct where all the fields overlap" model it's a ZST.

Though in both of those models, a union with one uninhabited field is uninhabited...

Edit: To answer my own question, 0-field unions are an error.

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 20, 2017

Contributor

I admit that the biggest motivation for me is that it feels more consistent. We call enums tagged unions, but an enum with all uninhabited variants is uninhabited. Therefore, a tag with a union with all uninhabited variants should be uninhabited. We have safe functions like ManuallyDrop::into_inner. Therefore, if ManuallyDrop<T> is inhabited, we can get a T, so T is inhabited. Also, as I said above, I shouldn't lose an optimization just because of an implementation detail, and uninhabitedness is a useful optimization.

Contributor

gereeter commented Dec 20, 2017

I admit that the biggest motivation for me is that it feels more consistent. We call enums tagged unions, but an enum with all uninhabited variants is uninhabited. Therefore, a tag with a union with all uninhabited variants should be uninhabited. We have safe functions like ManuallyDrop::into_inner. Therefore, if ManuallyDrop<T> is inhabited, we can get a T, so T is inhabited. Also, as I said above, I shouldn't lose an optimization just because of an implementation detail, and uninhabitedness is a useful optimization.

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 20, 2017

Contributor

@scottmcm Is the following a valid implementation of transmute?

union Transmuter<T, U> {
    start: T,
    end: U
}

unsafe fn transmute<T, U>(value: T) -> U {
    Transmuter { start: value }.end
}

In the "overlapping struct" model, I'd say definitely yes. In the "tagless enum" model, I'm not sure.

Contributor

gereeter commented Dec 20, 2017

@scottmcm Is the following a valid implementation of transmute?

union Transmuter<T, U> {
    start: T,
    end: U
}

unsafe fn transmute<T, U>(value: T) -> U {
    Transmuter { start: value }.end
}

In the "overlapping struct" model, I'd say definitely yes. In the "tagless enum" model, I'm not sure.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 20, 2017

Member

just because of an implementation detail

This isn't really about implementation details, which don't really matter here either way.
The real question is the model that we actually want to expose to the user.

Member

eddyb commented Dec 20, 2017

just because of an implementation detail

This isn't really about implementation details, which don't really matter here either way.
The real question is the model that we actually want to expose to the user.

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 20, 2017

Contributor

@eddyb Sorry, I meant an implementation detail of a data structure library. If I have a data structure like NonEmptyList<T> and I want it to be uninhabited if T is uninhabited, then I am forbidden from using T only inside of a union (if unions are always inhabited). Unless PhantomData induces uninhabitedness, but that seems very wrong to me. Therefore, the optimization I want to give to the users of my library forces my hand for implementation.

Contributor

gereeter commented Dec 20, 2017

@eddyb Sorry, I meant an implementation detail of a data structure library. If I have a data structure like NonEmptyList<T> and I want it to be uninhabited if T is uninhabited, then I am forbidden from using T only inside of a union (if unions are always inhabited). Unless PhantomData induces uninhabitedness, but that seems very wrong to me. Therefore, the optimization I want to give to the users of my library forces my hand for implementation.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 20, 2017

Member

@gereeter What does a sound usecase look like? If you can optimize it out, how do you use union?

Member

eddyb commented Dec 20, 2017

@gereeter What does a sound usecase look like? If you can optimize it out, how do you use union?

@scottmcm

This comment has been minimized.

Show comment
Hide comment
@scottmcm

scottmcm Dec 20, 2017

Member

Uninhabitedness is the first thing I've seen where the "a union is an enum" model is clearly better than the"a union is a struct" model, since a struct is uninhabited if any field is uninhabited (aka the issue at hand) whereas an enum is uninhabited only if all variants are uninhabited.

@gereeter That's a question for the smantics-of-unsafe-code folks, I think -- it seems open as of https://github.com/petrochenkov/rfcs/blob/8f1a960844b6ae37ec19fd89d39355fa98b1bb2b/text/1444-union.md#delayed-and-unresolved-questions

Member

scottmcm commented Dec 20, 2017

Uninhabitedness is the first thing I've seen where the "a union is an enum" model is clearly better than the"a union is a struct" model, since a struct is uninhabited if any field is uninhabited (aka the issue at hand) whereas an enum is uninhabited only if all variants are uninhabited.

@gereeter That's a question for the smantics-of-unsafe-code folks, I think -- it seems open as of https://github.com/petrochenkov/rfcs/blob/8f1a960844b6ae37ec19fd89d39355fa98b1bb2b/text/1444-union.md#delayed-and-unresolved-questions

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Dec 20, 2017

Contributor

I think that we want unions (or, at least, ManuallyDrop) to always be inhabited - unions are used in unsafe code, and introducing basically unneeded and randomly-firing footguns into unsafe code is a stupid thing.

Contributor

arielb1 commented Dec 20, 2017

I think that we want unions (or, at least, ManuallyDrop) to always be inhabited - unions are used in unsafe code, and introducing basically unneeded and randomly-firing footguns into unsafe code is a stupid thing.

Never mark unions as uninhabited. Although I think this is wrong, it …
…is certainly sound, and the general consensus seems to value not having footguns over some sort of aesthetic consistency.
@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 21, 2017

Contributor

@eddyb I'm thinking of some sort of wrapper like

struct PermutedDrop<A: Array> {
    data: [ManuallyDrop<A::Item>; A::LENGTH],
    order: [usize; A::LENGTH]
}

impl<A: Array> Drop for PermutedDrop<A> {
    fn drop(&mut self) {
        for idx in self.order {
            unsafe {
                ManuallyDrop::drop(&mut self.data[idx]);
            }
        }
    }
}

that rearranges the drop order of the things it contains. If T is uninhabited, PermutedDrop<[T; 20]> should also be uninhabited. If unions are always uninhabited, I believe that PermutedDrop cannot be written to work and have that property. I admit, however, that this is a very niche use case where the optimization is unlikely to matter - why would you put uninhabited things in PermutedDrop anyway?

@arielb1 I'd posit that almost all code that assumes that unions cannot be uninhabited is broken in some other way. In that case, this being a footgun is not a problem - it just makes it more likely that the unsoundness will be noticed. That said, I may be wrong about this, and I think the situation is a bit different with ManuallyDrop, so it seems reasonable to at least make ManuallyDrop always be inhabited.

Contributor

gereeter commented Dec 21, 2017

@eddyb I'm thinking of some sort of wrapper like

struct PermutedDrop<A: Array> {
    data: [ManuallyDrop<A::Item>; A::LENGTH],
    order: [usize; A::LENGTH]
}

impl<A: Array> Drop for PermutedDrop<A> {
    fn drop(&mut self) {
        for idx in self.order {
            unsafe {
                ManuallyDrop::drop(&mut self.data[idx]);
            }
        }
    }
}

that rearranges the drop order of the things it contains. If T is uninhabited, PermutedDrop<[T; 20]> should also be uninhabited. If unions are always uninhabited, I believe that PermutedDrop cannot be written to work and have that property. I admit, however, that this is a very niche use case where the optimization is unlikely to matter - why would you put uninhabited things in PermutedDrop anyway?

@arielb1 I'd posit that almost all code that assumes that unions cannot be uninhabited is broken in some other way. In that case, this being a footgun is not a problem - it just makes it more likely that the unsoundness will be noticed. That said, I may be wrong about this, and I think the situation is a bit different with ManuallyDrop, so it seems reasonable to at least make ManuallyDrop always be inhabited.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Dec 21, 2017

Member

I don't even think we treat [!; N] as uninhabited, just ZST.

Member

eddyb commented Dec 21, 2017

I don't even think we treat [!; N] as uninhabited, just ZST.

@gereeter

This comment has been minimized.

Show comment
Hide comment
@gereeter

gereeter Dec 21, 2017

Contributor

@eddyb Oh. Well, that also seems wrong to me (for N greater than 0). However, it does largely invalidate my argument.

Since it seems to be the most accepted opinion, doesn't break code, and isn't setting a precedent that hasn't already been set, I switched this PR to always making unions inhabited.

Contributor

gereeter commented Dec 21, 2017

@eddyb Oh. Well, that also seems wrong to me (for N greater than 0). However, it does largely invalidate my argument.

Since it seems to be the most accepted opinion, doesn't break code, and isn't setting a precedent that hasn't already been set, I switched this PR to always making unions inhabited.

Show outdated Hide outdated src/librustc/ty/layout.rs Outdated
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb
Member

eddyb commented Dec 24, 2017

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Dec 24, 2017

Contributor

📌 Commit da97917 has been approved by eddyb

Contributor

bors commented Dec 24, 2017

📌 Commit da97917 has been approved by eddyb

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Dec 24, 2017

Contributor

⌛️ Testing commit da97917 with merge 34c65e2...

Contributor

bors commented Dec 24, 2017

⌛️ Testing commit da97917 with merge 34c65e2...

bors added a commit that referenced this pull request Dec 24, 2017

Auto merge of #46859 - gereeter:uninhabited-unions, r=eddyb
Only mark unions as uninhabited if all of their fields are uninhabited

Fixes #46845.
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Dec 24, 2017

Contributor

💔 Test failed - status-travis

Contributor

bors commented Dec 24, 2017

💔 Test failed - status-travis

@kennytm

This comment has been minimized.

Show comment
Hide comment
@kennytm

kennytm Dec 24, 2017

Member

@bors retry

[01:49:45] test net::tcp::tests::clone_accept_smoke has been running for over 60 seconds

No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received

The build has been terminated

(check x86_64-apple-darwin)

Member

kennytm commented Dec 24, 2017

@bors retry

[01:49:45] test net::tcp::tests::clone_accept_smoke has been running for over 60 seconds

No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received

The build has been terminated

(check x86_64-apple-darwin)

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Dec 24, 2017

Contributor

⌛️ Testing commit da97917 with merge 4ce6b9a...

Contributor

bors commented Dec 24, 2017

⌛️ Testing commit da97917 with merge 4ce6b9a...

bors added a commit that referenced this pull request Dec 24, 2017

Auto merge of #46859 - gereeter:uninhabited-unions, r=eddyb
Only mark unions as uninhabited if all of their fields are uninhabited

Fixes #46845.
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Dec 24, 2017

Contributor

☀️ Test successful - status-appveyor, status-travis
Approved by: eddyb
Pushing 4ce6b9a to master...

Contributor

bors commented Dec 24, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: eddyb
Pushing 4ce6b9a to master...

@bors bors merged commit da97917 into rust-lang:master Dec 24, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jan 10, 2018

Member

Looks like we forgot to backport this to 1.23.0 (sorry about that!) so removing the beta tags

Member

alexcrichton commented Jan 10, 2018

Looks like we forgot to backport this to 1.23.0 (sorry about that!) so removing the beta tags

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jan 10, 2018

Member

@alexcrichton #45225 was backed out from that beta (IIUC) so there wasn't anything to backport.

Member

eddyb commented Jan 10, 2018

@alexcrichton #45225 was backed out from that beta (IIUC) so there wasn't anything to backport.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jan 10, 2018

Member

Oh yay!

Member

alexcrichton commented Jan 10, 2018

Oh yay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment