Untagged unions (tracking issue for RFC 1444) #32836

Open
nikomatsakis opened this Issue Apr 8, 2016 · 177 comments

Comments

Projects
None yet
@nikomatsakis
Contributor

nikomatsakis commented Apr 8, 2016

Tracking issue for rust-lang/rfcs#1444.

Unresolved questions:

  • Does assigning directly to a union field trigger a drop of the previous contents?
  • When moving out of one field of a union, are the others considered invalidated? (1, 2, 3, 4)
  • Under what conditions can you implement Copy for a union? For example, what if some variants are of non-Copy type? All variants?
  • What interaction is there between unions and enum layout optimizations? (#36394)

Open issues of high import:

  • #47412 -- MIR-based unsafety checker sometimes accepts unsafe accesses to union fields in presence of uninhabited fields

@nikomatsakis nikomatsakis referenced this issue in rust-lang/rfcs Apr 8, 2016

Merged

unions #1444

@sfackler

This comment has been minimized.

Show comment
Hide comment
@sfackler

sfackler Apr 8, 2016

Member

I may have missed it in the discussion on the RFC, but am I correct in thinking that destructors of union variants are never run? Would the destructor for the Box::new(1) run in this example?

union Foo {
    f: i32,
    g: Box<i32>,
}

let mut f = Foo { g: Box::new(1) };
f.g = Box::new(2);
Member

sfackler commented Apr 8, 2016

I may have missed it in the discussion on the RFC, but am I correct in thinking that destructors of union variants are never run? Would the destructor for the Box::new(1) run in this example?

union Foo {
    f: i32,
    g: Box<i32>,
}

let mut f = Foo { g: Box::new(1) };
f.g = Box::new(2);
@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 8, 2016

Member

@sfackler My current understanding is that f.g = Box::new(2) will run the destructor but f = Foo { g: Box::new(2) } would not. That is, assigning to a Box<i32> lvalue will cause a drop like always, but assigning to a Foo lvalue will not.

Member

solson commented Apr 8, 2016

@sfackler My current understanding is that f.g = Box::new(2) will run the destructor but f = Foo { g: Box::new(2) } would not. That is, assigning to a Box<i32> lvalue will cause a drop like always, but assigning to a Foo lvalue will not.

@sfackler

This comment has been minimized.

Show comment
Hide comment
@sfackler

sfackler Apr 8, 2016

Member

So an assignment to a variant is like an assertion that the field was previously "valid"?

Member

sfackler commented Apr 8, 2016

So an assignment to a variant is like an assertion that the field was previously "valid"?

@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 8, 2016

Member

@sfackler For Drop types, yeah, that's my understanding. If they weren't previously valid you need to use the Foo constructor form or ptr::write. From a quick grep, it doesn't seem like the RFC is explicit about this detail, though. I see it as an instantiation of the general rule that writing to a Drop lvalue causes a destructor call.

Member

solson commented Apr 8, 2016

@sfackler For Drop types, yeah, that's my understanding. If they weren't previously valid you need to use the Foo constructor form or ptr::write. From a quick grep, it doesn't seem like the RFC is explicit about this detail, though. I see it as an instantiation of the general rule that writing to a Drop lvalue causes a destructor call.

@ohAitch

This comment has been minimized.

Show comment
Hide comment
@ohAitch

ohAitch Apr 8, 2016

Should a &mut union with Drop variants be a lint?

On Friday, 8 April 2016, Scott Olson notifications@github.com wrote:

@sfackler https://github.com/sfackler For Drop types, yeah, that's my
understanding. If they weren't previously valid you need to use the Foo
constructor form or ptr::write. From a quick grep, it doesn't seem like
the RFC is explicit about this detail, though.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#32836 (comment)

ohAitch commented Apr 8, 2016

Should a &mut union with Drop variants be a lint?

On Friday, 8 April 2016, Scott Olson notifications@github.com wrote:

@sfackler https://github.com/sfackler For Drop types, yeah, that's my
understanding. If they weren't previously valid you need to use the Foo
constructor form or ptr::write. From a quick grep, it doesn't seem like
the RFC is explicit about this detail, though.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#32836 (comment)

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Apr 8, 2016

Member

On April 8, 2016 3:36:22 PM PDT, Scott Olson notifications@github.com wrote:

@sfackler For Drop types, yeah, that's my understanding. If they
weren't previously valid you need to use the Foo constructor form or
ptr::write. From a quick grep, it doesn't seem like the RFC is
explicit about this detail, though.

I should have covered that case explicitly. I think both behaviors are defensible, but I think it'd be far less surprising to never implicitly drop a field. The RFC already recommends a lint for union fields with types that implement Drop. I don't think assigning to a field implies that field was previously valid.

Member

joshtriplett commented Apr 8, 2016

On April 8, 2016 3:36:22 PM PDT, Scott Olson notifications@github.com wrote:

@sfackler For Drop types, yeah, that's my understanding. If they
weren't previously valid you need to use the Foo constructor form or
ptr::write. From a quick grep, it doesn't seem like the RFC is
explicit about this detail, though.

I should have covered that case explicitly. I think both behaviors are defensible, but I think it'd be far less surprising to never implicitly drop a field. The RFC already recommends a lint for union fields with types that implement Drop. I don't think assigning to a field implies that field was previously valid.

@sfackler

This comment has been minimized.

Show comment
Hide comment
@sfackler

sfackler Apr 8, 2016

Member

Yeah, that approach seems a bit less dangerous to me as well.

Member

sfackler commented Apr 8, 2016

Yeah, that approach seems a bit less dangerous to me as well.

@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 8, 2016

Member

Not dropping when assigning to a union field would make f.g = Box::new(2) act differently from let p = &mut f.g; *p = Box::new(2), because you can't make the latter case not drop. I think my approach is less surprising.

It's not a new problem, either; unsafe programmers already have to deal with other situations where foo = bar is UB if foo is uninitialized and Drop.

Member

solson commented Apr 8, 2016

Not dropping when assigning to a union field would make f.g = Box::new(2) act differently from let p = &mut f.g; *p = Box::new(2), because you can't make the latter case not drop. I think my approach is less surprising.

It's not a new problem, either; unsafe programmers already have to deal with other situations where foo = bar is UB if foo is uninitialized and Drop.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Apr 8, 2016

Member

I personally don't plan to use Drop types with unions at all. So I'll defer entirely to people who have worked with analogous unsafe code on the semantics of doing so.

Member

joshtriplett commented Apr 8, 2016

I personally don't plan to use Drop types with unions at all. So I'll defer entirely to people who have worked with analogous unsafe code on the semantics of doing so.

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Apr 9, 2016

Member

I also don't intend to use Drop types in unions so either way doesn't matter to me as long as it is consistent.

Member

retep998 commented Apr 9, 2016

I also don't intend to use Drop types in unions so either way doesn't matter to me as long as it is consistent.

@ohAitch

This comment has been minimized.

Show comment
Hide comment
@ohAitch

ohAitch Apr 9, 2016

I don't intend to use mutable references to unions, and probably
just "weirdly-tagged" ones with Into

On Friday, 8 April 2016, Peter Atashian notifications@github.com wrote:

I also don't intend to use Drop types in unions so either way doesn't
matter to me as long as it is consistent.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#32836 (comment)

ohAitch commented Apr 9, 2016

I don't intend to use mutable references to unions, and probably
just "weirdly-tagged" ones with Into

On Friday, 8 April 2016, Peter Atashian notifications@github.com wrote:

I also don't intend to use Drop types in unions so either way doesn't
matter to me as long as it is consistent.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#32836 (comment)

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Apr 12, 2016

Contributor

Seems like this is a good issue to raise up as an unresolved question. I'm not sure yet which approach I prefer.

Contributor

nikomatsakis commented Apr 12, 2016

Seems like this is a good issue to raise up as an unresolved question. I'm not sure yet which approach I prefer.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Apr 12, 2016

Member

@nikomatsakis As much as I find it awkward for assigning to a union field of a type with Drop to require previous validity of that field, the reference case @tsion mentioned seems almost unavoidable. I think this might just be a gotcha associated with code that intentionally disables the lint for putting a type with Drop in a union. (And a short explanation of it should be in the explanatory text for that lint.)

Member

joshtriplett commented Apr 12, 2016

@nikomatsakis As much as I find it awkward for assigning to a union field of a type with Drop to require previous validity of that field, the reference case @tsion mentioned seems almost unavoidable. I think this might just be a gotcha associated with code that intentionally disables the lint for putting a type with Drop in a union. (And a short explanation of it should be in the explanatory text for that lint.)

@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 12, 2016

Member

And I'd like to reiterate that unsafe programmers must already generally know that a = b means drop_in_place(&mut a); ptr::write(&mut a, b) to write safe code. Not dropping union fields would be one more exception to learn, not one less.

(NB: the drop doesn't happen when a is statically known to already be uninitialized, like let a; a = b;.)

But I support having a default warning against Drop variants in unions that people have to #[allow(..)] since this is a fairly non-obvious detail.

Member

solson commented Apr 12, 2016

And I'd like to reiterate that unsafe programmers must already generally know that a = b means drop_in_place(&mut a); ptr::write(&mut a, b) to write safe code. Not dropping union fields would be one more exception to learn, not one less.

(NB: the drop doesn't happen when a is statically known to already be uninitialized, like let a; a = b;.)

But I support having a default warning against Drop variants in unions that people have to #[allow(..)] since this is a fairly non-obvious detail.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Apr 12, 2016

Contributor

@tsion this is not true for a = b and maybe only sometimes true for a.x = b but it is certainly true for *a = b. This uncertainty is what made me hesitant about it. For example, this compiles:

fn main() {
  let mut x: (i32, i32);
  x.0 = 2;
  x.1 = 3;
}

(though trying to print x later fails, but I consider that a bug)

Contributor

nikomatsakis commented Apr 12, 2016

@tsion this is not true for a = b and maybe only sometimes true for a.x = b but it is certainly true for *a = b. This uncertainty is what made me hesitant about it. For example, this compiles:

fn main() {
  let mut x: (i32, i32);
  x.0 = 2;
  x.1 = 3;
}

(though trying to print x later fails, but I consider that a bug)

@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 12, 2016

Member

@nikomatsakis That example is new to me. I guess I would have considered it a bug that that example compiles, given my previous experience.

But I'm not sure I see the relevance of that example. Why is what I said not true for a = b and only sometimes for a.x = b?

Say, if x.0 had a type with a destructor, surely that destructor is called:

fn main() {
    let mut x: (Box<i32>, i32);
    x.0 = Box::new(2); // x.0 statically know to be uninit, destructor not called
    x.0 = Box::new(3); // x.0 destructor is called before writing new value
}
Member

solson commented Apr 12, 2016

@nikomatsakis That example is new to me. I guess I would have considered it a bug that that example compiles, given my previous experience.

But I'm not sure I see the relevance of that example. Why is what I said not true for a = b and only sometimes for a.x = b?

Say, if x.0 had a type with a destructor, surely that destructor is called:

fn main() {
    let mut x: (Box<i32>, i32);
    x.0 = Box::new(2); // x.0 statically know to be uninit, destructor not called
    x.0 = Box::new(3); // x.0 destructor is called before writing new value
}
@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Apr 14, 2016

Contributor

Maybe just lint against that kind of write?

Contributor

arielb1 commented Apr 14, 2016

Maybe just lint against that kind of write?

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Apr 16, 2016

Contributor

My point is only that = does not always run the destructor; it
uses some knowledge about whether the target is known to be
initialized.

On Tue, Apr 12, 2016 at 04:10:39PM -0700, Scott Olson wrote:

@nikomatsakis That example new to me. I guess I would have considered it a bug that that example compiles, given my previous experience.

But I'm not sure I see the relevance of that example. Why is what I said not true for a = b and only sometimes for 'a.x = b'?

Say, if x.0 had a type with a destructor, surely that destructor is called:

fn main() {
    let mut x: (Box<i32>, i32);
    x.0 = Box::new(2); // x.0 statically know to be uninit, destructor not called
    x.0 = Box::new(3); // x.0 destructor is called
}
Contributor

nikomatsakis commented Apr 16, 2016

My point is only that = does not always run the destructor; it
uses some knowledge about whether the target is known to be
initialized.

On Tue, Apr 12, 2016 at 04:10:39PM -0700, Scott Olson wrote:

@nikomatsakis That example new to me. I guess I would have considered it a bug that that example compiles, given my previous experience.

But I'm not sure I see the relevance of that example. Why is what I said not true for a = b and only sometimes for 'a.x = b'?

Say, if x.0 had a type with a destructor, surely that destructor is called:

fn main() {
    let mut x: (Box<i32>, i32);
    x.0 = Box::new(2); // x.0 statically know to be uninit, destructor not called
    x.0 = Box::new(3); // x.0 destructor is called
}
@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Apr 16, 2016

Contributor

@nikomatsakis

It runs the destructor if the drop flag is set.

But I think that kind of write is confusing anyway, so why not just forbid it? You can always do *(&mut u.var) = val.

Contributor

arielb1 commented Apr 16, 2016

@nikomatsakis

It runs the destructor if the drop flag is set.

But I think that kind of write is confusing anyway, so why not just forbid it? You can always do *(&mut u.var) = val.

@solson

This comment has been minimized.

Show comment
Hide comment
@solson

solson Apr 16, 2016

Member

My point is only that = does not always run the destructor; it uses some knowledge about whether the target is known to be initialized.

@nikomatsakis I already mentioned that:

(NB: the drop doesn't happen when a is statically known to already be uninitialized, like let a; a = b;.)

But I didn't account for dynamic checking of drop flags, so this is definitely more complicated than I considered.

Member

solson commented Apr 16, 2016

My point is only that = does not always run the destructor; it uses some knowledge about whether the target is known to be initialized.

@nikomatsakis I already mentioned that:

(NB: the drop doesn't happen when a is statically known to already be uninitialized, like let a; a = b;.)

But I didn't account for dynamic checking of drop flags, so this is definitely more complicated than I considered.

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Apr 17, 2016

Contributor

@tsion

Drop flags are only semi-dynamic - after zeroing drop is gone, they are a part of codegen. I say we forbid that kind of write because it does more confusion than good.

Contributor

arielb1 commented Apr 17, 2016

@tsion

Drop flags are only semi-dynamic - after zeroing drop is gone, they are a part of codegen. I say we forbid that kind of write because it does more confusion than good.

@Daggerbot

This comment has been minimized.

Show comment
Hide comment
@Daggerbot

Daggerbot Apr 27, 2016

Should Drop types even be allowed in unions? If I'm understanding things correctly, the main reason to have unions in Rust is to interface with C code that has unions, and C doesn't even have destructors. For all other purposes, it seems that it's better to just use an enum in Rust code.

Daggerbot commented Apr 27, 2016

Should Drop types even be allowed in unions? If I'm understanding things correctly, the main reason to have unions in Rust is to interface with C code that has unions, and C doesn't even have destructors. For all other purposes, it seems that it's better to just use an enum in Rust code.

@Amanieu

This comment has been minimized.

Show comment
Hide comment
@Amanieu

Amanieu Apr 27, 2016

Contributor

There is a valid use case for using a union to implement a NoDrop type which inhibits drop.

Contributor

Amanieu commented Apr 27, 2016

There is a valid use case for using a union to implement a NoDrop type which inhibits drop.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Apr 27, 2016

Member

As well as invoking such code manually via drop_in_place or similar.

Member

joshtriplett commented Apr 27, 2016

As well as invoking such code manually via drop_in_place or similar.

@RumataEstor

This comment has been minimized.

Show comment
Hide comment
@RumataEstor

RumataEstor Jun 21, 2016

To me dropping a field value while writing to it is definitely wrong because the previous option type is undefined.

Would it be possible to prohibit field setters but require full union replacement? In this case if the union implements Drop full union drop would be called for the value replaced as expected.

To me dropping a field value while writing to it is definitely wrong because the previous option type is undefined.

Would it be possible to prohibit field setters but require full union replacement? In this case if the union implements Drop full union drop would be called for the value replaced as expected.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Jun 22, 2016

Member

I don't think it makes sense to prohibit field setters; most uses of unions should have no problem using those, and fields without a Drop implementation will likely remain the common case. Unions with fields that implement Drop will produce a warning by default, making it even less likely to hit this case accidentally.

Member

joshtriplett commented Jun 22, 2016

I don't think it makes sense to prohibit field setters; most uses of unions should have no problem using those, and fields without a Drop implementation will likely remain the common case. Unions with fields that implement Drop will produce a warning by default, making it even less likely to hit this case accidentally.

@Stebalien

This comment has been minimized.

Show comment
Hide comment
@Stebalien

Stebalien Jun 29, 2016

Contributor

For the sake of discussion, I intend to expose mutable references to fields in unions and put arbitrary (possibly Drop) types into them. Basically, I would like to use unions to write custom space-efficient enums. For example,

union SlotInner<V> {
    next_empty: usize, /* index of next empty slot */
    value: V,
}

struct Slot<V> {
    inner: SlotInner<V>,
    version: u64 /* even version -> is_empty */
}
Contributor

Stebalien commented Jun 29, 2016

For the sake of discussion, I intend to expose mutable references to fields in unions and put arbitrary (possibly Drop) types into them. Basically, I would like to use unions to write custom space-efficient enums. For example,

union SlotInner<V> {
    next_empty: usize, /* index of next empty slot */
    value: V,
}

struct Slot<V> {
    inner: SlotInner<V>,
    version: u64 /* even version -> is_empty */
}
@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Jul 2, 2016

Member

@nikomatsakis I'd like to propose a concrete answer to the question currently listed as unresolved here.

To avoid unnecessarily complex semantics, assigning to a union field should act like assigning to a struct field, which means dropping the old contents. It's easy enough to avoid this if you know about it, by assigning to the whole union instead. This is still slightly surprising behavior, but having a union field that implements Drop at all will produce a warning, and the text of that warning can explicitly mention this as a caveat.

Would it make sense to provide an RFC pull request amending RFC1444 to document this behavior?

Member

joshtriplett commented Jul 2, 2016

@nikomatsakis I'd like to propose a concrete answer to the question currently listed as unresolved here.

To avoid unnecessarily complex semantics, assigning to a union field should act like assigning to a struct field, which means dropping the old contents. It's easy enough to avoid this if you know about it, by assigning to the whole union instead. This is still slightly surprising behavior, but having a union field that implements Drop at all will produce a warning, and the text of that warning can explicitly mention this as a caveat.

Would it make sense to provide an RFC pull request amending RFC1444 to document this behavior?

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 5, 2016

Member

@joshtriplett Since @nikomatsakis is away on vacation, I'll answer: I think it's great form to file an amendment RFC for resolving questions like this. We'd often fast-track such RFC PRs when appropriate.

Member

aturon commented Jul 5, 2016

@joshtriplett Since @nikomatsakis is away on vacation, I'll answer: I think it's great form to file an amendment RFC for resolving questions like this. We'd often fast-track such RFC PRs when appropriate.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Jul 6, 2016

Member

@aturon Thanks. I've filed the new RFC PR rust-lang/rfcs#1663 with these clarifications.to RFC1444, to resolve this issue.

Member

joshtriplett commented Jul 6, 2016

@aturon Thanks. I've filed the new RFC PR rust-lang/rfcs#1663 with these clarifications.to RFC1444, to resolve this issue.

@anp anp referenced this issue in jameysharp/corrode Jul 8, 2016

Open

implement union #22

@Stebalien

This comment has been minimized.

Show comment
Hide comment
@Stebalien

Stebalien Aug 3, 2016

Contributor

(@aturon you can check-off that unresolved question now.)

Contributor

Stebalien commented Aug 3, 2016

(@aturon you can check-off that unresolved question now.)

@Daggerbot Daggerbot referenced this issue in erlepereira/x11-rs Aug 7, 2016

Closed

Version 3 #40

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Aug 12, 2016

Contributor

I have some preliminary implementation in https://github.com/petrochenkov/rust/tree/union.

Status: Implemented (modulo bugs), PR submitted (#36016).

Contributor

petrochenkov commented Aug 12, 2016

I have some preliminary implementation in https://github.com/petrochenkov/rust/tree/union.

Status: Implemented (modulo bugs), PR submitted (#36016).

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Aug 13, 2016

Member

@petrochenkov Awesome! Looks great so far.

Member

joshtriplett commented Aug 13, 2016

@petrochenkov Awesome! Looks great so far.

@asajeffrey

This comment has been minimized.

Show comment
Hide comment
@asajeffrey

asajeffrey Feb 1, 2018

@joshtriplett yes, Rust doesn't guarantee that destructors will run, but it happened (prior to 1.19) to maintain the invariant that perma-borrowed values would only have their memory reclaimed if the destructor ran. This is even true in the presence of mem::forget, since you can't call it on a perma-borrowed value.

This is what Joephine rather naughtily was relying on, but isn't true any more because of how the drop checker treats unions_with_drop_fields.

It would be fine if allow(unions_with_drop_fields) were considered to be an unsafe annotation, this wouldn't be a drastic change, AFAICT, it would just require deny(unsafe_code) to check for allow(unions_with_drop_fields).

@joshtriplett yes, Rust doesn't guarantee that destructors will run, but it happened (prior to 1.19) to maintain the invariant that perma-borrowed values would only have their memory reclaimed if the destructor ran. This is even true in the presence of mem::forget, since you can't call it on a perma-borrowed value.

This is what Joephine rather naughtily was relying on, but isn't true any more because of how the drop checker treats unions_with_drop_fields.

It would be fine if allow(unions_with_drop_fields) were considered to be an unsafe annotation, this wouldn't be a drastic change, AFAICT, it would just require deny(unsafe_code) to check for allow(unions_with_drop_fields).

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

@asajeffrey I'm still trying to understand the Pin thing... so, if I follow the example correctly, the reason this "works" is that fn pin(&'a Pin<'a, T>) -> &'a T forces the borrow to last for as long as the lifetime 'a annotated in the type, and that lifetime is moreover invariant.

That's an interesting observation! I was not aware of this trick. My gut feeling is that this works "by accident", i.e. safe Rust doesn't happen to provide a way to prevent the destructor from running but that doesn't make this part of the "contract". Notably, https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html does not list leaks.

Member

RalfJung commented Feb 2, 2018

@asajeffrey I'm still trying to understand the Pin thing... so, if I follow the example correctly, the reason this "works" is that fn pin(&'a Pin<'a, T>) -> &'a T forces the borrow to last for as long as the lifetime 'a annotated in the type, and that lifetime is moreover invariant.

That's an interesting observation! I was not aware of this trick. My gut feeling is that this works "by accident", i.e. safe Rust doesn't happen to provide a way to prevent the destructor from running but that doesn't make this part of the "contract". Notably, https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html does not list leaks.

@nox

This comment has been minimized.

Show comment
Hide comment
@nox

nox Feb 2, 2018

Contributor

IMO it doesn't matter if it works by accident or on purpose. There was no way to avoid Drop running with this trick prior to ManuallyDrop existing (which requires unsafe code to be implemented), and now we can't rely on that anymore.

The addition of ManuallyDrop basically killed that neat behaviour of Rust and saying that it shouldn't have been relied on in the first place sounds like circular reasoning to me. If ManuallyDrop didn't allow calling Pin::pin, would there be any other way to make calling Pin::pin unsound? I don't think so.

Contributor

nox commented Feb 2, 2018

IMO it doesn't matter if it works by accident or on purpose. There was no way to avoid Drop running with this trick prior to ManuallyDrop existing (which requires unsafe code to be implemented), and now we can't rely on that anymore.

The addition of ManuallyDrop basically killed that neat behaviour of Rust and saying that it shouldn't have been relied on in the first place sounds like circular reasoning to me. If ManuallyDrop didn't allow calling Pin::pin, would there be any other way to make calling Pin::pin unsound? I don't think so.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

I don't think we can commit to preserving every guarantee that rustc accidentally happens to provide right now. We have no idea what these guarantees may be, so we'd be stabilizing a pig in a poke (okay I hope this idiom makes sense... it's what the dictionary tells me matches my native language idiom, which would literally translate to "the cat in the bag" ;) -- what I want to say is we'd have no clue what we would be stabilizing).

Also, this is a two-edged sword -- every additional guarantee we decide to provide is something unsafe code has to take care of. So discovering a new guarantee may just as well break existing unsafe code (sitting silently somewhere on crates.io without being aware) as enable new unsafe code (the latter was the case here).

For example, it is very conceivable that lexical lifetimes enable unsafe code that gets broken by non-lexical lifetimes. Currently all lifetimes are well-nested, maybe there is a way for unsafe code to exploit this? Only with non-lexical lifetimes there can be lifetimes that overlap, but neither is included in the other. Does this make NLL a breaking change? I hope not!

If ManuallyDrop didn't allow calling Pin::pin, would there be any other way to make calling Pin::pin unsound? I don't think so.

With unsafe code, there would be. So by declaring this Pin trick sound, you are declaring some unsafe code unsound that would be sound if we decide ManuallyDrop is okay.

Member

RalfJung commented Feb 2, 2018

I don't think we can commit to preserving every guarantee that rustc accidentally happens to provide right now. We have no idea what these guarantees may be, so we'd be stabilizing a pig in a poke (okay I hope this idiom makes sense... it's what the dictionary tells me matches my native language idiom, which would literally translate to "the cat in the bag" ;) -- what I want to say is we'd have no clue what we would be stabilizing).

Also, this is a two-edged sword -- every additional guarantee we decide to provide is something unsafe code has to take care of. So discovering a new guarantee may just as well break existing unsafe code (sitting silently somewhere on crates.io without being aware) as enable new unsafe code (the latter was the case here).

For example, it is very conceivable that lexical lifetimes enable unsafe code that gets broken by non-lexical lifetimes. Currently all lifetimes are well-nested, maybe there is a way for unsafe code to exploit this? Only with non-lexical lifetimes there can be lifetimes that overlap, but neither is included in the other. Does this make NLL a breaking change? I hope not!

If ManuallyDrop didn't allow calling Pin::pin, would there be any other way to make calling Pin::pin unsound? I don't think so.

With unsafe code, there would be. So by declaring this Pin trick sound, you are declaring some unsafe code unsound that would be sound if we decide ManuallyDrop is okay.

@nox

This comment has been minimized.

Show comment
Hide comment
@nox

nox Feb 2, 2018

Contributor

What we described is a very ergonomic way to integrate Rust with GCs. What I'm trying to say is that it sounds wrong to me to tell us that this was just an accident that it worked and that we should forget about it, when I can't find of any use case for not constraining unions with Drop fields as described by @asajeffrey here, and when this really is the only wart breaking Josephine.

I'll be happy to forget about it if someone can show it was unsound even without ManuallyDrop.

Contributor

nox commented Feb 2, 2018

What we described is a very ergonomic way to integrate Rust with GCs. What I'm trying to say is that it sounds wrong to me to tell us that this was just an accident that it worked and that we should forget about it, when I can't find of any use case for not constraining unions with Drop fields as described by @asajeffrey here, and when this really is the only wart breaking Josephine.

I'll be happy to forget about it if someone can show it was unsound even without ManuallyDrop.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

What I'm trying to say is that it sounds wrong to me to tell us that this was just an accident that it worked

I see no indication of this trick ever being "designed", so I think it is quite fair to call it an accident.

and that we should forget about it

I should have made clearer that this part is just my personal gut feeling. I think it could also be a reasonable point of action to declare this a "happy accident" and actually make it a guarantee -- if we are reasonably confident that indeed all the other unsafe code respects this guarantee, and that providing this guarantee is more important than the ManuallyDrop use case. This is a trade-off, similar to leakpocalypse, where we can't eat our cake and have it, too (we can't have both Rc with its current API and the drop-based scoped threads; we can't have both ManuallyDrop and Pin) so we have to make a decision either way.

That said, I find it hard to express the actual guarantee provided here in a precise way, which makes me personally lean more towards the "ManuallyDrop is fine" side of things.

Member

RalfJung commented Feb 2, 2018

What I'm trying to say is that it sounds wrong to me to tell us that this was just an accident that it worked

I see no indication of this trick ever being "designed", so I think it is quite fair to call it an accident.

and that we should forget about it

I should have made clearer that this part is just my personal gut feeling. I think it could also be a reasonable point of action to declare this a "happy accident" and actually make it a guarantee -- if we are reasonably confident that indeed all the other unsafe code respects this guarantee, and that providing this guarantee is more important than the ManuallyDrop use case. This is a trade-off, similar to leakpocalypse, where we can't eat our cake and have it, too (we can't have both Rc with its current API and the drop-based scoped threads; we can't have both ManuallyDrop and Pin) so we have to make a decision either way.

That said, I find it hard to express the actual guarantee provided here in a precise way, which makes me personally lean more towards the "ManuallyDrop is fine" side of things.

@nox

This comment has been minimized.

Show comment
Hide comment
@nox

nox Feb 2, 2018

Contributor

if we are reasonably confident that indeed all the other unsafe code respects this guarantee, and that providing this guarantee is more important than the ManuallyDrop use case. This is a trade-off, similar to leakpocalypse, where we can't eat our cake and have it, too (we can't have both Rc with its current API and the drop-based scoped threads; we can't have both ManuallyDrop and Pin) so we have to make a decision either way.

Fair enough, I heartily agree with that. Note that if we consider what @asajeffrey described as undefined behaviour in the end, that can bring back a drop-based scoped thread API.

Contributor

nox commented Feb 2, 2018

if we are reasonably confident that indeed all the other unsafe code respects this guarantee, and that providing this guarantee is more important than the ManuallyDrop use case. This is a trade-off, similar to leakpocalypse, where we can't eat our cake and have it, too (we can't have both Rc with its current API and the drop-based scoped threads; we can't have both ManuallyDrop and Pin) so we have to make a decision either way.

Fair enough, I heartily agree with that. Note that if we consider what @asajeffrey described as undefined behaviour in the end, that can bring back a drop-based scoped thread API.

@SimonSapin

This comment has been minimized.

Show comment
Hide comment
@SimonSapin

SimonSapin Feb 2, 2018

Contributor

As far as I understand Alan’s proposal is not to remove ManuallyDrop, only to make dropck assume that it (and other unions with Drop fields) has a destructor. (That destructors happens to do nothing, but its mere existence affects what programs dropck accepts or rejects.)

Contributor

SimonSapin commented Feb 2, 2018

As far as I understand Alan’s proposal is not to remove ManuallyDrop, only to make dropck assume that it (and other unions with Drop fields) has a destructor. (That destructors happens to do nothing, but its mere existence affects what programs dropck accepts or rejects.)

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

I'll be happy to forget about it if someone can show it was unsound even without ManuallyDrop.

Not sure if this qualifies, but here's my first attempt: A silly implementation of something like ManuallyDrop that works in pre-union Rust.

pub mod manually_drop {
    use std::mem;
    use std::ptr;
    use std::marker::PhantomData;
    
    pub struct ManuallyDrop<T> {
        data: [u8; 32],
        phantom: PhantomData<T>,
    }
    
    impl<T> ManuallyDrop<T> {
        pub fn new(x: T) -> ManuallyDrop<T> {
            assert!(mem::size_of::<T>() <= 32);
            let mut data = [0u8; 32];
            unsafe {
                ptr::copy(&x as *const _ as *const u8, &mut data[0] as *mut _, mem::size_of::<T>());
            }
            mem::forget(x);
            ManuallyDrop { data, phantom: PhantomData }
        }
        
        pub fn deref(&self) -> &T {
            unsafe {
                &*(&self.data as *const _ as *const T)
            }
        }
    }
}

(Yeah I probably have to do some more work to get the alignment right, but that could be done as well by sacrificing some bytes.)
Playground showing this breaks Pin: https://play.rust-lang.org/?gist=fe1d841cedb13d45add032b4aae6321e&version=nightly

This is what I meant by two-edged sword above -- as far as I can see, my ManuallyDrop respects all the rules we have put out. So, we have two pieces of incompatible unsafe code -- ManuallyDrop and Pin. Who's "right"? I'd say Pin relies on guarantees we have never made and hence it's "wrong" here, but this is a judgment call, not a proof.

Member

RalfJung commented Feb 2, 2018

I'll be happy to forget about it if someone can show it was unsound even without ManuallyDrop.

Not sure if this qualifies, but here's my first attempt: A silly implementation of something like ManuallyDrop that works in pre-union Rust.

pub mod manually_drop {
    use std::mem;
    use std::ptr;
    use std::marker::PhantomData;
    
    pub struct ManuallyDrop<T> {
        data: [u8; 32],
        phantom: PhantomData<T>,
    }
    
    impl<T> ManuallyDrop<T> {
        pub fn new(x: T) -> ManuallyDrop<T> {
            assert!(mem::size_of::<T>() <= 32);
            let mut data = [0u8; 32];
            unsafe {
                ptr::copy(&x as *const _ as *const u8, &mut data[0] as *mut _, mem::size_of::<T>());
            }
            mem::forget(x);
            ManuallyDrop { data, phantom: PhantomData }
        }
        
        pub fn deref(&self) -> &T {
            unsafe {
                &*(&self.data as *const _ as *const T)
            }
        }
    }
}

(Yeah I probably have to do some more work to get the alignment right, but that could be done as well by sacrificing some bytes.)
Playground showing this breaks Pin: https://play.rust-lang.org/?gist=fe1d841cedb13d45add032b4aae6321e&version=nightly

This is what I meant by two-edged sword above -- as far as I can see, my ManuallyDrop respects all the rules we have put out. So, we have two pieces of incompatible unsafe code -- ManuallyDrop and Pin. Who's "right"? I'd say Pin relies on guarantees we have never made and hence it's "wrong" here, but this is a judgment call, not a proof.

@nox

This comment has been minimized.

Show comment
Hide comment
@nox

nox Feb 2, 2018

Contributor

Now that's interesting. In some versions of our pinning stuff, Pin::pin takes a &'this mut Pin<'this, T>, but it wouldn't be unreasonable for your ManuallyDrop to have a DerefMut impl, right?

Contributor

nox commented Feb 2, 2018

Now that's interesting. In some versions of our pinning stuff, Pin::pin takes a &'this mut Pin<'this, T>, but it wouldn't be unreasonable for your ManuallyDrop to have a DerefMut impl, right?

@nox

This comment has been minimized.

Show comment
Hide comment
@nox

nox Feb 2, 2018

Contributor

Here is a playground that shows that @RalfJung's (unsurprisingly) still breaks Pin with a &mut-taking pin method.

https://play.rust-lang.org/?gist=5057570b54952e245fa463f8d7719663&version=nightly

Contributor

nox commented Feb 2, 2018

Here is a playground that shows that @RalfJung's (unsurprisingly) still breaks Pin with a &mut-taking pin method.

https://play.rust-lang.org/?gist=5057570b54952e245fa463f8d7719663&version=nightly

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

it wouldn't be unreasonable for your ManuallyDrop to have a DerefMut impl, right?

Yeah, I just added the API I needed for this example. The obvious deref_mut should work just fine.

As far as I understand Alan’s proposal is not to remove ManuallyDrop, only to make dropck assume that it (and other unions with Drop fields) has a destructor. (That destructors happens to do nothing, but its mere existence affects what programs dropck accepts or rejects.)

Ah, I had missed that; sorry about that. Adding the following to my example keeps it working though:

    unsafe impl<#[may_dangle] T> Drop for ManuallyDrop<T> {
        fn drop(&mut self) {}
    }

Only if I remove the #[may_dangle] Rust rejects it. So, at the very least, we'd have to come up with some rule that the above code violates -- just saying "there exists some code we want to be sound that this is incompatible with" is a bad call because it makes it pretty much impossible to look at some code and check if it is sound.


I think what makes me most uneasy about this "accidental guarantee" is that I don't see a single good reason that this works. The way things are wired up in Rust makes this hold together, but dropck has been added not to prevent leaks but to avoid unsound references to dead data (a common problem in destructors). The reasoning for Pin to work is not based on "here is some mechanism in the Rust compiler, or some type system guarantee, that pretty clearly says perma-borrowed data cannot be leaked" -- it is rather based on "we've tried hard and we have not been able to leak perma-borrowed data, so we think that's okay". Relying on this for soundness makes me pretty nervous. EDIT: The fact that dropck is involved makes me even more nervous because this part of the compiler has a history of nasty soundness bugs. The reason this works seems to be that perma-borrows are at odds with safe drop. This really seems to be "reasoning based on exhaustive case analysis of what one can do with perma-borrowed data".

Now, to be fair, one could say similar things about interior mutability -- it happens to be the case that permitting modifications through shared references actually works safely in some cases, if we pick the right API. However, making this work actually required explicit support in the compiler (UnsafeCell) because it clashes with optimizations, and there is unsafe code that would be sound without interior mutability but is not sound with interior mutability. Another difference is that interior mutability was a design goal from the start (or from very early on -- this is way before my time in the Rust community), which is not the case for "perma-borrowed doesn't get leaked". And finally, for interior mutability, I think there's a pretty good story about "sharing makes mutation dangerous, but not impossible, and shared references' API just says you don't get mutability in general but doesn't exclude permitting more operations for specific types", resulting in a coherent overall picture. Of course, I've spent lots of time thinking about shared references, so maybe there is an equally coherent picture for the issue at hand that I'm just not aware of.

Member

RalfJung commented Feb 2, 2018

it wouldn't be unreasonable for your ManuallyDrop to have a DerefMut impl, right?

Yeah, I just added the API I needed for this example. The obvious deref_mut should work just fine.

As far as I understand Alan’s proposal is not to remove ManuallyDrop, only to make dropck assume that it (and other unions with Drop fields) has a destructor. (That destructors happens to do nothing, but its mere existence affects what programs dropck accepts or rejects.)

Ah, I had missed that; sorry about that. Adding the following to my example keeps it working though:

    unsafe impl<#[may_dangle] T> Drop for ManuallyDrop<T> {
        fn drop(&mut self) {}
    }

Only if I remove the #[may_dangle] Rust rejects it. So, at the very least, we'd have to come up with some rule that the above code violates -- just saying "there exists some code we want to be sound that this is incompatible with" is a bad call because it makes it pretty much impossible to look at some code and check if it is sound.


I think what makes me most uneasy about this "accidental guarantee" is that I don't see a single good reason that this works. The way things are wired up in Rust makes this hold together, but dropck has been added not to prevent leaks but to avoid unsound references to dead data (a common problem in destructors). The reasoning for Pin to work is not based on "here is some mechanism in the Rust compiler, or some type system guarantee, that pretty clearly says perma-borrowed data cannot be leaked" -- it is rather based on "we've tried hard and we have not been able to leak perma-borrowed data, so we think that's okay". Relying on this for soundness makes me pretty nervous. EDIT: The fact that dropck is involved makes me even more nervous because this part of the compiler has a history of nasty soundness bugs. The reason this works seems to be that perma-borrows are at odds with safe drop. This really seems to be "reasoning based on exhaustive case analysis of what one can do with perma-borrowed data".

Now, to be fair, one could say similar things about interior mutability -- it happens to be the case that permitting modifications through shared references actually works safely in some cases, if we pick the right API. However, making this work actually required explicit support in the compiler (UnsafeCell) because it clashes with optimizations, and there is unsafe code that would be sound without interior mutability but is not sound with interior mutability. Another difference is that interior mutability was a design goal from the start (or from very early on -- this is way before my time in the Rust community), which is not the case for "perma-borrowed doesn't get leaked". And finally, for interior mutability, I think there's a pretty good story about "sharing makes mutation dangerous, but not impossible, and shared references' API just says you don't get mutability in general but doesn't exclude permitting more operations for specific types", resulting in a coherent overall picture. Of course, I've spent lots of time thinking about shared references, so maybe there is an equally coherent picture for the issue at hand that I'm just not aware of.

@asajeffrey

This comment has been minimized.

Show comment
Hide comment
@asajeffrey

asajeffrey Feb 2, 2018

Time zones are fun, I only just got up! There seem to be two issues here (invariants in general, and dropck in particular), so I'll put them in separate comments...

Time zones are fun, I only just got up! There seem to be two issues here (invariants in general, and dropck in particular), so I'll put them in separate comments...

@asajeffrey

This comment has been minimized.

Show comment
Hide comment
@asajeffrey

asajeffrey Feb 2, 2018

@RalfJung: yes, this is an issue about the invariants being maintained by unsafe Rust. For any version of Rust+std, there's more than one choice of invariant I which is maintained using rely-guarantee reasoning. And indeed there may be two libraries L1 and L2, which chose incompatible I1 and I2, such that Rust+L1 is safe and Rust+L2 is safe, but Rust+L1+L2 is unsafe.

In this case, L1 is ManuallyDrop and L2 is Josephine, and it's pretty clear that ManuallyDrop is going to win since it's now in std, which has much stronger backward compatibility constraints than Josephine.

Interestingly, the guidelines at https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html are written as "It is the programmer's responsibility when writing unsafe code that it is not possible to let safe code exhibit these behaviors: ..." that is, it's a contextual property (for all safe contexts C, C[P] can't go wrong) and so is dependent on version (since v1.20 of Rust+std has more safe contexts than v1.18). In particular, I'd claim that pinning actually did satisfy this constraint for Rust before 1.20, since there was no safe context C s.t. C[Pinning] goes wrong.

However, this is just barracks-room lawyering, I think everyone agrees that there's a problem with this contextual defn, hence all the discussions about unsafe code guidelines.

If nothing else, I think that pinning has shown an interesting example of accidental invariants going wrong.

@RalfJung: yes, this is an issue about the invariants being maintained by unsafe Rust. For any version of Rust+std, there's more than one choice of invariant I which is maintained using rely-guarantee reasoning. And indeed there may be two libraries L1 and L2, which chose incompatible I1 and I2, such that Rust+L1 is safe and Rust+L2 is safe, but Rust+L1+L2 is unsafe.

In this case, L1 is ManuallyDrop and L2 is Josephine, and it's pretty clear that ManuallyDrop is going to win since it's now in std, which has much stronger backward compatibility constraints than Josephine.

Interestingly, the guidelines at https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html are written as "It is the programmer's responsibility when writing unsafe code that it is not possible to let safe code exhibit these behaviors: ..." that is, it's a contextual property (for all safe contexts C, C[P] can't go wrong) and so is dependent on version (since v1.20 of Rust+std has more safe contexts than v1.18). In particular, I'd claim that pinning actually did satisfy this constraint for Rust before 1.20, since there was no safe context C s.t. C[Pinning] goes wrong.

However, this is just barracks-room lawyering, I think everyone agrees that there's a problem with this contextual defn, hence all the discussions about unsafe code guidelines.

If nothing else, I think that pinning has shown an interesting example of accidental invariants going wrong.

@asajeffrey

This comment has been minimized.

Show comment
Hide comment
@asajeffrey

asajeffrey Feb 2, 2018

The particular thing that untagged unions (and hence ManuallyDrop) did was in the interaction with the drop checker, in particular ManualDrop acts like its defn is:

unsafe impl<#[may_dangle] T> Drop for ManuallyDrop<T> { ... }

and then you can have a conversation about whether this is allowed or not :) Indeed, this conversation is happening over in the may_dangle thread starting at #34761 (comment)

@RalfJung your code shows an interesting corner case, where the run-time type for data is T, but it's compile-time type is [u8; N]. Which type counts as far as may_dangle is concerned?

The particular thing that untagged unions (and hence ManuallyDrop) did was in the interaction with the drop checker, in particular ManualDrop acts like its defn is:

unsafe impl<#[may_dangle] T> Drop for ManuallyDrop<T> { ... }

and then you can have a conversation about whether this is allowed or not :) Indeed, this conversation is happening over in the may_dangle thread starting at #34761 (comment)

@RalfJung your code shows an interesting corner case, where the run-time type for data is T, but it's compile-time type is [u8; N]. Which type counts as far as may_dangle is concerned?

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

Interestingly, the guidelines at https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html are written as "It is the programmer's responsibility when writing unsafe code that it is not possible to let safe code exhibit these behaviors: ..." that is, it's a contextual property

Ah, interesting. I agree this is clearly not sufficient -- this would make the original scoped threads sound. To be meaningful, this has to (at least) specify the set of unsafe code that the safe code is permitted to call.

Personally, I feel a better way of specifying this is to give the invariants that are to be maintained. But I'm clearly biased here, because the methodology I use for proving things about Rust requires such an invariant. ;)

I'm a little surprised that page doesn't contain some sort of disclaimer of being preliminary; we're not really certain yet what exactly the limit will be -- as this discussion shows. We require unsafe code to at least do what that document says, but we probably have to require more.

For example, the limits of undefined behavior and what unsafe code can do are not the same. See nikomatsakis/rust-memory-model#44 for a recent discussion on that topic: Duplicating a &mut T for mem::size_of::<T>() == 0 does not lead to any undefined behavior directly, and yet is clearly considered illegal for unsafe code to do. The reason is that other unsafe code may rely on its ownership discipline being respected, and duplicating things violates that discipline.

Member

RalfJung commented Feb 2, 2018

Interestingly, the guidelines at https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html are written as "It is the programmer's responsibility when writing unsafe code that it is not possible to let safe code exhibit these behaviors: ..." that is, it's a contextual property

Ah, interesting. I agree this is clearly not sufficient -- this would make the original scoped threads sound. To be meaningful, this has to (at least) specify the set of unsafe code that the safe code is permitted to call.

Personally, I feel a better way of specifying this is to give the invariants that are to be maintained. But I'm clearly biased here, because the methodology I use for proving things about Rust requires such an invariant. ;)

I'm a little surprised that page doesn't contain some sort of disclaimer of being preliminary; we're not really certain yet what exactly the limit will be -- as this discussion shows. We require unsafe code to at least do what that document says, but we probably have to require more.

For example, the limits of undefined behavior and what unsafe code can do are not the same. See nikomatsakis/rust-memory-model#44 for a recent discussion on that topic: Duplicating a &mut T for mem::size_of::<T>() == 0 does not lead to any undefined behavior directly, and yet is clearly considered illegal for unsafe code to do. The reason is that other unsafe code may rely on its ownership discipline being respected, and duplicating things violates that discipline.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Feb 2, 2018

Member

If nothing else, I think that pinning has shown an interesting example of accidental invariants going wrong.

Oh, that certainly. And I wonder what we can do to avoid this in the future? Maybe put some big warning onto https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html saying "just because an invariant happens to hold in rustc+libstd, doesn't mean unsafe code can rely on it; instead here's some invariants that you can rely on"?

Member

RalfJung commented Feb 2, 2018

If nothing else, I think that pinning has shown an interesting example of accidental invariants going wrong.

Oh, that certainly. And I wonder what we can do to avoid this in the future? Maybe put some big warning onto https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html saying "just because an invariant happens to hold in rustc+libstd, doesn't mean unsafe code can rely on it; instead here's some invariants that you can rely on"?

@asajeffrey

This comment has been minimized.

Show comment
Hide comment
@asajeffrey

asajeffrey Feb 2, 2018

@RalfJung yes, I don't think anyone's in love with the contextual defn of "correctness", primarily because it's brittle wrt the observational power of contexts. I'd be a lot happier with a semantic defn in terms of invariants.

The only thing I'd ask for is please can we give ourselves some wiggle room and define two invariants for rely-guarantee reasoning (code can rely on R and should guarantee G, where G implies R). That way there's some room for strengthening R and weakening G. If we just have one invariant (i.e. R=G) we're stuck never being able to change them!

@RalfJung yes, I don't think anyone's in love with the contextual defn of "correctness", primarily because it's brittle wrt the observational power of contexts. I'd be a lot happier with a semantic defn in terms of invariants.

The only thing I'd ask for is please can we give ourselves some wiggle room and define two invariants for rely-guarantee reasoning (code can rely on R and should guarantee G, where G implies R). That way there's some room for strengthening R and weakening G. If we just have one invariant (i.e. R=G) we're stuck never being able to change them!

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Mar 26, 2018

Member

Constant checking currently doesn't special-case union fields: (cc @solson @oli-obk)

union Transmute<T, U> { from: T, to: U }

const SILLY: () = unsafe {
    (Transmute::<usize, Box<String>> { from: 1 }.to, ()).1
};

fn main() {
    SILLY
}

The above code produces the miri evaluation error "calling non-const fn std::ptr::drop_in_place::<(std::boxed::Box<std::string::String>, ())> - shim(Some((std::boxed::Box<std::string::String>, ())))".

Changing it to force the type of .to to be observed by the const-checker:

const fn id<T>(x: T) -> T { x }

const SILLY: () = unsafe {
    (id(Transmute::<usize, Box<String>> { from: 1 }.to), ()).1
};

results in "destructors cannot be evaluated at compile-time".

Relevant implementation code is here (specifically the restrict call):

ProjectionElem::Field(..) |

Member

eddyb commented Mar 26, 2018

Constant checking currently doesn't special-case union fields: (cc @solson @oli-obk)

union Transmute<T, U> { from: T, to: U }

const SILLY: () = unsafe {
    (Transmute::<usize, Box<String>> { from: 1 }.to, ()).1
};

fn main() {
    SILLY
}

The above code produces the miri evaluation error "calling non-const fn std::ptr::drop_in_place::<(std::boxed::Box<std::string::String>, ())> - shim(Some((std::boxed::Box<std::string::String>, ())))".

Changing it to force the type of .to to be observed by the const-checker:

const fn id<T>(x: T) -> T { x }

const SILLY: () = unsafe {
    (id(Transmute::<usize, Box<String>> { from: 1 }.to), ()).1
};

results in "destructors cannot be evaluated at compile-time".

Relevant implementation code is here (specifically the restrict call):

ProjectionElem::Field(..) |

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Apr 10, 2018

Contributor

Better analysis of #41073 had revealed that the semantics for when destructors run when assigning to subfields of unions are insufficiently ready for stabilization. See that issue for details.

Contributor

arielb1 commented Apr 10, 2018

Better analysis of #41073 had revealed that the semantics for when destructors run when assigning to subfields of unions are insufficiently ready for stabilization. See that issue for details.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 17, 2018

Member

Is it realistic to just entirely rule out Drop types in unions, and implement ManuallyDrop separately (as a lang-item)? From what I can tell, ManuallyDrop seems to be the biggest motivation for Drop in unions, but that's a very special case.

Absent of a positive "no drop" trait, we could then say a union is well-formed if every field is either Copy or of the form ManuallyDrop<T>. That would entirely side-step all the complications around dropping when assigning union fields (where it seems every possible solution will be full of surprising footguns), and the ManuallyDrop is a clear marker to programmers that they have to handle Drop themselves here. (The check could be smarter, e.g. it could traverse through product types and through nominal types that are declared in the same crate. Of course, having a positive way to say "this type will never implement Drop" would be nicer.)


The checklist in the first post does not mention unsized unions, nor does the RFC---but still we have an implementation, restricted to single-variant unions. This is closely related to the interaction with layout optimizations because it presupposes (once thin-pointer DSTs enter the picture) that a single-variant union has to be "valid" in some sense (it may be dropped, but it can't be any odd bit pattern either).

This conflicts with the way unions are sometimes used in C, which is an an "extension point" (IIRC @joshtriplett was the one to mention this at the all-hands): A header file may declare 3 variants for a union, but this is considered forward-compatible with adding more variants later (as long as that does not increase the size of the union). The user of the library promises to not touch the union data if the tag (sitting somewhere else) indicates that they do not know the current variant. Crucially, if you only know a single variant, that doesn't mean that there only is a single variant!

Member

RalfJung commented Apr 17, 2018

Is it realistic to just entirely rule out Drop types in unions, and implement ManuallyDrop separately (as a lang-item)? From what I can tell, ManuallyDrop seems to be the biggest motivation for Drop in unions, but that's a very special case.

Absent of a positive "no drop" trait, we could then say a union is well-formed if every field is either Copy or of the form ManuallyDrop<T>. That would entirely side-step all the complications around dropping when assigning union fields (where it seems every possible solution will be full of surprising footguns), and the ManuallyDrop is a clear marker to programmers that they have to handle Drop themselves here. (The check could be smarter, e.g. it could traverse through product types and through nominal types that are declared in the same crate. Of course, having a positive way to say "this type will never implement Drop" would be nicer.)


The checklist in the first post does not mention unsized unions, nor does the RFC---but still we have an implementation, restricted to single-variant unions. This is closely related to the interaction with layout optimizations because it presupposes (once thin-pointer DSTs enter the picture) that a single-variant union has to be "valid" in some sense (it may be dropped, but it can't be any odd bit pattern either).

This conflicts with the way unions are sometimes used in C, which is an an "extension point" (IIRC @joshtriplett was the one to mention this at the all-hands): A header file may declare 3 variants for a union, but this is considered forward-compatible with adding more variants later (as long as that does not increase the size of the union). The user of the library promises to not touch the union data if the tag (sitting somewhere else) indicates that they do not know the current variant. Crucially, if you only know a single variant, that doesn't mean that there only is a single variant!

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 17, 2018

Member

The check could be smarter, e.g. it could traverse through product types and through nominal types that are declared in the same crate.

This predicate already exists, but is conservative in generics due to there being no trait to bound on.
You can access it via std::mem::needs_drop (which uses an intrinsic which rustc implements).

Member

eddyb commented Apr 17, 2018

The check could be smarter, e.g. it could traverse through product types and through nominal types that are declared in the same crate.

This predicate already exists, but is conservative in generics due to there being no trait to bound on.
You can access it via std::mem::needs_drop (which uses an intrinsic which rustc implements).

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 17, 2018

Member

@eddyb will needs_drop take forwards compatibility into account, or will it happily look into other creates to determine if their types implement Drop? The goal here is to have a check that will never break from semver-compatible changes, where e.g. adding an impl Drop to a struct with no type or lifetime parameters and only private fields is semver compatible.

Member

RalfJung commented Apr 17, 2018

@eddyb will needs_drop take forwards compatibility into account, or will it happily look into other creates to determine if their types implement Drop? The goal here is to have a check that will never break from semver-compatible changes, where e.g. adding an impl Drop to a struct with no type or lifetime parameters and only private fields is semver compatible.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Apr 17, 2018

Contributor

@RalfJung

This conflicts with the way unions are sometimes used in C, which is an an "extension point" (IIRC @joshtriplett was the one to mention this at the all-hands): A header file may declare 3 variants for a union, but this is considered forward-compatible with adding more variants later (as long as that does not increase the size of the union). The user of the library promises to not touch the union data if the tag (sitting somewhere else) indicates that they do not know the current variant. Crucially, if you only know a single variant, that doesn't mean that there only is a single variant!

That's a very specific case.
It only affects C-style unions (so there are no destructors and everything is Copy, exactly the subset available on stable) generated from C headers.
We can easily add a _dummy: () or _future: () field to such unions and keep benefitting from safer "enum" model by default. A FFI union being an "extension point" is something that needs to be well documented anyway.

Contributor

petrochenkov commented Apr 17, 2018

@RalfJung

This conflicts with the way unions are sometimes used in C, which is an an "extension point" (IIRC @joshtriplett was the one to mention this at the all-hands): A header file may declare 3 variants for a union, but this is considered forward-compatible with adding more variants later (as long as that does not increase the size of the union). The user of the library promises to not touch the union data if the tag (sitting somewhere else) indicates that they do not know the current variant. Crucially, if you only know a single variant, that doesn't mean that there only is a single variant!

That's a very specific case.
It only affects C-style unions (so there are no destructors and everything is Copy, exactly the subset available on stable) generated from C headers.
We can easily add a _dummy: () or _future: () field to such unions and keep benefitting from safer "enum" model by default. A FFI union being an "extension point" is something that needs to be well documented anyway.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett Apr 17, 2018

Member
Member

joshtriplett commented Apr 17, 2018

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 18, 2018

Member

@RalfJung No, it behaves like auto traits do, exposing all internal details.

Member

eddyb commented Apr 18, 2018

@RalfJung No, it behaves like auto traits do, exposing all internal details.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 18, 2018

Member

There's currently some discussion around "active fields" and drop in unions over at #41073 (comment)

Member

RalfJung commented Apr 18, 2018

There's currently some discussion around "active fields" and drop in unions over at #41073 (comment)

@derekdreery

This comment has been minimized.

Show comment
Hide comment
@derekdreery

derekdreery Jul 11, 2018

Contributor

Unions should continue being a bag of bits, with Rust having no idea what they might contain at any given time until unsafe code accesses it.

This is exactly how I would expect unions to work. They are an advanced feature to squeeze out extra performance and interact with C code, where there are no such things as destructors.

For me, if you want to drop the contents of a union, you should have to cast/transmute (maybe it can't be transmute because it might be bigger with some unused bits at the end for another variant) it to the type you want to drop take pointers to the fields that need dropping and use std::ptr::drop_in_place, or use the field syntax to extract the value.

If I knew nothing about unions this is how I would expect them to work:

Example - Representing mem::uninitialized as a union

pub union MaybeValid<T> {
    valid: T,
    invalid: ()
}

impl<T> MaybeValid<T> {
    #[inline] // this should optimize to a no-op
    pub fn from_valid(valid: T) -> MaybeValid<T> {
        MaybeValid { valid }
    }

    pub fn invalid() -> MaybeValid<T> {
        MaybeValid { invalid: () }
    }

   pub fn zeroed() -> MaybeValid<T> {
        // do whatever is necessary here...
        unimplemented!()
    }
}

fn example() {
    let valid_data = MaybeValid::from_valid(1_u8);
    // Destructor of a union always does nothing, but that's OK since our 
    // data type owns nothing.
    drop(valid_data);
    let invalid_data = MaybeValid::invalid();
    // Destructor of a union again does nothing, which means it needs to know 
    // nothing about its surroundings, and can't accidentally try to free unused memory.
    drop(invalid_data);
    let valid_data = MaybeValid::from_valid(String::from("test string"));
    // Now if we dropped `valid_data` we would leak memory, since the string 
    // would never get freed. This is already possible in safe rust using e.g. `Rc`. 
    // `union` is a similarly advanced feature to `Rc` and so new users are 
    // protected by the order in which concepts are introduced to them. This is 
    // still "safe" even though it leaks because it cannot trigger UB.
    //drop(valid_data)
    // Since we know that our union is of a particular form, we can safely 
    // move the value out, in order to run the destructor. I would expect this 
    // to fail if the drop method had run, even though the drop method does 
    // nothing, because that's the way stuff works in rust - once it's dropped
    // you can't use it.
    let _string_to_drop = unsafe { valid_data.valid };
    // No memory leak and all unsafety is encapsulated.
}

I'm going to post this then edit it so I don't loose my working.
EDIT @SimonSapin way to drop fields.

Contributor

derekdreery commented Jul 11, 2018

Unions should continue being a bag of bits, with Rust having no idea what they might contain at any given time until unsafe code accesses it.

This is exactly how I would expect unions to work. They are an advanced feature to squeeze out extra performance and interact with C code, where there are no such things as destructors.

For me, if you want to drop the contents of a union, you should have to cast/transmute (maybe it can't be transmute because it might be bigger with some unused bits at the end for another variant) it to the type you want to drop take pointers to the fields that need dropping and use std::ptr::drop_in_place, or use the field syntax to extract the value.

If I knew nothing about unions this is how I would expect them to work:

Example - Representing mem::uninitialized as a union

pub union MaybeValid<T> {
    valid: T,
    invalid: ()
}

impl<T> MaybeValid<T> {
    #[inline] // this should optimize to a no-op
    pub fn from_valid(valid: T) -> MaybeValid<T> {
        MaybeValid { valid }
    }

    pub fn invalid() -> MaybeValid<T> {
        MaybeValid { invalid: () }
    }

   pub fn zeroed() -> MaybeValid<T> {
        // do whatever is necessary here...
        unimplemented!()
    }
}

fn example() {
    let valid_data = MaybeValid::from_valid(1_u8);
    // Destructor of a union always does nothing, but that's OK since our 
    // data type owns nothing.
    drop(valid_data);
    let invalid_data = MaybeValid::invalid();
    // Destructor of a union again does nothing, which means it needs to know 
    // nothing about its surroundings, and can't accidentally try to free unused memory.
    drop(invalid_data);
    let valid_data = MaybeValid::from_valid(String::from("test string"));
    // Now if we dropped `valid_data` we would leak memory, since the string 
    // would never get freed. This is already possible in safe rust using e.g. `Rc`. 
    // `union` is a similarly advanced feature to `Rc` and so new users are 
    // protected by the order in which concepts are introduced to them. This is 
    // still "safe" even though it leaks because it cannot trigger UB.
    //drop(valid_data)
    // Since we know that our union is of a particular form, we can safely 
    // move the value out, in order to run the destructor. I would expect this 
    // to fail if the drop method had run, even though the drop method does 
    // nothing, because that's the way stuff works in rust - once it's dropped
    // you can't use it.
    let _string_to_drop = unsafe { valid_data.valid };
    // No memory leak and all unsafety is encapsulated.
}

I'm going to post this then edit it so I don't loose my working.
EDIT @SimonSapin way to drop fields.

@SimonSapin

This comment has been minimized.

Show comment
Hide comment
@SimonSapin

SimonSapin Jul 11, 2018

Contributor

if you want to drop the contents of a union, you should have to cast/transmute (maybe it can't be transmute because it might be bigger with some unused bits at the end for another variant) it to the type you want to drop, or use the field syntax to extract the value

(If it’s only to let it drop there is no need to extract the value in the sense of moving it, you can take a pointer to one of the fields and use std::ptr::drop_in_place.)

Contributor

SimonSapin commented Jul 11, 2018

if you want to drop the contents of a union, you should have to cast/transmute (maybe it can't be transmute because it might be bigger with some unused bits at the end for another variant) it to the type you want to drop, or use the field syntax to extract the value

(If it’s only to let it drop there is no need to extract the value in the sense of moving it, you can take a pointer to one of the fields and use std::ptr::drop_in_place.)

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 11, 2018

Contributor

Related: For constants I'm currently arguing that at least one field of a union inside a constant needs to be correct: #51361 (if you have a ZST field that's always true)

Contributor

oli-obk commented Jul 11, 2018

Related: For constants I'm currently arguing that at least one field of a union inside a constant needs to be correct: #51361 (if you have a ZST field that's always true)

@SimonSapin

This comment has been minimized.

Show comment
Hide comment
@SimonSapin

SimonSapin Jul 11, 2018

Contributor

I'm going to post this then edit it so I don't loose my working.

Please note that edits are not reflected in email notifications. If you’re going to make significant changes to your comment, consider making a new comment instead or in addition.

Contributor

SimonSapin commented Jul 11, 2018

I'm going to post this then edit it so I don't loose my working.

Please note that edits are not reflected in email notifications. If you’re going to make significant changes to your comment, consider making a new comment instead or in addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment