Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unions have no active field #478

Merged
merged 10 commits into from
Jan 27, 2019
Merged

unions have no active field #478

merged 10 commits into from
Jan 27, 2019

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Dec 9, 2018

be placed in `unsafe` blocks.
Unions have no notion of an "active field". Instead, every union access just
interprets the storage at the type of the field used for the access. The effect
of reading a union with a different field than it was written to is that of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"field ... it was written to" here looks suspiciously similar to the "active field" that's intended to be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change compared to the previous semantics, I think, is that the "active field"/"field it was written to" doesn't necessarily exist with the new rules.
(While using the "active field" or some other wording for the field in question is a matter of terminology.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve the wording to make it more clear that there is nothing remembering which field was written to, just re-interpreting storage at different types. Does that help?

@briansmith
Copy link

The bigger problem with the current text is that it says "More detailed specification for unions, including unstable bits, can be found in RFC 1897," i.e. it defers to the RFC(s) that say that type punning is UB. I think the RFC reference should be removed too to clarify that the reference is now more correct than the RFC.

Copy link
Contributor

@alercah alercah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the "bad type" sentence. The problem is not the type, it is the value being interpreted as that type.

Also, my understanding was that conclusions from the unsafe WG would not be exported to the language spec (which among other things means documentation in the reference) until they went through RFC. So I'm uncomfortable with this until that is done.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 10, 2018

This is not a conclusion from a UCG discussion. It is, IMO, a better description of the current behavior of Rust: Rust never disallowed type punning, and describing the behavior of Rust never required the concept of an "active field". "Active field" is a notion coming from C where it is required to enable type-based alias analysis, which was a definite non-goal for Rust for many years. I know of no other purpose of this notion.

This does not change the semantics. The RFC this text refers to (and I will remove that reference) did not get accepted, and the old text in the reference does not say that type punning is disallowed. It just distinguishes between the "active field" and all the other fields in a way that I think is more confusing than saying "every single access transmutes (just transmuting T to T is trivially correct)".

@RalfJung
Copy link
Member Author

I dislike the "bad type" sentence. The problem is not the type, it is the value being interpreted as that type.

Agreed, I tried to improve the wording. What do you think?

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits to make it slightly better :)

src/items/unions.md Outdated Show resolved Hide resolved
src/items/unions.md Outdated Show resolved Hide resolved
@Centril
Copy link
Contributor

Centril commented Dec 10, 2018

(While the reference isn't normative and may contain rustc specifics, we should strive to avoid that and be as spec-y as we can...)

That said; the wording "storage" carries connotations of unions being bags of bits that hasn't been agreed to yet. Is there a more neutral phrasing perhaps?

While in general, unions may not have an active field, that may still be part of an API invariant for a specific union data-type.

@petrochenkov
Copy link
Contributor

petrochenkov commented Dec 10, 2018

"More detailed specification for unions, including unstable bits, can be found in RFC 1897," i.e. it defers to the RFC(s) that say that type punning is UB.

Where did you find that?
RFC 1897 explicitly said that supporting type punning is the most important requirement for unions.
(Also it's still the more detailed specification for unions as they are implemented now, despite the "active field doesn't necessarily exist" change.)

@RalfJung
Copy link
Member Author

That said; the wording "storage" carries connotations of unions being bags of bits that hasn't been agreed to yet. Is there a more neutral phrasing perhaps?

Lazily doing transmutes at every access (and not carrying any extra state) already has that connotation.

The main open question is whether there are some restrictions on which bit patterns are allowed.

@Centril
Copy link
Contributor

Centril commented Dec 10, 2018

@RalfJung I suppose "bag of bits with restriction on the bits" still makes sense if you say "storage" heh ;)

@briansmith
Copy link

Where did you find that?
RFC 1897 explicitly said that supporting type punning is the most important requirement for unions.

I followed the link to RFC 1897. Then I noticed that it was rejected (postponed), so I went and found the latest accepted RFC for unions, which was RFC 1444, IIUC. That says "In addition, since a union declared without #[repr(C)] uses an unspecified binary layout, code reading fields of such a union or pattern-matching such a union must not read from a field other than the one written to. This includes pattern-matching a specific value in a union field."

(Also it's still the more detailed specification for unions as they are implemented now, despite the "active field doesn't necessarily exist" change.)

I think the best way to fix that is to copy whatever is good from RFC 1897 into the reference. End-users like me have no way to interpret the authority of a postponed RFC, so using it as the reference is not good.

@strega-nil
Copy link

@RalfJung they're not transmutes though - they're reading bits at a type. transmute implies that they were initially a value of one type, that is then read as another type - in this case, you're simply doing the latter half. transmute is an extra step.

@RalfJung
Copy link
Member Author

@ubsan The first half happened when writing to a union field (including initializing the union).

Any suggestions for how to word this?

@matthewjasper
Copy link
Contributor

Isn't transmute_copy closer. The union may be larger than some of its fields.

@RalfJung
Copy link
Member Author

transmute_copy is about bypassing !Copy, what does it have to do with being larger?

@matthewjasper
Copy link
Contributor

It also doesn't require that its two type parameters are the same size, which transmute does.

@strega-nil
Copy link

@RalfJung I would rather word it similarly to the transmute documentation - "reads the bits of the union as a value of the field type"

@RalfJung
Copy link
Member Author

@ubsan better like this?

@strega-nil
Copy link

@RalfJung definitely! I like the new wording a lot more :)

Havvy
Havvy previously requested changes Dec 23, 2018
Copy link
Contributor

@Havvy Havvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One formatting nit.

One theoretical problem though: Transmute only works for types of the same size. So, what happens when you have union { eight: u8, thirty_two: u32 } and write to thirty_two and read from eight?

src/items/unions.md Outdated Show resolved Hide resolved
@strega-nil
Copy link

@Havvy that would be reading uninitialized memory, and at best would result in an unspecified value.

@RalfJung
Copy link
Member Author

@Havvy it'll just read whatever you put into those bytes. If you never put anything into them, as @ubsan said you get uninitialized data.

@Havvy
Copy link
Contributor

Havvy commented Dec 27, 2018

The PR doesn't state that that is what happens.

@RalfJung
Copy link
Member Author

It says that the bytes get read at the given type. I cannot see any ambiguity in there. The comparison with transmute is just to connect to a mental model the reader might already have formed.

Do you have suggestions for wording that you think you improve clarity here?

@Havvy
Copy link
Contributor

Havvy commented Dec 28, 2018

I think I'm quibbling on the word "equivalent". "analogous" might be a better choice here?

@RalfJung
Copy link
Member Author

RalfJung commented Jan 9, 2019

I don't see any semantic difference between the two in this context, so sure. I adjusted the text.

@matthewjasper matthewjasper merged commit d201abc into rust-lang:master Jan 27, 2019
@matthewjasper
Copy link
Contributor

Thanks

bors added a commit to rust-lang/rust that referenced this pull request Mar 5, 2019
Update books

Update nomicon, reference, book, edition-guide, embedded-book

rust-by-example cannot be updated because it is currently broken (rust-lang/rust-by-example#1161)

## nomicon

8 commits in b7eb4a087207af2405c0669fa577f8545b894c66..f1ff93b66844493a7b03101c7df66ac958c62418
2018-11-30 08:52:24 -0500 to 2019-02-26 13:37:28 -0500
- Fix typo in other-reprs.md (rust-lang/nomicon#112)
- Remove duplicate 'the' in aliasing.md (rust-lang/nomicon#116)
- Fix typo in subtyping.md (rust-lang/nomicon#117)
- Trivial updates to the coercions chapter (rust-lang/nomicon#118)
- Fix double "the" in aliasing.md (rust-lang/nomicon#119)
- Fixes outdated bindgen link (rust-lang/nomicon#110)
- Fix link to type layout reference (rust-lang/nomicon#121)
- Fix capitalization of Rust in races.md (rust-lang/nomicon#107)

## reference

18 commits in 1c775a1..41493ff
2019-01-13 21:12:05 +0100 to 2019-03-05 12:32:22 +0100
- Fix broken link. (rust-lang/reference#527)
- Update attribute documentation. (rust-lang/reference#528)
- Remove target_has_atomic. (rust-lang/reference#529)
- Remove "mode" from derive macro terminology. (rust-lang/reference#530)
- Clarifications (rust-lang/reference#493)
- Fix an incorrect note (rust-lang/reference#522)
- Document extern_crate_self (rust-lang/reference#517)
- Fix spelling (rust-lang/reference#520)
- Update conditional-compilation.md (rust-lang/reference#503)
- Document if_while_or_patterns (rust-lang/reference#516)
- Fix some broken links. (rust-lang/reference#519)
- Clarify what access struct updates do (rust-lang/reference#518)
- Document irrefutable_let_patterns (rust-lang/reference#515)
- Document Rc/Arc method receivers. (rust-lang/reference#494)
- unions have no active field (rust-lang/reference#478)
- attributes.md Outer -> Inner (rust-lang/reference#510)
- Let bindings are now available in const contexts (rust-lang/reference#512)
- Add missed literal at MacroFragSpec (rust-lang/reference#513)

## book

2 commits in fab9419503f0e34c1a06f9350a6705d0ce741dc6..9cffbeabec3bcec42d09432bfe7705125c848889
2019-02-25 07:53:23 -0500 to 2019-03-02 08:22:41 -0500
- More edits (rust-lang/book#1838)
- Remove the 2018 edition nostarch directory

## edition-guide

1 commits in 5f3cc2a5618700efcde3bc00799744f21fa9ad2e..aa0022c875907886cae8f3ef8e9ebf6e2a5e728d
2019-02-27 20:11:50 -0800 to 2019-02-27 22:10:39 -0800
- Fix one last link in readme. (rust-lang/edition-guide#153)

## embedded-book

12 commits in bd2778f304989ee52be8201504d6ec621dd60ca9..9e656ead82bfe869493dec82653a52e27fa6a05c
2019-02-10 12:37:14 +0000 to 2019-03-03 16:03:26 +0000
- Merge rust-embedded/book#174
- Merge rust-embedded/book#172
- Merge rust-embedded/book#169
- Merge rust-embedded/book#171
- Merge rust-embedded/book#168
- Merge rust-embedded/book#153
- Merge rust-embedded/book#142 rust-embedded/book#151
- Merge rust-embedded/book#133 rust-embedded/book#135
- Merge rust-embedded/book#150
- Merge rust-embedded/book#145
- Merge rust-embedded/book#129
- Merge rust-embedded/book#128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants