Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upadd a stucts-and-tuples chapter #31
Conversation
This was referenced Sep 25, 2018
eddyb
reviewed
Sep 26, 2018
eddyb
reviewed
Sep 26, 2018
eddyb
reviewed
Sep 26, 2018
eddyb
reviewed
Sep 26, 2018
eddyb
reviewed
Sep 26, 2018
eddyb
reviewed
Sep 26, 2018
This comment has been minimized.
This comment has been minimized.
|
Pushed an update containing the various suggestions from @eddyb. |
gnzlbg
reviewed
Sep 26, 2018
| [#42877]: https://github.com/rust-lang/rust/issues/42877 | ||
| [pg-unsized-tuple]: https://play.rust-lang.org/?gist=46399bb68ac685f23beffefc014203ce&version=nightly&mode=debug&edition=2015 | ||
|
|
||
| There are also benefits also to having fewer guarantees. For example: |
This comment has been minimized.
This comment has been minimized.
alercah
approved these changes
Sep 26, 2018
| ### Default layout ("repr rust") | ||
|
|
||
| The default layout of structs is undefined and subject to change | ||
| between compiler revisions. We further do not guarantee that two |
This comment has been minimized.
This comment has been minimized.
alercah
Sep 26, 2018
Between individual compilations, no? I think that is what we had determined was the line.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Oct 11, 2018
Author
Collaborator
Actually, I would like to push back on this slightly — there is a general desire to ensure that th compiler output is deterministic. This is not 100% true but it is very nearly true, and we would like to to be true. This seems to imply that, so long as the input does not change, the layout cannot change. I am not sure why we would need to lose that guarantee.
rkruppe
reviewed
Sep 27, 2018
This comment has been minimized.
This comment has been minimized.
|
Long ago I proposed that we might want to guarantee (some subset of) newtype unpacking for repr(Rust) structs. @nikomatsakis carried this over into #11 as discussion point but it received no further discussion. I like to think that means it's uncontroversial To make a specific proposal, let's restrict it to structs [1] that contain a single field having the same memory layout as the type of the sole field. So [1] The same guarantee for |
This comment has been minimized.
This comment has been minimized.
the8472
commented
Oct 3, 2018
|
@rkruppe for So shouldn't that guarantee be restricted to structs with a single public field? Or at least there should be a lint if such transmutations are used by a crate that does not own |
This comment has been minimized.
This comment has been minimized.
|
Visibility should not affect layout, only who can rely on the layout. |
This comment has been minimized.
This comment has been minimized.
the8472
commented
Oct 3, 2018
|
That might be worth documenting. Still, it might be better to document the intent of such structs with |
This comment has been minimized.
This comment has been minimized.
|
I agree that such subtleties should be documented, but "punishing" people who do not do that in a certain prescribed way (a repr attribute, instead of e.g. a comment) by making their code de jure undefined behavior (especially if it will de facto never misbehave because the layout cannot sensibly be any different) is simply user-hostile with no benefit. |
scottmcm
referenced this pull request
Oct 5, 2018
Closed
RevSlice requires repr(C) or repr(transparent) #1
RalfJung
reviewed
Oct 5, 2018
RalfJung
reviewed
Oct 5, 2018
nikomatsakis
added some commits
Oct 11, 2018
nikomatsakis
referenced this pull request
Oct 11, 2018
Closed
Effect of `packed` and `align` on representation #17
rkruppe
reviewed
Oct 11, 2018
| The default layout of structs is not specified. Effectively, the | ||
| compiler provdes a deterministic function per struct definition that | ||
| defines its layout. This function may as principle take as input the | ||
| entire input program. Therefore: |
This comment has been minimized.
This comment has been minimized.
rkruppe
Oct 11, 2018
Member
I am not sure this wording reserves the freedoms it want to reserve, and even if someone argued it does I would like it to be clarified. Since we did not get consensus to rule out profile-guided layout, the program source code is not all that informs layout. Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...). So I don't think we can guarantee anything involving the words "deterministic function" and have to stick to something like "every time you invoke the compiler you may get a completely different layout".
This comment has been minimized.
This comment has been minimized.
gnzlbg
Oct 11, 2018
Collaborator
Since we did not get consensus to rule out profile-guided layout
I might have missed this, but did you managed to write your thoughts about this down?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Oct 23, 2018
Author
Collaborator
Hmm. I agree with the gist of your comment @rkruppe but I think I don't quite agree with this part:
Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...).
In particular, it seems like we would basically want to say that layout is determined by the program source code + other auxiliary inputs (e.g., compiler settings, output from PGO, etc).
I don't know that we need to give the freedom for rustc to choose arbitrary orderings on two consecutive runs where nothing at all changed (in particular, we've been shooting for deterministic compilation, and this would sort of contravene that). Of course it's ok to start there for now, since it doesn't say that we have to change layout...just seems looser than is needed.
Am I missing something?
This comment has been minimized.
This comment has been minimized.
rkruppe
Oct 23, 2018
Member
To be clear: it is certainly possible to define PGO traces as part of the compiler input (beyond just sources and compiler flags), though it's a bit of a stretch IMO. That should be called out explicit in the text, though. What's currently written here could reasonably be read as saying the layout depends just on the source code and flags such as -C opt-level.
This comment has been minimized.
This comment has been minimized.
gnzlbg
reviewed
Oct 11, 2018
| a detailed write-up): | ||
| layout scheme. See section 6.7.2.1 of the [C17 specification][C17] for | ||
| a detailed write-up of what such rules entail. For most platforms, | ||
| however, this means the following: |
This comment has been minimized.
This comment has been minimized.
gnzlbg
Oct 11, 2018
•
Collaborator
Maybe we should just go one step further here and just state that the "most recent" C specification applies. It would be a pain to have to update this every N years, history has shown that the newer C specs are backwards compatible with the old ones (e.g. the C99, C11, and C17 specs have never introduced breaking changes here - they just have defined behavior that was undefined before), and we probably want to be as compatible as possible with C anyways which means we have to follow the latest spec.
This kinds of ties Rust with the C spec, but this is already pretty much the case, not only for platform support, but some newer C proposals like N2289 - Zero overhead failure should definetely allow using Rust's Option and Result properly as error handling mechanisms in C FFI. So whether we like it or not, Rust is already a stakeholder in the C standardization process, and we should be sending someone to their meetings to represent Rust's interests in the ISO C standard evolution, and that would include layout guarantees for repr(C) types. We don't want any changes to the C language to make it impossible for Rust to target C via FFI.
gnzlbg
approved these changes
Oct 11, 2018
|
I honestly think this is a great start. As other issues progress, and unresolved issues get resolved, this will probably be amended many times, but it will be easier to discuss the details of those changes in PRs that build on top of this one. |
RalfJung
reviewed
Oct 13, 2018
RalfJung
reviewed
Oct 13, 2018
RalfJung
reviewed
Oct 13, 2018
| ahead of | ||
| time](https://github.com/rust-rfcs/unsafe-code-guidelines/issues/11#issuecomment-420659840), | ||
| so the user cannot do it manually. | ||
| - If layout is defined, then it becomes part of your API, such taht |
This comment has been minimized.
This comment has been minimized.
RalfJung
reviewed
Oct 13, 2018
nikomatsakis
added some commits
Oct 23, 2018
This comment has been minimized.
This comment has been minimized.
|
I pushed some updates. The most interesting thing has to do with zero-sized structs. I added the following text to the
|
RalfJung
reviewed
Oct 24, 2018
| a struct like `#[repr(C)] struct Foo { }`. Further, when a | ||
| `#[repr(C)]` struct has a field whose type has zero-size, that field | ||
| may induce padding due to its alignment, but will not otherwise affect | ||
| the offsets of subsequent fields (as it takes up zero space). |
This comment has been minimized.
This comment has been minimized.
RalfJung
Oct 24, 2018
Member
As @rkruppe noted on Zulip, this seems like a problem. It means that "copy-pasting struct definitions and adding repr(C) everywhere" does not give you C compatibility, because your Foo would actually take space when put in a larger struct in C.
This seems like a bug, TBH. I am not sure if it is a bug that we can still fix. Might be worth having at least a warning.
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Oct 24, 2018
Author
Collaborator
Yes — I was thinking the same in that conversation. That is, "bug and not clearly a bug we can fix", which does suggest that at least a lint is warranted.
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Oct 24, 2018
Author
Collaborator
For the benefit of others, the Zulip conversation was here.
This comment has been minimized.
This comment has been minimized.
ubsan
Oct 24, 2018
•
Collaborator
That's not actually how C works - C does not allow empty structs. The godbolt link was from C++, which does allow empty structs, and in which empty structs have size one. Basically, this would only be an issue for somebody like Mozilla, who talk to C++ through a non-C ABI.
Notably, also, gcc and clang's C extension for empty structs has sizeof(struct { }) = 0: https://godbolt.org/z/K3_5fJ
This comment has been minimized.
This comment has been minimized.
rkruppe
Oct 24, 2018
Member
Transcribing another thing @ubsan noted on Zulip: empty structs are accepted as an extension by some C compilers, but (at least) GCC and Clang make them have size zero, unlike C++. Example: https://godbolt.org/z/AS2gdC
This comment has been minimized.
This comment has been minimized.
ubsan
Oct 25, 2018
Collaborator
Allow-by-default at best - 0 size structures are weird in C++, and usually you'd use either EBO or [[no_unique_address]] with them.
This comment has been minimized.
This comment has been minimized.
RalfJung
Oct 25, 2018
Member
Isn't them being weird another argument for making this a warn-by-default lint? I expect many people will not know this. I am not a C++ expert, but I have programmed in C++ for many years and never heard about this; I don't think we can expect everybody doing C++ FFI to know about these issues.
This comment has been minimized.
This comment has been minimized.
gnzlbg
Oct 25, 2018
•
Collaborator
So to summarize: C does not allow empty structs, some C language extensions allow empty structs with sizeof == 0, C++ does allow empty structs but these have a sizeof == 1 unless they are inherited from or they are fields that have the [[no_unique_address]] attribute (in both cases, they don't increase the size of the struct - i'm unsure what role the alignment of the type plays though).
I think #[repr(C)] should warn-by-default on this when it makes a difference, that is, when the ZST would change the layout. We could have an opt-in warning that always warns on ZST being used in #[repr(C)] but I fear that would be extremely noisy for little win.
About the situation with the C-language extension and C++ it appears that #[repr(C)] != #[repr(Cxx)], so maybe we just need to add new reprs to deal with those. In the mean time it might be worth it to just ignore C++ while specifying #[repr(C)] here (maybe add a note so that we don't forget).
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
gnzlbg
Oct 25, 2018
Collaborator
@RalfJung note that EBO is guaranteed by the C++>=11 standard for types with standard layout, like empty structs: struct T {};, so it is a required layout optimization.
This comment has been minimized.
This comment has been minimized.
|
Is this almost ready to merge? |
This comment has been minimized.
This comment has been minimized.
|
@avadacatavra my hope was that we will merge it today |
This comment has been minimized.
This comment has been minimized.
|
That said, I think that some part of this conversation about zero-sized structs deserves to be added. I think we ought to also add "lint for |
pnkfelix
referenced this pull request
Oct 25, 2018
Open
run-pass/extern-pass-empty is probably a bogus thing to test #53859
nikomatsakis
force-pushed the
nikomatsakis:structs-and-tuples
branch
from
5318410
to
a9223e2
Oct 25, 2018
This comment has been minimized.
This comment has been minimized.
|
OK, I attempted to summarize the conversation from here in the latest commit and added some appropriate warning language. |
gnzlbg
reviewed
Oct 25, 2018
This comment has been minimized.
This comment has been minimized.
|
It looks like gcc (the only compiler which has implemented
|
This comment has been minimized.
This comment has been minimized.
|
I'm going to merge this, and any additions can be filed as followup issues/PRs (cc @nikomatsakis) |
nikomatsakis commentedSep 25, 2018
•
edited
Writes up what I believe to be the consensus around layout of structs and tuples. Suggestions very welcome!
Here are some of the highlights:
(T1...Tn)is laid out the same asTupleN<T1..Tn>wherestruct TupleN<P0..Pn>(P0..Pn).[T; N].#[repr(Rust)]structs (the default) have no particular layout guarantees. In particular, even if two#[repr(Rust)]structs have the fields of the same types, they are not guaranteed to be laid out in a compatible way. This is a conservative position that could be strengthened, though I personally now think it'd be better to "opt-in" to any such guarantees. I cover the pros and cons in the doc.#[repr(C)],#[repr(align)],#[repr(packed)], and#[repr(transparent)]are all discussed.Fixes #11
Fixes #12
Fixes #17