Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layout of homogeneous structs #36

Open
nikomatsakis opened this Issue Oct 11, 2018 · 15 comments

Comments

Projects
None yet
7 participants
@nikomatsakis
Copy link
Collaborator

nikomatsakis commented Oct 11, 2018

From #31: If you have homogeneous structs, where all the N fields are of a single type T, can we guarantee a mapping to the memory layout of [T; N]? How do we map between the field names and the indices? What about zero-sized types?

A specific proposal:

  • If all fields have the same type T (ignoring zero-sized types), then the layout will be the same as [T; N] where N is the number of fields
  • If the struct is defined with named fields, the mapping from fields to their indices is undefined (so foo.bar maps an undefined index)
  • If the struct is defined as a tuple struct (or a tuple), then the indices are derived from the definition (so foo.0 maps to [0])

This is basically because it's convenient but also because it's about the only sensible thing to do (unless you imagine we might want to start inserting padding between random fields or something, which connects to #35).

Note that for tuple structs (and tuples), the ordering of (public) fields is also significant for semver and other purposes -- changing the ordering will affect your clients. The same is not true for named fields and so this proposal attempts to avoid making it true.

@the8472

This comment has been minimized.

Copy link

the8472 commented Oct 11, 2018

(unless you imagine we might want to start inserting padding between random fields or something

Would there ever be a reason to automatically insert tail padding for homogenous structs?

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 11, 2018

The most plausible reason for not guarantee this would be to raise the alignment of the aggregate to allow copying the type with fewer or more efficient instructions (e.g., align a struct of four u8s to 32 bit and use aligned 32 bit load and store instructions for it).

@Gankro

This comment has been minimized.

Copy link

Gankro commented Oct 11, 2018

I can also imagine two hypothetical but very dubious scenarios to add to the pile:

  • We explicitly want to break code that's relying on the order without an annotation because someone might reorder the fields syntactically without knowing it's important (in some sense, using repr(C) as a "I promise this type will have some kind of stable ABI" annotation)

  • With profile-guided optimization (PGO), it is determined that (say) Foo.7 being first in the type is better due to cache effects.

@gnzlbg

This comment has been minimized.

Copy link
Collaborator

gnzlbg commented Oct 12, 2018

The most plausible reason for not guarantee this would be to raise the alignment of the aggregate to allow copying the type with fewer or more efficient instructions

I think we should be able to do this for arrays too, and also for homogenous parts of aggregates like struct A { a: u8, x: f32, y: f32, z: f32, w: f32, b: u8 } where it might be worth it to reorder {xyzw} to the front not only to reduce A's size, but also to be able to perform SIMD operations on these fields, which might require raising the alignment of A from 4 to, e.g., 16. I don't think any implementation will do these optimizations in the near future (maybe never), but if leaving the door open for these does not causes many headaches it might still be worth doing nevertheless. We can always provide more guarantees later.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 12, 2018

I think we should be able to do this for arrays too

That would break ABI compatibility with C arrays.

Also, one reason why one might not want to raise the alignment of any aggregate is that it tends to cause extra padding, either internally (in your example struct A, increasing alignment to 16 bytes also increases the size of the struct from 20 bytes to 32 bytes) or externally, i.e., requiring extra padding in an aggregate that contains a field whose alignment requirement was raised.

@gnzlbg

This comment has been minimized.

Copy link
Collaborator

gnzlbg commented Oct 12, 2018

That would break ABI compatibility with C arrays.

Yes, repr(Rust) arrays would then differ from repr(C) arrays - I don't know how big of a deal this would be.

Also, one reason why one might not want to raise the alignment of any aggregate is that it tends to cause extra padding, either internally (in your example struct A, increasing alignment to 16 bytes also increases the size of the struct from 20 bytes to 32 bytes) or externally, i.e., requiring extra padding in an aggregate that contains a field whose alignment requirement was raised.

Yes, that's a very good reason not to do that. FWIW I wasn't suggesting to do this unconditionally, but rather that we don't have to forbid an implementation from doing this if it deems it worthwhile right now.

@Gankro

This comment has been minimized.

Copy link

Gankro commented Oct 12, 2018

You can't put a repr on a builtin type, so there's no such thing as a "repr(c) array". Also we've long guaranteed their layout, and that ship has sailed.

@gnzlbg

This comment has been minimized.

Copy link
Collaborator

gnzlbg commented Oct 12, 2018

An alternative would be to asymmetrically require that tuples are layout compatible with arrays, but to not require the layout of tuples and arrays to be identical. That is, arrays would not be layout compatible with tuples.

This would allow the compiler to increase the alignment of homogeneous structs and tuples, and also allow &(T, T, T, T) as &[T; 4]. It would not allow &[T; 4] as &(T, T, T, T) because the alignment of [T; 4] might be smaller than the alignment of (T, T, T, T). An union like U { a: [T; 4], t: (T, T, T, T) } could, however, be used to re-interpret the bytes of an array as a tuple because the union will properly align the array.

AFAICT it would be backwards compatible to, in the future, make the stronger guarantee that the layout of homogeneous tuples and arrays is identical, which would allow &[T; 4] as &(T, T, T, T).

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Oct 13, 2018

These ordering issues make me think that for homogeneous non-tuple-structs, while we could guarantee that the layout matches that of an array, we should not guarantee how the fields are mapping to array indices. Getting that from the definition order seems strange to me.

I do not have a strong opinion on this, though.

@nikomatsakis

This comment has been minimized.

Copy link
Collaborator Author

nikomatsakis commented Oct 18, 2018

@RalfJung

These ordering issues make me think that for homogeneous non-tuple-structs, while we could guarantee that the layout matches that of an array, we should not guarantee how the fields are mapping to array indices. Getting that from the definition order seems strange to me.

This was, I believe, what I meant when I wrote "If the struct is defined with named fields, the mapping from fields to their indices is undefined (so foo.bar maps an undefined index)".

This also seems to address @Gankro's first concern:

We explicitly want to break code that's relying on the order without an annotation because someone might reorder the fields syntactically without knowing it's important (in some sense, using repr(C) as a "I promise this type will have some kind of stable ABI" annotation)


With respect to the PGO or SIMD scenarios, if we adopted this proposal, the idea would probably be that one out to explicitly opt in to "non-standard" layouts of this kind (e.g., #[repr(opt)] or something). I think this is probably reasonable -- basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 18, 2018

I think this is probably reasonable -- basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

This has been my position for the longest time, but recently I'm less sure since the scenario outlined in #36 (comment) doesn't really need SIMD or any hypothetical whole-program/profile-guided optimizations to be profitable. It would be nice to be able to raise e.g. the alignment of struct Rgba { r: u8, g: u8, b: u8, a: u8 } to four bytes because doing so would rarely increase data structure sizes (if the struct is private we can even sometimes prove this locally) and on many targets it could make copies of Rgba use 4x fewer instructions.

This is relatively niche compared to reordering to remove internal packing, and considering the nice guarantees it would conflict with, I am not arguing we should reserve that freedom, but I do regret that more than I regret PGO-driven or SIMD-targeted layout changes.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Oct 18, 2018

basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

I would expect #[repr(Rust)] to be able to optimize however it wants (not just for compactness), based on whatever information it has, including all whole-program analysis. In particular, I would expect it to eventually be able to notice which types might be best optimized for SIMD or avoiding false sharing.

If all fields have the same type T

What if all fields have different types, A, B, C, D, all of which are #[repr(transparent)] wrappers around the same T? There should be no performance penalty to using #[repr(transparent)] wrappers.

What if all fields have similar types, e.g. i8 and u8, or i32 and u32, or T& and T* and some transparent wrapper around T*? I think it would be too surprising to change T* to a T& or change the signness of a field and have that change affect the layout in the way this issue suggests.

I think there's a lot of advantages to treating homogeneous and heterogeneous structs identically. For one, there's no need to define "homogeneous."

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 18, 2018

@briansmith Fair points, but if we decide to treat homogenuous aggregates the same as other aggregates then I'd suggest teasing out the salient properties (e.g. size+align) and give guarantees about the layout of all repr(Rust) aggregates that in particular imply that things like (u8, i8), (&T, *const T), etc. are laid out like arrays. Special casing homogenuous aggregates was, after all, originally conceived as a compromise solution between that and giving no guarantees at all.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Oct 18, 2018

Another thing to consider: Let's say you have a heterogeneous type with fields of types A, B, and A. Then you change the type of the second field to A, making the struct homogeneous. Should this really reduce the kinds of optimizations the compiler can perform, especially if A and B are the same size and alignment?

@the8472

This comment has been minimized.

Copy link

the8472 commented Nov 24, 2018

@briansmith for the heterogeneous type the compiler is free to reorder and add padding. For the homogeneous type it would at most be allowed to raise alignment up to the size itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.