Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: treat blank struct fields as having the same names for type identity #21967

Closed
bcmills opened this issue Sep 21, 2017 · 12 comments

Comments

Projects
None yet
6 participants
@bcmills
Copy link
Member

commented Sep 21, 2017

go version devel +c26aedbdde Fri Sep 15 11:28:52 2017 -0400 linux/amd64


Motivation

In #13467, I am attempting to use aliases of struct types to define mutually-assignable struct types in separate packages.

Because those struct types must match the layout of C structs, they use blank-named struct fields for padding and alignment. The spec mostly supports usage of blank-fields-as-padding (emphasis mine):

  • “Two struct values are equal if their corresponding non-blank fields are equal.”
  • “Within a struct, non-blank field names must be unique.”

The spec even gives an example of this padding use-case:

// A struct with 6 fields.
struct {
	x, y int
	u float32
	_ float32  // padding
	A *[]int
	F func()
}

Unfortunately, the rule for type identity does not respect the padding use-case (emphasis mine):

“Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different.”

That implies that otherwise-identical structs with blank fields for alignment are treated as non-identical. For example, if two different packages define this type alias:

type S = struct {
	X byte
	_ [7]byte  // padding
	Y int32
}

it will result in types that are not mutually-assignable, despite sharing the same layout and accessible fields and producing otherwise-equal values.


Proposal

To remedy this problem, I propose a minor change to the rule for type identity:

“Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported, non-blank field names from different packages are always different.”

@gopherbot gopherbot added this to the Proposal milestone Sep 21, 2017

@gopherbot gopherbot added the Proposal label Sep 21, 2017

@mdempsky

This comment has been minimized.

Copy link
Member

commented Sep 21, 2017

In practice, I expect this would be fine, but technically this is a backwards incompatible change.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2017

Yes, it is technically backwards-incompatible. (I consider it a “Specification error” for the purpose of Go 1 compatibility, but you may reasonably disagree.)

@mdempsky

This comment has been minimized.

Copy link
Member

commented Sep 21, 2017

@bcmills I don't feel too strongly about it. Just @griesemer and I spent a while fixing #15514, so all the corner cases are still on my mind.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2017

If you have

type Point { X, Y int }

then sometimes we write instead

type Point { X, Y int; _ struct{} }

to make sure that literals must be keyed and also that other packages can't convert between Point and a struct type literal written in another package (like struct { X, Y int }).

If we allow _ to all be identical then it does start allowing clients to depend on internal details in such structs again, although it's unclear why code would do that. (Literals would still have to be keyed.)

There's also a question of whether we want to encourage _ for padding any further. (We might want to guarantee layout only on an opt-in basis, for some kind of opt-in.)

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 26, 2017

to make sure that literals must be keyed

As you note, under this proposal those structs would still need to be keyed.

Allowing them to be unkeyed would also be more consistent with blank-fields-as-padding, but I don't currently have a compelling use-case for it.

and also that other packages can't convert between Point and a struct type literal written in another package (like struct { X, Y int }).

Note that under this proposal the type struct { X, Y int } would still be distinct from struct { X, Y int; _ struct{} }, as the former lacks the trailing field.

User code can already copy those structs explicitly (or by enumerating fields using reflection), so I'm not sure the lack of convertibility between redefinitions buys much.

There's also a question of whether we want to encourage _ for padding any further.

A blessed alternative for alignment and padding would also satisfy my use-case, as long as it does not interfere with type identity.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Sep 26, 2017

A blessed alternative for alignment and padding would also satisfy my use-case, as long as it does not interfere with type identity.

Can you clarify this? I would think it's desirable that

struct {
    X byte // offset 0
    Y byte // offset 1
}

and (something like)

struct {
     X byte // offset 0
//go:align 4
     Y byte // offset 4
}

be recognized as different types.

Otherwise, I think we'll need to complicate conversion rules: currently it's legal to convert *S to *T if S and T have identical underlying types, but we wouldn't want to allow that conversion if S and T's underlying types were the two struct types above.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Sep 26, 2017

@mdempsky, I agree that two types with different layouts or alignment should be treated as different types, even if they otherwise have the same exported fields. (IMO that's a compelling argument for not using //go: comments for alignment and padding: adding a comment to a type declaration should not affect its identity.)

What I mean by “does not interfere with type identity” is just that, whatever the mechanism, adding the same padding to a type used in two packages should not cause it to be treated as two distinct types.

@mbenkmann

This comment has been minimized.

Copy link

commented May 7, 2018

Since padding and alignment are issues that only arise in the context of cgo, instead of changing the language with respect to all types, one could introduce types in package "C" and treat them specially in the compiler. E.g.

 struct {
      X byte // offset 0
      _ [4]C.Älign
      Y byte // offset 4
 }

For purposes of the language C.Älign would be a byte but the compiler "optimizes" the _ field down to 3 bytes. In the same vein you could have the compiler relax certain language rules for these C.... types without actually changing the language. I think this approach gives one the freedom to address the practical issues of interfacing with C code within Go1.

The only drawback I see is that a 3rd party compiler could be compliant with the Go1 language definition but be unable to compile a certain project using cgo properly. But with all the required external parts (C headers in particular) and platform dependencies any project using cgo is already fragile with respect to its platform and tools.

The above is only 1 possible way. The core concept I'm proposing is to leave alignment and padding as issues for the compiler, not the language, and simply define some ways within the existing language to signal to the compiler what you want.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented May 7, 2018

@mbenkmann Note that alignment doesn't arise solely in the context of cgo. See https://golang.org/src/sync/waitgroup.go#L23 .

@bcmills

This comment has been minimized.

Copy link
Member Author

commented May 8, 2018

you could have the compiler relax certain language rules for these C.... types without actually changing the language.

A deviation from the language spec that is required in order for some common tool (such as cgo) to work is, de facto, a language change: it has nearly all the same disadvantages, plus the additional disadvantage of obscure documentation.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2018

I think that this proposal is only for cases that arise when using cgo. If we change cgo to name padding fields, rather than name them all _, then this proposal seems unnecessary. That seems like a simpler change than modifying the language.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Feb 5, 2019

This adds a special case to the language that doesn't seem necessary. In general we want to avoid special cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.