Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: allow conversion between recursively equivalent types #18030

Closed
adrianludwin opened this issue Nov 23, 2016 · 10 comments
Closed

Proposal: allow conversion between recursively equivalent types #18030

adrianludwin opened this issue Nov 23, 2016 · 10 comments

Comments

@adrianludwin
Copy link

adrianludwin commented Nov 23, 2016

Summary: I propose we consider relaxing struct conversion rules so that when converting composite data types (eg structs and slices), we recursively attempt to convert element types as well as the fundamental types. This would simplify certain important use cases, such as the handling of (otherwise) identical protobuf messages.

1. Background: This proposal is inspired by the (accepted) proposal described in #16085, and heavily references that proposal, but has a significantly wider scope. For example, under the current type conversion rules, we can convert between the following two types:

type type1 struct {
	id int
}

type type2 struct {
	id int
}

But we cannot convert between the following two types, even though they are clearly identical, from a data perspective:

type id1 int
type type1 struct {
	id id1
}

type id2 int
type type2 struct {
	id id2
}

Unfortunately, this is very similar to the kinds of structures that are created by the Protocol Buffer compiler. This means that many structurally identical messages cannot easily be converted from one proto package to another.

Issue #16085 (Section 5) did specify that tags should be ignored recursively, for the purposes of type conversion, but this is only relevant for anonymous structs and therefore is not widely applicable. However, it does foreshadow the issues raised by the this proposal.

2. Proposal: The current spec defines that “A non-constant value x can be converted to type T [if, among other cases,] x's type and T have identical underlying types.” Instead, I suggest that this be changed to “x’s type and T have structurally equal (SE) types.” As one might expect, two types are SE if they meet any of the current criteria (including being of the same underlying type), or if they:

  • Have the same Kind (as defined in the “reflect” package), and
  • Have component types that are themselves SE.

To put this in concrete terms for each of Go's composite types:

  • Array, channel or slice: two array, channel or slice types are SE if their element types are SE. Of course, an array type is not SE with a channel or slice type or vice versa.
  • Functions and methods: two function types are SE if all the following conditions are met:
    • The corresponding parameters (by order of declaration) of the two functions are identically named and have SE types;
    • The corresponding return values (by order of declaration) of the two functions are either both unnamed or identically named, and have SE types.
  • Interfaces: two interface types are SE if they have the identical set of methods, and all corresponding methods of the same name are SE.
  • Map: two map types are SE if the keys and values are SE.
  • Pointer: two pointer types are SE if their base types are SE (this is already allowed by the spec, but as a special case).
  • Struct: two structs are SE if the corresponding fields (by order of declaration) of the two structs are identically named and have SE types.

In addition to changing the built-in language behaviour, we would also update the reflect package to match this change (Type.ConvertibleTo and Value.Convert).

3. Impact: as with #16085, this is a backwards-compatible change since it only loosens restrictions. The only exceptions to this are the behaviour of several methods from the “reflect” package, as described in #16085. Furthermore, the benefits should be greatly enhanced when dealing with protobufs, as described in the background.

4. Discussion: While the primary driver for this change was to simplify the handling of structures, I have included functions, methods and interfaces for completeness. However, if the inclusion of these functional elements causes problems, it may be acceptable to remove them from this proposal as well.

A further note should be made about method sets. Under the current rules, it is already possible to cast one type to another and completely change its method sets. As a result of the current proposal, the method sets of its element types may also change. For example, if x.a.Foo() was a valid call, and x is casted to y (a variable of a different but SE type to x), this does not necessarily mean that y.a.Foo() will be a valid call - or if it is a valid call, that it will refer to a method with the same signature, let alone semantics.

5. Alternatives: One alternative to allowing automatic conversion is for developers to continue to manually specify every field that needs to be converted. Another is to provide a library function, which would fail at runtime (instead of at compile time) if the two types were not SE. There would also be performance and memory considerations, as in #16085.

6. Implementation: TBD. In the compiler, we would need to generalize the type identity functions to recurse in the case of conversions. The spec would be updated with a new section on structural equality, and the reflect package would need to be updated to match the behaviour of the compiler.

Thanks to @griesemer for some preliminary comments on this proposal.

@ianlancetaylor ianlancetaylor added this to the Proposal milestone Nov 23, 2016
@ianlancetaylor
Copy link
Contributor

I'm not sure how I feel about this. The recursive nature of this proposal means that we can explicitly convert between interface types that are entirely different, as in

type Seconds interface {
    Add() int64
}
type NanoSeconds interface {
    Add() time.Duration
}
var S Seconds
var S2 = NanoSeconds(S)

Of course the conversion does have to be explicitly, so presumably people should avoid doing that. But the identical underlying type can be hidden beneath an arbitrary number of layers of function parameters.

I can appreciate the desire for MOVE CORRESPONDING but I'm concerned that this proposal goes too far. And I don't know how to usefully limit it while still permitting what is desired.

@adrianludwin
Copy link
Author

I had similar concerns about interfaces - the proposal's really targeted towards data (structs, etc) and not functions. Would you feel better if interfaces were excluded?

@ianlancetaylor
Copy link
Contributor

Interfaces are just one aspect of the issue, because interfaces are just the dual of method sets.

// T1 represents a number of seconds.
type T1 int64
// Time returns a number of seconds.
func (t T1) Time() int64 { return t }
// T2 represent a number of nanoseconds
type T2 time.Duration
// Time returns a number of nanoseconds.
func (t T2) Time() time.Duration { return t }

type S1 struct {
    t T1
}
type S2 struct {
    t T2
}
var V1 S1
var V2 = S2(V1)

This conversion is accepted by your proposal even though the values have different meanings and the value methods have different meanings, they just happen to have the same underlying type.

@russ-
Copy link

russ- commented Nov 23, 2016

Another case worth considering is oneof. For proto messages that use oneof, the proto-generated structs will contain an interface. For message Foo below, the message and field names are used in the interface's function name isFoo_Bar. So if you have two protos with the same oneof field structure but different e.g. top-level message names, they wouldn't be "SE" even if they have all of the same field names and effective types.

.proto

message Foo {
  oneof bar {
    string a = 1;
    int64 b = 2;
  }
}

.pb.go

type isFoo_Bar interface {
        isFoo_Bar()
}

type Foo_A struct {
  A string
}

type Foo_B struct {
  B int64
}

func (*Foo_A) isFoo_Bar() {}
func (*Foo_B) isFoo_Bar() {}

...

type Foo struct {
  Bar isFoo_Bar
}

@adrianludwin
Copy link
Author

@ianlancetaylor - yes, I agree with what you're saying. I feel that there might be a semantic difference between interfaces and method sets - even though they behave the same way - but if most of the community doesn't see it that way, then there's little point introducing a distinction. Plus, as @russ points out, generated protobuf code does use interfaces, so we would have to support them.

@russ, thanks for pointing this use case out, I had forgotten about it. Given the current protobuf generator, I think the options would either be to ignore the method names (and only verify compare the other aspects of their signatures), or else insist that the top-level message names had to be the same.

The first option seems too lax to me - if an interface contained more than one method with the same signature (ignoring the name), it would be either impossible or very brittle to determine which method in the first type should match to the second.

The second option is feasible but unfortunate, since nothing else in this proposal cares about the names of the types themselves. So this would reduce the usefulness of this proposal, and could cause working code to suddenly break if a developer added a oneof to their code.

If we can change the protobuf golang generator, we have a few additional options:

  1. We could save the oneof field contents as an interface{} instead of the special-purpose oneof interface type, but then you loose all type safety for the possible types of the field.
  2. The oneof field could be stored in the equivalent of a variant, although this would require runtime checks that only one value was ever populated at a time, as well as wasting memory. I don't know if this would break other kinds of existing code, though - it would depend if people were manually checking for the isFoo_Bar() method.

I don't find any of these options terribly appealing. My best guess would be to sacrifice oneof support - any messages that contained oneof fields would simply not be SE with any other message that might otherwise be SE but does not have the same name. Does anyone else have any suggestions?

@minux
Copy link
Member

minux commented Nov 27, 2016 via email

@adrianludwin
Copy link
Author

@minux - the code generator won't know the different types you want to convert to/from in the general case. However, I agree that (in theory) we could make some changes to the generator to simplify these use cases.

@rsc
Copy link
Contributor

rsc commented Nov 28, 2016

As Ian pointed out in #18030 (comment), there is significant complexity lurking here. Part of the reason things are as locked down as they are is to make it OK for people to change their own types in specific ways (like adding struct fields) without fear of breaking other people's programs.

The type system does not seem like something we should modify incrementally one step at a time without very good reason. We had a good, specific, compelling reason for #16085. What is the specific, compelling reason here?

@adrianludwin
Copy link
Author

Hi @rsc , the compelling reason is being able to copy information between two identically written but distinct proto messages. However, allowing that kind of casting would break the use case you're mentioning - if one of the protos changed, the code would break.

Given the complexities, perhaps this isn't worthwhile - maybe an extension to protoc that could provide translators between two specified protos would be better if there's sufficient demand.

@rsc
Copy link
Contributor

rsc commented Dec 5, 2016

OK, let's decline this until there is a more compelling reason.

@rsc rsc closed this as completed Dec 5, 2016
@golang golang locked and limited conversation to collaborators Dec 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants