New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: sum types based on general interfaces #57644
Comments
Could you comment on why this restriction occurs? Is this simply to err on the side of caution initially and potentially remove this restriction in the future? Or is there a technical reason not to do this? |
The reason to not permit |
Is there a technical reason that the language could not also evolve to support |
In a vacuum, I'd prefer pretty much any other option, but since it's what generics use, it's what we should go with here and we should embrace it fully. Specifically,
@dsnet I think |
With the direct storage mechanism detailed in the post as an alternative to boxing, would it be possible for the zero-value not to be type Example interface {
int16 | string
} the zero value in memory would look like
I don't understand this comment, which may indicate that I'm missing something fundamental about the explanation. Why would pointers make any difference? If the above |
The example in the proposal is rather contrived, so I tried to imagine some real situations I've encountered where this new capability could be useful to express something that was harder to express before. Is the following also an example of something that this proposal would permit? type Success[T] struct {
Value T
}
type Failure struct {
Err error
}
type Result[T] interface {
Success[T] | Failure
}
func Example() Result[string] {
return Success[string]{"hello"}
} (NOTE WELL: I'm not meaning to imply that the above would be a good idea, but it's the example that came most readily to mind because I just happened to write something similar -- though somewhat more verbose -- to smuggle (result, error) tuples through a single generic type parameter yesterday. Outside of that limited situation I expect it would still be better to return Another example I thought of is Although I expect it would not be appropriate to change this retroactively for compatibility reasons, presumably a hypothetical green field version of that type could be defined like this instead: type Token interface {
Delim | bool | float64 | Number | string
// (json.Token also allows nil, but since that isn't a type I assume
// it wouldn't be named here and instead it would just be
// a nil value of type Token.)
} Given that the exact set of types here is finite, would we consider it to be a breaking change to add new types to this interface later? If not, that could presumably allow the following to compile by the compiler noticing that the // TokenString is a rather useless function that's just here to illustrate an
// exhaustive type switch...
func TokenString(t Token) string {
switch t := t.(type) {
case Delim:
return string(t)
case bool:
return strconv.FormatBool(t)
case float64:
return strconv.FormatFloat(t, 'g', -1, 64)
case Number:
return string(t)
case string:
return string
}
} I don't feel strongly either way about whether such sealed interfaces should have this special power, but it does seem like it needs to be decided either way before implementation because it would be hard to change that decision later without breaking some existing code. Even if this doesn't include a special rule for exhaustiveness, this still feels better in that it describes the range of EDIT: After posting this I realized that my type switch doesn't account for Finally, it seems like this would shrink the boilerplate required today to define what I might call a "sealed interface", by which I mean one which only accepts a fixed set of types defined in the same package as the interface. One way I've used this in the past is to define struct types that act as unique identifiers for particular kinds of objects but then have some functions that can accept a variety of different identifier types for a particular situation: type ResourceID struct {
Type string
Name string
}
type ModuleID struct {
Name string
}
type Targetable interface {
// Unexported method means that only types
// in this package can implement this interface.
targetable()
}
func (ResourceID) targetable() {}
func (ModuleID) targetable() {}
func Target(addr Targetable) {
// ...
} I think this proposal could reduce that to the following, if I've understood it correctly: type ResourceID struct {
Type string
Name string
}
type ModuleID struct {
Name string
}
type Targetable interface {
ResourceID | ModuleID
}
func Target(addr Targetable) {
// ...
} If any of the examples I listed above don't actually fit what this proposal is proposing (aside from the question about exhaustive matching, which is just a question), please let me know! If they do, then I must admit I'm not 100% convinced that the small reduction in boilerplate is worth this complexity, but I am leaning towards 👍 because I think the updated examples above would be easier to read for a future maintainer who is less experience with Go and so would benefit from a direct statement of my intent rather than having to infer the intent based on familiarity with idiom or with less common language features. |
@dsnet Sure, we could permit |
@DeedleFake The alternative implementation is only an implementation issue, not a language issue. We shouldn't use that to change something about the language, like whether the value can be The reason pointer values matter is that given a value of the interface type, the current garbage collector implementation has to be able to very very quickly know which fields in that value are pointers. The current implementation does this by associating a bitmask of pointers with each type, such that a 1 in the bitmask means that the pointer-sized slot at that offset in the value always holds a pointer. |
@apparentlymart I think that everything you wrote is correct according to this proposal. Thanks. |
It would be, but I think it would be worth it. And I don't think it would be so strange as to completely preclude eliminating the extra oddness that would come from union types always being nilable. In fact, I'd go so far as to say that if this way of implementing unions has to have them be nilable, then a different way of implementing them should be found.
I was worried it was going to be the garbage collector... Ah well. |
A major problem is that type constraints work on static types while interfaces work on dynamic types of objects. This immediately prohibits this approach to do union types.
This works because the static type of |
@merykitty per my understanding of the proposal, I think for the dynamic form of what you wrote you'd be expected to write something this: type Addable interface {
int | float32
}
func Add(x, y Addable) Addable {
switch x := x.(type) {
case int:
return x + y.(int)
case float32:
return x + y.(float32)
default:
panic("unsupported Addable types %T + %T", x, y)
}
} Of course this would panic if used incorrectly, but I think that's a typical assumption for interface values since they inherently move the final type checking to runtime. I would agree that the above seems pretty unfortunate, but I would also say that this feels like a better use-case for type parameters than for interface values and so the generic form you wrote is the better technique for this (admittedly contrived) goal. |
@merykitty No, in your example, |
also, note that the type set never includes interfaces. So |
Is something like that going to be allowed? type IntOrStr interface {
int | string
}
func DoSth[T IntOrStr](x T) {
var a IntOrStr = x
_ = a
} |
Let's say I have these definitions. type I1 interface {
int | any
}
type I2 interface {
string | any
}
type I interface {
I1 | I2
} Would it be legal to have a variable of type |
@mateusz834 Can't see why not.
I think the answer to all of these is "yes". For the cases where you assign an interface value, the dynamic type/value of the |
FWIW my main issue with this proposal is that IMO union types should allow representing something like |
This comment was marked as resolved.
This comment was marked as resolved.
@ianlancetaylor Does the proposal as-is allow both type sets and functions in an interface? It would have a remarkable property not typically present in sum types where you could have a closed set of types along with the ability to have those types implement some common functions and be used as an interface. |
For reference, this has been suggested a few times in #19412 and #41716, starting with #19412 (comment). Requiring nil variants versus allowing source code order to affect semantics is the classic tension of sum types proposals.
The spelling of a type with no information beyond existence is usually
Yes, since the proposal is just to allow values of general interfaces less |
Thanks @zephyrtronium. Taking your feedback into account, and also realizing that it is easy to redefine types, then I think points (2) and (3) I raised are not issues. Type definitions can be used to give the same type different semantics for each case. For example: type ClaimPredicateUnconditional struct{}
type ClaimPredicateAnd []ClaimPredicate
type ClaimPredicateOr []ClaimPredicate
type ClaimPredicateNot ClaimPredicate
type ClaimPredicateBeforeAbsoluteTime Int64
type ClaimPredicateBeforeRelativeTime Int64
type ClaimPredicate interface {
ClaimPredicateUnconditional |
ClaimPredicateAnd |
ClaimPredicateOr |
ClaimPredicateNot |
ClaimPredicateBeforeAbsoluteTime |
ClaimPredicateBeforeRelativeTime
} In the main Go code base I work in we have 106 unions implemented as multi-field structs, which require a decent amount of care to use. I think this proposal would make using those unions easier to understand, probably on par in terms of effort to write. If tools like gopls went on to support features like pre-filling out the case statements of a switch based on the type sets, since it can know the full set, that would make writing code using them easier too. The costs of this proposal feel minimal. Any code using the sum type would experience the type as an interface and have nothing new to learn over that of interfaces. This is I think the biggest benefit of this proposal. |
To me, On the one hand, On the other hand, union Exhaustiveness in type |
In the proposal, Ian mentions that in some cases values could be stored directly, rather than boxed. This is impossible now because the exact type and size of an interface value's dynamic value can't be known at compile time. There may be other reasons, but I would guess
Oh yes you are absolutely right, I hadn't thought that through. |
The proposal above illustrates how to add sum type (aka: union), but tagged union is more powerful and useful (which allows multiple occurrences of a type, use tags to distinguish different variants). Real world designs: |
@leaxoy sigma types issue is probably what you are looking for. |
Unlike most proposals, there are downsides to not accepting this one:
I think the second situation is unlikely. The bar would be much higher than any other language change, which is already pretty high. I dislike the first situation. There are many uses for union types even if they have downsides compared to other more theoretically pure alternatives. |
How about introduce new keyword |
@leaxoy Introducing a new keyword is not backward-compatible, because any code today using |
FWIW this issue is specifically about using the existing syntax, because it seems dubious to have two different syntactical constructs to mean very similar things. Being able to reuse that syntax was, in fact, one of the (minor) arguments for introducing it to begin with. |
This proposal can support tagged unions via specialized types: type All struct{}
type None struct {}
type Some []string
type Filter interface {
All | None | Some
}
func Select(f Filter) ([]string, error) {
// ...
} Empty struct values could be optimized away:
Edit: "multiple values of a type:" type Width uint32
type Height uint32
type Weight uint32
type Dimension interface {
Width | Height | Weight
}
func f(d Dimension) ...
_ = f(Width(10))
_ = f(Height(20)) |
@ianlancetaylor would you consider a form that would disallow Some hypothetical syntax: type I1 interface! {
int | float64
}
|
@ydnar The question of "what would the zero value be" is exactly the one that needs to be answered for that kind of proposal. Syntax aside, the concept of non-nillable interfaces or sum types has been suggested many times between here, #19412, and other proposals. None of them have answered the zero value question in a way that satisfies even a majority of people (including those that have tried the answer "that of the first type in the union"). |
That's true, but I think most languages also don't have a concept of zero values in the way that Go does. Rust, for example, requires all values of any type to always be explicitly set to something, even if it's a default value. Random thought that might be terrible: What if Go did allow |
You could still set values to nil with: type Bits64 interface { int64 | uint64 | float64 }
type Dummy struct { Field64 Bits64 }
func (d *Dummy) SetField64(value Bits64) {
d.Field64 = value
}
func main() {
var d1, d2 Dummy
var value uint64
d1.SetField64(value) // d1.Field64 stops being nil
d1.SetField64(d2.Field64) // d1.Field64 is nil again...
} To prevent this you would have allow only the concrete types listed in the sum type be assignable to sum type variables. Which is way, way much more restrictive. But yeah, it may be the only real alternative to fully nilable sum types if we want to make them be based on interfaces. |
@DeedleFake I don't believe that would be practically feasible. And without that runtime check, there is no real benefit, because you still have to code against the possibility of it being @tinne26 Under @DeedleFake's suggestion, that code would panic, because passing |
I don't remember, what was the problem with it? It's a straightforward rule that can be easily explained. It also appears to be intuitive, as it's the default response for most proposers when posed with the question. |
@gophun Currently, |
@Merovius
Then these
could be short form for:
Here the order wouldn't matter. |
@Merovius Thank you for the pointer; I'll take it over there. The keyword option was criticized solely on the grounds of being "not backwards compatible," a stance that has been clearly contradicted by the Go project lead. |
That is not true. But again, that discussion doesn't belong here. |
This comment was marked as off-topic.
This comment was marked as off-topic.
It would not, for a couple reasons:
I think what you want is #45380. |
Seeing this example, it occurs to me that #57644 (comment) actually seems to be wrong. Consider these definitions: type bytestring interface {
string | []byte
}
func f[T bytestring]() {} Type bytestring itself can instantiate f if bytestring satisfies bytestring, which it does if bytestring implements bytestring. Since bytestring is an interface, it implements any interface of which its type set is a subset, which trivially includes itself. Therefore f[bytestring] is a legal instantiation. So, it seems that we need additional adjustments to the spec to make interfaces with union elements legal. Otherwise every type constraint which includes a union element and no ~T terms gains a non-empty set of members, all of interface type, which will be illegal in almost every case. |
This is a speculative issue based on the way that type parameter constraints are implemented. This is a discussion of a possible future language change, not one that will be adopted in the near future. This is a version of #41716 updated for the final implementation of generics in Go.
We currently permit type parameter constraints to embed a union of types (see https://go.dev/ref/spec#Interface_types). We propose that we permit an ordinary interface type to embed a union of terms, where each term is itself a type. (This proposal does not permit the underlying type syntax
~T
to be used in an ordinary interface type, though of course that syntax is still valid for a type parameter constraint.)That's really the entire proposal.
Embedding a union in an interface affects the interface's type set. As always, a variable of interface type may store a value of any type that is in its type set, or, equivalently, a value of any type in its type set implements the interface type. Inversely, a variable of interface type may not store a value of any type that is not in its type set. Embedding a union means that the interface is something akin to a sum type that permits values of any type listed in the union.
For example:
The types
MyInt
andMyFloat
implementI1
. The typeMyOtherInt
does not implementI1
. None ofMyInt
,MyFloat
, orMyOtherInt
implementI2
.In all other ways an interface type with an embedded union would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted for type parameters when using such a type as a type parameter constraint. This is because in a generic function we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as
+
. With two values of some interface type, all we know is that both types appear in the type set, but they need not be the same type, and so+
may not be well defined. (One could imagine a further extension in which+
is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)In particular, the zero value of an interface type with an embedded union would be
nil
, just as for any interface type. So this is a form of sum type in which there is always another possible option, namelynil
. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.As an implementation note, we could in some cases use a different implementation for interfaces with an embedded union type. We could use a small code, typically a single byte, to indicate the type stored in the interface, with a zero indicating
nil
. We could store the values directly, rather than boxed. For example,I1
above could be stored as the equivalent ofstruct { code byte; value [8]byte }
with thevalue
field holding either anint
or afloat64
depending on the value ofcode
. The advantage of this would be reducing memory allocations. It would only be possible when all the values stored do not include any pointers, or at least when all the pointers are in the same location relative to the start of the value. None of this would affect anything at the language level, though it might have some consequences for thereflect
package.As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics implementation. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412 or newer variants such as #54685. Thanks.
The text was updated successfully, but these errors were encountered: