Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: sum types using interface type lists #41716

Open
ianlancetaylor opened this issue Sep 30, 2020 · 106 comments
Open

proposal: Go 2: sum types using interface type lists #41716

ianlancetaylor opened this issue Sep 30, 2020 · 106 comments

Comments

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Sep 30, 2020

This is a speculative issue for discussion about an aspect of the current generics design draft. This is not part of the design draft, but is instead a further language change we could make if the design draft winds up being adopted into the language.

The design draft describes adding type lists to interface types. In the design draft, an interface type with a type list may only be used as a type constraint. This proposal is to discuss removing that restriction.

We would permit interface types with type lists to be used just as any other interface type may be used. A value of type T implements an interface type I with a type list if

  1. the method set of T includes all of the methods in I (if any); and
  2. either T or the underlying type of T is identical to one of the types in the type list of I.

(The latter requirement is intentionally identical to the requirement in the design draft when a type list is used in a type constraint.)

For example, consider:

type MyInt int
type MyOtherInt int
type MyFloat float64
type I1 interface {
    type MyInt, MyFloat
}
type I2 interface {
    type int, float64
}

The types MyInt and MyFloat implement I1. The type MyOtherInt does not implement I1. All three types, MyInt, MyOtherInt, and MyFloat implement I2.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

Edit: This paragraph is withdrawn. We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

In all other ways an interface type with a type list would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics design draft. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412. Thanks.

@Merovius
Copy link

@Merovius Merovius commented Sep 30, 2020

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

I don't understand this. If all types in the type list appear as cases, the default case would never, trigger, correct? Why require both?

Personally, I'm opposed to requiring to mention all types as cases. It makes it impossible to change the list. ISTM at least adding new types to a type-list should be possible. For example, if go/ast used these proposed sum types, we could never add new node-types, because doing so would break any third-party package using ast.Node. That seems counterproductive.

I think requiring a default case is a good idea, but I don't like requiring to mention all types as cases.

There is another related question. It is possible for such a sum-value to satisfy two or more cases simultaneously. Consider

type A int

type X interface {
    type A, int
}

func main() {
    var x X = A(0)
    switch x.(type) {
    case int: // matches, underlying type is int
    case A: // matches, type is A
    }
}

I assume that the rules are the same as for type-switches today, which is that the syntactically first case is matched? I do see some potential for confusion here, though.

@griesemer
Copy link
Contributor

@griesemer griesemer commented Sep 30, 2020

[edited]

@Merovius It does say "...and if there are any types in the type list that do not appear as cases in the type switch." Specifically, there is no comma between "default case" and "and". Perhaps that is the cause for the confusion?

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

@tooolbox
Copy link

@tooolbox tooolbox commented Sep 30, 2020

It does say "...and if there are any types in the type list that do not appear as cases in the type switch."

Makes sense. I can see how the language is a little ambiguous, the point is it's a compile error if both of those conditions exist.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

It occurred to me that tooling could spot when a type switch branch was invalid, i.e. the interface type list only contains A and B and your switch checks for C, but it seems best to not make that a compiler error. A linter could warn about it, but being overly restrictive here might harm backwards-compatibility.

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

I think it makes the most sense for the type switch to behave consistently. It's not clear to me how the type switch would be any different except that the interface being switched on has a type list. You can know at compile-time what branches should be in the switch, but that's it.

Overall I'm in favor, I think the proposal is right on the money. They function like any other interface, (no operators) and zero value is nil. Simple, consistent, unifies semantics with the Generics proposal. 👍

@Merovius
Copy link

@Merovius Merovius commented Sep 30, 2020

@griesemer Ah, I think I understand now. I actually misparsed the sentence. So AIUI now, the proposal is to require either a default case or to mention all types, correct?

In that case, the proposal makes more sense to me and I'm no longer confused :) I still would prefer to require a default case, though, to get open sums. If it is even allowed to not have a default case, it's impossible to add new types to the type-list (I can't know if any of my reverse dependencies does that for one of my exported types, so if I don't want to break their compilation, I can't add new types). I understand that open sums seem less useful to people who want sum types, though (and I guess that's at the core of why I consider sum types to be less useful than many people think). But IMO open sums are more adherent to Go's general philosophy of large-scale engineering and the whole gradual repair mechanism - and also more useful for almost all use-cases I see sum types suggested for. But that's just my 2¢.

@griesemer
Copy link
Contributor

@griesemer griesemer commented Sep 30, 2020

@Merovius Yes, your new reading is correct.

@mvdan
Copy link
Member

@mvdan mvdan commented Sep 30, 2020

In all other ways an interface type with a type list would act exactly like an interface type.
[...] So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

Could you clarify why nil should always be an option in such sum types? I understand this makes them more like a regular interface, but I'm not sure if that consistency benefit outweighs how it makes them less useful.

For example, they could be left out by default, or included by writing nil or untyped nil as one of the elements in the type list.

I understand that the zero value gets trickier if we remove the possibility of nil, which might be the reason behind always including nil. What do other languages do here? Do they simply not allow creating a "zero value" of a sum type?

@mvdan
Copy link
Member

@mvdan mvdan commented Sep 30, 2020

To add to my comment above - @rogpeppe's older proposal in #19412 (comment) does indeed make nil opt-in, and the zero value of the sum type becomes the zero value of the first listed type. I quite like that idea.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Sep 30, 2020

@mvdan as far as I'm aware other languages with sum types do not have the notion of a zero value and either require a constructor or leave it undefined. It's not ideal to have a nil value but getting something that works both as a type and a metatype for generics is worth the tradeoff, imo.

@Merovius
Copy link

@Merovius Merovius commented Sep 30, 2020

I guess (as a nit) nil should also be a required case if no default case is given, if we make nil a valid value.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Sep 30, 2020

So this https://go2goplay.golang.org/p/5L7T8G9rfLD would print "something else" under the current proposal, correct? The only way to get that value is reflect?

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Sep 30, 2020

@jimmyfrasche Correct. This proposal doesn't change the way that type switches operate, except for the suggested error if there are omitted cases.

@Merovius
Copy link

@Merovius Merovius commented Sep 30, 2020

So that means this code would panic? https://go2goplay.golang.org/p/vPC-qtKb7VO
That seems strange and as if it makes these sum types significantly less useful.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Sep 30, 2020

I'd like to reiterate my earlier suggestion: https://groups.google.com/g/golang-nuts/c/y-EzmJhW0q8/m/XICtS-Z8BwAJ

@tooolbox
Copy link

@tooolbox tooolbox commented Sep 30, 2020

I'd like to reiterate my earlier suggestion: https://groups.google.com/g/golang-nuts/c/y-EzmJhW0q8/m/XICtS-Z8BwAJ

Not terribly excited about the new syntax.

That seems strange and as if it makes these sum types significantly less useful.

Well, it panics without the type list. But I get your point, the interface then allows a value for which the compiler won't enforce a branch in a type switch.

Could we perform implicit conversion to one of the listed types when you assign into the interface? The only case I can think of where that's weird is when the interface has methods that those underlying types don't have, i.e. you have type Foo interface{ type int; String() string }, so implicit conversion to int itself violates the interface.

While I really like the idea of unifying interface semantics by allowing type lists in interfaces used as values, rather than just as constraints, perhaps the two use cases are different enough that the interfaces you'd use for each vary significantly. Maybe this problem we're discussing isn't one that we'd encounter in real code? It might be time to break out some concrete examples.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Sep 30, 2020

Any explicit syntax would work. I just had to choose something semi-reasonable to write the idea down. At any rate, it wouldn't need to be used very often but having the choice let's everything work reasonably without either use being hindered by the existence of the other.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Sep 30, 2020

@Merovius Correct: that code would panic.

I think it's worth discussing whether that would in fact make these sum types significantly less useful. It's not obvious to me, because it's not obvious to me that sum types are often used to store values of types like int. I agree that if that is a common case, then this form of sum types is not all that useful, but when does that come up in practice, and why?

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Sep 30, 2020

You could always work around it by using type LabeledInt int in the sum but that means having to create additional types. fwif json.Token is a sum type in the standard library that takes bool and float64 and string

@Merovius
Copy link

@Merovius Merovius commented Oct 1, 2020

@ianlancetaylor Point taken. I can't personally really provide any evidence or make any strong arguments, because I'm not convinced sum types in and off itself are actually very useful :) I was trying to extrapolate. Either way, I also find it objectionable on an aesthetic level, to single out predeclared types in this way - but that's just subjective, of course.

@neild
Copy link
Contributor

@neild neild commented Oct 1, 2020

Regarding changing a type list being a breaking change: If the type list contains an unexported type, then the rule in @ianlancetaylor's proposal effectively requires that all type switches outside the package containing the sum type contain a default case.

For example,

package p
type mustIncludeDefaultCase struct{}
type MySum interface {
  type int, float64, mustIncludeDefaultCase
}

Regarding nil-ness of sum types: I find it strange that the proposed rules require type switches to exhaustively cover the possible types in the sum or include a default case, but don't require covering the nil case.

type T interface { type int16, int32 }
func main() {
  var x T

  // None of these cases will execute, because x is nil.
  switch x.(type) {
  case int16:
  case int32:
  } 
}

I personally would prefer a design in which the zero value of a sum is the zero value of the first type in the sum. It is easy to add an additional "nothing here" case when desired, but impossible to remove a mandatory nil case when not.

@rogpeppe
Copy link
Contributor

@rogpeppe rogpeppe commented Oct 1, 2020

In general I'm in favour of this proposal, but I think there are some issues that need to be solved first.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

If type switches aren't changed at all, then I don't see how this rule is useful. It feels like it's attempting to define that switches on type-list interfaces are complete, but it's clear that they can never be complete when the type list contains builtin types, because there are any number of other non-builtin types there could be.

In general, even putting the above aside, I don't think the requirement for a type switch statement to fully enumerate the cases or the requirement to have a default fits well with the rest of the language. It's common to just "fall off the bottom" of a switch statement if something doesn't match, and that seems just as apt to me for a type-list type switch as with any other switch or type switch. In general, the rule doesn't feel very "Go-like" to me.

What about type assertions involving type-list interfaces. Can I do this?

type I1 interface {
    type string, []byte
}
var x I1 = "hello"
y := x.(string)

If not, why not? If so, why is this so different from a type switch with a single case and no default branch?

What about this (type asserting to a type list interface) ?

x := y.(I1)

If that works, presumably this could be used to test the underlying type of the dynamic type of an interface, which is something that's not easy to do even with reflect currently.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

I understand why this rule is proposed - using the same rule in both cases is important for consistency and lack of surprises in the language. However, ISTM that this rule gives rise to almost all the discomfort I have with this proposal:

  • we can switch on all the types in the type list without having a guarantee of getting a match
  • there's a many-to-one correspondence between the types named in the interface type and the dynamic types that the interface can take on

If we don't allow an interface type with a type list to match underlying types too, then you end up with surprises with assignments in generic functions. For example, this wouldn't be allowed, because F might be instantiated with a type that isn't int64 or int:

type I interface {
    type int64, int
}

func F[T I](x T) I {
    return x
}

How about working around this by adding the following restriction:

Only generic type parameter constraints can use type-list interfaces that contain builtin types.

So the above example would give a compile error because I, which contains a builtin type, is being used as a normal interface type.

The above restriction would make almost all the issues go away, I think - albeit at the cost of some generality.

There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

Allowing operators is only a problem for binary operators AFAICS. One thing that might be interesting to allow is operators that do not involve more than one instance of the type. For example, given:

type StringOrBytes interface {
     type string, []byte
}

I don't think that there would be any technical problem with allowing:

var s StringOrBytes = "hello"
s1 := s[2:4]
n := len(s)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

I think that always having nil as a possibility is a bit of a shame and I don't think it's absolutely necessary as @mvdan pointed out, but I'd still support this proposal even with nil, for the record.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 1, 2020

As far as I can tell, this proposal nearly parallels the defined-sum interface types in my previous writeup.

There is one key difference that I would like to explore: assignability. To me, assignability is what makes interface types coherent at all: it is what causes interface types to have regular subtyping relationships, which other Go types lack.

This proposal does not mention assignability at all. That seems like an oversight. In particular:

  • If all of the types in the sum type S are defined (not underlying) types, and all of those types implement the methods of an ordinary interface type I, should a variable of type S be assignable to I?

  • If all of the types in the sum type S1 are also in the sum type S2, should a variable of type S1 be assignable to S2? (This is especially important when I is itself a type-list interface: it seems clear to me that a variable of type interface { int8, int16 } should be assignable to a variable of type interface { int8, int16, int32 }.)

  • Should a variable of any sum type be assignable to interface{}?

@mvdan: for me, the assignability properties are what lead to the conclusion that all sum types should admit nil values. It would be strange for the zero-value of type interface { int8, int16 } to be different for the zero-value of type interface{} if the former is assignable to the latter, and it would be even stranger for the zero-value of type interface { *A, *B } to be assignable to an interface implemented by both *A and *B but to have a non-nil zero-value.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 1, 2020

I believe that this proposal is compatible with (in the sense that it would not preclude later addition of) the sum interface types from my previous writeup, which I think are a closer fit to what most users who request “sum types” have in mind.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 1, 2020

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

This constraint seems like a mistake to me. Due to the underlying-type expansion, even a switch that enumerates all of the types in the type switch could be non-exhaustive.

I think that either the default constraint should be dropped, or a default case should also be required for any type-list interface that includes any type that could be an underlying type (that is, any literal composite type and any predeclared primitive type).

@deanveloper
Copy link

@deanveloper deanveloper commented Oct 1, 2020

Would unary operators be defined on the types if possible? ie:

type Int interface {
    type int, int32, int64
}
func Neg(i Int) Int {
    return -i
}

I would assume not since unary operators are defined to expand to binary operators, however the unary operators are valid to use since the other operand is untyped. Although this could result in unexpected behavior for sum types with uint types included.

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 1, 2020

Finally, I would like to note that this proposal on its own would expose an (existing) inconsistency in the generics design draft: the semantics of a type-list interface used as a constraint would be markedly different from the semantics of any other interface type used as a type constraint.

In particular, unlike other interface types, a type-list interface would not satisfy itself as a type constraint. If it did, then the type constraint interface { int8, int16 } would not be sufficient to allow the use of mathematical operators on the constrained type, because the type interface { int8, int16 } itself does not have defined mathematical operators.

(See https://github.com/bcmills/go2go/blob/master/typelist.md#type-lists-are-not-coherent-with-interface-types for more detail.)

@fpsiorz
Copy link

@fpsiorz fpsiorz commented Oct 21, 2020

I really dislike the fact that builtin types are treated differently from user defined types. That's counter intuitive and a possible source of errors. Instead of treating them special in type lists, I would suggest using some kind of special syntax that means "any type that has the same underlying type as X". The reason is that nowhere else in the language, some type names mean "the concrete type X" whereas others mean "anything with X as its underlying type".

My suggestion would be to call it interface X, for the lack of a better fitting keyword name.

So this interface would be satisfied only by the concrete types int and uint:

type ConcreteInt interface {
    type int, uint
}

On the other hand, this interface would be satisfied by all types that have int or uint as their underlying type:

type AnyInt interface {
    type interface int, interface uint
}

This syntax should not only be allowed in type lists, but also in type switches:

func WhatIsThis(x AnyInt) {
    switch x := x.(type) {
    case int:
        fmt.Println("concrete int")
    case interface int:
        fmt.Println("underlying type is int, value is", int(x))
    case uint:
        fmt.Println("concrete uint")
    case interface uint:
        fmt.Println("underlying type is uint, value is", uint(x))
    case nil:
        fmt.Println("nil")
    }
}

The type of x in the second case would be interface int, which supports being converted to any type with underlying type int, but not any other operations. When you have a type MyInt defined as type MyInt int, the types interface int and interface MyInt would be completely equivalent.

Ideally, the syntax should be allowed anywhere where types are allowed, so you should be able to write this code:

func TypeAndValue(x interface int) (string, int) {
    return fmt.Sprintf("%T", x), int(x)
}

This would still panic for TypeAndValue(nil), but one could adapt it:

func TypeAndValue(x interface int) (string, int) {
    if x, ok := x.(interface int); ok {
        return fmt.Sprintf("%T", x), int(x)
    }
    return "nil", 0
}

With generics, there could also be situations in which such interfaces are suitable constraints, e.g.

func SortAnyFloat64s[T interface float64](x []T) {
    sort.Float64s([]float64(x))
}
@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Oct 21, 2020

I think that when type lists are used in a type constraint, people will usually want the underlying type matching for builtin types. So I think it would be unfortunate and error prone to require them to write interface int in that scenario.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Oct 21, 2020

I'm not sure what's unfortunate and error prone about having to be explicit. Even if an annotation is made or forgotten in error, it's easy to spot by inspection.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Oct 21, 2020

I think it's error prone because both type int and type interface int are accepted, and the difference between them is quite subtle.

It would be less error prone if something were required either way. Then people would have to think about what they mean.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Oct 22, 2020

Not annotating an entry makes it look like how types look everywhere else and match like how types are match everywhere else. Annotating an entry makes it look and match differently.

Different annotations for each case seem redundant since one case is the "like everywhere else" case—but I'd be fine with that, I suppose.

Changing matching based on accident of provenance seems more subtle and error prone than either of the above to me and the most likely to trip up newcomers who haven't learned the rules and may not realize there is a rule until they trip over it.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Oct 22, 2020

There is no case today where a type list involves matching types. So while I understand the intuition, I don't agree that this is how types are matched everywhere else. They aren't matched anywhere else. We have operations like assignment and conversion, and those operations do not work exactly like any of matching mechanisms suggested for type lists.

Also, note that if we use type lists at all, we are going to use them in type constraints before we use them as ordinary types. It's quite possible that we will use them in type constraints and never use them in ordinary types (that is, that we will accept a generics proposal but not accept this proposal). That would leave us saying type interface int for all time. That can't be right.

@bokwoon95
Copy link

@bokwoon95 bokwoon95 commented Oct 22, 2020

What if we change the 'mode' (concrete or underlying) of the type switch depending on the keyword passed to it?

switch s := s.(type) {
case int: // <-- matches only concrete ints
}

switch s := s.(interface) {
case int: // <-- matches types whose underlying type is also int
}
@bcmills
Copy link
Member

@bcmills bcmills commented Oct 22, 2020

@bokwoon95, the keyword interface seems to subtle to me for that use-case. (interface to me does not say “underlying type”.)

switch s.(underlying) seems reasonable, though, or switch underlying(s).

@deanveloper
Copy link

@deanveloper deanveloper commented Oct 22, 2020

i forgot who suggested this, but someone had suggested much earlier that int... could mean “int and anything with underlying type int”

@fpsiorz
Copy link

@fpsiorz fpsiorz commented Oct 22, 2020

[Edit: Sorry, I messed up the formatting, should be right now]

I doubt that the main usage of type lists would be for constraints. Most users will probably just use the predefined constraints from the library, or combine them using interface embedding, such as here:

type OrderedNumber interface {
    constraints.Ordered
    constraints.Number
}

Like with any other interfaces, this should allow to intersect the sets of types satisfied by those two interfaces. On the other hand, if users wanted the union of those types, so all types that satisfy constraints.Ordered or constraints.Number, they could still use type lists without any extra keyword, like this:

type OrderedOrNumber interface {
    type constraints.Ordered, constraints.Number
}

This should be the general behavior of all interfaces in type lists: they add all types to the type list that the interface satisfies. That's the logical meaning of an interface in a type list. I don't know if this usage of interfaces in type lists is already in the proposal, but it should at least be considered as a future addition.

This leads to a very simple rule of how type lists work:

  • Concrete type in type list: this exact concrete type is satisfied by the new interface
  • Interface in type list: all concrete types satisfied by this interface are satisfied by the new interface

Currently, builtin types break this rule. They are concrete types, yet including them in a type list doesn't just add a single concrete type, it adds the whole set of types that have the same underlying type.

The solution would be to have an interface type that is satisfied not based on the methods, but on the underlying type. That's where interface int comes in. It's explicitly a set of concrete types (like all interface types), not something that usually refers to a concrete type but changes its meaning in the context of a type list.

On the other hand, the current semantics are very error prone. Imagine that in a package, the author wants a function SetTitle to be able to accept either a normal string or any fmt.Stringer to produce the title. They would write it like this

type Stringer interface {
    type string, fmt.Stringer
}

var title string

func SetTitle(str Stringer) {
    switch str := str.(type) {
    case string:
        title = str
    case fmt.Stringer:
        title = str.String()
    }
}

Now, consider somebody using this package. They can call the function SetTitle not only with an argument of a type the package author intended to, but also with any other type that has string as its underlying type. However, in that case, the function will panic. Even worse, there is not even a simple way for the package author to fix it. In the definition of the interface, there is no way to specify that only the builtin string type should satisfy the interface, not any other user defined type thay happens to have string as its underlying type. So the package author may decide to just allow all types that are backed by a string, but then they realize that this isn't really straight forward either, because there's no easy way to match for such a type in a type switch. They would have to introduce a new type like this:

type anyString interface {
    type string
}

Then, they would have to match for it, and the implementation would have to allow a type conversion in this case, using title = string(str). This is not very intuitive, especially because type conversions on interfaces are not common. Using a type assertion wouldn't work in this use case because they also only match the exact concrete type, not the underlying one.

If the interface string syntax would be adapted, the package author would have the choice:

  • Either only the builtin string type is allowed, and no user defined type (except the ones that satisfy fmt.Stringer). Then the syntax would be exactly like in the above example.
  • Or explicitly allow all types that are backed by a string. This means having to use interface string in both the type list and the type switch, and having to use an explicit type conversion doing title = string(str). In this case the type conversion is less surprising because the type conversion operator is essentially the "interface method" of the the interface string.

This really simplifies the language, and I would argue it also makes the language a bit more orthogonal. Currently, interfaces define a set of types by the common methods they implement. By adding this syntax, it would add a new dimension to the way interfaces can build a set of types, namely by having a common backing type, and having type conversion as a sort of "method" for this interface. This could even be used independent of type lists and constraints and sum types, just as regular interfaces. For example one could write this:

type StringerString interface {
    fmt.Stringer
    interface string
}

The type StringerString would only be satisfied by user defined types that are backed by a string and also have a String method. It could be used like this:

func InsideAndOutside(s StringerString) {
    fmt.Println("internal representation:", string(s), "pretty printed:", s.String())
}

Now, this isn't terribly useful and necessary by itself, but having the concept of an interface that satisfies all concrete types that have the same underlying type helps bridge the gap between what we want for constraints in generics and what we want in sum types, and allow us to use the same interface notation for both.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Oct 22, 2020

@ianlancetaylor

There is no case today where a type list involves matching types. So while I understand the intuition, I don't agree that this is how types are matched everywhere else. They aren't matched anywhere else. We have operations like assignment and conversion, and those operations do not work exactly like any of matching mechanisms suggested for type lists.

That's certainly correct. What I mean is that given

type G interface {
  type A, B, C
}
func F[T G](x T) {}

where none of A, B, C are primitive, you can reason that T be each of A, B, and C in turn by substitution and that when T = A you can consider the signature of F to be F(x A) and so on for B and C. That's what I mean about is being "like everywhere else".

Just to drive it home, under the current draft and the rule in this proposal if, say, int is added to the type list of G when T = int you cannot consider the signature of F to be F(x int) since it also accepts the whole family of types whose underlying type is int, which is unlike anything else in the language today.

Also, note that if we use type lists at all, we are going to use them in type constraints before we use them as ordinary types. It's quite possible that we will use them in type constraints and never use them in ordinary types (that is, that we will accept a generics proposal but not accept this proposal). That would leave us saying type interface int for all time. That can't be right.

If we end up with generics but not this proposal, I'd still rather have an explicit annotation because (a) explicit 👍 and (b) I might want to write something that is int and only int. (b) is not very important but (a) is. There's surely more syntax to learn but it's more uniform and straightforward. The new idea has a new form. If you're someone learning Go generics you can spot the new case because it has it a new syntax.

As long as everyone is bikeshedding syntax, let's say it's ~T instead of interface T. With a new syntax it can be expanded to work in type assertions and switches. Even if there's just generics that would let you write

type G interface {
  type A, ~B
}
func F[T G](x T) {
  switch any(x).(type) {
  case A:
  case ~B:
  }
}

and handle the case when T has underlying type B. This is less clear if there's annotations for both cases since one of them would be unnecessary for switches and assertions.

I doubt it would be of any use but it would also be possible to expand an explicit syntax later and allow things like

type G interface {
  type []~T, ~[]S, ~[]~U
}
@beoran
Copy link

@beoran beoran commented Oct 22, 2020

I agree that it would be useful to be able to specify both in type switches and in type lists whether we want either the exact type or all types with the given underlying type. However, for ease of use with generics, I think the former case should be marked somehow and the latter not.

@rogpeppe
Copy link
Contributor

@rogpeppe rogpeppe commented Oct 22, 2020

@ianlancetaylor

I think that when type lists are used in a type constraint, people will usually want the underlying type matching for builtin types.

That may or may not be true, but it seems clear to me that it's this particular aspect that is causing the most fundamental problems with the whole generics proposal. It's this behaviour which makes type switches on generic types problematic, and it's also this behaviour that makes this type-lists-as-sum-types proposal hard too.

It might be hard to let go of, but if we do, I think the language will be a lot cleaner, even if slightly less convenient, and some things become possible that are not currently.

As an alternative to allowing generics to work automatically on underlying types, we could extend the type conversion operator to allow conversion between values with the same underlying type, even when those types aren't actually at the top level.

For example, this code could be allowed because b has the same underlying type as byte:

 type b byte
 x := []b{1,2,3}
 y := []byte(x)

If we allow that (it was actually allowed in a previous version of the spec in fact), then a generic function specified on a type list type parameter can still be invoked for a named type, albeit at the cost of explicit type conversions for argument and/or return values.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Oct 22, 2020

@rogpeppe Thanks. I remain absolutely convinced that it is not acceptable to require type conversions for Min(x, y) if x and y happen to have values of type Celsius. It must be possible to write that in some way. Maybe we have the wrong approach to making that work, but it is not a matter of being hard to let go of. It absolutely must work.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Oct 22, 2020

I still think explicit syntax is the way to go, but I'm warming up to the possibility of two rules

  1. in generic constraints, type lists always use or-underlying matching
  2. in sum types, type lists always use normal matching

It doesn't feel ideal to have two rules but as long as they're both simple maybe it's not too bad.

@tdakkota
Copy link

@tdakkota tdakkota commented Dec 15, 2020

Sum types in Go is really good idea.

Using sum type is a good way to define different states using types.
But when it can be nil value, sum type always have an additional and maybe unnecessary state. Every time when you define N variants, you can get N+1 variants.
So,

type Result[T any] interface {
    type T, error
}

can be error, T or nil.

I think it should not be nil-able, it should not have zero value instead.
We can add check that sum type is always initialized by some variant.

type SomeVariant sturct{}
type SomeVariant2 sturct{}

type SumType interface {
    type SomeVariant, SomeVariant2
}

func var() {
    var s SumType // compiler error: s is uninitialized
    var s2 SumType = SomeVariant{} // OK
}

func result() (r SumType) { // compiler error: r is uninitialized
    return
}

func result2() (r SumType) { // OK, r is always initialized 
    r = SomeVariant{} 
    return
}

func result3() SumType { // OK, r is always initialized
    if condition() {
        return SomeVariant{}
    }

    return SomeVariant2{}  
}

type SomeType struct {
    S SumType
    value int
}

func insideStruct() SomeType {
    var s SomeType // compiler error: field S is uninitialized
    var s2 *SomeType // OK, if pointer is nil, then dereferencing or s2.S will cause panic
    s1 := SomeType{value: 10} // compiler error: field S is uninitialized
    s2 := &SomeType{} // compiler error: field S is uninitialized
    s3 := SomeType{S: SomeVariant{}} // OK  
}

type Constr interface {
     type []byte, SomeType 
}

func insideTypeSum[T Constr]() T {
       var zero T // compiler error: T can be SomeType and is uninitialized
       return zero
}

Also, if we really need nil-able sum type, we can allow nil variant explicitly.
It that case it can't be used as constraint.

type SumType interface {
    type nil, SomeVariant
}

And there is intersting case to use this feature:

type box[T any] struct {
  value *T
}

func (b box[T]) Get() *T {
  return b.value
}

type NonNil[T any] interface {
  type box[T]
  Get() *T
}

func Take[T any](t T) NonNil[T] {
  return box[T]{value: &t}
}
  • NonNil can't have zero value, so it always be initialized using box[T]
  • box[T] can't be created from another package without Take[T]
  • Take[T] can't create box[T] with value = nil

So, there no way to create NonNil[T] type where function Get() *T can return a nil pointer.

@tv42
Copy link

@tv42 tv42 commented Dec 15, 2020

@tdakkota

it should not have zero value

That would be a huge change to Go, definitely much more so than the proposal here. You have left out so many edge cases, e.g. reflect.Zero, arrays/slices, maps, new and so on. Your idea would be better served by its own proposal (if one doesn't already exist), where its feasibility can be discussed without adding noise to this much narrower proposal.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Dec 15, 2020

@tdakkota There is a lot of discussion of zero values and sum types over at #19412.

@tdakkota
Copy link

@tdakkota tdakkota commented Dec 17, 2020

@tv42

That would be a huge change to Go, definitely much more so than the proposal here. You have left out so many edge cases, e.g. reflect.Zero, arrays/slices, maps, new and so on. Your idea would be better served by its own proposal (if one doesn't already exist), where its feasibility can be discussed without adding noise to this much narrower proposal.

I did not propose deny zero values for all types or types inside type list. I proposed to deny creation of type sum with nil type.
I don't think it's a so huge change.
In that case

type ByteString interface {
     type []byte, []rune, string
}

Variable of type ByteString can be []byte(nil), []rune(nil) or string("")
But it always has a concrete type.

var slice []SumType // OK, there are no elements - no zero value type sum
var map map[string]SumType // OK, there are no elements - no zero value type sum
var channel chan SumType // OK, there are no elements - no zero value type sum

var array [2]SumType // compiler error: elements are uninitialized

new(SumType) is literally same as &SumType{}, so it should cause a compiler error.

I am not sure about how reflection would be work in Go2, but I think reflect.Zero should cause panic. Also, we can add a reflect.CanZero function to make sure that type can have zero value.

@rogpeppe
Copy link
Contributor

@rogpeppe rogpeppe commented Dec 17, 2020

I don't think it's a so huge change.

I'm afraid it would be a huge change. Zero values crop up in Go in many places.

var slice []SumType // OK, there are no elements - no zero value type sum

What about: slice = append(slice, s) where s is of type SumType?
Surely that must be OK. But then it might have a capacity larger than its length, so what
would slice[:cap(slice)][cap(slice)-1] return?

var m map[string]SumType // OK, there are no elements - no zero value type sum

What's the result of m[key] when key doesn't exist in m.

var c chan SumType // OK, there are no elements - no zero value type sum

What's the value of v in v, ok := <-c ?

You would have to forbid use of reflect.Zero(t) for any type t that contains a SumType.

In short, your suggestion is not viable. All types in Go must have a zero value.

In #19412, I suggested that the zero value could
be the first type in the list. I still think that's a viable approach rather than always including nil
as a possible value.

@atdiar
Copy link

@atdiar atdiar commented Dec 19, 2020

Got me wondering if nil should not be made a special type that belongs to every sumtype.

The "bottom" I think it's called in type theory. Everything I read in the proposal summary seems fine to me so far. Especially if we consider types as merely named interfaces or in Go, interfaces with a method that returns a unique name.
Sum types would just be a union of disjoint interfaces.
int would just be an embedding of type MyInt. (just from an abstract pov, a type being an interface around raw bits)

Issue is that people would not like having to deal with nil everywhere but it might be needed for function composition with sumtype arguments, amongst other things.
It's already the type of the nil value somehow, especially useful for interface comparison (if err!=nil).
Also, it allows for optional value parameters in functions (variable arity functions even) , a kind of "Maybe(x)". (or in struct fields but then, encoding will have to change, thinking about json and sql etc).
Nilable types would be a restricted kind of intersection types then between a value type and the nil type. Not sure of what it brings to the table but interesting nonetheless.

Also means that if we wants sumtypes of interfaces, they have to have fully orthogonal method sets (only a union of disjoint sets of types should work, a traditional go interface being an intensional encoding of a set of types)

Also important to note that set union is commutative. So should be interfaces. The zero-value of a sumtype being the zero-value of the first or second type component would preclude that.

@deanveloper
Copy link

@deanveloper deanveloper commented Dec 23, 2020

If we are going to use interface { ... } to designate sum types, then it should behave like an interface, and nil would be its zero value.

I'm not saying that we need to use nil as the zero value, however if we don't want to use nil as the zero value, then we should pick a different syntax from interface.

@thwd
Copy link

@thwd thwd commented Dec 26, 2020

What we're calling "underlying" in this thread, really just refers to a type's kind. reflect.Kind speaks about memory representation of types. We should separate the two concepts; type and kind. Types define abstractions, kinds define memory representation (and operators).

Types in Go1 are invariant. MyInt and int are not assignable to each other and must be converted. If we introduce underlying-matching; we introduce a type hierarchy. Consider:

type MyInt int
type YourInt Myint
type OurInt YourInt

Having int match all three of these and YourInt match two of these and MyInt match one of these induces a type hierarchy (int :> MyInt :> YourInt :> OurInt) while the rest of the language is strictly invariant.

In my opinion, the interesting question is not how to string and int, but how to struct{ F func(map[string]chan []*MyInt) } in kind-constraints.

  • Do we even allow structs? Say we didn't.
  • Do we allow funcs? Say we didn't.
  • Do we allow maps/slices? Say we didn't.
  • Does *MyInt match *int? (covariance)

In my opinion: introducing underlying-matching fundamentally challenges Go1's design and opens a wholly new problem space much larger than what has been considered in this thread so far.

@Merovius
Copy link

@Merovius Merovius commented Dec 27, 2020

I'm having trouble understanding what you are trying to say.

What we're calling "underlying" in this thread, really just refers to a type's kind.

Not really. kind doesn't distinguish between different structs, but underlying does. Also, "underlying type" is an established concept from the go spec, kind isn't.

If we introduce underlying-matching; we introduce a type hierarchy.

Perhaps. Though I think it is useful to be more specific than just say "we introduce a type hierarchy". In particular, Go already has actual subtyping (the spec calls it "assignability", mainly affected by interface satisfaction), so having type hierarchies in general isn't necessarily a bad thing.

In this particular case, the matching only applies to generic type parameters. So while there are some contexts in which you can use a YourInt as a MyInt, you can't in general - so we don't have actual subtyping or an actual type-hierarchy. It could be argued that that's more confusing - but IMO "generic type parameters" are an easy enough to understand and distinguish context that it isn't much of a problem in general.

Types in Go1 are invariant. […] the rest of the language is strictly invariant.

I'm not sure what you mean here. "Invariance" is a property of type-constructors, not of types. And as far as I can tell, nothing really changes in this regard - Go won't get variant type constructors with the generic design, underlying type matching or not. You won't be able to use a func(X) as a func(Y), unless X and Y are the same type - that won't change.

You might intend to say that Go has no subtyping, but see above - it already does and has been since the Go1.0.

Do we even allow structs? Say we didn't.
Do we allow funcs? Say we didn't.
Do we allow maps/slices? Say we didn't.

Why would we say any of that? I don't see anything in the design or this proposal that would suggest structs, funcs or maps/slices aren't allowed, under exactly the same rules.

Does *MyInt match *int? (covariance)

This is an interesting question. A priori, I would assume "no" - the underlying type of *MyInt is *MyInt and that's not identical to *int. So you couldn't use a *MyInt to satisfy an interface{ type *int } constraint.

introducing underlying-matching fundamentally challenges Go1's design and opens a wholly new problem space much larger than what has been considered in this thread so far.

Perhaps. But I think it's also important to acknowledge that without it, type-parameters lists lose much of their utility - namely, allowing to write generic functions using operators. If you don't mach on the underlying type of a type-argument, you couldn't use a generic Max function with your MyInt, for example. (And the implications of using underlying type matching or abandoning it have been discussed in this thread already, I think :) )

@atdiar
Copy link

@atdiar atdiar commented Dec 28, 2020

Yes, I think the confusion stems from the understanding of what underlying means. I think it's suposed to be used in case of a sumtype defined using the built-in types. So when the underlying is a built-in type.

Because if we take the initial post from Ian, if I define
type I3 interface{ MyInt2, MyFloat2 }
Even if the underlying implements I1, I don't think that I1 == I3

[edit] unless as @thwd was mentioning a type hierarchy which indeeed exist, we have

type OurInt that would be accepted in type interface{MyInt, Myfloat} since MyInt is an underlying of OurInt
Yet, type MyInt would not accepted in type interface{Outint, MyFloat} because it is not an underlying.

Also, yes, pointer dereference operator is not an issue. Think of it as a function from int to a Typed value and remember that there is no overloading in Go. No covariance/ contravariance either. No automatic conversion. All this is subtly linked.

Then, it just works because MyInt and int are different: they define disjoint sets of values.
That's also why we can't add MyInt and OurInt by using "+" especially in a sumtype relationship. Otherwise, that would be a misuse of overloading when really, the operands are not the same.

Add(MyInt, int) can be transformed to Add(MyInt, MyInt) but Add(MyInt, OurInt) can not be specified. (or another name for this function/method/operator should be used, i.e. can't use the "+" symbol because no operator overloading and it would be left at the discretion of the user to define what that even means think adding celsius + meters)

For type switches, again, it's important that the types in a sumtype/union type be disjoint in terms of conxeptual method set (still taking a type as a conceptual go interface which, if not built-in, has a unique-name returning method) .
So while passing an int to type interface{MyInt, MyFloat} is ok, it's not for type Interface{MyInt, MyInt2} . Because by implementation int can be either one of them. So the underlying should be boxed into one of the types. We may find this issue again when type switching.

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 6, 2021

@atdiar

Yes, I think the confusion stems from the understanding of what underlying means.

The definition that applies here is the one in the language spec (https://golang.org/ref/spec#Types):

Each type T has an underlying type: If T is one of the predeclared boolean, numeric, or string types, or a type literal, the corresponding underlying type is T itself. Otherwise, T's underlying type is the underlying type of the type to which T refers in its type declaration.

@bcmills
Copy link
Member

@bcmills bcmills commented Jan 6, 2021

For type switches, again, it's important that the types in a sumtype/union type be disjoint in terms of conxeptual method set (still taking a type as a conceptual go interface which, if not built-in, has a unique-name returning method).

I don't agree that that is important. This proposal describes types that function more as unions than as algebraic data types, and that's ok: the concrete types themselves function as algebraic constructors, so the sum types don't need to also provide that functionality.

@atdiar
Copy link

@atdiar atdiar commented Jan 8, 2021

Yes, thank you. Indeed I had also gone back to the spec and figured out underlying is defined there. The proposal does define a subtyping relationship then as @thwd was alluding to in terms of type hierarchy. I do no think it's necessarily problematic after thinking about it.

As for your second point, the issue that could possibly occur for me is when the types in a union are interfaces. How to type switch on that if a concrete type implements several interfaces.
Also, if two different types implement the same interfaces that would have been in a union, are we sure these interfaces can really interop? (especially if they are not in a subtyping relationship)

@beoran
Copy link

@beoran beoran commented Jan 23, 2021

One solution for the type switch problem for the union proposed in this issue wiuld be to require the type switch to consist of all members of the union, each nominally, nothing more and nothing less. If any member is not exported then the type switch is only allowed in the defining package. This idea is simple and Go-like, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet