Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: add sum types / discriminated unions #19412

Open
DemiMarie opened this issue Mar 5, 2017 · 273 comments

Comments

@DemiMarie
Copy link

commented Mar 5, 2017

This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:

  • they are value types, like structs
  • the types contained in them are fixed at compile-time

Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2017

This has been discussed several times in the past, starting from before the open source release. The past consensus has been that sum types do not add very much to interface types. Once you sort it all out, what you get in the end if an interface type where the compiler checks that you've filled in all the cases of a type switch. That's a fairly small benefit for a new language change.

If you want to push this proposal along further, you will need to write a more complete proposal doc, including: What is the syntax? Precisely how do they work? (You say they are "value types", but interface types are also value types). What are the trade-offs?

@bradfitz bradfitz added the Proposal label Mar 6, 2017

@bradfitz bradfitz added this to the Proposal milestone Mar 6, 2017

@rsc rsc changed the title Proposal: Discriminated unions proposal: spec: add sum types / discriminated unions Mar 6, 2017

@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2017

@griesemer

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2017

I think this is too significant a change of the type system for Go1 and there's no pressing need.
I suggest we revisit this in the larger context of Go 2.

@rsc rsc added the Go2 label Mar 13, 2017

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

Thanks for creating this proposal. I've been toying with this idea for a year or so now.
The following is as far as I've got with a concrete proposal. I think
"choice type" might actually be a better name than "sum type", but YMMV.

Sum types in Go

A sum type is represented by two or more types combined with the "|"
operator.

type: type1 | type2 ...

Values of the resulting type can only hold one of the specified types. The
type is treated as an interface type - its dynamic type is that of the
value that's assigned to it.

As a special case, "nil" can be used to indicate whether the value can
become nil.

For example:

type maybeInt nil | int

The method set of the sum type holds the intersection of the method set
of all its component types, excluding any methods that have the same
name but different signatures.

Like any other interface type, sum type may be the subject of a dynamic
type conversion. In type switches, the first arm of the switch that
matches the stored type will be chosen.

The zero value of a sum type is the zero value of the first type in
the sum.

When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen.

For example:

var x int|float64 = 13

would result in a value with dynamic type int, but

var x int|float64 = 3.13

would result in a value with dynamic type float64.

Implementation

A naive implementation could implement sum types exactly as interface
values. A more sophisticated approach could use a representation
appropriate to the set of possible values.

For example a sum type consisting only of concrete types without pointers
could be implemented with a non-pointer type, using an extra value to
remember the actual type.

For sum-of-struct-types, it might even be possible to use spare padding
bytes common to the structs for that purpose.

@bcmills

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@rogpeppe How would that interact with type assertions and type switches? Presumably it would be a compile-time error to have a case on a type (or assertion to a type) that is not a member of the sum. Would it also be an error to have a nonexhaustive switch on such a type?

@josharian

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
  case int:
    // ...

and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

Or can sum types contain only concrete types?

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

Yes, I think it would be reasonable to have a compile-time error
to have a case that can't be matched. Not sure about whether it's
a good idea to allow non-exhaustive switches on such a type - we
don't require exhaustiveness anywhere else. One thing that might
be good though: if the switch is exhaustive, we could not require a default
to make it a terminating statement.

That means that you can get the compiler to error if you have:

func addOne(x int|float64) int|float64 {
    switch x := x.(type) {
    case int:
        return x + 1
    case float64:
         return x + 1
    }
}

and you change the sum type to add an extra case.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
case int:
// ...
and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

t can't contain an interface{} containing an int. t is an interface
type just like any other interface type, except that it can only
contain the enumerated set of types that it consists of.
Just like an interface{} can't contain an interface{} containing an int.

Sum types can match interface types, but they still just get a concrete
type for the dynamic value. For example, it would be fine to have:

type R io.Reader | io.ReadCloser

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

According to the proposal above, you get the first item
in the sum that the value can be assigned to, so
you'd get the nil interface.

In fact interface{} | nil is technically redundant, because any interface{}
can be nil.

For []int | nil, a nil []int is not the same as a nil interface, so the
concrete value of ([]int|nil)(nil) would be []int(nil) not untyped nil.

@bcmills

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

The []int | nil case is interesting. I would expect the nil in the type declaration to always mean "the nil interface value", in which case

type T []int | nil
var x T = nil

would imply that x is the nil interface, not the nil []int.

That value would be distinct from the nil []int encoded in the same type:

var y T = []int(nil)  // y != x
@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be? My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil.

It seems overly subtle.

Exhaustive type switches would be nice. You could always add an empty default: when it's not the desired behavior.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

The proposal says "When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen."

So, with:

type T []int | nil
var x T = nil

x would have concrete type []int because nil is assignable to []int and []int is the first element of the type. It would be equal to any other []int (nil) value.

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be?

The proposal says "The zero value of a sum type is the zero value of the first type in
the sum.", so the answer is int64(0).

My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil

No, it would just be the usual interface nil value in that case. That type (interface{} | nil) is redundant. Perhaps it might be a good idea to make it a compiler to specify sum types where one element is a superset of another, as I can't currently see any point in defining such a type.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

The zero value of a sum type is the zero value of the first type in the sum.

That is an interesting suggestion, but since the sum type must record somewhere the type of the value that it currently holds, I believe it means that the zero value of the sum type is not all-bytes-zero, which would make it different from every other type in Go. Or perhaps we could add an exception saying that if the type information is not present, then the value is the zero value of the first type listed, but then I'm not sure how to represent nil if it is not the first type listed.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

So (stuff) | nil only makes sense when nothing in (stuff) can be nil and nil | (stuff) means something different depending on whether anything in stuff can be nil? What value does nil add?

@ianlancetaylor I believe many functional languages implement (closed) sum types essentially like how you would in C

struct {
    int which;
    union {
         A a;
         B b;
         C c;
    } summands;
}

if which indexes into the union's fields in order, 0 = a, 1 = b, 2 = c, the zero value definition works out to all bytes are zero. And you'd need to store the types elsewhere, unlike with interfaces. You'd also need special handling for the nil tag of some kind wherever you store the type info.

That would make union's value types instead of special interfaces, which is also interesting.

@shanemhansen

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

Is there a way to make the all zero value work if the field which records the type has a zero value representing the first type? I'm assuming that one possible way for this to be represented would be:

type A = B|C
struct A {
  choice byte // value 0 or 1
  value ?// (thing big enough to store B | C)
}

[edit]

Sorry @jimmyfrasche beat me to the punch.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

Is there anything added by nil that couldn't be done with

type S int | string | struct{}
var None struct{}

?

That seems like it avoids a lot of the confusion (that I have, at least)

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

Or better

type (
     None struct{}
     S int | string | None
)

that way you could type switch on None and assign with None{}

@bcmills

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@jimmyfrasche struct{} is not equal to nil. It's a minor detail, but it would make type-switches on sums needlessly(?) diverge from type-switches on other types.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@bcmills It wasn't my intent to claim otherwise—I meant that it could be used for the same purpose as differentiating a lack of value without overlapping with the meaning of nil in any of the types in the sum.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@rogpeppe what does this print?

// r is an io.Reader interface value holding a type that also implements io.Closer
var v io.ReadCloser | io.Reader = r
switch v.(type) {
case io.ReadCloser: fmt.Println("ReadCloser")
case io.Reader: fmt.Println("Reader")
}

I would assume "Reader"

@bcmills

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

(And I would also expect sums which include only interface types to use no more space than a regular interface, although I suppose that an explicit tag could save a bit of lookup overhead in the type-switch.)

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

@ianlancetaylor That's an excellent point to raise, thanks. I don't think it's hard to get around though, although it does imply that my "naive implementation" suggestion is itself too naive. A sum type, although treated as an interface type, does not have to actually contain direct pointer to the type and its method set - instead it could, when appropriate, contain an integer tag that implies the type. That tag could be non-zero even when the type itself is nil.

Given:

 var x int | nil = nil

the runtime value of x need not be all zeros. When switching on the type of x or converting
it to another interface type, the tag could be indirected through a small table containing
the actual type pointers.

Another possibility would be to allow a nil type only if it's the first element, but
that precludes constructions like:

var t nil | int
var u float64 | t
@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2017

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

Yes.

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

I don't get this. Why would "this [...] have to be valid for the type switch to print ReadCloser"
Like any interface type, a sum type would store no more than the concrete value of what's in it.

When there are several interface types in a sum, the runtime representation is just an interface value - it's just that we know that the underlying value must implement one or more of the declared possibilities.

That is, when you assign something to a type (I1 | I2) where both I1 and I2 are interface types, it's not possible to tell later whether the value you put into was known to implement I1 or I2 at the time.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

If you have a type that's io.ReadCloser | io.Reader you can't be sure when you type switch or assert on io.Reader that it's not an io.ReadCloser unless assignment to a sum type unboxes and reboxes the interface.

@jimmyfrasche

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

Going the other way, if you had io.Reader | io.ReadCloser it would either never accept an io.ReadCloser because it goes strictly right-to-left or the implementation would have to search for the "best matching" interface from all interfaces in the sum but that cannot be well defined.

@tema3210

This comment has been minimized.

Copy link

commented Jul 12, 2019

I would like to see the rust’s discriminated unions syntax rather than haskell’s sum types, it allow naming of variants and allow better pattern matching syntax proposal.
Implementation can be done like struct with tag field(uint type, depends on count of variants) and union field(holding the data).
This feature required for closed set of variants (state representation would be much easier and cleaner, with compile time checking). According to questions about interfaces and their representation, I think that their implementation in sum type must be not over than just another case of sum type, because the interface is about any type which fit some requirements, but the sum type use case is different.

Syntax:

type Type enum {
         Tuple (int,int),
         One int,
         None,
};

In the example above the size would be sizeof((int,int)).
Pattern matching can be done with new created match operator, or within existing switch operator, just like:

var a Type
switch (a) {
         case Tuple{(b,c)}:
                    //do something
         case One{b}:
                    //do something else
         case None:
                    //...
}

Creation syntax:
var a Type = Type{One=12}
Note, that in enum instance construction only one variant can be specified.

Zero value (problem):
We can sort names in alphabet order, zero value of enum will be zero value of type of first member in sorted member list.

P.S Solution of Zero Value problem is mostly defined by agreement.

@urandom

This comment has been minimized.

Copy link

commented Jul 12, 2019

I think keeping the zero value of the sum as the zero value of the first user defined sum field would be less confusing, perhaps

@tema3210

This comment has been minimized.

Copy link

commented Jul 12, 2019

I think keeping the zero value of the sum as the zero value of the first user defined sum field would be less confusing, perhaps

But making zero value depend on field declaration order, i think it's worse.

@tema3210

This comment has been minimized.

Copy link

commented Jul 21, 2019

Someone wrote design doc?

I have one:
19412-discriminated_unions_and_pattern_matching.md.zip

I changed this:

I think keeping the zero value of the sum as the zero value of the first user defined sum field would be less confusing, perhaps

Now in my proposal agreement on Zero Value (Problem) moved to urandoms position.

UPD: Design doc changed, minor fixes.

@sirkon

This comment has been minimized.

Copy link

commented Jul 31, 2019

I have two recent use cases, where I needed built in sum types:

  1. AST tree representation, as expected. Initially found a library which was a solution at first sight, but their approach was having a large struct with lots of nilable fields. Worst of both worlds IMO. No type safety of course. Wrote our own instead.
  2. Had a queue of predefined background tasks: we have a search service which is under development right now and our search operations could be overly long, etc. So we decided to execute them in the background via sending search index operation tasks into a channel. Then a dispatcher will decide what to do with them further. Could use visitor pattern, but it obviously is an overkill for a simple gRPC request. And it is not particularly clear to say at least, as it introduces a tie between a dispatcher and a visitor.

In both cases implemented something like this (on the example of 2nd task):

type Task interface {
    task()
}

type SearchAdd struct {
    Ctx   context.Context
    ID    string
    Attrs Attributes
}

func (SearchAdd) task() {}

type SearchUpdate struct {
    Ctx         context.Context
    ID          string
    UpdateAttrs UpdateAttributes
}

func (SearchUpdate) task() {}

type SearchDelete struct {
    Ctx context.Context
    ID  string
}

func (SearchDelete) task() {}

And then

task := <- taskChannel

switch v := task.(type) {
case tasks.SearchAdd:
    resp, err := search.Add(task.Ctx, &search2.RequestAdd{…}
    if err != nil {
        log.Error().Err(err).Msg("blah-blah-blah")
    } else {
        if resp.GetCode() != search2.StatusCodeSuccess  {
            …
        } 
    }
case tasks.SearchUpdate:
    …
case tasks.SearchDelete:
    …
}

This is almost good. The bad thing Go does not provide full type safety, i.e. there will be no compilation error after the new search index operation task would be added.

IMHO using sum types is the clearest solution for these kind of tasks usually solved with visitor and set of dispatchers, where visitor's functions are not numerous and small and visitor itself is a fixed type.

I truly believe having something like

type Task oneof {
    // SearchAdd holds a data for a new record in the search index
    SearchAdd {
        Ctx   context.Context
        ID    string
        Attrs Attributes   
    }

    // SearchUpdate update a record
    SearchUpdate struct {
        Ctx         context.Context
        ID          string
        UpdateAttrs UpdateAttributes
    }

    // SearchDelete delete a record
    SearchDelete struct {
        Ctx context.Context
        ID  string
    }
}
switch task {
case tasks.SearchAdd:
    // task is tasks.SearchAdd in this scope
case tasks.SearchUpdate:
case tasks.SearchDelete:
}

would be much more Goish in spirit than any other approach Go allows in its current state. No need for Haskellish pattern matching, juist diving up to the certain type is more than enough.

@sirkon

This comment has been minimized.

Copy link

commented Aug 1, 2019

Ouch, missed the point of the syntax proposal. Fix it.

Two version, one for generic sum type and sum type for enumerations:

Generic sum types

type Sum oneof {
    TTypeDeclTTypeDecl₂
    …
    TTypeDeclₙ
}

where T₁Tₙ are type definitions at the same level with Sum (oneof exposes them outside of its scope) and Sum declares some interface which only T₁Tₙ satisfies.

The processing is similar to what we have (type) switch except it is done implicitly over oneof objects and there have to be a compiler check if all variants were listed.

Real type safe enumerations

type Enum oneof {
    Value = iota
}

pretty similar to iota of consts, except only explicitly listed values are Enums and everything else is not.

@tema3210

This comment has been minimized.

Copy link

commented Aug 7, 2019

switch task {
case tasks.SearchAdd:
    // task is tasks.SearchAdd in this scope
case tasks.SearchUpdate:
case tasks.SearchDelete:
}

would be much more Goish in spirit than any other approach Go allows in its current state. No need for Haskellish pattern matching, juist diving up to the certain type is more than enough.

I don't think that manipulating meaning of task variable is good idea, although acceptable.

@sirkon

This comment has been minimized.

Copy link

commented Aug 7, 2019

```go
switch task {
case tasks.SearchAdd:
    // task is tasks.SearchAdd in this scope
case tasks.SearchUpdate:
case tasks.SearchDelete:
}

would be much more Goish in spirit than any other approach Go allows in its current state. No need for Haskellish pattern matching, juist diving up to the certain type is more than enough.

I don't think that manipulating meaning of task variable is good idea, although acceptable.

Good luck with your visitors then.

@Goodwine

This comment has been minimized.

Copy link

commented Aug 7, 2019

@sirkon What do you mean about visitors? I liked this syntax btw, however should the switch be written as this:

switch task {
case Task.SearchAdd:
    // task is Task.SearchAdd in this scope
case Task.SearchUpdate:
case Task.SearchDelete:
}

Also what would the no-value for Task be? For example:

var task Task

Would it be nil? If so, should the switch have an extra case nil?
Or would it be initialized to the first type? This would be awkward tho, because then order of type declaration matters in a way it didn't before, however it would probably be OK for numeric enums.

I'm assuming that this is equivalent to switch task.(type) but the switch would require all cases to be there, right? as in.. if you miss one case, compilation error. And no default allowed. Is that right?

@sirkon

This comment has been minimized.

Copy link

commented Aug 7, 2019

What do you mean about visitors?

I meant they are the only type safe option in Go for such kind of functionality. Much worse at that for a certain set of cases (limited number of predefined alternatives).

Also what would the no-value for Task be? For example:

var task Task

I am afraid it should be a nilable type in Go as this

Or would it be initialized to the first type?

would be way too weird especially for a purpose intended.

I'm assuming that this is equivalent to switch task.(type) but the switch would require all cases to be there, right? as in.. if you miss one case, compilation error.

Yes, right.

And no default allowed. Is that right?

No, defaults are allowed. Discouraged though.

PS I seem to got an idea Go @ianlancetaylor and other Go people have about sum types. It looks like nilness makes them quite an NPD-prone, since Go doesn't have any control over nil values.

@Goodwine

This comment has been minimized.

Copy link

commented Aug 7, 2019

If it's nil, then I guess it's fine. I would prefer that case nil was a requirement for the switch statement. Doing an if task != nil before is ok too, I just don't like it that much :|

@Goodwine

This comment has been minimized.

Copy link

commented Aug 7, 2019

Would this be allowed too?

type Foo oneof {
  A = 3
  B = "3"
  C = 3.0
  D = struct { E bool }{ true }
}
@sirkon

This comment has been minimized.

Copy link

commented Aug 7, 2019

Would this be allowed too?

type Foo oneof {
  A = 3
  B = "3"
  C = 3.0
  D = struct { E bool }{ true }
}

Well, no consts then, only

type Foo oneof {
    A <type reference>
}

or

type Foo oneof {
    A = iota
    B
    C
}

or

type Foo oneof {
    A = 1
    B = 2
    C = 3
}

No combination of iotas and values. Or combination with control on values, they should not be repeated.

@Merovius

This comment has been minimized.

Copy link

commented Aug 7, 2019

FWIW, one thing I found interesting about the newest generics design is that it showed another venue to address at least some of the use cases of sum types while avoiding the pitfall about zero-values. It defines disjunctive contracts, which are sum-ish in a way, but because they describe constraints and not types, don't need to have a zero value (as you can't declare variables of that type). That is, it's at least possible to write a function that takes a limited set of possible types, with compile-time type-checking of that set.

Now, of course, as-is the design doesn't really work for the use-cases intended here: Disjunctions list only underlying types or methods and are thus still widely open. And of course, even as a general idea, it's pretty limited as you can't instantiate a generic (or sum-ish-taking) func or value. But IMO it shows that the design space to address some of the use-cases of sums is much larger than the idea of sum types themselves. And that thinking about sums is thus more fixating on a specific solution, instead of specific problems.

Anyway. Just thought it's interesting.

@alanfo

This comment has been minimized.

Copy link

commented Aug 9, 2019

@Merovius makes an excellent point about the latest generic design being able to deal with some of the use cases of sum types. For example, this function which was used earlier in the thread:

func addOne(x int|float64) int|float64 {
    switch x := x.(type) {
    case int:
        return x + 1
    case float64:
         return x + 1
    }
}

would become:

contract intOrFloat64(T) {
    T int, float64
}

func addOne(type T intOrFloat64) (x T) T {
    return x + 1
}

As far as sum types themselves are concerned, if generics eventually land, I'd be even more dubious than I am now about whether the benefits of introducing them would outweigh the costs for a simple language such as Go.

However, if something were to be done, then the simplest and least disruptive solution IMO would be @ianlancetaylor's idea of 'restricted interfaces' which would be implemented in exactly the same way as 'unrestricted' interfaces are today but could only be satisfied by the specified types. In fact, if you took a leaf out of the generic design's book, and made the type constraint the first line of the interface block:

type intOrFloat64 interface{ type int, float64 }    

then this would be completely backwards compatible as you wouldn't need a new keyword (such as restrict) at all. You could still add methods to the interface and it would be a compile time error if the methods were not supported by all of the specified types.

I see no problem at all assigning values to a variable of the restricted interface type. If the type of the value on the RHS (or the default type of an untyped literal) was not an exact match for one of the specified types then it simply would not compile. So we'd have:

var v1 intOrFloat64 = 1        // compiles, dynamic type int
var v2 intOrFloat64 = 1.0      // compiles, dynamic type float64
var v3 intOrFloat64 = 1 + 2i   // doesn't compile, complex128 is not a specified type

It would be a compile time error for the cases of a type switch to not match a specified type and an exhaustiveness check could be implemented. However, a type assertion would still be needed to convert the restricted interface value to a value of its dynamic type as it is today.

Zero values are not a problem with this approach (or at any rate no more of a problem than they are today with interfaces generally). The zero value of a restricted interface would be nil (implying that it doesn't currently contain anything) and the specified types would of course have their own zero values, internally, which would be nil for nilable types.

All this seems perfectly workable to me though, as I said earlier, is the compile time safety gained really worth the additional complexity - I have my doubts as I've never really felt the need for sum types in my own programming.

@Goodwine

This comment has been minimized.

Copy link

commented Aug 9, 2019

IIUC the generics thing won't be of dynamic types, so this whole point doesn't stand. However if interfaces are allowed to work as contracts (which I doubt) it wouldn't solve exhaustive checks and enums which is what (I think, maybe not?) sumtypes is about.

@griesemer

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2019

@alanfo, @Merovius Thanks for the cue; it's interesting that this discussing is turning in this direction:

I like to turn the viewpoint around for just a split second: I'm trying to understand why contracts cannot be replaced entirely with parameterized interfaces that permit the type restriction mentioned above. At the moment I don't see any strong technical reason, except that such "sum" interface types, when used as "sum" types would want to restrict the possible dynamic values to exactly the types enumerated in the interface, while - if the same interface were used in contract position - the enumerated types in the interface would need to serve as underlying types to be a reasonably useful generic restriction.

@alanfo

This comment has been minimized.

Copy link

commented Aug 9, 2019

@Goodwine
I wasn't suggesting that the generics design would address everything that one might want to do with sum types - as @Merovius clearly explained in his last post they won't. In particular, the type constraints proposed for generics only cover the built-in types and any types derived from them. From a sum type viewpoint the former is too narrow and the latter too wide.

However, the generics design would enable one to write a function which operates on a limited set of types which the compiler would enforce and this is something that we can't do at all at present.

As far as restricted interfaces are concerned, the compiler would know the precise types that could be used and it would therefore become feasible for it to do an exhaustive check in a type switch statement.

@alanfo

This comment has been minimized.

Copy link

commented Aug 9, 2019

@griesemer

I'm puzzled by what you say as I thought the draft generics design document explained quite clearly (in the section "Why not use interfaces instead of contracts") why the latter were considered a better vehicle than the former for expressing generic constraints.

In particular, a contract can express a relationship between type parameters and therefore only a single contract is needed. Any of its type parameters can be used as the receiver type of a method listed in the contract.

The same cannot be said of an interface, parameterized or not. If they had any constraints at all, each type parameter would need a separate interface.

This makes it more awkward to express a relationship between type parameters using interfaces though not impossible as the graph example showed.

However, if you're thinking that we could "kill two birds with one stone" by adding type constraints to interfaces and then using them for both generic and sum type purposes, then (apart from the problem you mentioned) I think you're probably right that this would be technically feasible.

I guess it wouldn't really matter if interface type constraints could include 'non built-in' types as far as generics were concerned though some way would need to be found to restrict them to the exact types (and not derived types as well) so they would be suitable for sum types. Perhaps we could use const type for the latter (or even just const) if we are to stick with the current keywords.

@stevenblenkinsop

This comment has been minimized.

Copy link

commented Aug 10, 2019

@griesemer There are a few reasons why parameterized interface types aren't a direct replacement for contracts.

  1. The type parameters are the same as on other parameterized types.
    In a type like

    type C2(type T C1) interface { ... }

    the type parameter T exists outside the interface itself. Any type argument passed as T has to be already known to satisfy contract C1, and the body of the interface cannot further constrain T. This is different from contract parameters, which are constrained by the body of the contract as a result of being passed into it. This would mean that each type parameter to a function would have to be independently constrained before being passed as a parameter to the constraint on any other type parameter.

  2. There's no way to name the receiver type in the body of the interface.
    Interfaces would have to let you write something like:

    type C3(type U C1) interface(T) {
        Add(T) T
    }

    where T denotes the receiver type.

  3. Some interface types would not satisfy themselves as generic constraints.
    Any operations relying on multiple values of the receiver type are not compatible with dynamic dispatch. These operations would therefore not be usable on interface values. This would mean that the interface would not satisfy itself (for example as the type argument to a type parameter constrained by the same interface). This would be surprising. One solution is just not to allow the creation of interface values for such interfaces at all, but this would disallow the use case being envisioned here anyways.

As for distinguishing between underlying type constraints and type identity constraints, there is one method that might work. Imagine we could define custom constraints, like

contract (T) indenticalTo(U) {
    *T *U
}

(Here, I'm using an invented notation to specify a single type as the "receiver". I'll pronounce a contract with an explicit receiver type as "constraint", just as a func with a receiver is pronounced "method". The parameters after the contract name are normal type parameters and cannot appear on the left hand side of a constraint clause in the body of the constraint.)

Because the underlying type of a literal pointer type is itself, this constraint implies that T is identical to U. Because this is declared as a constraint, you could write (identicalTo(int)), (identicalTo(uint)), ... as a constraint disjunction.

@urandom

This comment has been minimized.

Copy link

commented Aug 10, 2019

While contracts might be useful to express some sort of sum types, I don't think you can express generic sum types with them. From what I've seen from the draft, one has to list concrete types, so you cannot write something like this:

contract Foo(T, U) {
    T U, int64
}

Which one would need to express a generic sum type of an unknown type and one/more known types. Even if the design did allow for such constructs, they would look weird when used, since both parameters would effectively be the same thing.

@alanfo

This comment has been minimized.

Copy link

commented Aug 10, 2019

I've been thinking some more about how the draft generics design might change if interfaces were extended to include type constraints and then used to replace contracts in the design.

It's perhaps easiest to analyze the situation if we consider different numbers of type parameters:

No parameters

No change :)

One parameter

No real problems here. A parameterized interface (as opposed to a non-generic one) would only be needed if either the type parameter referred to itself and/or some other independent fixed type(s) were needed to instantiate the interface.

Two or more parameters

As mentioned previously each type parameter would need to be constrained individually if it needed a constraint at all.

A parameterized interface would only be needed if:

  1. The type parameter referred to itself.

  2. The interface referred to another type parameter or parameters which had already been declared in the type parameter section (presumably we wouldn't want to backtrack here).

  3. Some other independent fixed type(s) were needed to instantiate the interface.

Of these (2) is really the only troublesome case as it would rule out type parameters referring to each other such as in the graph example. Whether one declared 'Node' or 'Edge' first, its constraining interface would still need the other one to be passed as a type parameter.

However, as indicated in the design document, you could work around this by declaring non-parameterized (as they don't refer to themselves) NodeInterface and EdgeInterface at top level as there would then be no problem in their referring to each other whatever the declaration order. You could then use these interfaces to constrain the Graph struct's type parameters and those of its associated 'New' method.

So it doesn't look like there are any insuperable problems here even if the contracts idea is nicer.

Presumably, comparable could now just become a built-in interface rather than a contract.

Interfaces could, of course, be embedded in each other as they can already.

I'm not sure how one would deal with the pointer method problem (in those cases where these would need to be specified in the contract) as you can't specify a receiver for an interface method. Perhaps some special syntax (such as preceding the method name with an asterisk) would be needed to indicate a pointer method.

Turning now to @stevenblenkinsop's observations, I wonder whether it would make life easier if parameterized interfaces didn't allow their own type parameters to be constrained in any way at all? I'm not sure this is really a useful feature anyway unless someone can think of a sensible use case.

Personally, I don't regard it as surprising that some interface types will not be able to satisfy themselves as generic constraints. An interface type is not a valid receiver type in any case and so can have no methods.

Although Steven's idea of an identicalTo() built in function would work, it seems to me to be potentially long-winded for specifying sum types. I'd prefer a syntax which allows one to specify a whole line of types as being exact.

@urandom is correct, of course, that as the generics draft currently stands one can only list concrete (built-in or aggregate built-in) types. However, this would clearly have to change if restricted interfaces are used instead for both generics and sum types. So I wouldn't rule out that something like this would be permitted in a unified environment:

interface Foo(T) {
    const type T, int64  // 'const' indicates types are exact i.e. no derived types
}
@tema3210

This comment has been minimized.

Copy link

commented Aug 10, 2019

why we can not just add Discriminated Unions to language instead of inventing another walk around of their absence?

@Merovius

This comment has been minimized.

Copy link

commented Aug 10, 2019

@griesemer You might or might not be aware, but I've been in favor of using interfaces to specify constraints from the beginning :) I no longer think the exact ideas I bring up in that post are the way to go (especially the things I suggest to address operators). And I do like the most recent iteration of the contracts design a lot more than the previous one. But in general, I completely agree that (possibly extended) interfaces as constraints are viable and worth considering.

@urandom

I don't think you can express generic sum types with them

I want to re-iterate that my point wasn't "you can build sum types with them", but "you can solve some problems that sum types solve with them". If your problem statement is "I want sum types", then it's not surprising that sum types are the only solution. I simply wanted to express that it might be possible to do without them, if we focus on the problems you want to solve with them.

@alanfo

This makes it more awkward to express a relationship between type parameters using interfaces though not impossible as the graph example showed.

I think "awkward" is subjective. Personally, I find using parameterized interfaces more natural and the graph example a very good illustration. To me, a Graph is an entity, not a relationship between a kind of Edge and a kind of Node.

But TBH, I don't think either of them is really more or less awkward - you write pretty much exactly the same code to express pretty much exactly the same things. And FWIW, there is prior art for this. Haskell type-classes behave a lot like interfaces and as that wiki-article points out, using multi-parameter type-classes to express relationships between types is a pretty normal thing to do.

@stevenblenkinsop

There's no way to name the receiver type in the body of the interface.

The way you would address that is with type-arguments at usage-site. i.e.

type Adder(type T) interface {
    Add(t T) T
}

func Sum(type T Adder(T)) (vs []T) T {
    var t T
    for _, v := range vs {
        t = t.Add(v)
    }
    return t
}

This requires some care as to how unification works, so that you can allow self-referencing type-parameters, but I think it can be made to work.

Your 1. and 3. I don't really understand, I have to admit. I would benefit from some concrete examples.


Anyway, it's a bit disingenuous to drop this at the end of continuing this discussion, but this is probably not the right issue to talk through the minutiae of the generics design. I only brought it up to widen the design space for this issue a bit :) Because it felt like it has been a while since new ideas where brought into the discussion around sum types.

@tema3210

This comment has been minimized.

Copy link

commented Aug 10, 2019

```go
switch task {
case tasks.SearchAdd:
    // task is tasks.SearchAdd in this scope
case tasks.SearchUpdate:
case tasks.SearchDelete:
}

would be much more Goish in spirit than any other approach Go allows in its current state. No need for Haskellish pattern matching, juist diving up to the certain type is more than enough.
I don't think that manipulating meaning of task variable is good idea, although acceptable.

Good luck with your visitors then.

Why you think that pattern matching can't be done in Go? If you lack of examples of patten matching, see, for example, Rust.

@skybrian

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2019

@Merovius re: "To me, a Graph is an entity"

Is it a compile-time entity or does it have a representation at runtime? One of the major differences between contracts and interfaces is that an interface is a runtime object. It participates in garbage collection, has pointers to other runtime objects, and so on. Converting from a contract to an interface would mean introducing a new, temporary runtime object that has pointers to the nodes/vertexes it contains (how many?), which seems awkward when you have a collection of graph functions, each of which might more naturally take parameters pointing at various parts of graphs in their own ways, depending on the function's needs.

Your intuition might be misled by using "Graph" for a contract, since "Graph" seems object-like and the contract isn't really specifying any particular subgraph; it's more like defining a set of terms to use later, as you would in mathematics or law. In some cases you might want both a graph contract and a graph interface, resulting in an annoying name clash. I can't think of a better name off the top of my head, though.

By contrast, a discriminated union is a runtime object. While not constraining the implementation, you need to think of what an array of them might be like. An N-item array needs N discriminators and N values, and there are a variety of ways that might be done. (Julia has interesting representations, sometimes putting the discriminators and values in separate arrays.)

@mathieudevos

This comment has been minimized.

Copy link

commented Aug 12, 2019

To suggest a reduction of errors which are currently occurring all over the place with the interface{}schemes, but to remove the continuous typing of the | operator, I would suggest the following:

type foobar union {
    int
    float64
}

Just the use-case alone of replacing many interface{} with this kind of type-safety would be a massive gain for the library. Just looking at half the things in the crypto library could use this.

Issues such as: ah you gave in ecdsa.PrivateKey instead of *ecdsa.PrivateKey - here's a generic error that only ecdsa.PrivateKey is supported. The simple fact that these should be clear union types would increase type safety quite a bit.

While this suggestion takes up more space compared to int|float64 it does force the user to think about this. Keeping the code-base a lot cleaner.

@tema3210

This comment has been minimized.

Copy link

commented Aug 14, 2019

To suggest a reduction of errors which are currently occurring all over the place with the interface{}schemes, but to remove the continuous typing of the | operator, I would suggest the following:

type foobar union {
    int
    float64
}

Just the use-case alone of replacing many interface{} with this kind of type-safety would be a massive gain for the library. Just looking at half the things in the crypto library could use this.

Issues such as: ah you gave in ecdsa.PrivateKey instead of *ecdsa.PrivateKey - here's a generic error that only ecdsa.PrivateKey is supported. The simple fact that these should be clear union types would increase type safety quite a bit.

While this suggestion takes up more space compared to int|float64 it does force the user to think about this. Keeping the code-base a lot cleaner.

See this(comment), it's my proposal.

Actually we can introduce both our ideas into language. This will lead to existance of two native ways to do ADT, but with different syntaxes.

My proposal for features, especially pattern matching, your for compability and ability to benefit from the feature for old code bases.

But looks like overkill, isn't it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.