New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: add sum types / discriminated unions #19412

Open
DemiMarie opened this Issue Mar 5, 2017 · 234 comments

Comments

Projects
None yet
@DemiMarie

DemiMarie commented Mar 5, 2017

This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:

  • they are value types, like structs
  • the types contained in them are fixed at compile-time

Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Mar 6, 2017

This has been discussed several times in the past, starting from before the open source release. The past consensus has been that sum types do not add very much to interface types. Once you sort it all out, what you get in the end if an interface type where the compiler checks that you've filled in all the cases of a type switch. That's a fairly small benefit for a new language change.

If you want to push this proposal along further, you will need to write a more complete proposal doc, including: What is the syntax? Precisely how do they work? (You say they are "value types", but interface types are also value types). What are the trade-offs?

@bradfitz bradfitz added the Proposal label Mar 6, 2017

@bradfitz bradfitz added this to the Proposal milestone Mar 6, 2017

@rsc rsc changed the title from Proposal: Discriminated unions to proposal: spec: add sum types / discriminated unions Mar 6, 2017

@rsc

This comment has been minimized.

Contributor

rsc commented Mar 6, 2017

@griesemer

This comment has been minimized.

Contributor

griesemer commented Mar 6, 2017

I think this is too significant a change of the type system for Go1 and there's no pressing need.
I suggest we revisit this in the larger context of Go 2.

@rsc rsc added the Go2 label Mar 13, 2017

@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

Thanks for creating this proposal. I've been toying with this idea for a year or so now.
The following is as far as I've got with a concrete proposal. I think
"choice type" might actually be a better name than "sum type", but YMMV.

Sum types in Go

A sum type is represented by two or more types combined with the "|"
operator.

type: type1 | type2 ...

Values of the resulting type can only hold one of the specified types. The
type is treated as an interface type - its dynamic type is that of the
value that's assigned to it.

As a special case, "nil" can be used to indicate whether the value can
become nil.

For example:

type maybeInt nil | int

The method set of the sum type holds the intersection of the method set
of all its component types, excluding any methods that have the same
name but different signatures.

Like any other interface type, sum type may be the subject of a dynamic
type conversion. In type switches, the first arm of the switch that
matches the stored type will be chosen.

The zero value of a sum type is the zero value of the first type in
the sum.

When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen.

For example:

var x int|float64 = 13

would result in a value with dynamic type int, but

var x int|float64 = 3.13

would result in a value with dynamic type float64.

Implementation

A naive implementation could implement sum types exactly as interface
values. A more sophisticated approach could use a representation
appropriate to the set of possible values.

For example a sum type consisting only of concrete types without pointers
could be implemented with a non-pointer type, using an extra value to
remember the actual type.

For sum-of-struct-types, it might even be possible to use spare padding
bytes common to the structs for that purpose.

@bcmills

This comment has been minimized.

Member

bcmills commented Mar 22, 2017

@rogpeppe How would that interact with type assertions and type switches? Presumably it would be a compile-time error to have a case on a type (or assertion to a type) that is not a member of the sum. Would it also be an error to have a nonexhaustive switch on such a type?

@josharian

This comment has been minimized.

Contributor

josharian commented Mar 22, 2017

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
  case int:
    // ...

and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

Or can sum types contain only concrete types?

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

Yes, I think it would be reasonable to have a compile-time error
to have a case that can't be matched. Not sure about whether it's
a good idea to allow non-exhaustive switches on such a type - we
don't require exhaustiveness anywhere else. One thing that might
be good though: if the switch is exhaustive, we could not require a default
to make it a terminating statement.

That means that you can get the compiler to error if you have:

func addOne(x int|float64) int|float64 {
    switch x := x.(type) {
    case int:
        return x + 1
    case float64:
         return x + 1
    }
}

and you change the sum type to add an extra case.

@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
case int:
// ...
and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

t can't contain an interface{} containing an int. t is an interface
type just like any other interface type, except that it can only
contain the enumerated set of types that it consists of.
Just like an interface{} can't contain an interface{} containing an int.

Sum types can match interface types, but they still just get a concrete
type for the dynamic value. For example, it would be fine to have:

type R io.Reader | io.ReadCloser

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

According to the proposal above, you get the first item
in the sum that the value can be assigned to, so
you'd get the nil interface.

In fact interface{} | nil is technically redundant, because any interface{}
can be nil.

For []int | nil, a nil []int is not the same as a nil interface, so the
concrete value of ([]int|nil)(nil) would be []int(nil) not untyped nil.

@bcmills

This comment has been minimized.

Member

bcmills commented Mar 22, 2017

The []int | nil case is interesting. I would expect the nil in the type declaration to always mean "the nil interface value", in which case

type T []int | nil
var x T = nil

would imply that x is the nil interface, not the nil []int.

That value would be distinct from the nil []int encoded in the same type:

var y T = []int(nil)  // y != x
@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be? My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil.

It seems overly subtle.

Exhaustive type switches would be nice. You could always add an empty default: when it's not the desired behavior.

@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

The proposal says "When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen."

So, with:

type T []int | nil
var x T = nil

x would have concrete type []int because nil is assignable to []int and []int is the first element of the type. It would be equal to any other []int (nil) value.

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be?

The proposal says "The zero value of a sum type is the zero value of the first type in
the sum.", so the answer is int64(0).

My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil

No, it would just be the usual interface nil value in that case. That type (interface{} | nil) is redundant. Perhaps it might be a good idea to make it a compiler to specify sum types where one element is a superset of another, as I can't currently see any point in defining such a type.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Mar 22, 2017

The zero value of a sum type is the zero value of the first type in the sum.

That is an interesting suggestion, but since the sum type must record somewhere the type of the value that it currently holds, I believe it means that the zero value of the sum type is not all-bytes-zero, which would make it different from every other type in Go. Or perhaps we could add an exception saying that if the type information is not present, then the value is the zero value of the first type listed, but then I'm not sure how to represent nil if it is not the first type listed.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

So (stuff) | nil only makes sense when nothing in (stuff) can be nil and nil | (stuff) means something different depending on whether anything in stuff can be nil? What value does nil add?

@ianlancetaylor I believe many functional languages implement (closed) sum types essentially like how you would in C

struct {
    int which;
    union {
         A a;
         B b;
         C c;
    } summands;
}

if which indexes into the union's fields in order, 0 = a, 1 = b, 2 = c, the zero value definition works out to all bytes are zero. And you'd need to store the types elsewhere, unlike with interfaces. You'd also need special handling for the nil tag of some kind wherever you store the type info.

That would make union's value types instead of special interfaces, which is also interesting.

@shanemhansen

This comment has been minimized.

Contributor

shanemhansen commented Mar 22, 2017

Is there a way to make the all zero value work if the field which records the type has a zero value representing the first type? I'm assuming that one possible way for this to be represented would be:

type A = B|C
struct A {
  choice byte // value 0 or 1
  value ?// (thing big enough to store B | C)
}

[edit]

Sorry @jimmyfrasche beat me to the punch.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

Is there anything added by nil that couldn't be done with

type S int | string | struct{}
var None struct{}

?

That seems like it avoids a lot of the confusion (that I have, at least)

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

Or better

type (
     None struct{}
     S int | string | None
)

that way you could type switch on None and assign with None{}

@bcmills

This comment has been minimized.

Member

bcmills commented Mar 22, 2017

@jimmyfrasche struct{} is not equal to nil. It's a minor detail, but it would make type-switches on sums needlessly(?) diverge from type-switches on other types.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

@bcmills It wasn't my intent to claim otherwise—I meant that it could be used for the same purpose as differentiating a lack of value without overlapping with the meaning of nil in any of the types in the sum.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

@rogpeppe what does this print?

// r is an io.Reader interface value holding a type that also implements io.Closer
var v io.ReadCloser | io.Reader = r
switch v.(type) {
case io.ReadCloser: fmt.Println("ReadCloser")
case io.Reader: fmt.Println("Reader")
}

I would assume "Reader"

@bcmills

This comment has been minimized.

Member

bcmills commented Mar 22, 2017

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

(And I would also expect sums which include only interface types to use no more space than a regular interface, although I suppose that an explicit tag could save a bit of lookup overhead in the type-switch.)

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

@ianlancetaylor That's an excellent point to raise, thanks. I don't think it's hard to get around though, although it does imply that my "naive implementation" suggestion is itself too naive. A sum type, although treated as an interface type, does not have to actually contain direct pointer to the type and its method set - instead it could, when appropriate, contain an integer tag that implies the type. That tag could be non-zero even when the type itself is nil.

Given:

 var x int | nil = nil

the runtime value of x need not be all zeros. When switching on the type of x or converting
it to another interface type, the tag could be indirected through a small table containing
the actual type pointers.

Another possibility would be to allow a nil type only if it's the first element, but
that precludes constructions like:

var t nil | int
var u float64 | t
@rogpeppe

This comment has been minimized.

Contributor

rogpeppe commented Mar 22, 2017

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

Yes.

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

I don't get this. Why would "this [...] have to be valid for the type switch to print ReadCloser"
Like any interface type, a sum type would store no more than the concrete value of what's in it.

When there are several interface types in a sum, the runtime representation is just an interface value - it's just that we know that the underlying value must implement one or more of the declared possibilities.

That is, when you assign something to a type (I1 | I2) where both I1 and I2 are interface types, it's not possible to tell later whether the value you put into was known to implement I1 or I2 at the time.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

If you have a type that's io.ReadCloser | io.Reader you can't be sure when you type switch or assert on io.Reader that it's not an io.ReadCloser unless assignment to a sum type unboxes and reboxes the interface.

@jimmyfrasche

This comment has been minimized.

Member

jimmyfrasche commented Mar 22, 2017

Going the other way, if you had io.Reader | io.ReadCloser it would either never accept an io.ReadCloser because it goes strictly right-to-left or the implementation would have to search for the "best matching" interface from all interfaces in the sum but that cannot be well defined.

@peter-mckenzie

This comment has been minimized.

peter-mckenzie commented Feb 19, 2018

What about var x, y int | float64? What are the rules here, when adding these? Which lossy conversion gets made (and why)? What will be the result type?

@Merovius no lossy conversion gets implicitly made, although I can see how my wording could give that impression sorry. Here, a simple x + y would not compile because it implies a possible implicit conversion. But either of the following would compile:

z = int(x) + int(y)
z = float64(x) + float64(y)

Similarly your xyz example would not compile because it requires possible implicit conversions.

I think "supported the intersection of the operations supported" sounds nice but doesn't quite convey what I was intending. Adding something like "compiles for all component types" helps describe how I think it could work.

Another example is if all the component types are slices and maps. Would be nice to be able to call len on the sum type without needing a type switch.

@peter-mckenzie

This comment has been minimized.

peter-mckenzie commented Feb 20, 2018

All of int, string and rune have a + operator; what is the above printing, why and most of all, how can the result not be completely confusing?

Just wanted to add that my "What if the sum type supported the intersection of the operations supported by its component types?" was inspired by the Go Spec's description of a type as "A type determines a set of values together with operations and methods specific to those values.".

The point I was trying to make is that a type is more than just values and methods, and thus a sum type could try to capture the commonality of that other stuff from its component types. This "other stuff" is more nuanced than just a set of operators.

Another example is comparison to nil:

var x []int | []string
fmt.Println(x == nil)  // Prints true
x = []string(nil)
fmt.Println(x == nil)  // Still prints true

Both component types are At least one type is comparable to nil, so we allow the sum type to be compared to nil without a type switch. Of course this is somewhat at odds with how interfaces currently behave, but that might not be a bad thing per #22729

Edit: equality testing is a bad example here as I think it should be more permissive, and only require a potential match from one or more component types. Mirrors assignment in that respect.

@Merovius

This comment has been minimized.

Merovius commented Feb 20, 2018

The problem is, that the result will either a) have the same problems that automatic conversions have or b) will be extremely (and IMO confusingly) limited in scope - namely, all the operators would only work with untyped literals, at best.

I also have another issue, which is that allowing that will even further limit their robustness against evolution of their constituent types - now the only types you could ever add while preserving backwards compatibility are ones which allow all the operations of their constituent types.

All of this just seems really messy to me, for a very small (if any) tangible benefit.

@Merovius

This comment has been minimized.

Merovius commented Feb 20, 2018

now the only types you could ever add while preserving backwards compatibility are ones which allow all the operations of their constituent types.

Oh and to be explicit about this one too: It implies that you can never decide that you'd like to extend a parameter or return type or variable or… from a singleton-type to a sum. Because adding any new type will make some operations (like assignments) fail to compile.l

@peter-mckenzie

This comment has been minimized.

peter-mckenzie commented Feb 20, 2018

@Merovius note that a variant of the compatibility issue already exists with the original proposal because "The method set of the sum type holds the intersection of the method set
of all its component types". So if you add a new component type that doesn't implement that method set, then that'll be a non backwards compatible change.

Oh and to be explicit about this one too: It implies that you can never decide that you'd like to extend a parameter or return type or variable or… from a singleton-type to a sum. Because adding any new type will make some operations (like assignments) fail to compile.l

Assignment behavior would remain as described by @rogpeppe but overall I'm not sure I understand this point.

If nothing else, I think the original rogpeppe proposal needs to be clarified regarding the behavior of the sum type outside of a type switch. Assignment and method set are covered, but that's all. What about equality? I think we can do better than what interface{} does:

var x int | float64
fmt.Println(x == "hello")  // compilation error?
x = 0.0
fmt.Println(x == 0) // true or false?  I vote true :-)
@Merovius

This comment has been minimized.

Merovius commented Feb 20, 2018

So if you add a new component type that doesn't implement that method set, then that'll be a non backwards compatible change.

You can always add methods, but you can't overload operators to work on new types. Which is precisely the difference - in their proposal, you can only call the common methods on a sum-value (or assign to it), unless you unwrap it with a type-assertion/-switch. Thus, as long as the type you add has the necessary methods, it would not be a breaking change. In your proposal, it still would be a breaking change, because users might use operators you can't overload.

(you might want to point out that adding types to the sum would still be a breaking change, because type-switches would not have the new type in them. Which is exactly why I'm not in favor of the original proposal either - I don't want closed sums for that very reason)

Assignment behavior would remain as described by @rogpeppe

Their proposal only talks about assignment to a sum-value, I talk about assignment from a sum-value (to one of its constituent parts). I agree that their proposal doesn't allow this either, but the difference is, that their proposal isn't about adding this possibility. i.e. my argument is exactly, that the semantics you suggest are not particularly beneficial, because in practice, the usage they get is severely limited.

fmt.Println(x == "hello") // compilation error?

This would probably be added to their proposal as well. We already have an equivalent special case for interfaces, namely

A value x of non-interface type X and a value t of interface type T are comparable when values of type X are comparable and X implements T. They are equal if t's dynamic type is identical to X and t's dynamic value is equal to x.

fmt.Println(x == 0) // true or false? I vote true :-)

Presumably false. Given, that the similar

var x int|float64 = 0.0
y := 0
fmt.Println(x == y)

should be a compile-error (as we concluded above), this question only really makes sense when comparing to untyped numerical constants. At that point it kind of depends how this is added to the spec. You could argue, that this is similar to assigning a constant to an interface type and thus it should have its default type (and then the comparison would be false). Which IMO is more than fine, we already accept that situation today without much fuzz. You could, however, also add a case to the spec for untyped constants that would cover the case of assigning/comparing them to sums and solve the question that way.

Answering this question either way, however, doesn't necessitate allowing all expressions using sum types that might make sense for the constituent parts.

But to reiterate: I'm not arguing in favor of a different proposal for sums. I'm arguing against this one.

@Merovius

This comment has been minimized.

Merovius commented Feb 20, 2018

fmt.Println(x == "hello") // compilation error?

This would probably be added to their proposal as well.

Correction: The spec already covers this compilation error, given that it contains the statement

In any comparison, the first operand must be assignable to the type of the second operand, or vice versa.

@peter-mckenzie

This comment has been minimized.

peter-mckenzie commented Feb 21, 2018

@Merovius you make some good points about my variant of the proposal. I'll refrain from debating them further, but I would like to drill into the comparison to 0 question a little further because it applies equally to the original proposal.

fmt.Println(x == 0) // true or false? I vote true :-)

Presumably false. Given, that the similar

var x int|float64 = 0.0
y := 0
fmt.Println(x == y)
should be a compile-error (as we concluded above),

I don't find this example very compelling because if you change the first line to var x float64 = 0.0 then you could use the same reasoning to argue that comparing a float64 to 0 should be false. (Minor points: (a) I assume you meant float64(0) on first line, since 0.0 is assignable to int. (b) x==y should not be a compile error in your example. It should print false though.)

I think your idea that "that this is similar to assigning a constant to an interface type and thus it should have its default type" is more compelling (assuming you meant sum type), so the example would be:

var x,y int|float64 = float64(0), 0
fmt.Println(x == y) // false

I'd still argue that x == 0 should be true though. My mental model is that a type is given to 0 as late as possible. I realize that this is contrary to the current behavior of interfaces which is precisely why I brought it up. I agree that this hasn't lead to "much fuzz", but the similar issue of comparing interfaces to nil has resulted in quite a lot of confusion. I believe we'd see a similar amount of confusion for comparison to 0 if sum types come into existence and the old equality semantics are kept.

@Merovius

This comment has been minimized.

Merovius commented Feb 21, 2018

I don't find this example very compelling because if you change the first line to var x float64 = 0.0 then you could use the same reasoning to argue that comparing a float64 to 0 should be false.

I didn't say it should, I said that presumably it would, given what I perceive as the most likely tradeoff between simplicity/usefulness for how their proposal would be implemented. I wasn't trying to make a value judgement. In fact, if with just as simple rules we could get it to print true, I'd probably tend to prefer it. I'm just not optimistic.

Note, that comparing float64(0) to int(0) (i.e. the example with the sum replaced by var x float64 = 0.0) isn't false, though, it's a compile-time error (as it should be). This is exactly my point; your proposal is only really useful when combined with untyped constants, because for anything else it wouldn't compile.

(a) I assume you meant float64(0) on first line, since 0.0 is assignable to int.

Sure (I was assuming semantics closer to the current "default type" for constant expressions, but I agree that the current wording doesn't imply that).

(b) x==y should not be a compile error in your example. It should print false though.)

No, it should be a compile time error. You have said, that the operation e1 == y, with e1 being a sum-type expression should be allowed if and only if the expression would compile with any choice of constituent type. Given that in my example, x has type int|float64 and y has type int and given that float64 and int are not comparable, this condition is clearly violated.

To make this compile you'd need to drop the condition that the substituting any constituent typed expression needs to compile too; at which point we are in the situation of having to set up rules how types are promoted or converted when used in these expressions (also know as "the C mess").

@sirkon

This comment has been minimized.

sirkon commented Jun 16, 2018

The past consensus has been that sum types do not add very much to interface types.

They doesn't indeed for most usecases of Go: trivial network services and utils. But once the system grows larger there's a good chance they are useful.
I am currently writing a heavily distributed service with data consistency guarantees implemented via lots of logic and I drove into the situation where they would be handy. These NPDs became too annoying as the service grew large and we don't see a sane way to split it.
I mean Go's type system guarantees are a bit too weak for something more complex than typical primitive network services.

But, the story with Rust shows it is a bad idea to use sum types for NPD and error handling just like they do in Haskell: there is typical natural imperative workflow and Haskellish approach doesn't fit well into it.

Example

consider iotuils.WriteFile-like function in pseudocode. Imperative flow would look like this

file = open(name, os.write)
if file is error
    return error("cannot open " + name + " writing: " + file.error)
if file.write(data) is error:
    return error("cannot write into " + name + " : " + file.error)
return ok

and how it looks in Rust

match open(name, os.write)
    file
        match file.write(data, os.write)
            err
                return error("cannot open " + name + " writing: " + err)
            ok
                return ok
    err
        return error("cannot write into " + name + " : " + err)

it is safe but ugly.

And my proposal:

type result[T, Err] oneof {
    default T
    Error Err
}

and how the program could look like (result[void, string] = !void)

file := os.Open(name, ...)
if !file {
    return result.Error("cannot open " + name + " writing: " + file.Error)
}
if res := file.Write(data); !res {
    return result.Error("cannot write into " + name + " : " + res.Error)
}
return ok

Here default branch is anonymous and error branch can be accessed with .Error (once it is known the result is Error). Once it is known the file was opened successfully user can access it via the variable itself. In first if we make sure file was successfuly opened or exit otherwise (and thus further statements knows the file is not an error).

As you see, this approach preserves imperative flow and provides type safety. NPD handling can be done in a similar way:

type Reference[T] oneof {
    default T
    nil
}
// Reference[T] = *T

the handling is similar to result

@skybrian

This comment has been minimized.

Contributor

skybrian commented Jun 17, 2018

@sirkon, your Rust example doesn't convince me that there's anything wrong with straightforward sum types like in Rust. Rather, it suggests that pattern-matching on sum types might be made more Go-like using if statements. Something like:

ferr := os.Open(name, ...)
if err(e) := ferr {           // conditional match and unpack, initializing e
  return fmt.Errorf("cannot open %v: %v", name, e)
}
ok(f) := ferr                  // unconditional match and unpack, initializing f
werr := f.Write(data)
...

(In the spirit of sum types, it would be compile error if the compiler can't prove that an unconditional match always succeeds because there is exactly one case remaining.)

For basic error-checking, this doesn't seem like an improvement over multiple return values, since it is one line longer and declares one more local variable. However, it would scale better to multiple cases (by adding more if statements), and the compiler could check that all cases are handled.

@Merovius

This comment was marked as off-topic.

Merovius commented Jun 17, 2018

@sirkon

They doesn't indeed for most usecases of Go: trivial network services and utils. But once the system grows larger there's a good chance they are useful.
[…]
I mean Go's type system guarantees are a bit too weak for something more complex than typical primitive network services.

Statement like these are unnecessarily confrontational and derogatory. They are also kind of embarrassing, TBH, because there are extremely large, nontrivial services written in Go. And given that a significant chunk of its developers work at Google, you should just assume that they know better than you, if it is suitable to write large and non-trivial services. Go might not cover all usecases (neither should it, IMO), but it empirically does not only work for "primitive network services".

NPD handling can be done in a similar way

I think this really illustrates that your approach doesn't actually add any significant value. As you point out, it simply adds a different syntax for dereference. But AFAICT nothing is preventing a programmer from using that syntax on a nil-value (which would presumably still panic). i.e. every program that is valid using *p is also valid using p.T (or is it p.default? It's hard to tell what your idea is specifically) and vice versa.

The one advantage sum types can add to error handling and nil-dereferences is that the compiler can enforce that you have to prove that the operation is safe by pattern-matching on it. A proposal that omits that enforcement doesn't seem to bring significant new things to the table (arguably, it is worse than using open sums via interfaces), while a proposal that does include it is exactly what you describe as "ugly".

@sirkon

This comment was marked as off-topic.

sirkon commented Jun 17, 2018

@Merovius

And given that a significant chunk of its developers work at Google, you should just assume that they know better than you,

Blessed are the believers.

As you point out, it simply adds a different syntax for dereference.

again

var written int64
...
res := os.Stdout.Write(data) // Write([]byte) -> Result[int64, string] ≈ !int64
written += res // Will not compile as res is a packed result type
if !res {
    // we are living on non-default res branch thus the only choice left is the default
    return Result.Error(...)
}
written += res // is OK
@sirkon

This comment was marked as off-topic.

sirkon commented Jun 17, 2018

@skybrian

ferr := os.Open(...)

this intermediate variable is what force me to left this idea. As you see, my approach is specifically for error and nil-handling. These tiny tasks are too important and deserve a special attention IMO.

@Merovius

This comment was marked as off-topic.

Merovius commented Jun 17, 2018

@sirkon You apparently have very little interest in talking to people eye-to-eye. I'll leave it at that.

@agnivade

This comment was marked as off-topic.

Member

agnivade commented Jun 17, 2018

Let us keep our conversations civil, and avoid nonconstructive comments. We can disagree on things, but still maintain a respectable discourse. https://golang.org/conduct.

@hasufell

This comment was marked as off-topic.

hasufell commented Jun 17, 2018

And given that a significant chunk of its developers work at Google, you should just assume that they know better than you

I doubt you could make that kind of argument at Google.

@sirkon

This comment was marked as off-topic.

sirkon commented Jun 17, 2018

@hasufell that guy is from Germany where they don't have large IT companies with crap interviews to pump interviewer's ego and behemoth management, that's why these words.

@hasufell

This comment was marked as off-topic.

hasufell commented Jun 17, 2018

@sirkon same goes for you. Ad-hominem and social arguments are not useful. This is more than a CoC problem. I've seen these kind of "social arguments" pop up rather frequently when it's about the core language: compiler devs know better, language designers know better, google people know better.

No, they don't. There is no intellectual authority. There is just decision authority. Get over it.

@bcmills

This comment has been minimized.

Member

bcmills commented Jun 18, 2018

Hiding a few comments to reset the conversation (and thanks @agnivade for trying to get it back on the rails).

Folks, please consider your role in these discussions in light of our Gopher values: everyone in the community has a perspective to bring, and we should strive to be respectful and charitable in how we interpret and respond to each other.

@dchenk

This comment has been minimized.

Contributor

dchenk commented Jun 21, 2018

Allow me, please, to add my 2-cents to this discussion:

We need a way to group different types together by features other than their method sets (as with interfaces). A new grouping feature should allow including primitive (or basic) types, which don’t have any methods, and interface types to be categorized as relevantly similar. We can keep primitive types (boolean, numeric, string, and even []byte, []int, etc.) as they are but enable abstracting away from differences between types where a type definition groups them in a family.

I suggest we add something like a type family construct to the language.

The Syntax

A type family may be define much like any other type:

type theFamilyName family {
    someType
    anotherType
}

The formal syntax would be something like:
FamilyType = "family" "{" { TypeName ";" } "}" .

A type family may be defined inside of a function signature:

func Display(s family{string; fmt.Stringer}) { /* function body */ }

That is, the one-line definition requires semicolons between the type names.

The zero value of a family type is nil, like with a nil interface.

(Under the hood a value sitting behind the family abstraction is implemented much like an interface.)

The Reasoning

We need something more precise than the empty interface where we want to specify what types are valid as arguments to a function or as returns of a function.

The proposed solution would enable better type safety, fully checked at compile time and adding no additional overhead at runtime.

The point is that Go code should be more self-documenting. What a function can take as an argument should be built into the code itself.

Too much code incorrectly exploits the fact that “interface{} says nothing.” It’s a little embarrassing that such a widely used (and abused) construct in Go, without which we wouldn’t be able to do much, says nothing.

Some Examples

The documentation for the sql.Rows.Scan function includes a large block detailing what types may be passed in to the function:

Scan converts columns read from the database into the following common Go types and special types provided by the sql package:
 *string
 *[]byte
 *int, *int8, *int16, *int32, *int64
 *uint, *uint8, *uint16, *uint32, *uint64
 *bool
 *float32, *float64
 *interface{}
 *RawBytes
 any type implementing Scanner (see Scanner docs)

And for the sql.Row.Scan function the documentation includes the sentence “See the documentation on Rows.Scan for details.” See the documentation for some other function for details? This is not Go-like—and in this case that sentence is not correct because in fact Rows.Scan can take a *RawBytes value but Row.Scan cannot.

The problem is that we are often forced to rely on comments for guarantees and behavior contracts, which the compiler cannot enforce.

When the docs for a function say that the function works just like some other function—“so go see the documentation for that other function”—you can almost guarantee that the function will be misused sometimes. I’ll bet that most people, like myself, have only found out that a *RawBytes is not permitted as an argument in Row.Scan only after getting an error from the Row.Scan (saying "sql: RawBytes isn't allowed on Row.Scan"). It’s sad that the type system permits such mistakes.

We could instead have:

type Value family {
    *string
    *[]byte
    *int; *int8; *int16; *int32; *int64
    *uint; *uint8; *uint16; *uint32; *uint64
    *bool
    *float32; *float64
    *interface{}
    *RawBytes
    Scanner
}

This way, the value passed in must be one of the types in the given family, and the type switch inside the Rows.Scan function will not need to deal with any unexpected or default cases; there would be another family for the Row.Scan function.

Consider also how the cloud.google.com/go/datastore.Property struct has a “Value” field of type interface{} and requires all this documentation:

// Value is the property value. The valid types are:
// - int64
// - bool
// - string
// - float64
// - *Key
// - time.Time
// - GeoPoint
// - []byte (up to 1 megabyte in length)
// - *Entity (representing a nested struct)
// Value can also be:
// - []interface{} where each element is one of the above types
// This set is smaller than the set of valid struct field types that the
// datastore can load and save. A Value's type must be explicitly on
// the list above; it is not sufficient for the underlying type to be
// on that list. For example, a Value of "type myInt64 int64" is
// invalid. Smaller-width integers and floats are also invalid. Again,
// this is more restrictive than the set of valid struct field types.
//
// A Value will have an opaque type when loading entities from an index,
// such as via a projection query. Load entities into a struct instead
// of a PropertyLoadSaver when using a projection query.
//
// A Value may also be the nil interface value; this is equivalent to
// Python's None but not directly representable by a Go struct. Loading
// a nil-valued property into a struct will set that field to the zero
// value.

This could be:

type PropertyVal family {
  int64
  bool
  string
  float64
  *Key
  time.Time
  GeoPoint
  []byte
  *Entity
  nil
  []int64; []bool; []string; []float64; []*Key; []time.Time; []GeoPoint; [][]byte; []*Entity
}

(You can imagine how this could be split up cleaner into two families.)

The json.Token type was mentioned above. It's type definition would be:

type Token family {
    Delim
    bool
    float64
    Number
    string
    nil
}

Another example that I got bit by recently:
When calling functions like sql.DB.Exec, or sql.DB.Query, or any function which takes a variadic list of interface{} where each element has to have a type in a particular set and not itself be a slice, it’s important to remember to use the “spread” operator when passing in the arguments from a []interface{} into such a function: it’s wrong to say DB.Exec("some query with placeholders", emptyInterfaceSlice); the correct way is: DB.Exec("the query...", emptyInterfaceSlice...) where emptyInterfaceSlice has type []interface{}. An elegant way to make such mistakes impossible would be to have this function take a variadic argument of Value, where Value is defined as a family as described above.

The point of these examples is that real mistakes are being made because of the imprecision of the interface{}.

@FMNSSun

This comment has been minimized.

FMNSSun commented Jul 3, 2018

var x int | float64 | string | rune
z = int(x) + int(y)
z = float64(x) + float64(y)

This should definitely be a compiler error because the type of x isn't really compatible with what can be passed to int().

I like the idea of having family. It'd essentially be an interface constrained (constricted?) to the listed types and the compiler can ensure you're matching against all the time and changes the type of the variable within the local context of the corresponding case.

The problem is that we are often forced to rely on comments for guarantees and
behavior contracts, which the compiler cannot enforce.

That's actually the reason why I started to slightly dislike things like

func foo() (..., error) 

because you have no idea what kind of error it returns.

and a few other things that return an interface instead of a concrete type. Some functions
return net.Addr and it's sometimes a bit difficult to dig through the source code to figure out what kind of net.Addr it actually returns and then use it appropriately. There's not really much downside in returning a concrete type (because it implements the interface and can thus be used anywhere where the interface can be used) except when you
later plan to extend your method to return a different kind of net.Addr. But if your
API mentions it returns OpError then why not make that part of the "compile time" spec?

For example:

 OpError is the error type usually returned by functions in the net package. It describes the operation, network type, and address of an error. 

Usually? Doesn't tell you exactly which functions return this error. And this is the documentation for the type, not the function. The documentation for Read nowhere mentions it returns OpError. Also, if you do

err := blabla.(*OpError)

it'll crash once it returns a different kind of error. That's why I'd really like to see this as part of the function declaration. At least *OpError | error would tell you that it returns
such an error and the compiler makes sure you don't do an unchecked type assertion crashing your program in the future.

BTW: Was a system like Haskell's type polymorphism considered yet? Or a 'trait' based type system i.e.:

func calc(a < add(a, a) a >, b a) a {
   return add(a, b)
}

func drawWidgets(widgets []< widgets.draw() error >) error {
  for _, widgets := range widgets {
    err := widgets.draw()
    if err != nil {
      return err
    }
  }
  return nil
}

a < add(a, a) a means "whatever a's type is, there must exist a function add(typeof a, typeof a) typeof a)". < widgets.draw() error> means that "whatever widget's type, it must provide a method draw that returns an error". This would allow more generic functions to be created:

func Sum(a []< add(a,a) a >) a {
  sum := a[0]
  for i := 1; i < len(a); i++ {
    sum = add(sum,a[i])
  }
  return sum
}

(Note that this is not equal to traditional "generics").

@Merovius

This comment has been minimized.

Merovius commented Jul 3, 2018

There's not really much downside in returning a concrete type (because it implements the interface and can thus be used anywhere where the interface can be used) except when you later plan to extend your method to return a different kind of net.Addr.

Also, Go doesn't have variant subtyping, so you can't use a func() *FooError as a func() error where needed. Which is especially important for interface satisfaction. And lastly, this doesn't compile:

func Foo() (FooVal, FooError) {
    // ...
}

func Bar(f FooVal) (BarVal, BarError) {
    // ...
}

func main() {
    foo, err := Foo()
    if err != nil {
        log.Fatal(err)
    }
    bar, err := Bar(foo) // Type error: Can not assign BarError to err (type FooError)
    if err != nil {
        log.Fatal(err)
    }
}

i.e. to make this work (I'd like if we could somehow) we'd need far more sophisticated type inference - currently, Go only uses local type information from a single expression. In my experience, those kinds of type inference algorithms are not only significantly slower (slowing down compilation and commonly not even bounded runtime) but also produce far less understandable error messages.

@FMNSSun

This comment has been minimized.

FMNSSun commented Jul 4, 2018

Also, Go doesn't have variant subtyping, so you can't use a func() *FooError as a func() error where needed. Which is especially important for interface satisfaction. And lastly, this doesn't compile:

I'd have expected that this works fine in Go but I've never stumbled upon this because the current practice is to just use error. But yes, in this cases these restrictions practically force you to use error as the return type.

func main() {
    foo, err := Foo()
    if err != nil {
        log.Fatal(err)
    }
    bar, err := Bar(foo) // Type error: Can not assign BarError to err (type FooError)
    if err != nil {
        log.Fatal(err)
    }
}

I'm not aware of any language that allows this (well, except for esolangs) but all you would have to do is keep a "type world" (which is basically a map of variable -> type) and if you re-assign the variable you just update its type in the "type world".

I don't think you need complicated type inference to do this but you need to keep track of types of variables but I'm assuming you need to do that anyway because

var int i = 0;
i = "hi";

you surely somehow have to remember which variables/declarations have which types and for i = "hi" you need to make a "type lookup" on i to check whether you can assign a string to it.

Are there practical issues that complicates assigning a func () *ConcreteError to a func() error other than the type checker not supporting it (like runtime reasons/compiled code reasons)? I guess currently you'd have to wrap it in a function like this:

type MyFunc func() error

type A struct {
}

func (_ *A) Error() string { return "" }

func NewA() *A {
	return &A{}
}

func main() {
	var err error = &A{}
	fmt.Println(err.Error())
	var mf MyFunc = MyFunc(func() error { return NewA() }) // type checks fine
        //var mf MyFunc = MyFunc(NewA) // doesn't type check
	_ = mf
}

If you're faced with a func (a, b) c but get a func (x, y) z all that needs to be done is to check whether z is assignable to c (and a,b must be assignable to x,y) which at least on the type level doesn't involve complicated type inference (it just involves checking whether a type is assignable/compatible to/with another type). Of course, whether this causes issues with runtime/compilation... I don't know but at least strictly looking at the type level I don't see why this would involve complicated type inference. The type checker already knows if an x can be assigned to a thus it also easily knows whether func () x can be assigned to func () a. Of course, there might be practical reasons (thinking about runtime representations) why this won't be easily possible. (I'm suspecting that's the real crux here, not the actual type checking).

Theoretically you could workaround the runtime issues (if there are) with automatically wrapping functions (like in the above snippet) with the potentially huge downside that it screws up comparisons of funcs with funcs (as the wrapped func won't be equal to the func it wraps).

@Merovius

This comment has been minimized.

Merovius commented Jul 4, 2018

I'm not aware of any language that allows this (well, except for esolangs)

Not exactly, but I'd argue that's because languages with powerful type systems are usually functional languages which don't really use variables (and so don't really need the ability to reuse identifiers). FWIW, I'd argue that e.g. Haskell's type system would be able to deal with this just fine - at least as long as you are not using any other properties of FooError or BarError, it should be able to infer that err is of type error and deal with it. Of course, again, this is a hypothetical, because this exact situation doesn't easily transfer to a functional language.

but I'm assuming you need to do that anyway because

The difference being, that in your example, i has a clear and well-understood type after the first line, which is int and you then run into a type-error when you assign a string to it. Meanwhile, for something like I mentioned, every usage of an identifier essentially creates a set of constraints on the used type and the type checker then tries to infer the most general type fulfilling all the constraints given (or complain that there is no type fulfilling that contract). That's what formal type theories are for.

Are there practical issues that complicates assigning a func () *ConcreteError to a func() error other than the type checker not supporting it (like runtime reasons/compiled code reasons)?

There are practical problems, but I believe for func they are probably solvable (by emitting un/-wrapping code, similarly to how interface-passing works). I wrote a bit about variance in Go and explain some of the practical problems I see at the bottom. I'm not totally convinced it's worth adding though. I.e. I'm unsure it solves important problems on its own.

with the potentially huge downside that it screws up comparisons of funcs with funcs (as the wrapped func won't be equal to the func it wraps).

funcs are not comparable.

Anyway, TBH, all of this seems a bit off-topic for this issue :)

@Merovius

This comment has been minimized.

Merovius commented Sep 5, 2018

FYI: I just did this. It's not nice, but it sure is type-safe. (Same thing can be done for #19814 FWIW)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment