Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: universal zero value with type inference #35966

Closed
geraldss opened this issue Dec 4, 2019 · 113 comments
Closed

proposal: Go 2: universal zero value with type inference #35966

geraldss opened this issue Dec 4, 2019 · 113 comments
Labels
LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Milestone

Comments

@geraldss
Copy link

geraldss commented Dec 4, 2019

I propose a universal zero value with type inference. Currently nil is a zero value with type inference for pointers and built-in reference types. I propose extending this to structs and atomic types, as follows:

{} would represent a zero value when the type can be inferred, e.g. in assignments and function call sites. If I have a function:

func Foo(param SomeLongStructName)

and I wish to invoke Foo with a zero value, I currently have to write:

Foo(SomeLongStructName{})

With this proposal, I could alternatively write:

Foo({})

For assignments currently (not initializations; post-initialization updates):

myvar = SomeLongStructName{}

With this proposal:

myvar = {}

This proposal is analogous to how nil is used for pointers and reference types.

The syntax allows type names and variable types to be modified without inducing extraneous code changes. The syntax also conveys the intent "zero-value" or "default" or "reset", as opposed to the actual contents of the zero value. Thus the intent is more readable.

@geraldss
Copy link
Author

geraldss commented Dec 4, 2019

Or perhaps underscore as the zero designator. Either would be readable.

@jimmyfrasche
Copy link
Member

Related issues #19642 now closed which proposed a universal zero value and #12854 which would allow type names to be elided in composite literals allowing all the examples in the first post.

@ianlancetaylor ianlancetaylor changed the title Universal zero value with type inference proposal: Go 2: universal zero value with type inference Dec 4, 2019
@gopherbot gopherbot added this to the Proposal milestone Dec 4, 2019
@ianlancetaylor ianlancetaylor added v2 An incompatible library change LanguageChange Suggested changes to the Go language labels Dec 4, 2019
@beoran
Copy link

beoran commented Dec 5, 2019

Interesting idea. Sorry to bike shed, but perhaps the reusing the default keyword would be more readable?

Foo(default)
myvar = default

@geraldss
Copy link
Author

geraldss commented Dec 5, 2019

Foo(default, default, default)
Foo({}, {}, {})
Foo(_, _, _)

I find the latter two more readable, but default or other keyword is also fine with a syntax highlighter. As jimmyfrasche pointed out, the {} and _ syntax have been proposed previously.

Here's another argument for the proposal. These calls highlight the values that are being passed, which is good:

Foo(10, "xyz", nil)
Foo({})

This call highlights the type that is being passed, including its fully qualified name. This shifts the cognitive effort.

Foo(SomeLongStructName{})

@quenbyako
Copy link

quenbyako commented Dec 8, 2019

i think Foo({}, {}, {}) is more readable, than default, FMIO, cause 1) default has a bit more letters... cause, you know, less letters -> better code 🙃 2) create new reserved word is not a good idea, i know so many projects with default variables, so, it can break a lot of code base. Also _ symbol is for another things, like /dev/null in golang universe. so, with _ symbol as empty struct definition is not a good idea, i think.

but as a concept of language proposal, i like your idea.

@ianlancetaylor
Copy link
Contributor

This seems to be a restatement of #19642 with a different spelling of the zero value. Given that the earlier proposal was not accepted, what has changed since then?

@geraldss
Copy link
Author

It's not stated why the previous proposal was closed, and I had not seen it when I searched and filed this proposal.

I raised this proposal from direct and repeated experience. In addition to comments in this issue and the previous issues, I'll add another:

There are up to three items of information in a Go expression or assignment: name, type, and value.

  1. Names are inferred using uniform rules across all Go datatypes. That is, names are inferred in assignments, function calls, and return statements, and the inference behavior is uniform across all Go datatypes.

  2. Values are not inferred. This is also uniform.

  3. Type inference is not universal and uniform, and it's not clear to me why that is.

The function calls FooInt32(0), FooInt64(0), FooPtr(nil), FooChan(nil), FooMap(nil) will all infer the argument type correctly. Presumably Golang believes that type inference is beneficial or ergonomical. These could all require explicit typing, e.g. int32(0).

@ianlancetaylor
Copy link
Contributor

0 is an untyped constant, as are "", true, and false, and, for that matter "abc", 100, and 1+2i. Untyped constants may be used with any compatible type. If there is no compatible type, as in a := 0, they have a default type.

nil is the zero value of pointer, slice, channel, map, and function types. It is not an untyped constant: a := nil is an error. nil is in effect an overloaded term for the zero value of certain types. This overloading is problematic; see #22729. Note that for the types with which nil can be used, there is no other way to write the zero value.

This proposal, and #19642, is something else again. It proposes a way of writing a value that can be converted to the zero value in a type context. Writing a := {} would be an error. But we could write F({}), a = {} (for an already defined a), a == {}, return {}. For ordered types we could write a > {}. And while {} could be used with any type, it would always be an alias for the actual zero value of that type (0, false, nil, S{}, etc.).

You could presumably write 0 == {}, which would always be true: the 0 would have no type context so it would default to int, at which point {} would default to the value 0 in type int. Maybe you could write {} == 0. I'm not sure. I'm also not sure about 1 + {} and {} + 1. Or "a" + {} and {} + "a".

So I don't agree with your suggestion that there is some missing aspect to type inference. Untyped constants, nil, and {} are three different kinds of things.

@geraldss
Copy link
Author

Per your comment, untyped constants do support type inference, and overloaded nil does support type inference (issue with nil interfaces noted).

The net effect of this is that type inference is neither uniform nor universal across data types. This is the impetus for my proposal and the earlier proposals. I also like #12854, and would consider any of these a positive step.

@ianlancetaylor
Copy link
Contributor

I think we must mean different things by "type inference". I tried to describe exactly how untyped constants and nil behave, to show that they are different from each other. I agree that if you describe both untyped constants and nil as "type inference", then "type inference" is neither uniform nor universal across data types. But I don't see how this proposal changes that fact.

@geraldss
Copy link
Author

geraldss commented Dec 11, 2019

By type inference, I mean the omission of the type name in the text of the value.

Your example of a == {} is interesting. I write these all the time:

if ptr == nil
if v == 0

Would be useful to write

if v == {}

where type of v is SomeLongStruct.

This proposal says that {} is treated uniformly as the zero value in all contexts where type can be inferred / determined. That seems uniform and universal. The concept of "zero value" is already universal, i.e. defined for all types.

@ianlancetaylor
Copy link
Contributor

OK, omitting the type in the text of the value is what I would call an implicit conversion. Untyped constants support an implicit conversion to a set of related types, and also support an implicit conversion to a default type. The value nil supports an implicit conversion to any pointer, slice, etc., type. This proposal is suggesting that the value {} support an implicit conversion to any type.

Another case where implicit conversion occurs in Go is that any type that implements an interface type may be implicitly converted to that interface type.

@deanveloper
Copy link

deanveloper commented Dec 13, 2019

A better way to do this (in my opinion) would be to allow for constant struct expressions, which would hopefully include "untyped struct literals". #21130 gets close to this but isn't very specific, I might try to type up something a little more formal.

@geraldss
Copy link
Author

Const-ness is orthogonal to type inference.

@deanveloper
Copy link

Untyped constants are not, however. What I am proposing is that we should be able to do var x MyStruct = {} just as we can do var y time.Duration = 0

@jimmyfrasche
Copy link
Member

I think #12854 and #21182 would fill most of the gaps where this hurts in most code. Comparing a struct to its zero would still be a little awkward with this proposal or #12854 since you'd need to write if v == ({}) {.

Generating code or, in the future, writing generic code that uses zero values is still going to be awkward, as you don't know which form the zero value takes, though #21182 would knock out the most painful case.

You can always do var zero T but that gets a little awkward if you need zeros for more than one type in the same scope. You can avoid naming the zeros and use the expression *new(T) but that's a bit weird looking, especially since new isn't used that much.

In most cases, you could probably get away with generalizing and having the user pass in a value, zero or not: for example, writing Filter(type T)(s []T, v T) []T instead of RemoveZeros(type T)(s []T) []T.

In generic code, comparing to zero also has a little wrinkle in that some incomparable types have a special case for comparing against zero that can't be matched in type constraints where you can only specify comparable or not. If there were some universal zero value, then #26842 could be accepted since there would always be a way to write a statically guaranteed to be all-bytes zero. But, if that's the only major case left and it would still be awkward to see if comparable structs are zero, maybe it would suffice to have a predeclared func zero(type T)(T) bool that worked on comparable and incomparable types alike?

@geraldss
Copy link
Author

Yes, it's possible to do less. But I haven't seen any argument for why less is more in this case, or any downside to the universal zero.

@jimmyfrasche
Copy link
Member

Let's consider what we can do with a specific, typed zero value, var zero T:

  1. Reset a variable to zero: v = zero
  2. Define a new variable: u := zero
  3. Send it to a channel: c <- zero
  4. If T is comparable, compare another variable to it: v == zero
  5. If T has operators, use it as an operand: v < zero or v + zero
  6. Use it in a composite literal: []T{u, zero, v}
  7. Call a method on it: zero.M()
  8. Return it from a function: return zero, err

If we had a universal zero value, then defining a new variable and calling a method are out, as a specific type is required for each. Using it as an operand isn't really a problem since any type with operators already has a concise zero value.

That leaves:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. Use it in a composite literal: []T{u, zero, v}
  5. Return it from a function: return zero, err

For the majority of these, there's only really a problem if T is composite, as they have verbose zero values. Use in a composite literal is only sometimes an issue as the types of composite literals in composite literals can be elided in a number of cases. #12854 could expand elision to all the remaining cases and allow you to write return {}, f({}) for example. This would also allow quite a bit more since you could also write c <- {k: v} or f({}, {X: 1}, {2, 3}).

For comparable T, comparison against zero would still have the issue that we could write

p := v == {}
if p { // ...

but we couldn't write

if v == {} { // ...

due to the ambiguity and we would instead have to write

if v == ({}) { // ...

All of this assumed that we knew upfront what T is. That goes away when generating code or (hopefully soon) writing generic code. Even if every type has a concise zero value, we will not necessarily know which one to use, unless the contract of the type parameter is sufficiently strict.

The most common case would be returning some zero values and an error. #21182 would allow that and also improve the readability and editability of non-generated/generic code as a bonus.

That leaves us with a different set of possible problems:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. If T has operators, use it as an operand: v < zero or v + zero
  5. Use it in a composite literal: []T{u, zero, v}

A universal zero value would be useful here, but I think the majority of these will be relatively uncommon, though I could be wrong. A good way to make a case for this proposal would be to write reasonable generic code using the latest generics draft that is very awkward without a universal zero. Finding code generators that have a lot of special cases or past/known bugs because of this would be another.

The one that seems like it would be most likely to cause problems is the split between incomparable types that are totally incomparable versus those that can be compared against nil. (#26842) If there were a universal zero value, all types, comparable or not, could be compared against it regardless of the specificity of the type constraints. It would also help to avoid the ambiguity when comparing struct values to their zero. But if it's just this one case that's left over that predeclared zero predicate would suffice.

@geraldss
Copy link
Author

geraldss commented Jan 4, 2020

As a detail, I don't see any ambiguity with

if v == {}

Every binary operator requires expressions on both sides, not statement blocks.

@jimmyfrasche
Copy link
Member

That's true. I was thinking about how you have to write if v == (T{}) { but you have to do that because of the T not the {}.

@earthboundkid
Copy link
Contributor

earthboundkid commented Mar 1, 2020

type T = func()

func Default(a, b T) T {
	var zero T
	if a != zero {
		return a
	}
	return b
}

This code doesn't compile because zero is a variable, not a constant, so it gives error "invalid operation: a != zero (func can only be compared to nil)". const zero T doesn't work because "const declaration cannot have type without expression". If default meant "zero value for type", you could write a != default and the code would work.

This doesn't matter much now, but in a world with generics, not being able to write (type T) IsZero(t T) bool would be a pain.

@jimmyfrasche
Copy link
Member

@carlmjohnson there's also #26842. Consider type T = struct { f func() }. A universal zero value wouldn't help with that unless it was also allowed to be compared against universally. Another way to solve that problem would be to make a function like IsZero a builtin.

@ianlancetaylor
Copy link
Contributor

Currently the language permits writing a simple expression, without specifying a type, for the zero value of most types: 0 for numeric types, "" for string types, nil for function, pointer, interface, channel, slice, and map types, false for boolean types. The exception is structs and arrays.

The raises the possibility of, rather than inventing a generic zero value, extending nil to be usable with struct and array types. Then nil would be the zero value for any composite type, which could arguably be a simplification of the spec.

The idea here is that we could assign nil to a variable of struct or array type, which would mean to zero all the elements. And we could compare a value of struct or array type to nil, which would report whether the value were the zero value.

@atdiar
Copy link

atdiar commented Jul 27, 2022

I have the impression that to you, Go is similar to Java. In Go, variables whose types have value semantics are never in an uninitialized state. That includes strings.

Note that a reference type is a computer science notion (that used to exist in Go's documentation). It can't mean anything.
string types have value type semantics (the emphasis on semantics was deliberate since the implementation can be confusing: e.g. you seem to have overlooked the rawstring internal function which allocates for every new string. Essentially, the internal pointer for a string is never nil).

Anyway, I could go on but the point is a bit moot if you had the impression that currently, every variable can be uninitialized.

Hope it helps.

@Merovius
Copy link
Contributor

Merovius commented Dec 23, 2022

FWIW I'm running into the issue @jimmyfrasche mentioned above, namely that I have a generic type and want to compare it to its zero value:

type Sparse[T any] struct {
    m map[Pos]T
}

func (g *Sparse[T]) Set(p Pos, v T) T {
    // don't store zeros, that just wastes space
    if v == *new(T) { // compiler error: T is not comparable.
        delete(g.m[p])
    } else {
        g.m[p] = v
    }
}

Personally, I didn't think of this before (and didn't see it when it was brought up) and it causes me to re-evaluate my support for a proposal like this.

Though I also thought about the predeclared func iszero(T) bool solution, which would suit me fine.

@earthboundkid
Copy link
Contributor

I strongly support adding zero. My only question is whether it makes sense to use this as an opportunity to also take on the nil interface vs. nil pointer confusion by also adding a new name for nil interface.

@Merovius
Copy link
Contributor

I have no strong opinions about the spelling of the zero value or whether we add a predeclared identifier for the zero value or add an iszero function.

Though I do oppose spelling the zero value nil specifically. As was said elsewhere, I don't think adding more nils will help that confusion.

And I think any spelling of a universal zero value will likely make it slightly worse. Because I think v == zero (or whatever) should work for interfaces as well and if people think an interface with dynamic value nil should compare equal to nil, they'll think an interface with dynamic value of zero should compare equal to zero. So, if anything, that confusion will get more common. The same issue exists if the check is spelled if iszero(someInterface).

IMO, the ship on that has sailed and if anything, we should try and figure out how to clarify that you never want to compare the dynamic value of an interface without knowing its type.

@mpx
Copy link
Contributor

mpx commented Jan 6, 2023

nil is the zero value for pointers and interfaces. Adding a separate universal zero would add duplication. Eg, the following would be equivalent:

  • Interfaces: err == nil and err == zero
  • Slices: buf == nil and buf == zero
  • Pointers: ptr == nil and ptr == zero
  • Functions, channels,...

Accepting this duplication would also be confusing (perhaps even more). It would only be a convention to use 1 over another (something for style guides and linters to opine about 😬 ). Historically, Go has avoided introducing multiple nearly identical concepts.

Properly understanding that nil is the zero value for an interface (and other types) would give people a better mental model.

I find extending nil a little disconcerting, but I suspect over time I would just get used to it and it would no longer be weird - just convenient and useful.

[I was originally in favour of something short like _, but now I'm more concerned with the permanent confusion caused by duplication]

@earthboundkid
Copy link
Contributor

Properly understanding that nil is the zero value for an interface (and other types) would give people a better mental model.

To clarify, are you proposing making nil the zero value for other types or just leaving the problem as is?

It would only be a convention to use 1 over another (something for style guides and linters to opine about 😬 ).

I agree that this is my main reservation about adding zero. I think it won't be a problem in practice though because it's very clear you should use the more specific value in most cases. zero is really just for generics and return values. If you're worried about it, someone could write a linter to fix it and add that to gopls.

@earthboundkid
Copy link
Contributor

var v struct{ _ [0]func() }
fmt.Println(reflect.ValueOf(&v).Elem().IsZero()) // true

fmt.Println(v == v) // does not compile
fmt.Println(v == nil) // does not compile
var zero struct{ _ [0]func() }
fmt.Println(v == zero) // does not compile

It is annoying to me that you can test if v is the zero value, but only if you use reflect, and reflect is quite slow (50x slower than an elementary operation in some quickie benchmarks I've done, and also it can force allocations).

@Merovius
Copy link
Contributor

Merovius commented Jan 6, 2023

@carlmjohnson I believe that if we introduced zero, we should specify that comparison with the predeclared identifier zero is always allowed and evaluates to true, if the other operand is its zero value (similar to how we currently specify comparison to nil in a special way).

@jimmyfrasche
Copy link
Member

@Merovius that's #26842

@mpx
Copy link
Contributor

mpx commented Feb 4, 2023

To clarify, are you proposing making nil the zero value for other types or just leaving the problem as is?

I'd rank the options:

  1. Make nil the universal zero (solves the broad use case, simplifies the language by removing restrictions, and enables a better mental model). nil is relatively short, so a good improvement for use with multiple returns.
  2. Add another identifier for the universal zero (solves the broad use case, but complicates the language by adding another concept/near duplication).
  3. Status Quo (doesn't solve the broad use case, but also avoids complicating the language).

TBH, I'm a little less confident at the moment with the order for 2/3. ("Is the extra complexity worth the benefit?"). However, I would really like a good solution -- var zero T is showing up in a lot of generic code, and returning zero values for structs is unnecessarily verbose (low signal-to-noise rather than providing clarity).

@griesemer griesemer added the TypeInference Issue is related to generic type inference label Mar 7, 2023
@justenwalker
Copy link

justenwalker commented Apr 17, 2023

var zero T is showing up in a lot of generic code, and returning zero values for structs is unnecessarily verbose (low signal-to-noise rather than providing clarity).

It requires very little understanding of the Go spec to know var zero T is always the zero value of the type T. I personally find it very clear, especially if you name your variable zero, when you write return zero or if v == zero what you are doing / your intent.

I understand there is a desire to condense 2 lines of code into 1; as one of the lines feels unnecessarily verbose; but besides decreasing the amount of typing, what other benefits does a universal zero value provide in any form; (nil, zero, _ ... {})

i.e. For the sake of argument, let's say we do get a universal zero value (substitute your favorite zero stand-in syntax below)

How has Go improved if goes from:

var zero T
if v == zero { 
  // something if zero
}

// ...

var zero T
return zero

To

if v == zero {
  // something if zero
}

// ...

return zero

I'll go first.
One way I can think of is to take the base case to its logical extreme. As you add parameterized types, you need to work with the zero value for all of them; it becomes excessively noisy to do:

func ManyTypes[A any, B any, C any, D any, E any](a A, b B, c C, d D, e E) {
        var (
                zeroA A
                zeroB B
                zeroC C
                zeroD D
                zeroE E
        )
        if a == zeroA { /*...*/ }
        if b == zeroB { /*...*/ }
        if c == zeroC { /*...*/ }
        if d == zeroD { /*...*/ }
        if e == zeroE { /*...*/ }
}

I wonder how common of a case would something like that be? I've never need more than 2 parameterized types.

Note: For return zeroT; you can probably eliminate some of this noise already by using named return values

In this case I'd probably be more inclined to support a new built-in iszero function rather than some new identifier, or changing the meaning of an existing one. On that note, a built-in zero(T) function that returns a zero value of T would likely be too niche; given we can do the same with *new(T) - new already exists and supports as superset of the use-cases a zero(T) function would. Although maybe zero(T) that set's its argument's value to zero might be slightly more useful given the alternative is var zero T; x = zero or x = *new(T)

@Merovius
Copy link
Contributor

Merovius commented Apr 17, 2023

@justenwalker

but besides decreasing the amount of typing, what other benefits does a universal zero value provide in any form; (nil, zero, _ ... {})

We can allow any type to be comparable against that universal zero identifier, which means you don't need to constrain generic code on comparable if all you want to do is check if something is zero.

FWIW that's probably also a reason to make the universal zero nil, because it would mean we can simplify the spec by removing the "comparison with the predeclared identifier nil" special cases for individual types in favor of a universal rule, while maintaining backwards compatibility.

@earthboundkid
Copy link
Contributor

In benchmarks for my truthy package, I've seen that IsZero[T comparable](val T) bool takes around 2x as long as a non-generic IsZeroString, etc. Not great but acceptable for the convenience. My guess is this is the price of losing inlining. IsZero[T any](val T) bool using reflection is about 50x slower than a native comparison, which is just not acceptable outside of toy code.

@DmitriyMV
Copy link
Contributor

DmitriyMV commented Apr 18, 2023

@carlmjohnson that can't be right. Both generic and non-generic versions should inline. This benchmark shows no difference in the ns/op.

@earthboundkid
Copy link
Contributor

earthboundkid commented Apr 18, 2023

@DmitriyMV on my machine, here's what I see:

go version go1.20 darwin/amd64

goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-1060NG7 CPU @ 1.20GHz

BenchmarkGeneric-8   	1000000000	         0.7093 ns/op	       0 B/op	       0 allocs/op
BenchmarkConcrete-8   	1000000000	         0.3318 ns/op	       0 B/op	       0 allocs/op

@randall77
Copy link
Contributor

I was able to reproduce @carlmjohnson 's performance difference. It has nothing to do with generics per se, just an optimization that sometimes failed to apply. See #59684. CL for a fix is in review.

@griesemer griesemer removed the TypeInference Issue is related to generic type inference label May 16, 2023
@rsc
Copy link
Contributor

rsc commented Jul 15, 2023

Thanks for the discussion here. I filed #61372 to try to move a concrete plan forward.

@adonovan
Copy link
Member

adonovan commented Dec 6, 2023

The {} notation is likely ambiguous, and the alternative zero approach in #61372 was eventually retracted due to lack of consensus. This proposal seems like a likely decline. Leaving discussion open for a further four weeks of final comments.

@findleyr
Copy link
Contributor

findleyr commented Jan 10, 2024

Closing as no further comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests