Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: spec: make fewer types nillable #28133

Open
zenhack opened this issue Oct 10, 2018 · 46 comments
Open

proposal: Go 2: spec: make fewer types nillable #28133

zenhack opened this issue Oct 10, 2018 · 46 comments

Comments

@zenhack
Copy link

@zenhack zenhack commented Oct 10, 2018

Apologies if this has been suggested before; I've been unable to find an issue.

Currently go has many basic types where the zero value is nil. nil is a huge source of bugs, and a potential Go 2, where backwards-incompatible changes are on the table, offers an opportunity to make the language more robust by reducing its scope.

Specifically, in some cases there is an obvious zero-value that makes more sense than nil. Off the top of my head:

  • The zero value for slices could be a slice with length and capacity of zero.
  • The zero value for maps could be an empty map.
  • The zero value for channels could be an unbuffered channel.

This is more in keeping with the design principle of making the zero value useful.

There isn't an obvious good zero value that I can think of for pointers and interfaces, but even this partial reduction would make it easier to build reliable software.

@bcmills bcmills changed the title Go 2: make fewer types nillable proposal: Go 2: spec: make fewer types nillable Oct 10, 2018
@gopherbot gopherbot added this to the Proposal milestone Oct 10, 2018
@bcmills
Copy link
Member

@bcmills bcmills commented Oct 10, 2018

nil slices and channels are, in my experience, not usually a significant problem.

nil maps and nil pointers, on the other hand...

@zenhack
Copy link
Author

@zenhack zenhack commented Oct 10, 2018

I've definitely run into bugs from nil slices (less so channels, but it has happened, and it seems like a straightforward fix). Agree that maps and (especially) pointers are a much bigger problem. Killing nil pointers would be huge but I don't know how to do it while still having zero values.

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 10, 2018

Related: #22729

@magical
Copy link
Contributor

@magical magical commented Oct 10, 2018

It's worth noting that nil channels have useful properties in select statements: sending or receiving on a nil channel always blocks, so nil-ing out a channel before doing a select effectively removes cases involving that channel from consideration, allowing one to dynamically enable or disable cases at runtime.

That said, it's definitely annoying to have to explicitly initialize channels.

@deanveloper
Copy link

@deanveloper deanveloper commented Oct 10, 2018

The zero value for slices could be a slice with length and capacity of zero.

That's already what it does :) A nil slice has no behavior difference from a zero-length-zero-capacity slice, except that s == nil is true.

I love this though. Zero values should be useful, begone with nil for things that don't need it!

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 10, 2018

One of the advantages of the current approach is that the zero values for all types is the value with all bits zero. This makes it easy to initialize variables. Of course that is not an absolute requirement; we could change that with some loss of efficiency. But it's the main reason that nil channels are not useful.

@robpike
Copy link
Contributor

@robpike robpike commented Oct 10, 2018

But nil channels are useful: They signal that the communication cannot happen and thereby provide the ability to control individual cases in select statements.

@zenhack
Copy link
Author

@zenhack zenhack commented Oct 10, 2018

I don't feel as strongly about channels, and apparently I was just plain mistaken re: slices (though it would probably be more intuitive for them to be non-nillable; most of the code I've seen in the wild that cares about nil slices seems to be written under the assumption that the same problems crop up as with pointers).

Making maps non-nillable seems like it is a very big win for robustness though.

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 10, 2018

Currently map is a covert pointer, whereas slice is a covert struct.

A Go2 default-empty map probably shouldn't allocate a map object until necessary, but then every m[k]=v incurs a test. Today every append(s, v) incurs a test, so maybe that's fine?

@zenhack
Copy link
Author

@zenhack zenhack commented Oct 10, 2018

m[k]=v already has to do a nil check, but right now it panics instead of initializing the map.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 10, 2018

Note that m[k] = v can not initialize the map if it is nil. Because maps have reference semantics, if we are going to initialize them automatically, we must do so at the point of declaration. Consider

func Add(m map[int]int) { m[0] = 1 }
func F() {
    var m map[int]int
    Add(m)
    fmt.Println(m[0])
}
@zenhack
Copy link
Author

@zenhack zenhack commented Oct 10, 2018

@networkimprov, that doesn't fix the problem; if it's an unboxed struct and you assign it, then use one copy or the other, they end up being different maps, since initializing one pointer doesn't initialize the other.

Doing up-front allocation would at least have the upside of allowing the nil check to always be omitted. Probably still a loss overall wrt perf.

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 10, 2018

Sorry, just deleted the comment you refer to (re making map a covert struct like slice), as I realized that slice has the same issue @ianlancetaylor described :-)

However that code works if map is a covert **MapType. But that incurs an extra dereference :-(

@zenhack
Copy link
Author

@zenhack zenhack commented Oct 10, 2018

But then you still have to allocate the (outermost) pointer up front.

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 11, 2018

Given all that, and the need for reference semantics, maybe default-empty maps don't fly.

@bronze1man
Copy link
Contributor

@bronze1man bronze1man commented Oct 12, 2018

@neganovalexey
I think slice is not a reference ,but map is a reference is also a source of bug.
How about add another map like smap which is not a reference, and make it default-empty, and leave the origin map as it is.
This way may even save some allocs in some cases.

@wsc1
Copy link

@wsc1 wsc1 commented Oct 19, 2018

I think the most important thing to keep in mind for any changes to nil-able things is consistency.

One thing that is consistent about the current semantics is that all nil-able (atomic) things are in some sense a reference or pointer and vice versa: all reference or pointer (atomic) things are nil-able. If you start making exceptions to that, then there are more rules to keep in mind and more clutter of specifications.

@mateuszmmeteo
Copy link

@mateuszmmeteo mateuszmmeteo commented Jan 29, 2019

I'm in opposite of the author - NIL is very good idea. If value is not set then it should not be set. That's required action in most of cases.

Think about all the int's not touched and set up as 0 by default (example). This will not prevent from the bugs but make them more.

Now you can simply check if the variable had been set (that was the intention of developer). With new solution proposed variable will always have some value even if not touched.

Language should not be responsible for fixing developers bugs - developers are.

@deanveloper
Copy link

@deanveloper deanveloper commented Jan 29, 2019

@mateuszmmeteo I'd recommend giving this a read - https://blog.valbonne-consulting.com/2014/11/26/tony-hoare-invention-of-the-null-reference-a-billion-dollar-mistake/

There are of course several reasons to have and not to have nil. Sometimes nil really is needed, but at other times it is not, and I think that this proposal does a good job of illustrating a couple places where nil really isn't needed

@mateuszmmeteo
Copy link

@mateuszmmeteo mateuszmmeteo commented Jan 29, 2019

@mateuszmmeteo I'd recommend giving this a read - https://blog.valbonne-consulting.com/2014/11/26/tony-hoare-invention-of-the-null-reference-a-billion-dollar-mistake/

There are of course several reasons to have and not to have nil. Sometimes nil really is needed, but at other times it is not, and I think that this proposal does a good job of illustrating a couple places where nil really isn't needed

Ok, and regarding memory use? If you for example define very BIG map of LARGE objects when you want to allocate memory? Already on definition (because it will be zeroed with this example) and you need to have CAP to put objects there or later on first insert, what will generate unnecessary delay to reserve memory and then put new object there?

What about garbage collector? When we should consider that this LARGE object (slice, map) is not needed? What if developer will declare it but will not use it? Once again shout that this has not been used but defined? With existing scenario impact will be very low. On zeroed map the memory needs to be reserved for inserting.

Finally what is faster: checking that slice, map and channels have a some length or they are NIL?

@zenhack
Copy link
Author

@zenhack zenhack commented Jan 29, 2019

Others have pointed out the perf related issues with nil maps, and those are valid. The other cases are kindof moot; I'd misunderstood the semantics in the first place. Channels have useful behavior that I hadn't realized. I no longer feel strongly about this proposal either way.

Finally what is faster: checking that slice ... have a some length or they are NIL?

Per discussion above, slices already do what I suggested. The check should be exactly the same; either way you're pulling out an integer-sized field and comparing it to zero.

I think it would make sense to have nil not be something that can be compared/assigned to slices, just for clarity -- I know I'm not the only person who's misunderstood this.

@creker
Copy link

@creker creker commented Jan 30, 2019

@mateuszmmeteo making nil map valid by itself does not remove the ability to make maps with required capacity. Just like with slices, you either use the default or make the one you want.

Performance is a question of implementation, not semantics. Maps could be lazily allocated to avoid unnecessary garbage when it's not used. Again, like slices. The problem is that with slices we have slice = append(slice, ...) and that maps really well to lazy allocation. With maps we have m[key] = value and it's not obvious how would you allocate new map here.

Finally what is faster: checking that slice, map and channels have a some length or they are NIL?

For slices - the same. Slice is a struct with three fields - pointer, length and capacity. Nil slice check extracts pointer and checks it for nil. Exactly the same as extracting the length and checking it for zero. Again, that's an implementation question and necessary optimization can be done there.

@taralx
Copy link

@taralx taralx commented Feb 28, 2019

nil maps drive me nuts. They're the largest source of "constructor" functions in my code. If I really want reference semantics for a map, I could always use *map[K]V. 👍 to maps having useful zero values and, if necessary, losing their reference semantics.

@networkimprov
Copy link

@networkimprov networkimprov commented Feb 28, 2019

Another way to get ready-to-wear maps is via default initializers, which run in lieu of zeroing a new value.

See #28366 (comment)

@atombender
Copy link

@atombender atombender commented Jun 26, 2019

It's worth noting that nil channels have useful properties in select statements

Useful, but also dangerous, because it's too easy to inadvertently select on a nil channel. For example, in the beginning of my Go career I had a case that went something like this:

myChan := make(chan bool)
defer func() {
  if myChan != nil {
    close(myChan)
    myChan = nil
  }
}()
go func() {
  for {
    select {
      case <-myChan:
        // ...
    }
  }
}()
// ...
if something {
  doMoreStuffWithThatChannel(myChan)
  myChan = nil // Prevent closing in the defer
}

This would cause the goroutine to suddenly block on nil. The only way to avoid this is to alias the channel to a local variable inside the goroutine.

I've also made other kinds of goofs, like accidentally listening on a nil channel before it's initialized.

Sometimes it's necessary to know whether one is done using a channel; in other words, a flag to determine if a channel is closed, so that one can prevent double-closing a channel. Since nil channels have special semantics, one must never use nil for this. Always use a separate variable:

// BAD:
func (p *Producer) shutdown() {
  if p.someChan != nil {
    close(p.someChan)
    p.someChan = nil
  }
}

// GOOD
func (p *Producer) shutdown() {
  if !p.shutdown {
    p.shutdown = true
    close(p.someChan)
  }
}

There are just too many gotchas. This is a case of Go's design optimizing for the wrong thing.

As an aside, close() on a nil channel will, somewhat surprisingly, panic, violating the principle of idempotency. Closing something that is closed should not error catastrophically. If you look at POSIX/C, for example, close() merely returns an error if you try to close something twice.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 26, 2019

The handling of nil channels in select is essential for various kinds of channel handling.

The close function on a channel is probably misnamed. It is not a close operation. It is a send operation that tells that receiver that there will be no more data. The language doesn't support multiple close calls on a channel because it doesn't make sense to report "there will be no more data" more than once. Any such dual reports may be a race condition.

@atombender
Copy link

@atombender atombender commented Jun 26, 2019

I understand why it is this way; I am explaining why it's language design. It conflates three unrelated concepts — nil values, zero values and a particular kind of channel state. Nil is a value, not state.

@peterbourgon
Copy link
Member

@peterbourgon peterbourgon commented Mar 12, 2020

I make good use of nil channels in certain types of state machines. But I never make good use of nil maps. A lot of my code would be significantly simplified, and improved, if the following were possible:

var m map[int]int
m[1]++

That this panics is especially frustrating, given that this doesn't:

var m map[int]int
println(m[1]) // Output: 0

This might be my most-wanted language improvement.

@leighmcculloch
Copy link
Contributor

@leighmcculloch leighmcculloch commented May 22, 2021

This would significantly improve my use of maps, but not for the var m map[int]int case expressed above.

Whenever I declare a map in local scope I more routlinely use this form which initializes the map:

m := map[int]int{}

👍🏻 Where auto-initializing maps would help is when they are a field of a struct. When making structs that contain maps we either need to:

  • Make a New* function to handle map setup.
  • Have every function of the type check if the map is nil and initialize it if so.
type S struct {
	m map[int]int
}

func (s *S) DoAThing() {
	if s.m == nil {
		s.m = map[int]int{}
	}
	m[1]++
}

Auto-initializing maps would help with maps as struct fields.


😕 Auto-initializing maps would change behavior for JSON encoded types though. The following code encodes M as null, but it would start encoding as {}. That seems like a pretty subtle and potentially difficult breaking change to check for.

type S struct {
	M map[int]int `json:"m"`
}

func mustJSON(v interface{}) string {
	b, err := json.Marshal(v)
	if err != nil {
		panic(err)
	}
	return string(b)
}

func main() {
	fmt.Println(mustJSON(S{}))
	fmt.Println(mustJSON(S{map[int]int{}}))
}

💡 Instead of auto-initializing maps, if we could specify default values for struct fields, then that would alleviate one of the larger pain points of using maps inside structs.

type S struct {
	m map[int]int = map[int]int{}
}

func (s *S) DoAThing() {
	m[1]++
}
@alnr
Copy link

@alnr alnr commented May 22, 2021

It may be of note that even if channels were constructed as unbuffered channels instead of nil when writing

var c chan int

that doesn't preclude the possibility of setting the channel to nil explicitly.

c = nil

In fact, most or all uses of nil channels I've seen start out with a non-nil channel and explicitly set it to nil before a select to selectively disable that particular channel.

Personally, I feel all of

var s []int
var m map[int]int
var c channel int

being nil violate the principle of "making the default value useful". Whether this is fixable without huge breakage I don't know.

@atdiar
Copy link

@atdiar atdiar commented May 22, 2021

Nil is useful to encode state.
The only issue is that it seems to be a bit hard to use for typestate.
If it doesn't make things too complex, it could be useful to be able to pick which operations are available to the nil value/state of a given type.

I have used nil maps in real programs to distinguish type states.

@peterbourgon
Copy link
Member

@peterbourgon peterbourgon commented May 22, 2021

Nil is useful to encode state.

All properties of a language can be made useful somehow. The question is about the net balance of cost vs. benefit. And I don't think there's any way to argue that a usable zero-value map would be net worse than the current behavior.

@leighmcculloch
Copy link
Contributor

@leighmcculloch leighmcculloch commented May 22, 2021

Rather than change maps such that their zero value is initialized, which is a breaking change, we could auto-initialize a map when setting a key on a nil map in a similar way that appending to a slice does.

Accessing a value from a nil map already returns the zero value and ok false without initializing the map, so this makes setting more consistent with access.

This would preserve existing behavior where nil maps are serialized differently than initialized maps, and nil checks on maps would continue to work.

This would be a breaking change only for code that relies on a panic when setting on a nil map, which is not particularly sensible.

@creker
Copy link

@creker creker commented May 22, 2021

@alnr

being nil violate the principle of "making the default value useful"

How so? Apart from maps, other two are useful. Nil slices can used as is and semantically equivalent to empty slice. Nil channels has specific semantics and whether in most cases people allocate channels and only nil them after doesn't matter here. But all of them have one big advantage - they do not create garbage by default. If we start creating objects left and right just by mere definition we will add significant GC overhead.

Maps are the only exception here. Nil maps are not useful and dangerous. The only useful feature they have is encoding nil state itself. And as was discussed before, we can't do much here. We can't change reference semantics, so we would have to allocate them somehow before use to prevent panics. That means lazy allocation on first access. Whether it's worth it is debatable. I suspect lazy allocation would mean additional overhead on every map access even after it's already allocated. And it would complicate tracking GC problems as you will be unable to identify allocation points at compile time like you can today simply by searching for make calls.

@seebs
Copy link
Contributor

@seebs seebs commented May 22, 2021

you can't detect allocation of maps by searching for make calls, because map literals can also be used to create non-nil maps.

i don't think that auto-initializing a map when setting a key on a nil map works the same way as appending to a slice, for the crucial reason that, if you pass a map into a function, there's no way for that function to know this, or initialize the caller's copy. with slices, this "works" because we already know that if you plan to append, you must take the appended-to slice as a return value because it may be a new and different slice. inserting into a map doesn't change the map object, and lots of code passes map values into functions, expecting those functions to write into the map.

that means that whatever the zero-value map is, it has to be something which will somehow magically start referring to the initialized map if you copy it, and then initialize the copy. which would almost certainly imply, in effect, an extra layer of allocations and indirection on all those maps, and also a zero value which wasn't just all-bytes-zero.

so auto-initializing a map when setting a key would produce extremely surprising behavior, in that you'd pass a map into a function, it would write to the map, and then the parent wouldn't see the changes. or it would impose significant overhead on Basically Everything.

@deanveloper
Copy link

@deanveloper deanveloper commented May 22, 2021

Maps are the only exception here. Nil maps are not useful and dangerous. The only useful feature they have is encoding nil state itself.

This is not true. For instance, some functions may take a lookup table argument. Nil maps are useful for representing a read-only lookup table with no values.

I personally don’t think that nil maps are very good, but saying that they are not useful is just untrue.

@peterbourgon
Copy link
Member

@peterbourgon peterbourgon commented May 22, 2021

@seebs

so auto-initializing a map when setting a key ... would impose significant overhead on Basically Everything.

If a map were defined as e.g. type map struct{ p *realmap } would the extra pointer dereference actually constitute "significant overhead"?

@creker
Copy link

@creker creker commented May 23, 2021

@seebs

The point still stands. With maps today you can detect allocation points just by looking at the code. If maps were to become somehow auto allocating, it wouldn't be possible anymore.

Of course it should propagate to parents somehow. Otherwise it's just broken. Probably would require double indirection and force compiler to make heap allocations even for nil maps but I think it should be possible.

@seebs
Copy link
Contributor

@seebs seebs commented May 23, 2021

I think the extra pointer dereference would be significant overhead in a couple of ways.

First, it means that the zero-valued map actually has to contain a pointer to a realmap which actually has to exist somewhere, which means an allocation. Second, it means that the zero-valued map has to not consist of all-zero bytes. And thirdly, an extra pointer indirection on every access is not free, and maybe it'll mostly stay cached, but every time it falls out of cache, it's going to cost significant time (on a memory-access time scale) to re-cache it, and if you get unlucky, it'll push the realmap's data out of the cache, because it's probably not in the same chunk of storage as anything else, and so on.

Also, unless you do extra special magic in the compiler to tell the compiler that it can be absolutely sure that the realmap pointer itself is never nil, you're adding another layer of nil checks which the compiler is not great at omitting, and which in the current regime I believe count against your inlining complexity budget, etcetera.

Basically, everything that's currently got one nil access check will become two nil access checks plus an extra dereference, etc.

So, yes, I think that overhead would be at least potentially significant. Honestly, though, the "can't be all bytes zero" thing is probably the bigger concern, but that's the one you need to get maps to have the behavior we currently expect from them, while allowing automatic allocation.

And I do concede creker's point that it's harder to detect potential map allocations under that change. With slices, we do have that because you have to have an append-or-make. Map writes not creating new maps is pretty well established.

@creker
Copy link

@creker creker commented May 23, 2021

Instead of double indirection we could add special flag that indicates "nilness" into the internal map structure itself. It already has flags field. It would require every map declaration to be actual heap allocation of the whole structure which is significantly bigger than single pointer with double indirection.

@mateuszmmeteo
Copy link

@mateuszmmeteo mateuszmmeteo commented May 23, 2021

@seebs
Copy link
Contributor

@seebs seebs commented May 23, 2021

Even then, you're still requiring the allocation, and the "zero value" which isn't zeroed bytes, and I think that's probably Too Much. I sort of get the appeal of an implicit initialization, but it implies too many expensive changes.

@creker
Copy link

@creker creker commented May 23, 2021

Not actually zeroed bytes - I don't think it's such a big problem. It would be hidden from the programmer anyway. Only unsafe could probably reveal these details and in that case it shouldn't violate backwards compatibility. nil checks would still return true as they would check against nilness flag. But allocation is a big problem, this I agree with.

@alnr
Copy link

@alnr alnr commented May 25, 2021

@mateuszmmeteo

Don’t make golang another python or node.js. Default values are bad!

Rob Pike disagrees.

You cannot detect if user provided input / object had been deserialized correctly or it’s just default value. Nil is there for a reason. As already mentioned before - nil value is a state - it tells that pointer doesn’t point to any memory address so had not been alocated. Actually all collections should be nil including channel because this one also sould not have any value. Issue with nil values is always between keyboard and chair - sorry, that’s a true.

You can always use *map[int]int to signify an optional value, and nobody is arguing the zero value for pointers should be anything but nil.

@mateuszmmeteo
Copy link

@mateuszmmeteo mateuszmmeteo commented May 25, 2021

@creker
Copy link

@creker creker commented May 25, 2021

@mateuszmmeteo you seem to confuse nil pointer with zero pointer. Nil is an abstract concept that, generally, has no value associated with it. If we preserve all the reference semantics of maps, it doesn't matter what actual default value it has in memory. It can even be heap allocated object. You can still compare it to nil and get the answer you expect. If you didn't allocate a map and didn't try to write to it (which under the proposed behavior above would implicitly allocate it) it would still be equal to nil. It wouldn't be all zero bytes or zero pointer but it would be nil. Go is not C.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet