Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: non-reference channel types #28366

Open
bcmills opened this issue Oct 24, 2018 · 22 comments

Comments

@bcmills
Copy link
Member

commented Oct 24, 2018

Background

I've started using 1-buffered channels pretty heavily as “selectable mutexes” (see also #16620). Unbuffered channels are also quite common in Go APIs (for example, as cancel channels in a context.Context; see also #28342 (comment)).

I've noticed several issues that emerge with the existing chan types:

  1. A channel's buffering semantics are often semantically important: for example, a 1-buffered channel can be used as a mutex, but a 2-buffered or unbuffered channel cannot. The need to document and enforce these important properties leads to a proliferation of comments like // 1-buffered and never closed.

  2. Struct types that require non-nil channels cannot have meaningful zero-values unless they also include a sync.Once or similar to guard channel initialization (which comes with losses in efficiency and clarity). Writing New functions for such types is a source of tedium and boilerplate.

  3. Currently, channels are always references to some other underlying object, typically allocated and stored on the heap. That makes them somewhat less efficient that a sync.Mutex, which can be allocated inline within a struct whose fields it guards.

Proposal

I propose a new family of channel types, with a relationship to the existing channel types analogous to the relationship between arrays and slices: the existing chan types are conceptually references to underlying instances of the proposed types.

Syntax

StaticChannelType = "chan" "(" Expression [ "," "close" ] ) ")"  ElementType .
CloseOnlyChannelType = "chan" "_" .

Parsing this syntax has one caveat: if we read a selector expression in parentheses after a chan keyword, we don't know whether it is an Expression or the ElementType until we see the next token after the closing parenthesis. I'm open to alternatives.

Semantics

A StaticChannelType represents a non-reference channel with a buffer size indicated by the Expression, which must be an integer constant expression. A chan(N, close) T can be closed, while a chan(N) T (without the close token) cannot. The buffer size is equivalent to the size passed to make: a call to make(chan T, N) conceptually returns a reference to an underlying channel of type chan(N, close) T.

A CloseOnlyChannelType is a channel with element type struct{} that does not support send operations.

An addressable chan(N[, close]) T can be used with the send and receive operators, range loop, select statement, and close, len, and cap builtins just like any other channel type (just as [N]T supports the same index expressions as []T). A send expression on a chan _ is a compile-time error. (Compare #21069.) A close call on a chan(N) T (declared without the close token) is a compile-time error.

chan(N) T is not assignable to chan T, just as [N]T is not assignable to []T. However, *chan(N) T (a pointer to a static channel) can be converted to chan T or either of its directional variants, and chan _ can be converted to chan struct{} or either of its directional variants. (Making *chan(N) T assignable to chan T also seems useful and benign, but I haven't thought through the full implications.)

A close on a chan T that refers to a non-closeable chan(N) T panics, much as a close on an already-closed chan T panics today. (One already cannot, in general, expect to close an arbitrary channel.) A send on a chan _ panics, much as a send on a closed channel panics today.

Implementation

A chan(N) T is potentially much more compact that a chan T, since we can know ahead of time that some of its fields are not needed. (A chan _ could potentially be as small as a single atomic pointer, storing the head of a wait-queue or a “closed” sentinel.) The difficult aspect of a more optimized implementation is the fact that a *chan(N) T can be converted to chan T: we would potentially need to make chan T a bit larger or more expensive to accommodate the fact that it might refer to a stripped-down statically-sized instance, or else apply something akin to escape analysis to decide whether a given chan(N) T should use a compact representation or a full hchan.

However, we could already do something like that optimization today: we could, for example, detect that all of the functions that return a non-nil *S for some struct type S also populate it with channels of a particular size, and that those channels do not escape the package, and replace them with a more optimized implementation and/or fuse the allocations of the object and its channels. It probably wouldn't be quite as straightforward, but it could be done.

I want to emphasize that the point of this proposal is to address the usability issues of reference channels: the ad-hoc comments about buffering invariants, lack of meaningful zero-values, and make boilerplate. Even if non-reference channels do not prove to be fertile ground for optimization, I think they would significantly improve the clarity of the code.

Examples

package context

type cancelCtx struct {
	Context

	mu       sync.Mutex            // protects following fields
	done     chan _
	children map[canceler]struct{} // set to nil by the first cancel call
	err      error                 // set to non-nil by the first cancel call
}

func (c *cancelCtx) Done() <-chan struct{} {
	return (<-chan struct{})(&c.done)
}
package deque

type state {
	front, back []interface{}
}

// A Deque is a synchronized double-ended queue.
// The zero Deque is empty and ready to use.
type Deque struct {
	contents chan(1) state
	empty chan(1) bool
}

func (d *Deque) Push(item interface{}) {
	var st state
	select {
	case d.empty <- true:
	case st = <-d.contents:
	}
	st.back = append(st.back, item)
	d.contents <- st	
}

[…]

CC: @ianlancetaylor @griesemer @josharian

@bcmills bcmills added this to the Go2 milestone Oct 24, 2018

@gopherbot gopherbot added the Proposal label Oct 24, 2018

@deanveloper

This comment has been minimized.

Copy link

commented Oct 24, 2018

So if these are non-reference, would that mean that its buffer gets copied if I do x := y?

I like the idea of a close-only channel, though.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Oct 24, 2018

@deanveloper, it would presumably be an error to copy a non-reference channel after its first use, in the same way that it is already an error to copy the types defined in the sync package (and atomic.Value).

I'm not sure whether that's something we could easily enforce in the language spec, but we could certainly apply vet rules and/or the race detector.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

It feels a bit weird that the channel operators seem to operate on the channel value but really it seems like you're wanting semantics like sync.Mutex, which operates on the pointer value.

You say:

A chan(N[, close]) T can be used with the send and receive operators...

The above statement seems to imply that the value doesn't need to be addressable, so logically this should work:

(map[bool]chan(1) bool{})[true] <- true

but I'm not sure how you'd make it work.

I wonder whether it might be better to say that the operators work only on *chan(N), and do some syntax sugaring to take the address, similar to calling a method on an addressable value.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Oct 25, 2018

@rogpeppe, that's a good point.

Following on the array analogy: today, you can read from an unaddressable array but not assign to its elements. The use-case for arrays is to index into a literal, but there is no such thing as a channel-buffer literal — and I am not proposing to add one here — so I suspect that it's best to require that the channels be addressable.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

Continuing the array analogy a little, the size argument is perhaps similar enough to the size of an array that square brackets would be justified instead of round.

That is, chan[1] int.

I think it probably makes sense to allow directional specifiers too, so *chan[1] int could be assigned to *<-chan[1] int.

I worry that having another way to pass channels around could split conventions and make APIs less interoperable.

@magical

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

Continuing the array analogy a little, the size argument is perhaps similar enough to the size of an array that square brackets would be justified instead of round.

That would be nice, but seems like it would conflict with existing syntax: chan[1] int already means "a channel of [1]int". https://play.golang.org/p/iucwp16UKgw

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Oct 25, 2018

Square brackets are more natural, but unfortunately ambiguous. Consider:

chan[1] int

vs.

chan [1]int
@bcmills

This comment has been minimized.

Copy link
Member Author

commented Oct 26, 2018

Re directions and having another way to pass channels around: I expect that public APIs would continue to use the existing reference channel types rather than pointers to statically-sized channels, for much the same reason that we tend to pass slices instead of pointers-to-arrays. If you only have a reference to a channel — or to just one direction of a channel — you have little reason to care about its allocation or buffering.

(Plus, as a practical matter, passing channel references instead of pointers discourages callers from playing games with unsafe.Pointer to access the opposing direction.)

@deanveloper

This comment has been minimized.

Copy link

commented Oct 26, 2018

@deanveloper, it would presumably be an error to copy a non-reference channel after its first use, in the same way that it is already an error to copy the types defined in the sync package (and atomic.Value).

I'm not sure whether that's something we could easily enforce in the language spec, but we could certainly apply vet rules and/or the race detector.

I feel like it's a bit clunky to not be able to copy a built-in type, and it'd be pretty easy to do without thinking, especially because they are so similar to regular channels (which can be copied, of course). Maybe if there were a defined behavior for when one of these gets copied? I really like the rest of this.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2018

it would presumably be an error to copy a non-reference channel after its first use

There are no types in the language that work like that. You do mention sync.Mutex and atomic.Value and friends, but those are in the standard library. In the language, all types are copyable.

But as far as I can see the only way to implement that would be to make non-reference channels be pointers, and in that case we no longer have an easy way to implement the zero value.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Dec 5, 2018

There are no types in the language that work like that. You do mention sync.Mutex and atomic.Value and friends, but those are in the standard library.

Yep. I think that's probably the biggest drawback of this proposal.

In the language, all types are copyable.

We could address that with a complementary change to the language (see previously #23764 and #8005 (comment)), but of course a general language feature would raise the cost of this proposal considerably.

as far as I can see the only way to implement that would be to make non-reference channels be pointers, and in that case we no longer have an easy way to implement the zero value

Since the non-reference channel types would be built in, we could define copying a value to be a compile-time error. (Essentially, we could treat these types as “not assignable” even to the same type.)

A single non-copyable type family would be a (minor) wart on the language, but arguably chan is already warty: channels and interfaces are the only types that do not have corresponding literals, and there is another proposal to address that for interfaces (#25860). The wart of being the only uncopyable type doesn't seem much worse than being the only type without literals.

(As a side benefit, making these channels uncopyable would provide [0]chan _ as a convenient marker for noncopyable types, addressing #8005 albeit in a roundabout way.)

@networkimprov

This comment has been minimized.

Copy link

commented Dec 5, 2018

Could copy create a new channel, with the same characteristics, but an empty buffer? I think that's what you'd expect if you copied a struct containing these...

@deanveloper

This comment has been minimized.

Copy link

commented Dec 6, 2018

Then it wouldn't be binary-equivalent which is something I'd expect from copying a built-in type

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Dec 6, 2018

@networkimprov, sends to the copy would not be received via the original (and vice-versa). That seems likely to cause very subtle bugs.

@networkimprov

This comment has been minimized.

Copy link

commented Dec 6, 2018

@bcmills that's what I expect in copying a container with this channel. Where the channel coordinates methods of its parent struct, a copy of the struct needs a channel with no link to the original.

A channel is a mailbox with unique address, and transient senders, receivers, and buffered items. If you duplicate a mailbox, you obtain a new address. (A proposal for tee-ing a channel to clone sent data is #26282.)

I think a type which blocks container copies sort-of defeats the goal of a make-less channel.

@deanveloper

This comment has been minimized.

Copy link

commented Dec 6, 2018

If you duplicate a mailbox, you obtain a new address

Right, but we're talking about copying, not duplicating. If you copy a mailbox, I'd expect it to have all of the same data as the old one.

var a chan(5) int
a <- 2
a <- 1

b := a
x := <-b // I'd personally expect 2 to come out of this
@bcmills

This comment has been minimized.

Copy link
Member Author

commented Dec 6, 2018

Channels are used for synchronization. Duplicating a channel's contents could cause a reference that is supposed to be unique to be duplicated. But failing to duplicate its contents would be even more surprising, like failing to copy the data in an array.

Both of those options are subtle and arguably confusing. That's why I believe the clearest behavior is to disallow copying.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Dec 6, 2018

@networkimprov

I think a type which blocks container copies sort-of defeats the goal of a make-less channel.

Could you elaborate? The examples at the top of the proposal do not require copying: the goal is to enable use-cases like those examples, in which a channel is one of potentially many members of a larger structure.

@networkimprov

This comment has been minimized.

Copy link

commented Dec 6, 2018

We instantiate structs via new & { } or assignment. For a struct with a channel, a chan(...) eliminates the need for a constructor to make the channel, but not the need for a copy-constructor if a chan(...) cannot be instantiated by assignment.

A channel isn't analogous to an array, whose elements are readable ad infinitum. A channel item is read once, by a caller who has the unique address for the channel.

EDIT:
And of course assignment is not a deep copy; the channel buffer is not directly accessible and should be considered "deep".

@networkimprov

This comment has been minimized.

Copy link

commented Dec 7, 2018

It might help to have literals...

type T struct {
                         // "value" channel
   a chanv int           // defaults {0, !close}
   b chanv int (1,close) // literalesque declaration
}

t1 := T{a: chanv int{1, close}, b: chanv int{}} // adjusted
t2 := T{}                                       // use defaults

s1 := []chanv int{{1, close}, ...}
s2 := []chanv int (1,close) {{}, ...}
@networkimprov

This comment has been minimized.

Copy link

commented Dec 11, 2018

And another alternative would be defined-type default initializers; see #28939 (comment)

EDIT:
Default initializers would allow ready-to-wear channels in defined types without a new chan variant.

type T struct {
   c chan int
   i int
}
init T{c: make(chan int, 1), i: 42} // permit constant or make(T, constant)
copy T{c: make(chan int, 1)}        // copy elements not listed

func f() {
   var t T   // use init
   a := t    // use copy
   t.c <- 1
   a.c <- 1
}
@petar-dambovaliev

This comment has been minimized.

Copy link

commented Mar 6, 2019

While you make some good points, the result of this proposal does not feel like Go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.