-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Description
In the following snippet, the compiler could recognize that the channel can not be accessed concurrently by anything else, and optimize the whole loop to a straight copy of the contents of the slice to the channel's send queue1:
ch := make(chan T, len(sliceT))
for _, e := range sliceT {
ch <- e
}
The main benefit is that this would avoid a lock/unlock for each element, reducing CPU utilization.
A potentially better alternative (that I haven't really thought through in terms of compliance to the spec) is to provide a way (e.g. via copy) or otherwise allowing the compiler to send/receive batches of multiple elements (regardless of whether the channel is already in use by multiple goroutines) as long as there is space in the channel and as long as the sending/receiving code does not perform any other synchronization operation. This approach would still amortize the lock/unlock cost across multiple elements and would likely apply to more cases than just the specific one above.
I see this idiom pop up most frequently where work created upfront needs to be distributed to a bounded pool of workers, and, more in general, to build iterator patterns (even though performance is often cited as problematic). It is also used in some cases to create semaphores.
Footnotes
-
going one step further, if the slice itself can be proven to be dead after the loop, the copy of the slice contents and the allocation of the channel send queue could be skipped and the channel could adopt the slice itself as its send queue, turning an O(n) operation into O(1). ↩