You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the proposals for the slices and maps packages, as well as other proposals talking about generic data structures (ie container/list and container/set), it has become clear that we need some pattern to iterate over structures.
Common iteration patterns
There are two common iterator patterns from what I have seen.
The first is a pretty standard pattern, aptly named the "Iterator Pattern". This is a pattern such that you return an iterator which is repeatedly called over time, and the iterator will return the next value each time the function is called. This is the iteration pattern that #43557 has recommended for Go. One of the great benefits of this pattern is that it is standard among many languages. However, these iterators tend to be quite cumbersome because the writer of the iterator needs to manage state over time, which can be extremely difficult, especially when dealing with complex data structures like maps, or recursive data structures like trees.
The second pattern is a bit more modern and is typically named the "Generator Pattern". In this pattern, the iterator is a single function which is passed a "yield" function (or in many languages, a yield keyword which may be only be used in Generator Functions). Each time yield is called, control goes to the caller of the iterator, and then control is given back to the iterator again, until yield is called again, or until the iterator ends. The amazing thing about the generator pattern, is that the iterator is extremely easy to write, and it ends up looking almost like a solution where one would simply append to a slice (and return the slice in the end). This is the pattern which #47707 recommends.
Using channels as iterators
In Go, we actually already have a form of iterating using the generator pattern: goroutines and channels. This can be done by making a function which returns a read-only channel. This function creates a goroutine, and sends values on the channel. It would look something like this, which is valid Go code today:
There are a few problems with this pattern in current Go. The main issue is that we must exhaust the iterator channel in order for the spawned goroutine to be destroyed. Otherwise, the goroutine in the iterator will simply hang on ch <- i forever.
The second issue is performance. Channels in Go unfortunately have a lot of overhead. In my own testing, recursive iterators took ~500x longer to use channels to iterate over a binary tree (compared to calling a function on each element). Iterating over a slice instead of a tree, it took ~100x longer to use channels.
Optimizations can definitely be made though, as using Javascript's generator pattern (which also uses coroutines in some form) can iterate over the binary tree in ~10x longer than calling a function on each element in Go.
Proposal
This proposal has three parts:
Establish a standard of using channels for iterators in the standard library.
Add optimizations for goroutines/channels so that using them as iterators does not cost so much time.
For part 1, this would mean we take actions such as adding Iter() <-chan T for container/list and container/set.
For part 2, we reconsider #19702 (such that cleanup is not done - the goroutine vanishes when it is GC'd). This allows us to spawn these goroutines, and the caller of the iterator does not need to communicate to the iterator that we are done iterating.
Part 3 is likely the least important of the three, but it is still extremely important. Currently, this pattern is two orders of magnitude slower than calling a function on each element. Languages (Kotlin, Rust) are already adopting this pattern and use a coroutine implementation, but do not see the same immense performance hits that Go does.
The wonderful thing about this update, is that nothing about the language itself needs to change. Tooling does not need to be updated, as channels are already range-able.
Many solutions were discussed in #43557, and I highly recommend checking them out. Here are a couple other solutions that also involve solutions to allow using channels as iterators:
Adding runtime.Deadlocked() to tell if the current goroutine is deadlocked.
This is my second favorite solution behind allowing goroutines to be GC'd.
Alternatively, this could be unexported, and imported (via //go:linkname) in something like chans.Generator. This would give the generator function a yield function, which would select between runtime.deadlocked() and sending on the iterator channel.
chans.Generator would look something like this:
//go:linkname deadlocked runtime.deadlockedfuncdeadlocked() <-chanstruct// Creates an iterator based off of a generator function.funcGenerator(generatorfunc(yieldfunc(int))) <-chanint {
ch:=make(chanint)
gofunc() {
deferclose(ch)
generator(func(incomingint) {
select {
casech<-incoming:
casedeadlocked():
runtime.Goexit()
}
})
}()
returnch
}
// usagefuncFibonacciIter() <-chanint {
returnchans.Generator(func (yieldfunc(int)) {
a, b:=0, 1for {
yield(a)
c:=a+ba=bb=c
}
})
}
Using finalizers to allow us to close the channel.
Finalizers are a bit hacky, but it somewhat solves our problem. There is a cool example that @ianlancetaylor shows in the generics draft about Rangers. Unfortunately, these are not actually rangeable, but it does show that we could possibly use a finalizer to clean up a goroutine.
Establishing a standard for iteration
Related: #43557 and #47707
Problem
With the proposals for the
slicesandmapspackages, as well as other proposals talking about generic data structures (iecontainer/listandcontainer/set), it has become clear that we need some pattern to iterate over structures.Common iteration patterns
There are two common iterator patterns from what I have seen.
The first is a pretty standard pattern, aptly named the "Iterator Pattern". This is a pattern such that you return an iterator which is repeatedly called over time, and the iterator will return the next value each time the function is called. This is the iteration pattern that #43557 has recommended for Go. One of the great benefits of this pattern is that it is standard among many languages. However, these iterators tend to be quite cumbersome because the writer of the iterator needs to manage state over time, which can be extremely difficult, especially when dealing with complex data structures like maps, or recursive data structures like trees.
The second pattern is a bit more modern and is typically named the "Generator Pattern". In this pattern, the iterator is a single function which is passed a "yield" function (or in many languages, a
yieldkeyword which may be only be used in Generator Functions). Each timeyieldis called, control goes to the caller of the iterator, and then control is given back to the iterator again, untilyieldis called again, or until the iterator ends. The amazing thing about the generator pattern, is that the iterator is extremely easy to write, and it ends up looking almost like a solution where one would simply append to a slice (and return the slice in the end). This is the pattern which #47707 recommends.Using channels as iterators
In Go, we actually already have a form of iterating using the generator pattern: goroutines and channels. This can be done by making a function which returns a read-only channel. This function creates a goroutine, and sends values on the channel. It would look something like this, which is valid Go code today:
There are a few problems with this pattern in current Go. The main issue is that we must exhaust the iterator channel in order for the spawned goroutine to be destroyed. Otherwise, the goroutine in the iterator will simply hang on
ch <- iforever.The second issue is performance. Channels in Go unfortunately have a lot of overhead. In my own testing, recursive iterators took ~500x longer to use channels to iterate over a binary tree (compared to calling a function on each element). Iterating over a slice instead of a tree, it took ~100x longer to use channels.
Optimizations can definitely be made though, as using Javascript's generator pattern (which also uses coroutines in some form) can iterate over the binary tree in ~10x longer than calling a function on each element in Go.
Proposal
This proposal has three parts:
For part 1, this would mean we take actions such as adding
Iter() <-chan Tforcontainer/listandcontainer/set.For part 2, we reconsider #19702 (such that cleanup is not done - the goroutine vanishes when it is GC'd). This allows us to spawn these goroutines, and the caller of the iterator does not need to communicate to the iterator that we are done iterating.
Part 3 is likely the least important of the three, but it is still extremely important. Currently, this pattern is two orders of magnitude slower than calling a function on each element. Languages (Kotlin, Rust) are already adopting this pattern and use a coroutine implementation, but do not see the same immense performance hits that Go does.
The wonderful thing about this update, is that nothing about the language itself needs to change. Tooling does not need to be updated, as channels are already range-able.
Example
Other solutions
Many solutions were discussed in #43557, and I highly recommend checking them out. Here are a couple other solutions that also involve solutions to allow using channels as iterators:
runtime.Deadlocked()to tell if the current goroutine is deadlocked.This is my second favorite solution behind allowing goroutines to be GC'd.
Alternatively, this could be unexported, and imported (via //go:linkname) in something like
chans.Generator. This would give the generator function ayieldfunction, which would select betweenruntime.deadlocked()and sending on the iterator channel.chans.Generatorwould look something like this:Finalizers are a bit hacky, but it somewhat solves our problem. There is a cool example that @ianlancetaylor shows in the generics draft about Rangers. Unfortunately, these are not actually
rangeable, but it does show that we could possibly use a finalizer to clean up a goroutine.