A "yield return" implementation for java
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


If you've ever used more than one programming language, you've probably found that - as you switch between them - there's usually some feature that you've left behind that you really miss. It could be a particular class, syntactic sugar, or language construct that you just can't do without. When I went from doing development primarily in Python and C# to doing development primarily in Java, one of the little things I missed from those languages was generators.

Generators are a simple but powerful tool for creating iterators. They are written like regular functions but use a "yield" statement whenever they want to return data. Each time next() is called in your for-each loop, control passes to the generator and it resumes right where it left off - its local variables and execution state are automatically saved between calls. The generator returns control back to your loop when it "yields" the next value in the iteration.

Using generators - like iterators - can confer some nice performance benefits. This performance boost is the result of the lazy (on-demand) generation of values, which translates to lower memory usage. Furthermore, we do not need to wait until all the elements have been generated before we start to use them.

It's important to note that anything that can be done with generators can also be done with iterators. What makes generators so nice & compact is that the iterator(), hasNext() and next() methods are all created automatically for you. This helps make generators easier to write and much clearer than an iterator-based approach. You don't have to write a ton of boilerplate and a mini state machine just to keep track of your progress.

By now, you've probably picked up that Java doesn't have generators. Worse still, Java (the language, not the JVM) doesn't have built-in support for continuations - a useful building block for implementing generators (though there are some 3rd-party add-ons like Apache Javaflow that implement them via bytecode manipulation). Fortunately, not all is lost. A bit of Googling turned up an excellent blog post by Jim Blackler who had implemented a Yield/Return framework in Java. After ironing out a few bugs in the framework (and passing those patches back to Jim), we at Zoom adopted it for use in our production environment.

Jim's framework is based on a traditional producer/consumer model. In it, two threads effectively do the work of one. Control passes between the "worker" thread (which computes your iteration's values) and the managing thread (which implements the java.util.Iterator logic). Each invocation of "Iterator.next()" causes the worker thread to wake up and compute the next result, which it puts into a Java SynchronousQueue. The worker thread goes back to sleep and the manager thread pops the queue, returning the newly-computed value to your "for" loop.

By itself, Jim's framework got us lazy-computation and a more natural programming model than straight Java Iterators, which was a huge improvement. But much like iterators, it required writing a lot of boilerblate code. Writing anything in Java requires writing a lot of boilerplate, but we strive to keep that to a minimum. To solve this, we wrote a really simple "Generator" base class that is easily extensible. It implements the Iterable interface, so you can use Generators wherever you would use a Java 1.5-style "for-each" loop. All it requires is that you implement your business logic inside of a "run()" method, return values via its "yield" method, and it figures out the rest for you.