Robust push/pull/graph generator primitives for Python
Python supports (lazy) generator expressions and functions:
nums = (i for i in range(100))
evens = (i for i in nums if i % 2 == 0)
squared_evens = (i * i for i in evens)
print(list(squared_evens)[0:5])
This is useful because we only iterate over each value a single time while maintaining composability and expresiveness of the individual components. At the end of this code block all of the generators have been depleted.
There are, however, some limitations. First consider the case of evens
needing to be passed into 2 expressions rather than just one.
nums = (i for i in range(100))
evens = (i for i in nums if i % 2 == 0)
squared_evens = (i * i for i in evens)
cubed_evens = (i * i * i for i in evens)
print(list(squared_evens)[0:5])
print(list(cubed_evens))
Oops! The expressions are lazy but not smart enough to save each iteration for both squared_evens
and cubed_evens
. There are some workarounds of course but you start to lose out on the composability benefits depending on what you do.
Another limitation is that traditional generator expressions rely on a pull
model, i.e. you are pull
ing results from some sort of iterable. They cannot easily address a push
model, or when you want to push
a value or iterable of values into a lazily linked computational chain.
This library attempts to address all 3 use cases with as intuitive of a syntax as regular Python generators.
Implementation is heavily inspired by the concept of transducers (most notably from Clojure). This is a somewhat opionated implementation of just some of the concepts of transducers + some directed graph niceties. For heavy duty applications you may want to consider a proper flow-based DAG such as Luigi.
Other projects doing something similar: