---

# Coroutines and Futures
### [Emil Sekerinski](http://www.cas.mcmaster.ca/~emil/), McMaster University, November 2023

---

<figure style="float:right;border-right:2em solid white" >
    <img style="width:90pt" src="./img/by-nc-nd.png"/>
    <figcaption style="width:90pt;font-size:x-small"><a href="https://creativecommons.org/licenses/by-nc-nd/4.0/" style="font-size:x-small">Licensed under Creative Commons CC BY-NC-ND</a>
    </figcaption>
</figure>

<img style="float:right;border-left:2em solid transparent" src="./img/Procedures.svg"/>

A call to a procedure, or _subroutine,_ allocates the local variables, starts the execution at the beginning, returns at some point, and deallocates the local variables. Procedures can call other procedures, leading to a nested calling structure. In the diagram to the right, vertical lines show the lifetime of a procedure invocation; bars indicate that the procedure is executing.

<img style="float:right;border-left:2em solid transparent" src="./img/SymmetricCoroutines.svg"/>

_Coroutines_ generalize procedures by allowing control to be _transferred_ out of a coroutine at some point and then resumed again at that point. Coroutines, like procedures, can have local variables. These are preserved when control is transferred out of a coroutine. When a coroutine is created, it starts execution at the beginning. Compared to procedures, coroutines decouple local variable allocation and transfer of control. 

The transfer from one coroutine to another can take different forms. In *symmetric coroutines*, the statement `transfer A` in coroutine `C` transfers control to coroutine `C`. All coroutines are equal; `A` can transfer back to `C` or transfer to `B`. The transfer structure is arbitrary.

<img style="float:right;border-left:2em solid transparent" src="./img/AsymmetricCoroutines.svg"/>

With *asymmetric coroutines*, if coroutine `A` creates coroutine `B`, the statement `suspend` in `B` returns control to `A`; in `A`, the statement `resume B` resumes execution in `B` at the point it suspended; coroutine `A` is like a caller and `B` like a callee, except that `B` will resume where it left and its state is preserved when suspending. Like the call-return structure of procedures, the create-suspend-resume structure of asymmetric coroutines is nested.

Coroutines run concurrently but are scheduled _cooperatively,_ meaning that there are explicit points when control is transferred. Only one coroutine is executed at a time. By contrast, threads (and processes) are scheduled _preemptively,_ meaning that after a specific time, control is transferred. Threads (and processes) can run in parallel. Interference between coroutines is impossible, unlike with threads (and processes).

### Rudimentary Coroutines in C

The C function `setjmp(buf)` stores the program counter and other environment data in `buf` of type `jmpbuf` and returns `0`. The function `longjmp(buf, val)` transfers control to the corresponding `setjmp(buf)` instruction, which returns `val`. Here is an example:

In [None]:
%%writefile setjmplongjmp.c
#include <stdio.h>
#include <setjmp.h>

jmp_buf buf;

void pong() {
    printf("pong\n");
    longjmp(buf, 1);  // transfers to buf; setjmp to return 1
}

void ping() {
    pong();
    printf("ping\n"); // does not print
}

int main() {
    if (!setjmp(buf)) ping(); // when called, 0 returned; when longjmp jumps back, 1 returned
    printf("boom\n");
}

In [None]:
!cc setjmplongjmp.c -o setjmplongjmp; ./setjmplongjmp

Functions `setjmp` and `longjmp` can be used for exception handling. Below, we use them for rudimentary coroutines:
- For every coroutine, a global variable of type `COROUTINE` for the coroutine's environment is declared.
- `CREATE(env, func)` stores the environment of the current coroutine in `env` and initiates coroutine `func`.
- `TRANSFER(from, to)` stores the environment of the current coroutine in `from` and transfers control to coroutine `to`.

In [None]:
%%writefile symmetric.c
#include <stdio.h>
#include <setjmp.h>

#define COROUTINE jmp_buf
#define CREATE(env, func) if (!setjmp(env)) func;
#define TRANSFER(from, to) if (!setjmp(from)) longjmp(to, 1);

COROUTINE m, pi, po;

void pong() {
    printf("pong\n");
    TRANSFER(po, m)
    printf("pong\n");
    TRANSFER(po, pi)
}

void ping() {
    printf("ping\n");
    CREATE(pi, pong())
    printf("ping\n");
}

int main() {
    printf("boom\n");
    CREATE(m, ping())
    printf("boom\n");
    TRANSFER(m, po)
}

In [None]:
!cc symmetric.c -o symmetric; ./symmetric

If `pong` has local variables, `ping` must not call any other function: the local variables of `ping` and `pong` are allocated but not deallocated when control is transferred. If `ping` were to call another function, it would overwrite the local variables of `pong`. Above, `ping` can call `printf` only because `pong` does not have local variables. Still, the return addresses on the stack when calling `ping` and `pong` are overwritten; thus, neither would return as expected. In general, coroutines implemented in this way must only use global variables. As the limitations are impractical, coroutine libraries for C and C++ allocate each coroutine its stack (the code for that depends on the internals of the underlying compiler); see also [this Wikipedia entry](https://en.wikipedia.org/wiki/Setjmp.h) on the limitations.

### Python Generators

In Python, asymmetric coroutines are used as _generators,_ which are functions that, when suspended, additionally yield a value. The `yield` statement is used for suspending, and `next` is used for resuming. A generator is created by “calling” it. Trying to resume a generator that terminates raises a `StopIteration` exception:

In [None]:
def gen():
    print("A"); yield; print("B"); yield; print("C")

In [None]:
g = gen(); g

The generator `g` has its own state and runs concurrently with the main program:

In [None]:
next(g)

In [None]:
next(g)

In [None]:
next(g) # raises StopIteration exception

Typically, generators return a result:

In [None]:
def fib():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

In [None]:
f = fib(); next(f), next(f), next(f), next(f), next(f), next(f)

Generators are used in `for` statements for iterating over all generated elements. As `fib()` generates arbitrarily many elements, this allows, in principle, iteration over an infinite sequence:

In [None]:
for x in fib(): print(x) # caution

_Exercise:_ Modify `fib` to take an additional integer parameter: `fib(n)` yields only numbers that are less than `n` and then terminates. This allows `fib(n)` to be used like `range(n)` in loops.

In [None]:
Your code here

In [None]:
def fib(n):
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a + b

Calling `next()` will eventually raise an exception:

In [None]:
f = fib(3); next(f), next(f), next(f), next(f)

In [None]:
next(f) # raises StopIteration exception

Catching the exception allows iteration until the generator terminates:

In [None]:
f = fib(10)
try:
    while True: print(next(f))
except StopIteration: pass

A Python `for` loop abbreviates this:

In [None]:
for x in fib(10): print(x)

Generators can be used to prevent intermediate data structures from being constructed. Consider functions
- `unique(iterable)`, which takes an `iterable` (list, tuple) and returns the elements in the same order but without duplicates,
- `filter(fn, iterable)`, which takes an argument `fn`, a predicate, and returns the elements of the second argument, `iterable`, that satisfy `fn`.

In [None]:
def unique(iterable):
    seen = set()
    for e in iterable:
        if e not in seen:
            yield e
            seen.add(e)

In [None]:
unique([1, 3, 4, 2, 1, 3])

In [None]:
list(unique([1, 3, 4, 2, 1, 3]))

The Python expression `x * x for x in range(10)` is an *iterable* that can be used as an argument to `unique` (the syntax requires it to be written in parenthesis):

In [None]:
(x * x for x in range(10))

In [None]:
unique(x * x for x in range(10))

In [None]:
list(unique(x * x for x in range(-5, 5)))

More complex expressions can be constructed in which generators are used akin to pipes with message passing:

In [None]:
def filter(fn, iterable):
    for e in iterable:
        if fn(e):
            yield e

In [None]:
def even(x): return x % 2 == 0
list(filter(even, [1, 2, 3, 4, 5, 6]))

In [None]:
list(unique(filter(even, (x * x for x in range(-5, 5)))))

While we can think of `filter` returning a list of even numbers, no such list is constructed in memory. Generator `filter` and `unique` are coroutines that run concurrently with the main program. (Note that Python has a built-in function `filter` with similar functionality.)

_Exercise:_ Implement `unique` and `filter` in Go as goroutines and compose them to achieve the same functionality as above. Note that while all even numbers are transmitted over a channel, a sequence of all even numbers is not constructed in memory.

Here is an implementation of the producer-consumer problem with a single producer and consumer. The producer is oblivious to the consumer; the consumer needs a reference to the producer:

In [None]:
def producer(n):
    for i in range(n):
        print("producing", i); yield i
        
def consumer(p):
    s = 0
    for i in p: print("consuming", i); s += i
    print("sum:", s)

def producerConsumer(n):
    p = producer(n); consumer(p)

producerConsumer(5)

### Goroutines

Goroutines are similar to symmetric coroutines in that control is transferred explicitly when sending and receiving. However, goroutines can be scheduled preemptively; if the goroutines do not access global variables, this is not observable. The Go implementation avoids time slices inherent to preemptive scheduling and instead inserts `transfer` instructions at places where the code might take longer, like loops. Thus, even if this appears to be preemptive scheduling, the Go implementation uses cooperative scheduling for efficiency.

### Python async / await

Threads (and processes) are suitable for _CPU-bound_ programs, as the computation can be spread among processors (cores). Symmetric coroutines are suitable for _I/O-bound_ programs: as the programs mainly wait for different I/O actions, they can quickly switch among those. In Python,

- a symmetric coroutine is declared with `async def`, and an instance of a coroutine is created by “calling” it,
- `await c` transfers control to the scheduler for `c` to be executed,
- `asyncio.run(c)` initiates the coroutine scheduler and starts coroutine instance `c`,
- `asyncio.gather(c0, c1, ...)` starts coroutines `c0, c1, ...` and waits all instances to terminate,
- `asyncio.sleep(d)` is a coroutine that delays execution for `d` seconds.

In Python, there is no direct transfer between coroutines. Instead, the `asyncio` library provides a scheduler that makes all transfers: `await c` transfers control to the scheduler and resumes once `c` is completed. Coroutines have to be wrapped into tasks for finer control:
 
- `t = asyncio.create_task(c)` starts coroutine `c` and returns a `Task` object, `t`,
- `t.cancel()` stops task `t`.

_Note:_ Some features are only available in Python 3.7 or 3.8. Check with `python3 -V`.

The call `asyncio.run(c)` starts an *event loop* that schedules the coroutines. It can be called only once within a thread. Note that the program below is written to a file and started as its process from the command line:

In [None]:
%%writefile asyncrequest.py
import asyncio, time

async def request(i):
    print(i * "  ", i, "requesting")
    await asyncio.sleep(1) # this could be an I/O operation
    print(i * "  ", i, "done")

async def main():
    await asyncio.gather(request(0), request(1), request(2))

start = time.perf_counter()
asyncio.run(main())
elapsed = time.perf_counter() - start
print(f"executed in {elapsed:0.2f} seconds.")

In [None]:
!python3 asyncrequest.py

Since [Jupyter](https://blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7) already starts an event loop, `asyncio.run(c)` cannot be called again when running Python within Jupyter. Instead, `await(c)` has to be used instead. The example also illustrates coroutines created by coroutines and shows that the transfer structure can be random:

In [None]:
import asyncio, random, time

async def io(i):
    print(i * "  ", i, "requesting")
    await asyncio.sleep(random.randint(0, 3)) # simulated I/O operation
    print(i * "  ", i, "done")

async def task(i):
    print(i * "  ", i, "starting")
    await io(i)
    print(i * "  ", i, "ended")

async def main():
    await asyncio.gather(task(0), task(1), task(2))

start = time.perf_counter()
await main()
elapsed = time.perf_counter() - start
print(f"executed in {elapsed:0.2f} seconds.")

The `asyncio` library provides a class `Queue` for communication between coroutines:

- `q = Queue()` creates an unbounded first-in-first-out queue,
- `q = Queue(C)` creates `q` as a first-in-first-out queue with capacity `C`,
- `q.put(x)` is a coroutine that adds `x` to `q` and blocks if `q` is full,
- `x = q.get()` is a coroutine that returns the earliest added item to `q` and blocks if `q` is empty,
- `q.join()` is a coroutine that blocks until `q` becomes empty.

Thus `await q.put(x)` and `x = await q.get()` wait for control to be transferred back. Queueus can be used as a buffer between producer and consumer coroutines:

In [None]:
import asyncio, random, time

async def makeitem(n: int) -> str:
    return str(random.randint(0, 10)) + ' by ' + str(n)

async def producer(n: int, q: asyncio.Queue):
    for _ in range(random.randint(0, 10)):
        print("Producer " + str(n) + " sleeping")
        await asyncio.sleep(random.randint(0, 3))
        i = await makeitem(n)
        await q.put(i)
        print(i + " added")

async def consumer(n: int, q: asyncio.Queue):
    while True:
        print("Consumer " + str(n) + " sleeping")
        await asyncio.sleep(random.randint(0, 3))
        i = await q.get()
        print(i + " removed by " + str(n))
        q.task_done()

async def main(np: int, nc: int): # number of producers, consumers
    q = asyncio.Queue()
    p = [asyncio.create_task(producer(n, q)) for n in range(np)]
    c = [asyncio.create_task(consumer(n, q)) for n in range(nc)]
    await asyncio.gather(*p)
    await q.join()  # blocks until q is empty
    for t in c: t.cancel()

await main(3, 5) # replace with asyncio.run(main(3, 5)) if running in own thread

### Futures

With futures, the result of a call is not available when the call returns, but only later when it is needed, e.g.

```algorithm
C ← future matrixmultiply(A, B)
```

Thus, the execution of the call can be delayed or occur in parallel with the caller. At the place `C` is needed, the program may be suspended until `C` is available. If the call has no side effects, it is indistinguishable from a plain call; `future` is only a hint to parallelize execution. 

Futures are particularly useful if the procedure is I/O bound and the result is not immediately needed. Current programming languages explicitly require waiting for the result of a future. Given

```
procedure p(v) → (r)
    S
```

the sequence

```
x ← future p(a)
...
x.done
write(x.result)
```

corresponds to:

```
x.done := false
fork (x.result ← p(a); x.done := true)
...
await x.done
write(x.result)
```

Futures can be expressed in Python with the `concurrent.futures` library. Python allows to check if the function has terminated and the result is available:

In [None]:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
 
def return_after_5_secs(message):
    sleep(5)
    return message
 
pool = ThreadPoolExecutor(3)
 
future = pool.submit(return_after_5_secs, ("hello"))
print(future.done())
print(future.result())
sleep(5)
print(future.done())
print(future.result())

A future is called a `Response` in JavaScript: programs make requests to servers and continue to be responsive without waiting for immediate results. If an immediate result is needed, it has to be sequenced, e.g. for reading and processing a file:

```JavaScript
    fetch('""" + wasmfile + """') // asynchronously fetch file, return Response object
      .then(response => response.arrayBuffer()) // read the response to completion and stores it in an ArrayBuffer
      .then(code => WebAssembly.compile(code)) // compile (sharable) code.wasm
```

Note that on their own, futures do not provide means for communication, only for synchronization on termination. Futures can be scheduled cooperatively or preemptively.