Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Coroutine Rewrite Issue #2377

andrewrk opened this issue Apr 29, 2019 · 5 comments


None yet
4 participants
Copy link

commented Apr 29, 2019

This issue is one of the main goals of the 0.5.0 release cycle. It is blocking stable event-based I/O, which is blocking networking in the standard library, which is blocking the Zig Package Manager (#943).

Note: Zig's coroutines are and will continue to be "stackless" coroutines. They are nothing more than a language-supported way to do Continuation Passing Style.


Status quo coroutines have some crippling problems:

  • Because Zig uses LLVM's coroutine support, it has to follow the paradigm they set, which is that the coroutine allocates its own memory and destroys itself. In Zig this means that calling a coroutine can fail, even though that doesn't have to be possible. It also causes issues such as #1194.
  • It also means that every coroutine has to be generic and accept a hidden allocator parameter. This makes it difficult to take advantage of coroutines to solve safe recursion (#1006).
  • One of the slowest things LLVM does in debug builds, discovered with -ftime-report, is the coroutine splitting pass. Putting the pass in zig frontend code will speed it up. I examined the implementation of LLVM's coroutine support, and it does not appear to provide any advantages over doing the splitting in the frontend.
  • Optimizations are completely disabled due to LLVM bugs with coroutines (#802)
  • Other related issues: #1363 #1260 #1197 #1163 #1870

The Plan

Step 1. Result Location Mechanism

Implement the "result location" mechanism from the Copy Elision proposal (#287). This makes it so that every expression has a "result location" where the result will go. If you were to for example do this:

test "example result location" {
    var w = hi();

const Point = struct {
    x: f32,
    y: f32,

fn hi() Point {
    return Point{
        .x = 1,
        .y = 2,

What actually ends up happening is that hi gets a secret pointer parameter which is the address of w and initializes it directly.

Step 2. New Calling Syntax

Next, instead of coroutines being generic across an allocator parameter, they will use the result location as the coroutine frame. So the callsite can choose where the memory for the coroutine frame goes.

pub fn main() void {
    var x = myAsyncFunction();

In the above example, the coroutine frame goes into the variable x, which in this example is in the stack frame (or coroutine frame) of main.

The type of x is @Frame(myAsyncFunction). Every function will have a unique type associated with its coroutine frame. This means the memory can be manually managed, for example it can be put into the heap like this:

const ptr = try allocator.create(@Frame(myAsyncFunction));
ptr.* = myAsyncFunction();

@Frame could also be used to put a coroutine frame into a struct, or in static memory. It also means that, for now, it won't be possible to call a function that isn't comptime known. E.g. function pointers don't work unless the function pointer parameter is comptime.

The @Frame type will also represent the "promise"/"future" (#858) of the return value. The await syntax on this type will suspend the current coroutine, putting a pointer to its own handle into the awaited coroutine, which will tail-call resume the awaiter when the value is ready.

Next Steps

From this point the current plan is to start going down the path of #1778 and try it out. There are some problems to solve.

  • Does every suspend point in a coroutine represent a cancellation point? Does it cascade (if a coroutine which is awaiting another one gets canceled, does it cancel the target as well)?
  • How do defer and errdefer interact with suspension and cancellation? How does resource management work if the cleanup is after a suspend/cancellation point?
  • Is cancellation even a thing? Since the caller has ownership of the memory of the coroutine frame, it might not be necessary to support in the language. Coroutines could simply document whether they had to be awaited to avoid resource leaks or not.
  • Do we want to try to solve generators?

This proposal for coroutines in Zig gets us closer to a final iteration of how things will be, but you can see it may require some more design iterations as we learn more through experimentation.

@andrewrk andrewrk added this to the 0.5.0 milestone Apr 29, 2019


This comment has been minimized.

Copy link
Member Author

commented Apr 29, 2019

Once this is complete, it will be time to revisit this work-in-progress code: #1848

@andrewrk andrewrk referenced this issue Apr 29, 2019


WIP Interface Reform (see: #1829) #1848

4 of 5 tasks complete

This comment has been minimized.

Copy link

commented Apr 29, 2019

andrewrk added a commit that referenced this issue Apr 30, 2019

WIP make shrinkFn optional in Allocator interface
This is work-in-progress because it's blocked by coroutines depending on
the Allocator interface, which will be solved with the coroutine rewrite

closes #2292

This comment has been minimized.

Copy link

commented May 1, 2019

Since this issue touches on the topic of generators, I'll share my implementation of generators in Zig 0.4.0.

The gist is that generators can be defined like:

const Gen = Generator(struct {
    slice: []i32,

    pub async fn generator(ctx: *@This(), item: *i32) !void {
        for (ctx.slice) |x| {
            item.* = x;

And used like:

var gen = Iterator.init(allocator, Gen.Args {
    .slice = []i32{ 1, 2, 3, 4, 5 },
defer gen.deinit();
while (try |item| {
    warn("{}\n", item);

I think that unless prettier generator support comes along for free with the coroutine rewrite, we should try a library approach like this first, and only move toward language support for generators if it ends up being insufficient.

(Currently I have an issue that forces .next() to return anyerror!?Item instead of properly inferring the errorset. The compiler fails an assertion if I try to track the errorset. I'll create a separate issue when I can figure out how to get the same failure with less code. #2398)


This comment has been minimized.

Copy link

commented May 12, 2019

Next, instead of coroutines being generic across an allocator parameter, they will use the result location as the coroutine frame.

Does that mean that the whole frame stays allocated as long as the returned value is alive?

I believe the two should have different lifetimes.


This comment has been minimized.

Copy link
Member Author

commented May 12, 2019

The coroutine frame's lifetime is manually managed by the caller. If you keep reading you can see the example of how to give them different lifetimes by putting the frame on the heap.

For an event-based I/O function calling another one (and using the result), this will work nicely because the coroutine frame of the callee will exist inside the coroutine frame of the caller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.