introduce `noasync` keyword to annotate functions, function calls, and awaits #3157

andrewrk · 2019-09-02T21:18:57Z

I got pretty far in the proof-of-concept branch for adding a global pub const io_mode = .evented;. Here's one issue that came up (text version follows after screenshot):

/home/andy/downloads/zig/build/lib/zig/std/special/start.zig:31:1: error: function with calling convention 'nakedcc' cannot be async
nakedcc fn _start() noreturn {
^
/home/andy/downloads/zig/build/lib/zig/std/special/start.zig:56:5: note: async function call here
    @noInlineCall(posixCallMainAndExit);
    ^
/home/andy/downloads/zig/build/lib/zig/std/special/start.zig:97:33: note: async function call here
    std.os.exit(callMainWithArgs(argc, argv, envp));
                                ^
/home/andy/downloads/zig/build/lib/zig/std/special/start.zig:108:20: note: async function call here
    return callMain();
                   ^
/home/andy/downloads/zig/build/lib/zig/std/special/start.zig:139:37: note: async function call here
            const result = root.main() catch |err| {
                                    ^
/home/andy/dev/zig-window/example/nox.zig:8:32: note: async function call here
    try loop.initSingleThreaded(std.heap.direct_allocator);
                               ^
/home/andy/downloads/zig/build/lib/zig/std/event/loop.zig:112:33: note: async function call here
        return self.initInternal(allocator, 1);
                                ^
/home/andy/downloads/zig/build/lib/zig/std/event/loop.zig:156:28: note: async function call here
        try self.initOsData(extra_thread_count);
                           ^
/home/andy/downloads/zig/build/lib/zig/std/event/loop.zig:198:54: note: async function call here
                            .eventfd = try os.eventfd(1, os.EFD_CLOEXEC | os.EFD_NONBLOCK),
                                                     ^
/home/andy/downloads/zig/build/lib/zig/std/os.zig:1645:45: note: async function call here
        else => |err| return unexpectedErrno(err),
                                            ^
/home/andy/downloads/zig/build/lib/zig/std/os.zig:2606:40: note: async function call here
        std.debug.dumpCurrentStackTrace(null);
                                       ^
/home/andy/downloads/zig/build/lib/zig/std/debug.zig:93:40: note: async function call here
    const debug_info = getSelfDebugInfo() catch |err| {
                                       ^
/home/andy/downloads/zig/build/lib/zig/std/debug.zig:74:48: note: async function call here
        self_debug_info = try openSelfDebugInfo(getDebugInfoAllocator());
                                               ^
/home/andy/downloads/zig/build/lib/zig/std/debug.zig:828:34: note: async function call here
    return openSelfDebugInfoPosix(allocator);
                                 ^
/home/andy/downloads/zig/build/lib/zig/std/debug.zig:1069:28: note: async function call here
    return openElfDebugInfo(
                           ^
/home/andy/downloads/zig/build/lib/zig/std/debug.zig:1027:39: note: async function call here
    var efile = try elf.Elf.openStream(allocator, elf_seekable_stream, elf_in_stream);
                                      ^
/home/andy/downloads/zig/build/lib/zig/std/elf.zig:392:25: note: async function call here
        try in.readNoEof(magic[0..]);
                        ^
/home/andy/downloads/zig/build/lib/zig/std/io/in_stream.zig:60:47: note: async function call here
            const amt_read = try self.readFull(buf);
                                              ^
/home/andy/downloads/zig/build/lib/zig/std/io/in_stream.zig:50:42: note: async function call here
                const amt = try self.read(buffer[index..]);
                                         ^
/home/andy/downloads/zig/build/lib/zig/std/io/in_stream.zig:38:68: note: suspends here
                return await @asyncCall(&stack_frame, &result, self.readFn, self, buffer);
                                                                   ^

This is really interesting!

Creating the event loop in main() didn't work, because it calls eventfd, and if it gets an unexpected OS error (in debug builds), it tries to dump a stack trace to pinpoint where the unexpected OS error occurred- which wants to open the self exe file to read dwarf info, which calls std.os.read(), which is getting generated event-based, because the application has selected pub const io_mode = .evented; but we can't suspend here because this is setting up the event loop itself.

What do we actually want to happen here? Answer: dumping the current stack trace should always be blocking and should not depend on an event loop. And we can accomplish this in a clean way:

Even if a function is async, one can make it be blocking if all the I/O it does is on file descriptors which are not O_NONBLOCK, because async func() runs func() up to the first suspend point. If the I/O it does is all blocking it will finish completely without suspending.

So we will make std.debug.dumpCurrentStackTrace always be a non-async function. This is accomplished by opening debug info with normal blocking file descriptor, and then do an async call for the async functions it calls, and then assert that they all finished without suspending. This makes dumpCurrentStackTrace a "seam". Even though it calls async functions, it knows that, in this case, they will return without suspending, and so it ends up being a non-async function.

This is elegant because all those async functions are not generated twice. The async versions of the functions can be used for both the blocking and the non-blocking path.

Without introducing any new syntax or language semantics, here's what this would look like:

-    const debug_info = getSelfDebugInfo() catch |err| {
+    var res: @typeOf(getSelfDebugInfo).ReturnType.ErrorSet!DebugInfo = undefined;
+    var frame: @Frame(getSelfDebugInfo) = undefined;
+    _ = @asyncCall(&frame, &res, getSelfDebugInfo);
+    // XXX it's not actually possible to assert that the call finished
+    const debug_info = res catch |err| {

This is obviously less than ideal, and it generates worse code than what I am proposing:

-    const debug_info = getSelfDebugInfo() catch |err| {
+    const debug_info = noasync getSelfDebugInfo() catch |err| {

This keyword annotates a function call and guarantees that the function call will not be a suspension point, even if the callee is an async function. It asserts that the callee finished (got to the return statement).

Similarly, noasync could be used in front of await. This diff would be equivalent:

-    const debug_info = getSelfDebugInfo() catch |err| {
+    var f = async getSelfDebugInfo();
+    const debug_info = (noasync await f) catch |err| {

Finally, considering that the main use case for this feature is to make a function be a "seam" between non-async and async code, it would make sense for noasync to be able to annotate a function directly. This would cause all function calls and await within the function body into being the "noasync" versions of them. It would also cause other suspension points (such as suspend) to be a compile error.

With this implemented, the solution to the above compile error would be a single line:

-pub fn dumpCurrentStackTrace(start_addr: ?usize) void {
+pub noasync fn dumpCurrentStackTrace(start_addr: ?usize) void {

This has the added benefit of providing semantically meaningful documentation for the function. It's useful for the callers to know whether a function has this attribute.

The text was updated successfully, but these errors were encountered:

See #3157

andrewrk · 2019-09-06T02:02:31Z

Checklist:

add noasync to stage1 parser
add noasync to zig fmt
implement noasync for function calls
runtime safety for noasync function calls. Panic if the function suspended.
make even async functions guaranteed not to suspend for noasync function calls
implement noasync for await
runtime safety for noasync await. Panic if the the frame is not finished.
implement noasync fn
docs

daurnimator · 2019-09-06T16:09:42Z

I think that noasync is a complexity we can do without.

I instead propose that we solve this in userland with a linked list of event loops; so that event loops can be nested or really: created from anywhere. Which includes the inside of e.g. the stack trace handler.

const LoopList = std.SinglyLinkedList(void);
pub const Loop = struct {
    parent: LoopList.Node,
    ....
}
/// A global containing the current event loop
var currentLoop = LoopList.init();

pub fn newLoop() Loop {
    const loop = Loop.new(); // could create different loop types; e.g. thread pool based vs uring based vs poll() based vs epoll() based.
    currentLoop.prepend(loop);
    return loop;
}

// example 'blocking' read function
fn read(fd: int, dest: []u8) !usize {
    var popLoop: bool = undefined;
    var loop: Loop = undefined;
    if (currentLoop.first) |l| {
        loop = l;
        popLoop = false;
    } else {
        loop = newLoop();
        popLoop = true;
    }
    defer { 
        if (popLoop) {
            loop.close();
            currentLoop.popFirst();
        }
    }
    return loop.read(fd, dest);
}

By allowing event loops to nest, you can enable some really cool patterns.

See #3157

andrewrk · 2019-09-07T22:17:21Z

I think your nested event loop example is going to be possible and it is an independent concept from this noasync language feature. Regardless of the existence of event loops, the ability of creating a "seam" between async and non-async functions is a fundamental language feature that needs to exist with Zig's async/await semantics. This would be how to start calling something async from main() for example.

I do have some questions about this example, such as how does it interact with threads? Let's have another issue open for this use case.

daurnimator · 2019-09-08T00:51:10Z

Regardless of the existence of event loops, the ability of creating a "seam" between async and non-async functions is a fundamental language feature that needs to exist with Zig's async/await semantics

Why? Removing the seam was the point of #1778: you're introducing a "colour" back to functions.

mschwaig · 2019-09-08T12:52:43Z

The different "colors" for functions are still there in Zig right now. They are inferred, which noasync does not change.

It is inferred that a function is async, if its implementation uses the suspend keyword or it invokes a function that was inferred be async. Effectively asyncness for functions propagates down through the callstack - until you get to an async at a call site, where this inference stops because this is where you have to explicitly put that whole callstack somewhere, to allow that function to continue execution via resume.

For getSelfDebugInfo we know that at runtime it should never actually suspend, because of how it uses the async functions it calls. So in the example without any new syntax it just runs to completion and we collect the result without ever taking advantage of async. Sadly right now we cannot even assert that it is done, we could only block if it is not.

noasync is a shorthand way of stopping inference at that point and asserting that the function really did not suspend, triggering checked undefined behaviour if they do. This gives the opportunity to generate more efficient versions of those functions, as long as UB bubbles up to the noasync as an error.

TLDR: noasync is not needed to create a "seam" below which the stack does not have to be async, but it makes it more convenient and assertable and does not actually introduce any more "coloring" to Zig.

mschwaig · 2019-09-08T13:20:03Z

noasync as a function keyword, instead of inferring that all calls are noasync could assert that all calls are noasync. That way introducing a suspend somewhere up the callstack from a noasync function will be a compile error instead of a runtime error.

daurnimator · 2019-09-09T14:53:48Z

Sadly right now we cannot even assert that it is done, we could only block if it is not.

That to me is the design feature: if you cannot tell, then people can't intentionally create single-coloured functions.
For this example of calling getSelfDebugInfo from the panic handler, the panic handler could push its own (new) event loop before doing any operations: this would have the effect of decoupling it from the "main" event loop.

andrewrk · 2019-09-09T14:57:37Z

@daurnimator what's your plan for making _start not async?

daurnimator · 2019-09-09T15:05:30Z

@daurnimator what's your plan for making _start not async?

An event loop would only suspend if it has a parent event loop.
When the read is hit, it would notice that there is no current event loop.
So the code would create a new event loop; run a single read operation in it; and then destroy that event loop object.

Note that "event loop" here is used a bit loosely: there is no event loop here; a better description might be "async scheduler"

andrewrk · 2019-09-09T15:12:46Z

it would notice that there is no current event loop.

at compile time or runtime? If runtime, then the function will be generated async. Which will bubble all the way up to _start.

This introduces the concept of "IO mode" which is configurable by the root source file (e.g. next to `pub fn main`). Applications can put this in their root source file: ``` pub const io_mode = .evented; ``` This will populate `std.io.mode` to be `std.io.Mode.evented`. When I/O mode is evented, `std.os.read` handles EAGAIN by suspending until the file descriptor becomes available for reading. Although the std lib event loop supports epoll, kqueue, and Windows I/O Completion Ports, this integration with `std.os.read` currently only works on Linux. This integration is currently only hooked up to `std.os.read`, and not, for example, `std.os.write`, child processes, and timers. The fact that we can do this and still have a working master branch is thanks to Zig's lazy analysis, comptime, and inferred async. We can continue to make incremental progress on async std lib features, enabling more and more test cases and coverage. In addition to `std.io.mode` there is `std.io.is_async` which is equal to `std.io.mode == .evented`. In case I/O mode is async, `std.io.InStream` notices this and the read function pointer becomes an async function pointer rather than a blocking function pointer. Even in this case, `std.io.InStream` can *still be used as a blocking input stream*. Users of the API control whether it is blocking or async at runtime by whether or not the read function suspends. In case of file descriptors, for example, this might correspond to whether it was opened with `O_NONBLOCK`. The `noasync` keyword makes a function call or `await` assert that no suspension happens. This assertion has runtime safety enabled. `std.io.InStream`, in the case of async I/O, uses by default a 4 MiB frame size for calling the read function. If this is too large or too small, the application can globally increase the frame size used by declaring `pub const stack_size_std_io_InStream = 1234;` in their root source file. This way, `std.io.InStream` will only be generated once, avoiding bloat, and as long as this number is configured to be high enough, everything works fine. Zig has runtime safety to detect when `@asyncCall` is given too small of a buffer for the frame size. This merge introduces -fstack-report which can help identify large async function frame sizes and explain what is making them so big. Until #3069 is solved, it's recommended to stick with blocking IO mode. -fstack-report outputs JSON format, which can then be viewed in a GUI that represents the tree structure. As an example, Firefox does a decent job of this. One feature that is currently missing is detecting that the call stack upper bound is greater than the default for a given target, and passing this upper bound to the linker. As an example, if Zig detects that 20 MiB stack upper bound is needed - which would be quite reasonable - currently on Linux the application would only be given the default of 16 MiB. Unrelated miscellaneous change: added std.c.readv

frmdstryr · 2019-12-14T01:51:38Z

I was messing with this some today and have a few thoughts.

Since the event loop is global, could I just grab the instance and run some async fn like?

var result = loop.runUntilComplete(someframe);

This is somewhat like @daurnimator's embedded event loops. Where it would just sit there in a while loop that keeps ticking until the frame is complete.

Or would it be possible to switch between sync and async "versons" at runtime, using some thread-local variable? I started making sync only alternative fn's but that quickly becomes a mess.

Note that there is not yet runtime safety for this. See #3157

andrewrk · 2020-03-08T22:42:56Z

Superseded by #4696.

andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. accepted This proposal is planned. labels Sep 2, 2019

andrewrk added this to the 0.5.0 milestone Sep 2, 2019

andrewrk added a commit that referenced this issue Sep 6, 2019

implement noasync function calls

0a3c6db

See #3157

andrewrk added a commit that referenced this issue Sep 6, 2019

runtime safety for noasync function calls

7d303ae

See #3157

andrewrk mentioned this issue Sep 7, 2019

missing async spill when returning a function call of async fn with error union return value #3190

Closed

andrewrk modified the milestones: 0.5.0, 0.6.0 Sep 20, 2019

frmdstryr mentioned this issue Dec 14, 2019

integrate std.os.write and equivalent with evented I/O #3557

Closed

andrewrk added a commit that referenced this issue Feb 16, 2020

Implement noasync awaits

7f7d1fb

Note that there is not yet runtime safety for this. See #3157

andrewrk mentioned this issue Mar 8, 2020

make "noasync" keyword work more like "comptime" #4696

Closed

andrewrk closed this as completed Mar 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce `noasync` keyword to annotate functions, function calls, and awaits #3157

introduce `noasync` keyword to annotate functions, function calls, and awaits #3157

andrewrk commented Sep 2, 2019 •

edited

andrewrk commented Sep 6, 2019 •

edited

daurnimator commented Sep 6, 2019

andrewrk commented Sep 7, 2019

daurnimator commented Sep 8, 2019 •

edited

mschwaig commented Sep 8, 2019

mschwaig commented Sep 8, 2019

daurnimator commented Sep 9, 2019

andrewrk commented Sep 9, 2019

daurnimator commented Sep 9, 2019 •

edited

andrewrk commented Sep 9, 2019

frmdstryr commented Dec 14, 2019

andrewrk commented Mar 8, 2020

introduce noasync keyword to annotate functions, function calls, and awaits #3157

introduce noasync keyword to annotate functions, function calls, and awaits #3157

Comments

andrewrk commented Sep 2, 2019 • edited

andrewrk commented Sep 6, 2019 • edited

daurnimator commented Sep 6, 2019

andrewrk commented Sep 7, 2019

daurnimator commented Sep 8, 2019 • edited

mschwaig commented Sep 8, 2019

mschwaig commented Sep 8, 2019

daurnimator commented Sep 9, 2019

andrewrk commented Sep 9, 2019

daurnimator commented Sep 9, 2019 • edited

andrewrk commented Sep 9, 2019

frmdstryr commented Dec 14, 2019

andrewrk commented Mar 8, 2020

introduce `noasync` keyword to annotate functions, function calls, and awaits #3157

introduce `noasync` keyword to annotate functions, function calls, and awaits #3157

andrewrk commented Sep 2, 2019 •

edited

andrewrk commented Sep 6, 2019 •

edited

daurnimator commented Sep 8, 2019 •

edited

daurnimator commented Sep 9, 2019 •

edited