Proposal: Stack-Capturing Macro #6965

SpexGuy · 2020-11-03T21:33:34Z

Summary

Stack capturing lambdas are normally unsafe, since the lambda may have a lifetime longer than the calling function. This proposal introduces a new type of lambda that is allowed to perform stack captures, but obeys special rules that allow the compiler to prove that these captures are safe. This proposal is based on Kotlin's concept of inline lambdas, though deviates from that implementation in some ways.

Description

Unlike functions, Stack-Capturing Macros (or "macros" for short) are not values. They may only exist as parameters to functions which are marked inline. You cannot create a variable or struct that contains a macro.

Macro parameters are declared with a macro prototype, which appears in a parameter list. Macro prototypes are not type declarations. They are a new syntax, similar to a function type declaration. Unlike function types, macro prototypes may use the !T syntax to infer error types from the macro body. When passing a macro as an argument, the token => is used to indicate that the following expression is a macro. The capture syntax (|x|) is optionally used to capture the macro parameters, and after that comes an expression (which may be a labelled or unlabelled block). "return" and "try" within this scope are relative to the function, not the macro. break :label may be used to exit to an outer scope. Doing this behaves as if the function and macro were inlined, and runs any defers in the function or macro.

/// iterates the entries in a bucketed hash map
inline fn foreach(self: *@This(), action: macro (key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(&node.key, &node.value);
        }
    }
}

fn countValues(self: *@This(), value: Value) usize {
    var count: usize = 0;
    self.foreach(=> |_, v| {
        // references to stack variables
        if (v.* == value) count += 1;
    });
    return count;
}

fn hasValue(self: @This(), value: Value) bool {
    return blk: {
        self.foreach(=>|_, v| {
            // nonlocal return
            if (v.* == value) return true;
        });
        break :blk false;
    };
}

fn findKeysWithValue(self: @This(), value: Value, allocator: *Allocator) ![]*Key {
    var list = std.ArrayList(*Key).init(allocator);
    errdefer list.deinit();
    try self.foreach(=> |k, v| {
        if (v.* == value) {
            // this try refers to the outer scope.
            try list.append(k);
        }
    });
    return list.toOwnedSlice();
}

// macro returning value
inline fn pollUntil(condition: macro (index: usize) bool) void {
    var index: usize = 0;
    while (!condition(index)) : (index += 1) {
        poll();
    }
}

fn pollFourTimes() void {
    pollUntil(=> |x| x >= 4)
}

// macro with no parameters
inline fn retry(comptime T: type, max_attempts: usize, action: macro () !T) !T {
    var attempt: usize = 1;
    while (true) : (attempt += 1) {
        if (action()) |_| {
            return;
        } else |err| {
            if (attempt >= max_attempts) return err;
        }
    }
}

fn openNetworkFile(path: []const u8) !Handle {
    return retry(Handle, 3, => blk: {
        doSomeSetup();
        defer doTeardown();
        break :blk openNetworkFileUnstable(path);
    });
}

Because they cannot exist in variables, there is no TypeInfo entry for a macro. There are also no macro types. @TypeOf(macro_param) is a compile error. You would never need to do this though, because macro parameters are always known to be macros within the text of the function. When inspecting the TypeInfo of a function, macro parameters show as having a null type, just like anytype parameters. We could add more info to the FnArg structure if necessary to handle this case.

Macros may not be closed over in a function literal, but they may be referenced from within other macros, and they may be passed directly as a macro parameter to another inline function. Macros may accept macros as parameters, but only if the compiler can inline them all. Y-combinators which do not terminate at compile time will not work. This is governed by the eval branch quota. Inlining a macro into itself, directly or indirectly, counts as one backwards branch.

Unlabelled break and continue don't work in a macro in a loop. This prevents a potential class of errors:

for (some_list) |item| {
    item.inner_list.foreach(=> |inner| {
        if (inner.value == item.target) break;
    });
}

Here the programmer intended to break out of the inner loop, but the language would interpret this as a break from the outer loop. (because the inner loop is inside the function and can't be seen). Instead, this should be a compile error and a labelled break can be used to get out of the inner loop:

next_list: for (some_list) |item| {
    item.inner_list.foreach(=> |inner| {
        if (inner.value == item.target) continue :next_list;
    });
}

Macros provide a way to allow stack references in a way that is provably safe.

Edit: This proposal previously read "If a macro has no parameters, the capture list is omitted." This would be ambiguous with an expression block, so it was changed to require an empty capture list.

Edit 2: Macros were originally blocks, but upon reflection they make more sense as expressions. This also means that try and return in a macro refer to the outer function, rather than the macro expression. And a token (=>) is needed for declaring a macro. Also added a bunch more examples.

The text was updated successfully, but these errors were encountered:

MasterQ32 · 2020-11-03T23:19:54Z

This is a surprisingly elegant solution to the "modern usecase of lambda expressions". I'm not sure if zig needs this, but the basic idea is quite nice and clean. But some questions:

Can i store a macro in a const? I might need a predicate or a sort criteria multiple times:

var threshold = 0.4; // runtime value
const is_included = |x| { return x > threshold; };

var count: usize = 0;
count += my_list_1.countAll(is_included);
count += my_list_2.countAll(is_included);

Can i call macros after declaring them? This would solve Improve block-local error handling #5610

const err_or_void = ||{
    try foo();
    try bar();
}();
someCleanup();
if(err_or_void) |_| { } else {
    std.debug.print("oh no!\n", .{});
    return error.Unexpected;
}

I'm a fan, not sure though if zig needs this feature…

SpexGuy · 2020-11-04T00:06:45Z

Can I store a macro in a const?

The proposal as written would say no, but this is an interesting use case. I think I'm still inclined to say no though. A property of macros as described above is that they execute only as part of the statement in which they are declared. You can't declare a macro on one line and then execute it on a different line. This means there is never any confusion about whether a macro is capturing the current value of a stack variable or a reference to the variable - it's both. This also ensures clearly that it's impossible for a macro to escape the bounds of the locals it references. A macro really isn't a function that executes later, it's a block that executes right now, but in a deeper scope.

Can I call macros after declaring them?

No, but you can do this:

const localTry = inline fn(blk: macro () !void) !void {
    try blk();
}

const err_or_void = localTry(||{
    try foo();
    try bar();
});
someCleanup();
if(err_or_void) |_| { } else {
    std.debug.print("oh no!\n", .{});
    return error.Unexpected;
}

I'm a fan, not sure though if zig needs this feature…

That's kind of how I feel too :P

zzyxyzz · 2020-11-04T00:56:52Z

I absolutely love the basic idea of this proposal. Most languages simply don't have a reasonable way to abstract over code blocks that only differ in a few lines of code and a few dynamic variables. Function pointers and non-capturing anonymous functions help somewhat, but usually have a substantial cost in both verbosity and performance. In my opinion, the proposed functionality could considerably increase the power of the language for implementing various types of iterators, filters, generic algorithms and context-wrapping functions (like with_open_file or with_drawing_context). With this, you get 80% of the power of Lisp macros without any AST-manipulations and custom evaluation semantics.

rohlem · 2020-11-04T04:53:19Z

I'm not as much of a fan.
If I understood this correctly (and maybe I overlooked something), we can already get 80% of same effect by

creating a local struct { fn action(self: *@This(), ...) !void } to encapsulate variables and methods to operate on them,
pass it as actor: anytype and
call it as actor.action.

The overall design makes sense to me, but I don't appreciate the individual changes either:

If a macro has no parameters, the capture list is omitted.

return and try can now (with the same syntax) only apply to an anonymous block, if that block is contextually passed as a macro argument.
If we want block-scoped try, then instead how about try :block syntax? (We already have break :block for "local returns".)
Unlike functions, Stack-Capturing Macros (or "macros" for short) are not values. They may only [...]

This makes sense in the proposed design, but status-quo local structs simply don't have these limitations. So if they get in your way, you need to "upgrade" from macro back to a full struct.

I personally appreciate the idea of one way to do things too much over having this "shorthand option" and having to decide between them (potentially for every single interface individually).
To me it seems more valuable to unify/simplify common use cases of the current language constructs.
Some ideas for the use cases presented here:

scoped try :block <expression> syntax
I think I would implement "nonlocal break" by returning an enum {continue, break} to the foreach function.
If we change that to union {continue: void, break: label_name} the foreach could exit and return the label to break from to the caller.
So now what's missing would be a way to break to the right label based on an enum value, f.e. breakto <expression> [optional value].
(We could even limit the expressions to ConstantLiteral syntax .labeled_block so that it's all comptime-checkable.)
For capturing stack variables, maybe we could go the opposite direction and allow usingnamespace on a stack-local struct instance. That way we could do this:

var stackToCapture = struct{
  count: usize = 0;
  fn action(self: *@This(), key: anytype, value: anytype) !void {...}
} {};
usingnamespace stackToCapture;
//refer to `count` as if it were a local variable directly, pass &stackToCapture as callback-with-data

MasterQ32 · 2020-11-04T09:01:49Z

The proposal as written would say no, but this is an interesting use case. I think I'm still inclined to say no though.

Yeah that would be my opinion as well. It would get too messy quickly and most macros are unique anyways...

No, but you can do this:

const err_or_void = localTry(||{
    try foo();
    try bar();
});

I really like this, we created a new language feature out of macros already, killing two birds (#229, #5610) with one stone (#6965)

If I understood this correctly (and maybe I overlooked something), we can already get 80% of same effect by

Not really. Capturing local variables either requires a lot of indirections and storing the locals in a "local variables struct", which means every normal access is prefixed or you have to capture them by hand into pointer struct fields

try :block <expression> syntax

I think this is worth a separate proposal, as this is solving a completly different use case which i discovered could be solved with this proposal

For capturing stack variables, maybe we could go the opposite direction and allow usingnamespace on a stack-local struct instance. That way we could do this:

This is confusing and usingnamespace is not a tool that should be used often. It's also just a syntactical trick.

rohlem · 2020-11-04T14:44:16Z

Capturing local variables either requires a lot of indirections and storing the locals in a "local variables struct", which means every normal access is prefixed or you have to capture them by hand into pointer struct fields.

I don't think there is much of a semantic difference between declaring stack variables separately and grouping them into a struct and declaring an instance of that. The only difference is in access syntax.

usingnamespace is not a tool that should be used often.

Why? As I understand it, importing symbols from a different namespace means your code depends on the symbols declared somewhere else, which is an implicit remote dependency.
What I proposed is a new mechanism that doesn't currently exist. In the example I showed, it links the function code to the function-local field declarations. All local, good visibility. I don't see an argument against allowing it.
If you think this shouldn't be allowed for non-local structs, then we can declare it invalid for non-locally defined types.

It's also just a syntactical trick.

That's my point. Both solutions just present syntactical convenience over something we can already do with less limitations.
If field access feels too burdensome, we can enhance usingnamespace, or introduce some other syntax if you'd prefer. Otherwise we can remain as we are.
The new semantics that come with this proposal seem to me like a differently-flavoured tool for the same job. Unless they permit a new use case not solvable by structs and status-quo that I'm oblivious to.

alexnask · 2020-11-04T17:11:38Z

I quite like this proposal personally.
The one modification I would make would be to require macro arguments to be comptime

SpexGuy · 2020-11-04T17:56:24Z

If a macro has no parameters, the capture list is omitted.

This is ambiguous I think, I'll change it to require an empty capture list. That's also ambiguous with the || operator but unary || isn't a thing so that should be enough to disambiguate.

return and try can now (with the same syntax) only apply to an anonymous block, if that block is contextually passed as a macro argument.

This is a fair point. This is actually where my proposal deviates from Kotlin's definition. Returns in a kotlin inline lambda return from the outer function. But this only works because the last expression in a kotlin lambda produces its value, which gives them an alternate syntax to return. I suppose we could allow a label on macros though, and use break :macro_label to yield a value. I could argue then that return from inside a macro marks invisible control flow, but you could make the same argument about nonlocal breaks.

If I understood this correctly [...], we can already get 80% of same effect [with anon functions + state objects]

It's true that you can accomplish the same computation with anonymous functions and state objects, but you cannot assume that that will be performant. Using functions here has several problems:

you are now taking the address of a local variable, so the optimizer needs to put it in memory and needs to consider aliasing possibilities
you are copying memory around and hoping that the optimizer will realize and remove it.

Essentially, using functions for this is taking something that is straightforward for the frontend (put that loop in my code and put this code in the loop), turning it into a really hard problem for the optimizer, and then hoping that the optimizer will figure it out.
This problem is one of the main reasons that many game developers avoid C++ lambdas.

The proposed macros sidestep this problem by forcing the frontend to emit something that is friendly to the backend (just a bunch of scopes, no pointers to locals, no extra functions). That's what makes this different. It's more constrained than lambdas, but it guarantees that it will not generate a function call, it will not generate extra copies, and it will not generate extraneous memory clobbers. IMO these guarantees are enough to make this a sufficiently different operation from passing a function (and therefore not another way of doing "the same thing").

require macro arguments to be comptime

This brings up a problem that I didn't think about before. I think putting comptime on a macro implies that it can be executed in a comptime block, meaning that it does not capture runtime locals.

MasterQ32 · 2020-11-04T18:04:16Z

It's more constrained than lambdas, but it guarantees that it will not generate a function call, it will not generate extra copies, and it will not generate extraneous memory clobbers. IMO these guarantees are enough to make this a sufficiently different operation from passing a function (and therefore not another way of doing "the same thing").

Word. This proposal is a simple syntax transformation, should be easy to implement in both stage1 and stage2 by just doing some AST transformations whereas function pointers yield inherently worse results due to increased complexity

rohlem · 2020-11-04T18:37:38Z

It's true that you can accomplish the same computation with anonymous functions and state objects, but you cannot assume that that will be performant.

Indeed I was hoping that this type of transformation would be something fully in the realm of capability for an optimizing compiler.
In particular, here is how my implementation would work in my head:

The code is translated to an AST and then translated to IR, and the foreach function is inlined, at some point during this.
(Inlining is forced by the user, or chosen by the compiler, advisable since the unique argument type makes the instantiation unique. Same goes for inlining actor.action.)
The compiler has to analyze all three functions as one, because they have been inlined. (Also Zig compiles all code as one translation unit anyway, so cross-analysis of known functions is not very restricted.)
It notices the pointer to the struct instance doesn't escape the scope of the variable declaration, therefore chooses to statically resolve all references relative to the stack frames (i.e. as local variables) instead of producing runtime pointers.

Even if Zig already had macros, I would advocate to implement these optimizations for the struct case. I'd be interested to know which step of this is unrealistic.

This problem is one of the main reasons that many game developers avoid C++ lambdas.

Interesting, I was under the impression people liked to use templates because (just like comptime parameters) they lead to the compiler inlining more code, which enables more optimization opportunities. C++ lambdas are 100% just syntactic sugar for template classes with overloaded call operators; the only barrier for optimizations using them would come from type erasure, f.e. by std::function. But then again, I might not be in the loop.

Amendment: I agree that this is "just an AST transformation" that "should be easy to implement". My concern is whether we want to facilitate this one use case with special-case syntax, type, and semantics, instead of reducing friction in the components we already have in the language.

SpexGuy · 2020-11-04T19:35:57Z

I was under the impression people liked to use templates

The prevailing opinion in the game development community seems to be that templates and lambdas can be fine for runtime performance if all optimizations are turned on, but compilation time and performance in debug mode are also very important, and templates are terrible for both of these. Indeed, all of the optimizations you described are possible and will probably be done in release-fast mode. But the other build modes also matter, and so does compilation time.

That said, header parsing time is a massive sink for C++ compile time, and templates can only reasonably go in headers. Zig is better about this, so it may not be as much of a problem.

My concern is whether we want to facilitate this one use case with special-case syntax, type, and semantics, instead of reducing friction in the components we already have in the language.

This is a good and valid concern, but I don't see any way to reduce the friction enough here without introducing a new concept.
Here's the status quo "null hypothesis" way to do this with functions:

/// iterates the entries in a bucketed hash map
inline fn foreach(self: *@This(), action: anytype) !void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            try action.each(&node.key, &node.value);
        }
    }
}

fn countValues(self: *@This(), value: Value) usize {
    var countValuesEach = struct{
        count: usize = 0,
        target: Value,
        pub inline fn each(self: *@This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) self.count += 1;
        }
    } { .target = value };
    self.foreach(&countValuesEach) catch @compileError("no errors possible");
    return countValuesEach.count;
}

fn hasValue(self: *@This(), value: Value) bool {
    self.foreach(struct{
        target: Value,
        pub inline fn each(self: @This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) return error.FoundTheValue;
        }
    } { .target = value }) catch return true;
    return false;
}

fn findKeysWithValue(self: *@This(), value: Value, allocator: *Allocator) ![]*Key {
    var findKeysEach = struct{
        list: std.ArrayList(*Key),
        target: Value,
        pub fn init(target: Value, alloc: *Allocator) @This() {
            return .{ .target = target, .list = std.ArrayList(*Key).init(alloc) };
        }
        pub fn deinit(self: *@This()) void {
            self.list.deinit();
        }
        pub inline fn each(self: *@This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) {
                try self.list.append(k);
            }
        }
    }.init(value, allocator);
    errdefer findKeysEach.deinit();

    try self.foreach(&findKeysEach);

    return findKeysEach.list.toOwnedSlice();
}

The mental gymnastics that I have to do to try to understand this code are too much. It's not worth making the foreach function at all here, it's more readable to just write the loop in each function. In this case I would probably instead make an iterator struct and use that. But iterators can be even harder for the optimizer to process, and worse on debug performance, so that doesn't really solve the problem either. They are also much more difficult to write than the iteration loop, and require more effort from readers to understand.

I don't think the usingnamespace on instance idea helps any of these examples. The availability of the locals to the return statement is not the problem.

I think I would implement "nonlocal break" by returning an enum {continue, break} to the foreach function.
If we change that to union {continue: void, break: label_name} the foreach could exit and return the label to break from to the caller.
So now what's missing would be a way to break to the right label based on an enum value, f.e. breakto [optional value].
(We could even limit the expressions to ConstantLiteral syntax .labeled_block so that it's all comptime-checkable.)

Limiting this to comptime-known values doesn't actually work. There's no way to run a function at runtime that returns a comptime-known value. This would have to be some totally new system in order to be guaranteed to be statically resolved, and I can't imagine any version of that which produces readable code.

It is possible that generators may be able to help here, by making it easier to write iterators. But that also assumes a very powerful optimizer, and puts a burden on it to think more (and longer) about how to optimize.

tauoverpi · 2020-11-04T20:01:41Z

Making this an expression |x| x and |x| blk: { break :blk x } would be a bit nicer and not need to change what try and return mean within the scope of a function which would make more sense with it being an inline AST trick and not a separate function on it's own as otherwise it becomes a special case of the two. Labled or otherwise try would fit better outside of macro definitions.

Other than that, sounds rather neat.

rohlem · 2020-11-04T20:14:24Z

Indeed, all of the optimizations you described are possible and will probably be done in release-fast mode. But the other build modes also matter, and so does compilation time.

Valid point for unoptimized Debug mode and compilation time.

Limiting this to comptime-known values doesn't actually work.

I was a bit unclear. I meant that the syntax of breakto <expression> [optional value] could limit expression to be a ComptimeLiteral, but the compiler would end up creating an enum to represent all labels in the function. (Afaik automatic enumeration and a goto table is how suspend/resume point information is managed for async functions already, so basically expose an idea like that to userland.) EDIT: And I meant to switch on the result of self.foreach, which would be that enum's type - yes, that would require specifying the enum in the union return type.

Here's one more idea to help the syntax of the status-quo approach (though decidedly not the compilation complexity):

//Introduce a new builtin: @currentScopeMixin(comptime Extension: type): anytype.
//This magically makes all stack declarations visible at this point (including function parameters) be auto-wrapped into a struct instance, and adds all declarations of `Extension` to that struct type.
//Downside: Would require adding `const` fields to the language for the semantics of this builtin type. (Then again, we already have `comptime` fields just for tuples...)

fn findKeysWithValue(self: @This(), value: Value, allocator: *Allocator) ![]*Key {
    var list = std.ArrayList(*Key).init(allocator);
    errdefer list.deinit();
    try self.foreach(&@currentScopeMixin(struct{
        fn each(self: anytype, k: *Key, v: *Value) !void {
            if (v.* == self.value) {
                try self.list.append(k);
            }
        }
    }));
    return list.toOwnedSlice();
}

EDIT: fixed passing the address of the instance constructed by @currentScopeMixin, not a copy.

data-man · 2020-11-04T20:36:37Z

When passing a macro as an argument, the capture syntax (|x|) is used to declare parameters and a scope is opened. If a macro has no parameters, an empty capture list (||) is used.

Zentropy += 1

ghost · 2020-11-05T02:00:33Z

My primary concern here is that whether or not a macro can be used depends on the enclosing function being inline, which is declared at the definition -- so, it's a syntactically nonlocal, semantically significant detail, which contradicts the very first point of "Why Zig?". For that reason, I'm against this. (Addressed by below comment.)

SpexGuy · 2020-11-05T04:37:02Z

breakto

This solves the problem but doesn't provide the same guarantees. Instead of doing jmp address you are now doing jmp table[runtime_index], which is a much slower optimization. The opimizer might be able to make this better in some cases but it's far from clear that it always will. And if it doesn't, there's no warning or error, you just have slower code than if you had written the loop inline yourself.

@currentScopeMixin

This is interesting, but it's by definition an implicit stack capture, which is exactly what this proposal is trying to avoid. The result of this macro could easily be put into a struct or copied somewhere else and referenced later, causing stack corruption.

Making this an expression

I think this is a good idea. Blocks will still be necessary to encode a statement, but blocks are expressions so this is ok.

linkedList.foreach(|x| { sum += x; });
linkedList.sumIf(|x| x > 4);

Zentropy += 1

I'm not sure what this means, but if I had to guess based on the context you're saying that empty capture lists aren't a thing anywhere else in the language so this is inconsistent? That's a fair point. An alternative would be to have an operator that indicates that a macro is being declared. Maybe something like this?

linkedList.foreach(=> |x| { sum += x; });
sevenTimes(=> std.debug.print("this will print seven times\n", {}));

whether or not a macro can be used depends on the enclosing function being inline

I may not have been clear in the proposal, but the function which passes the macro doesn't need to be inline. Only the function receiving the macro needs to be marked inline. The inline specifier is in the function header, right next to the macro prototype in the parameter list, which is validated by the compiler. So you're not going to get a surprise compile error in one place because some other function isn't inline. The compile error will appear on the function that is trying to declare a macro prototype parameter even though it isn't marked inline. Macros and functions are not at all interchangeable, so I don't see the problem here.

ghost · 2020-11-05T06:38:09Z

Upon understanding the proposal a bit more, I think it's a good idea, but should be managed with care. I think, in particular, that there should be no special-case control flow -- keywords should require labels to snap to macro scope, as in blocks (this fits well with making macro bodies expressions, as well as #5610). Also, I don't think a macro type declaration needs parameter names, since a macro instance will use its own anyway.

An alternative would be to have an operator that indicates that a macro is being declared. Maybe something like this?

~~If we do have an operator, I think it should be infix: |x, y| => /* body */. Not that one specifically though, since it might look weird in switch arms (a => => //...) -- maybe ->, or :?~~
In typing this, I realised something -- if a comptime switch returns a macro (which should be allowed, as it doesn't have the temporal problem of assigning it), then that creates a syntactic ambiguity with payload capture:

inline_func(switch (foo) {
    .bar => |x| x < 4, // Is this a payload capture, or a macro?
    else => unreachable,
});

To resolve that, why not reuse the trick we already use to declare literals without prepending them with types -- the humble .. Declare a macro type as macro (T, S) R, write a concrete macro as .|x, y| z or .|| z. I don't think the empty bars are a problem here, since they don't play exactly the same role as in branch capture -- here, they mean "capture zero elements", whereas missing bars in a branch expression means "don't capture this one element".

I have a feeling Andrew might reject this proposal out of hand, because it looks a bit too magical -- I hope that doesn't happen. It really is incredibly powerful, and there's no magic happening if you think about it.

rohlem · 2020-11-05T11:06:16Z

breakto

Instead of doing jmp address you are now doing jmp table[runtime_index]

Again, this holds true for unoptimizied Debug mode - and actually, only if Debug mode is allowed to override user-forced inlining decisions. I honestly don't remember the performance of Debug mode ever being under so much scrutiny, but fair enough.

EDIT:

The opimizer might be able to make this better in some cases but it's far from clear that it always will.

Oh, I just now realized that the intermediate return unifying control flow is the source of this issue (as I understand it).
So do we need the label enum type to be comptime-only? I guess that doesn't work because we need to differentiate them by runtime path.
So we need direct control flow, a normal return doesn't quite cut it.

@currentScopeMixin

This is interesting, but it's by definition an implicit stack capture, which is exactly what this proposal is trying to avoid. The result of this macro could easily be [...]

This proposal introduces a new comptime-only type and states new rules that apply to no other type.
Assuming we need these rules for the optimizations we want out them, we can put the same restrictions on the result of @currentScopeMixin, even if it reduces symmetry and flexibility:

Since we pass it as a pointer, the receiving function already shouldn't make assumptions about the pointee's lifetime.
To make sure that the pointee isn't copied, we can change the result of @currentScopeMixin from being a struct to being an opaque type. (And maybe @currentScopeMixinAddress is the better baseline then, idk.)
EDIT3: This would mean that the member functions first have to @ptrCast(*@This(), self). If @ptrCast isn't valid for this for some reason, it might be valuable provide builtins @opaqueify and @deopaqueify for this.
(see also non-copy proposals Ability to mark a struct field as "pinned" in memory #3803 and make aggregate types non-copyable by default; provide copyable attribute #3804 which may provide an alternative without changing to opaque)

So now that we gave the same usage restrictions, here are the remaining differences between proposed inline macro syntax and proposed @currentScopeMixin[Address]:

Constructing macros needs some kind of syntax, and it seems like it would benefit from unique syntax that isn't used anywhere else - although granted, stateless lambda syntax could end up looking similar.
The syntax for a builtin function returning an instance of a builtin type already exists, f.e. @typeInfo.
The idea of associating data with comptime-known functions as struct and opaque already exists, and is afaik the prevalent way of currently implementing this.
That means that all code currently written for them (that doesn't copy the pointer/pointee somewhere) will automatically work with a @currentScopeMixin, and code written for a @currentScopeMixin will automatically work with a manually-crafted struct or opaque type.
To me it seems that specifying the semantics in terms of struct and opaque and optimizing their usage helps a broader spectrum of use cases than would benefit from the new macro type.
To get the scoped-try behaviour of this, I'd prefer try :block syntax usable anywhere. EDIT2: Although actually, if a block's type is deduced as ErrorUnion, this coincides with <expression> catch |e| breakto :label2 e.

I think the one thing left unaccounted for would be break-ing from within the macro-code into the caller-code EDIT:, without requiring the extra step of a function result and a manual breakto (or actually breakfrom might be a better name).
Having a general mechanism for EDIT: direct control flow like this (like a comptime-only type label/scope) would make sense. It might be up to whether that makes the intermediate function's defer sequence feel like "hidden control flow" or not.

Also thank you for seriously addressing my comments.

SpexGuy · 2020-11-05T22:07:50Z

there should be no special-case control flow

Yeah, I've come around to this one. I always found nonlocal returns to be the weirdest part of kotlin's version, but that's because I was thinking of them as producing lambda functions. Thinking of them as expressions/blocks makes this much less weird. I'll modify the proposal to bring this in.

if a comptime switch returns a macro

I don't think this can be allowed. Macros are cannot be values, and arbitrary expressions cannot produce them. If that can happen, then you have this problem:

// argument is block expression producing macro value
withMacro(blk: {
    var local: u32 = 0;
    break :blk => { local += 1 };
});

This macro captures a variable that goes out of scope before use. So invoking this macro corrupts the stack. That sort of stack capture was what was ruled out in #229. In theory, there is an alternate form of this proposal where macros are values and the compiler verifies at each inlining site that they don't capture any variables that are out of scope. But I felt that that could produce all kinds of complexity that doesn't make sense. (what does it mean to store a macro in a comptime struct and then invoke it in a separate call to the same function at comptime? Does it capture the locals that were available in the first call or the second? Does it just reference variables in scope by name? Would it make sense to invoke it in a separate function that has locals of the same names?)

I think macros need to be all-the-way values (with variables and struct fields and everything) or all the way not (doesn't even support assignment, not compatible with expressions), so that it's really clear to the programmer what can or can't be done with them. I don't think this restriction significantly limits their usefulness.

To be really clear, I'm not proposing modifying the Expr node at all. I'm proposing modifying ArgumentList (or ExprList as it's currently called) and ParamDeclList to accept macro declarations and macro prototypes, respectively. Since a macro parameter isn't actually a variable, I wouldn't be opposed to requiring some special token on it (eg. $myMacroParam) to turn macros in expressions into parse errors instead of compile errors.

If you need to comptime switch to determine the behavior of a macro, you can put the switch inside the macro without any loss of functionality. If you need to comptime switch based on the number of arguments in a macro, you must put that switch outside of the entire function call expression. But the same requirement holds for the function declaration. Macro prototypes are not compatible with type expressions either, so in order to have a function which takes a variable number of macro parameters you must have multiple function declarations and a comptime value which is set to one of them. I can't really think of a reason you would ever want to do this though.

Is this a payload capture, or a macro?

I actually chose the => token because of this ambiguity. It is both. You can consider a switch to be an expression which selects one of several macros to immediately run based on the switch value and match expressions.

we can put the same restrictions on the result of @currentScopeMixin

This would work, but I think if we're going to introduce those restrictions we might as well get a really convenient syntax out of it.

direct control flow like this would make sense

Yeah, I've been thinking a bit about how this would work. It's kind of like a "reverse inline". Like "I want the compiler to take this code and inline it at each return in the called inline function, so that I can have the parameter be comptime known".
Something like

// mark the return value as "inline", may be captured at call site with inline_return
// return expressions must be comptime known
inline fn compareFmt(a: usize, b: usize) inline []const u8 {
    return if (a < b) "{} < {}" else "{} >= {}";
}

fn compareToFour(a: usize) void {
    inline_return (compareFmt(a, 4)) |fmt| {
        // fmt is comptime-known but may hold different values.
        // this is kind of a generalization of inline for.
        std.debug.print(fmt, .{a, 4});
    };
}

(side note, I also kind of want this for switches)

const T = union(enum) {
    a: extern struct { ptr: *u8, len: usize },
    b: extern struct { len: usize, ptr: *u8 },
};

var value: T = ...;
const len = inline switch (value) {
    // this text compiles for each possible value but they produce different machine code
    else => |val| val.len,
};

But I think if your goal is something like "iterate a data structure and return immediately if you find a certain value", this is a pretty tortured way to express that. It's also worth noting that you can emulate this inline return feature with macros, in what I think is a much more clear way:

inline fn compareFmt(a: usize, b: usize, withFmt: macro (comptime fmt: []const u8) void) void {
    return if (a < b) withFmt("{} < {}") else withFmt("{} >= {}");
}

fn compareToFour(a: usize) void {
    compareFmt(a, 4, => |fmt| {
        // fmt is comptime-known but may hold different values.
        std.debug.print(fmt, .{a, 4});
    });
}

tauoverpi · 2020-11-06T07:00:14Z

I find the => |x| x syntax to be more confusing than .|x| x would have been since it doesn't fit the style of the rest of the language and messes with the visual parsing of the => token (may be due to haskell/idris use). Maybe some other syntax would do but => communicates this => (to) that the same as -> would.

Another note, this would be brilliant for testing as you could make testing contexts without much friction.

const text = "argument not supplied by fuzzer";
fuzz(rngFn, testedFn, .{ .@"5" = text }, .|args, result| {
  const expect = .{ 0, 1, 2, 3, 4, text, 6 };
  testing.expectEqualWithSlice(expect, try result);
});

Along with other "do-notation-like" contexts where vales need to be fixed up after.

zzyxyzz · 2020-11-06T12:12:47Z

Concerning notation: Why is the => or the proposed alternative .| | necessary? From skimming the thread, the reasons seem to be:

To distinguish macros and capture lists
To avoid empty ||

However, AFAICT, capture list expressions as currently used in the language have exactly the semantics of macros, so there is no need to distinguish them. E.g., for (elts) |x| { foo(x); } could actually be implemented in userland with this proposal: for(elts, |x| { foo(); }). (In terms of interface; The actual implementation would need a goto and you couldn't use break or continue).

The capture expressions in switch, catch, while, else, etc. also have the same semantics: they are kind of inline mini-functions that are fed the arguments specified between the | | by the parent construct, but otherwise interact directly with the parent scope like ordinary blocks. In short, a macro as proposed here.

An extra decorator is also not necessary to distinguish macros from similar objects: If it appears within the parameter list of a function it's a macro, otherwise it is one of the special constructions using capture lists. If the recieving function expects a normal function argument rather than a macro (not meant for inlining), the compiler can yell at you.

Concerning empty capture lists: True, they don't occur in the language, but that's because all current uses of capture lists pass at least one argument. But in the generalization proposed here, there are legitimate cases where you don't need to pass anything and simply execute a code block as is, so this seems like a non-issue to me.

ghost · 2020-11-06T12:29:21Z

I like that much better, actually. Although it does mean an empty capture list has to be examined in context to distinguish it from the error set combination operator.

tauoverpi · 2020-11-06T12:44:16Z

Changing the error set sum to && wouldn't be too much of a change and somewhat keep the meaning of joining of the sets at the cost of looking slightly less visually appealing.

rohlem · 2020-11-06T13:41:35Z

@tauoverpi & means bit-intersection, && means logical conjunction (can be seen as event intersection) in other languages. For it to mean "join"/union seems extremely counter-intuitive to me, at least from a mathematical perspective.

zzyxyzz · 2020-11-06T13:57:50Z

Although it does mean an empty capture list has to be examined in context to distinguish it from the error set combination operator.

Yeah, I forgot about that. The ambiguity is only in the tokenizer, though (I think). The parser itself will not be made much more complicated because of it. So maybe this particular ambiguity can be deemed acceptable.

tauoverpi · 2020-11-06T15:14:43Z

@rohlem issue there is that no other operator really fits either apart from variations on || and any additions to it makes it just as confusing. The other choice would be either or similar but same issue of not reading as well there.

zzyxyzz · 2020-11-06T17:22:49Z

Edit: The ambiguity is more serious than I thought, since A || blk: { break :blk error{Bar}; } is in fact legitimate and currently supported syntax. Still, since || as error set join is an infix operator, while in a capture list context it can only occur in prefix position, that should be sufficient for disambiguation. The original comment is mostly left as it was, but I'm less sure of it now.

@tauoverpi, I don't think that changing the error set join is necessary at all. To expand a bit on my previous comment, the ambiguity would be pretty bad if it were possible to write something like
ErrorSet1 || blk: { break blk: ErrorSet2; }(). Fortunately, there is no reason whatsoever to allow that, and the context-dependence of || therefore becomes quite narrow: If it is encountered as an infix operator in a type context, it always means error set join. If it is encountered as the first token in a function argument, it always means empty capture list. (Unless someone can think of further complications?) The dual meaning of || can only really throw off direct pattern matching on the source text (e.g. in simple syntax highlighters or search-and-replace). But that can't be helped, and besides, there are already other things that do that, for example ! in !void, !true, != and "Hello!". Personally, I don't consider this a serious issue, but if others disagree, I'd prefer using the .|x| syntax rather than change the error set combination operator.

<offtopic> I was surprised to find out that the error set join is a fairly rare operation, with a whopping 133 legit matches in the entire Zig repo. So maybe changing its syntax wouldn't be that dramatic. But like @rohlem, I'm stictly against && on grounds of set theory, and I doubt that Andrew would be interested in @ErrorJoin(A, B, C) or something 😁) </offtopic>

SpexGuy · 2020-11-06T19:03:58Z

ambiguity of ||

Yeah, this is not meaningfully ambiguous. As @zzyxyzz points out, there is already precedent in the language for this sort of context dependence, in inferred error types in return expressions. Unary not is not normally supported on types, but if it begins a return type expression it means "infer the error set". So this would be a solution.

However, I think it's inconsistent to require an empty capture list for macros but not require it in variants of if/for/switch/while with no captures. And we can't have no token and no capture list, because that's ambiguous with an unlabeled block that executes with the arguments. The compiler could figure it out, but it would be really hard for programmers to read. I'm open to other tokens besides => though, or even a keyword. But . doesn't quite fit, because without the capture list that's .{ x += 4; } which is ambiguous with anonymous structs. Maybe macro, inline, expr, or hoist?

yohannd1 · 2021-06-30T20:14:12Z

I think this would a nice addition to the language. And it would be even cleaner if the function could be placed outside the parentheses like in Java or Kotlin:
const sorted = std.sort(i32, .{3, 2, 4}) |lhs, rhs| {
   return lhs >= rhs;
};

@akvadrako that does look cleaner, but I have some "issues":

I don't think it's very clear on what is going on

This is actually an issue that I also feel on ruby (although I have very little experience with it, so it might just be me crying about it) and I probably would on Kotlin and Java too.
I feel like a macro declaration right after an expression isn't very easy to read and it might become confusing in more complex expressions:

// current state of the proposal
_ = funcWithMacro1(10, |x| funcWithMacro2(20, |y| 0));

// with this suggestion
_ = funcWithMacro1(10) |x| funcWithMacro2(20) |y| 0;

Maybe something like a |...| placeholder could be better:

const sorted = std.sort(i32, .{3, 2, 4}, |...|) |lhs, rhs| {};

It would be used to indicate to the compiler that a macro declaration should follow right after the end of the expression.
Still, I don't know if this really solves it or makes it even more complicated...

confusion with builtins

// this is valid syntax:
const some_value: ?i32 = null;
if (some_value) |x| {}

// with your suggestion, this could also be valid syntax:
pub fn ifPositive(x: i32, m: macro (i32) void) void {}
ifPositive(10) |x| {
    std.debug.print("{} is positive!\n", .{x});
}; // other than this semicolon it's pretty similar

I don't know if this would really be a bad thing, since it looks nice, but it does make distinguishing builtin constructs from userland constructs harder.

what to do if there's more than one macro being passed?

// what I think this proposal aims to have
_ = callWithTwoMacros(|x| x + 10, |y| y + 5);

// what would happen here? (this is a placeholder)
_ = callWithTwoMacros() |x| x + 10, |y| y + 5;

Maybe this syntax could be only allowed calls that have a single macro to be passed.

Overall, I don't think this would alone suffice all use cases better than the proposed syntax, and I don't think it's worth having both of them, as it would make things more complex and would deviate from the Only one obvious way to do things. rule.

zzyxyzz · 2021-06-30T22:15:52Z

I just noticed that taking the snippet outside the macro is ambiguous, so it's not possible in any case.

const x: usize = 1;
const y = foo() | x | blk: { break :blk x; }; // bitwise OR, not a macro

topolarity · 2021-11-18T18:39:23Z

To throw in an interesting connection: Some folks were inspired by PASCAL/ALGOL's second-class functions, and they had an idea to allow values to be made local, an attribute which has these rules:

References to a local variable may never be held/stored in any non-local variable
Functions may not return local variables
local values may not be stored in mutable variables (neither assigned to an object or its fields)

According to these rules, a macro basically acts like a local function closure, which also happens to be comptime and inlined. If we add a couple rules for local fn(...), we get very close to a version of the proposal with function-like control flow only:

A stack-capturing anonymous function can auto-convert to a local fn(...) only when passed in directly as a parameter (i.e. not when assigned to local fn variable)
We restrict the compile-time reflection on local fn to hide the fact that it's a closure (@TypeInfo, etc.)

(Note that the only thing that can be done with a stack-capturing anonymous function, i.e. an automatic closure, is for it to be converted automatically to a local fn by argument passing.)

This approach does not include non-local control-flow. The benefits are that:

Less new syntax is required, both at the caller and callee sites (esp. no new macro syntax)
local can be used to give a strict stack discipline to other variables

To see where (2) might be useful, consider a helper like Python's with open(...):

inline fn with_file(filename: []u8, flags: OpenFlags,
                    comptime action: local fn(f: local File) !void) !void {
    const file = try openFile(filename, flags);
    defer file.close();
    try @call(.{modifier = .always_inline}, action, .{file});
}

This can then be used in user code like:

fn main() !void {
    const second_half = "world!\n";
    try with_file("test.txt", fn(f: local File) !void {
        try f.write("Hello, ");
        try f.write(second_half);
    });
}

The advantage being that it's impossible to leak a reference to the File, so it's always perfectly safe to close.

In a sense, we're cheating here - a local fn is not just a normal fn which happens to be local. It's actually a snippet/closure/etc. However, from the user's perspective this is irrelevant. A local fn carries extra restrictions that make this underlying nature of a local fn(...) un-observable. This breaks Spex_Guy's recommendation of "all-the-way values or all-the-way not", but it also separates the macro semantics into two smaller parts: (1) stack-restricted object semantics in the form of local, and (2) restricting the reflection/built-ins for local fn(...)

This isn't intended to be an alternative proposal or anything, just a chance to connect this to some new ideas - I'm not convinced local is useful enough to be worth the added complexity for Zig (and I think the name local also needs some bike-shedding), but I wanted to share in case it gives anyone some crazy ideas about where else stack-restricted variables could be useful.

kayomn · 2021-12-05T18:21:53Z

To give an example relevant to my own project where this would be powerful, I'm working on a virtual machine for a toy scripting language that tries to provideas memory safe of a native API as possible.

Currently, doing this:

const message = vm.getGlobal("message");

// This may trigger the vm to do a GC sweep and free "message" as it has no references inside the VM's reachable address space.
myVmFunc.call(.{message});

// Message may now potentially be invalid.

Is less than opportune. Currently, I have a memory binding API that works as follows

// No matter what, this data will never be freed now.
const message = vm.bindGlobal("message");

defer vm.unbind(message);

myVmFunc.call(.{message});

However this introduces a lot of boilerplate and potential for footgun, as the dynamic type system means that any value, including primitives like integers and numbers, need to be "bound" and "unbound" otherwise there would be an introduction of even more boilerplate to do concrete type checking.

With stack-capturing macros, the following could be done instead for simple bindings.

// The lifetime of the data is both clearly readable and easy to write.
vm.getGlobals(.{"message", "otherstuff"}, => |globals| {
  // If the value is a dynamic resource, it was bound.
  myVmFunc.call(.{globals[0]});
});

iacore · 2022-01-07T19:17:23Z

So this is now duplicate of #4170

kayomn · 2022-01-07T21:11:53Z

So this is now duplicate of #4170

Looking at both issues, I'd disagree. This is attempting to introduce some form of functional programming patterns in a systems-level manner whereas #4170 is about supporting first-class functions without any focus on stack-capturing beyond your own suggestions a few days ago.

zzyxyzz · 2022-02-05T22:23:52Z

There is another small way in which macros can be made even more first-class. Generally, capture expressions in loops and other places in the laguage allow you to choose whether to capture by value (for (slice) |x| ...) or by pointer (for (slice) |*x| ...). Macros, in the currently proposed form, do not have this flexibility. However, we could adopt the convention that a macro argument declared as a pointer is converted to an optionally mutable capture:

inline fn foo(action: macro(item: i32) void) void {
    // ...
}

inline fn bar(action: macro(item: *i32) void) void {
    // ...
}

foo(|x| { ... });  // ok
foo(|*x| { ... }); // compile error
bar(|*x| { ... }); // ok
bar(|x| { ... });  // also ok; dereference happens automatically
                   // and is elided on the spot

With this, macro expressions would have the same semantics and flexibility as native captures.

Jarred-Sumner · 2022-08-15T04:56:54Z

This would be so useful

Writing iterators in Zig right now has some footguns

my_iterator.index would seemingly point to the current item, but updating iterator.index needs to happen at the end of the loop instead of the beginning of the loop or else it will point to the following item. How do you run code at the end of the loop when you don't have access to the underlying loop?
The code doing the looping needs to either return an iterator or accept a function to call per iteration. What if you want to stop iterating in the middle of the function? Now you've handrolled a limited version of coroutines when all you wanted was to transcode latin1 into UTF-16.

Jarred-Sumner · 2022-08-22T02:17:26Z

Here's another usecase

    var file = std.fs.openFileAbsoluteZ(absolute_path, .{ .mode = .read_only }) catch {
        const WarnOnce = struct {
            pub var warned = false;
        };
        if (!WarnOnce.warned) {
            WarnOnce.warned = true;
            Output.prettyErrorln("Could not find file: " ++ absolute_path ++ " - using embedded version", .{});
        }
        return Holder.file;
    };

With stack capturing macros, this code becomes something like

    var file = std.fs.openFileAbsoluteZ(absolute_path, .{ .mode = .read_only }) catch {
        warnOnce({
             Output.prettyErrorln("Could not find file: " ++ absolute_path ++ " - using embedded version", .{});
        });
        return Holder.file;
    };

and warnOnce becomes reusable instead of a one-off

12Boti · 2022-08-22T08:42:01Z

@Jarred-Sumner You can already do this by simply returning a bool:

const std = @import("std");

fn warnOnce() bool {
    const WarnOnce = struct {
        pub var warned = false;
    };
    const r = WarnOnce.warned;
    WarnOnce.warned = true;
    return !r;
}

pub fn main() !void {
    var i: u32 = 0;
    while (i < 10) : (i += 1) {
        if (warnOnce()) {
            std.debug.print("warning\n", .{});
        }
    }
}

To make this reusable, you could pass an opaque type to it:

const std = @import("std");

fn warnOnce(comptime T: type) bool {
    _ = T;
    const WarnOnce = struct {
        pub var warned = false;
    };
    const r = WarnOnce.warned;
    WarnOnce.warned = true;
    return !r;
}

pub fn main() !void {
    var i: u32 = 0;
    while (i < 10) : (i += 1) {
        if (warnOnce(opaque {})) {
            std.debug.print("warning1\n", .{});
        }
    }

    i = 0;
    while (i < 10) : (i += 1) {
        if (warnOnce(opaque {})) {
            std.debug.print("warning2\n", .{});
        }
    }
}

judofyr · 2022-08-28T08:44:27Z

I'd just like to point out that functions which has a "macro" as a parameter doesn't have to be inline.

// This function:
fn foreach(self: *@This(), action: macro (key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(&node.key, &node.value);
        }
    }
}

// … can internally be represented as:
// (however, with the restriction can you're not allowed to store `action` in a variable etc.)
fn foreach(self: *@This(), ctx: *anyopaque, action: *const fn (ctx: *anyopaque, key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(ctx, &node.key, &node.value);
        }
    }
}

This turns macro into being only about constructing closures which are guaranteed to always be valid (while referring to values on the stack). The function still behaves as a normal function and there's no change in control-flow. There's only one instance of this function and the inliner may or may not decide to inline it where it's being used.

It would of course also be neat to have a way of customizing the control flow in these macros, but that does seem like a much more complicated problem so maybe we can solve it later? I feel like just having these macros (with no change in control-flow) would enable a lot of nice use cases.

kayomn · 2022-09-22T10:08:41Z

I'd just like to point out that functions which has a "macro" as a parameter doesn't have to be inline.
// This function:
fn foreach(self: *@This(), action: macro (key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(&node.key, &node.value);
        }
    }
}

// … can internally be represented as:
// (however, with the restriction can you're not allowed to store `action` in a variable etc.)
fn foreach(self: *@This(), ctx: *anyopaque, action: *const fn (ctx: *anyopaque, key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(ctx, &node.key, &node.value);
        }
    }
}
This turns macro into being only about constructing closures which are guaranteed to always be valid (while referring to values on the stack). The function still behaves as a normal function and there's no change in control-flow. There's only one instance of this function and the inliner may or may not decide to inline it where it's being used.

It would of course also be neat to have a way of customizing the control flow in these macros, but that does seem like a much more complicated problem so maybe we can solve it later? I feel like just having these macros (with no change in control-flow) would enable a lot of nice use cases.

Does this take into account Zig's built-in async-await primitives? In the initial proposal, inline has the implicit exclusion of the async calling convention.

Also, more in general to the thread topic, does anyone have any thoughts on how this would interact with async await? Would they be mutually exclusive from each other in case there's some way of escaping stack references?

Jarred-Sumner · 2022-12-12T02:32:55Z

i'd be very curious about the perf impact of this proposal on iterators in Zig

For example, say you're iterating over a HashMap with 1 million entries:

zig/lib/std/hash_map.zig

Lines 817 to 842 in 5238f9c

    
           pub const Iterator = struct { 
        
               hm: *const Self, 
        
               index: Size = 0, 
        
               pub fn next(it: *Iterator) ?Entry { 
        
                   assert(it.index <= it.hm.capacity()); 
        
                   if (it.hm.size == 0) return null; 
        
                   const cap = it.hm.capacity(); 
        
                   const end = it.hm.metadata.? + cap; 
        
                   var metadata = it.hm.metadata.? + it.index; 
        
                   while (metadata != end) : ({ 
        
                       metadata += 1; 
        
                       it.index += 1; 
        
                   }) { 
        
                       if (metadata[0].isUsed()) { 
        
                           const key = &it.hm.keys()[it.index]; 
        
                           const value = &it.hm.values()[it.index]; 
        
                           it.index += 1; 
        
                           return Entry{ .key_ptr = key, .value_ptr = value }; 
        
                       } 
        
                   } 
        
                   return null; 
        
               }

In the current implementation, the following lines would likely be evaluated 1 million times:

         if (it.hm.size == 0) return null; 
         const cap = it.hm.capacity(); 
         const end = it.hm.metadata.? + cap; 
         var metadata = it.hm.metadata.? + it.index;

For cases when you don't need to iterate and modify (probably the default scenario), that's very wasteful.

With stack-capturing macros, the next function could be rewritten to only do that work once.

rohlem · 2022-12-16T07:56:04Z

EDIT: Turns out these are essentially the same points I posted two years ago, my bad.

I might be wrong about this, but aren't all code snippets posted here regarding optimization primarily solved by aggressive inlining?
From my understanding the point of interaction is that this proposed language feature, with the proposed semantics, requires the compiler to inline the passed function.
(Maybe also forcing to fuse their basic blocks or whatever the correct jargon.)

In status quo, we can pass a function via enum{fn f ...}.f and then decide whether the call should be inlined via @call(.inline).
With the proposal, there is a new way to do it, with a stack-capturing macro, but now the function call has to be inlined.

I don't like the idea of deciding whether callers of my function would rather always have their predicate inlined or not.
That seems like an optimization decision, which can be decided after analysis and/or benchmarking,
and I would personally like to initially defer it to the optimizer/compiler and not have to think about it.
And when I do have to change that decision, it would be nice to rewrite as little code as possible, which seems easier when not separating these concepts on a language level.

Moreover, any optimization like this we could teach the compiler for all functions would benefit not just the ones written as stack-capturing macros.
Therefore, say this proposal is rejected, if we still took note of the cases here to improve the compiler to handle normal functions in this regard, I would see it as a larger benefit than introducing this new concept and having the old way to implement it remain demonstrably slower.

Completely disconnected from this performance talk, yes I would also appreciate syntax for lambda functions,
and if they were actually structs to support captures I would also end up using those.
If such syntax joins the language, I still think control over their inlining would be useful.

jkensik · 2023-03-27T17:16:41Z

is there any update on this proposal, or functionality for supporting closures?

moderatelyConfused · 2024-03-23T00:36:39Z

Just adding a few thoughts here as a new zig user that recently googled "can you pass a block into a function zig", and eventually ended up here. Given all of the discussion by everyone in this thread about making macros identical in both syntax and semantics to blocks, it seems strange to call them macros/snippets/whatever, when block would perfectly describe how they behave.

This also simplifies the "rules" of macros, since if a "macro" is just a block, it's already true in zig today that you can't assign a block to a const value, you'll just get the evaluated result from the block if you do that:

// This will give you 4, not the block itself
const b = blk: {
    break :blk 4;
};

And again it's already true in zig today that blocks with parameters are only allowed in certain places:

// ok, the language allows this
if (x) |y| {
    ...
}
// not allowed
const z = |w| {
    ...
};

So instead of creating a new "macro" feature with a complicated set of rules around initialization and usage that make it indistinguishable from a block, this feature is just extending the list of places it's legal to put a block with parameters (and without) from if, while, for, switch cases, etc, to include (inline) function parameters.

This also makes the question of semantics pretty straightforward (and seem to match the proposal, if I'm reading it correctly):

// any control flow statements like continue, break, return in this block...
action({
    ...
});


// ...should have identical semantics to those in this block.
{
    ...
}

(It also took me quite a long time to find this issue, because it never occurred to me to call this feature a macro - in my head the term "macro" is about naive code splicing and immediately calls up a red flag that reads "beware copy-paste issues and namespace collisions", which aren't relevant to this proposal.)

So maybe consider making the keyword block instead of macro?

rohlem · 2024-03-23T08:00:49Z

@moderatelyConfused I like your simplified specification, however here's a nitpick:

So instead of creating a new "macro" feature [...] , this feature is just extending the list of places it's legal to put a block with parameters (and without) from [...], to include (inline) function parameters.
... example using syntax action({ ... }); ...

This isn't strictly true: Blocks are expressions, so the following is already valid under status-quo (results in passing the result of the evaluation, just as in your assignment example):

const debugLog = @import("std").log.debug;
fn action(x: anytype) void { //mark x inline once we have inline parameters #7772
  debugLog("x is {}", .{x});
}

pub fn main() void {
  action({
    debugLog("1", .{});
  }); //block is evaluated to {} (the void value)
  action(x: {
    debugLog("2", .{});
    break :x @as(u8, 44);
  }); //block is evaluated to @as(comptime_int, 44);
}

To make the proposed behavior not depend on the parameter definition of the callee,
we need to have some way (f.e. proposed => syntax) to distinguish an eagerly-evaluated block from a lazily-passed block.

Lazily-passed blocks are repeatedly evaluatable by call-syntax, making them look more like status-quo functions,
with the difference being they may unwind their caller function(s) up to their source function.
So another term would be function-ized/-ified/-shaped blocks (~~which is very close to anonymous functions #1717 but with extra stack-unwinding capabilities~~ repeating myself).

moderatelyConfused · 2024-03-23T21:20:57Z

@rohlem This is a fair nit, I was imagining that blocks-as-expressions would be disambiguated by the callee definition, but there is definitely a readability tradeoff being made there.

One thing I'll point out is that any disambiguating syntax like => is acting as a modifier/visual indicator/sanity check for the programmer since it doesn't actually modify the semantics of the block, which are determined by the block syntax itself and the definition of the callee. All the modifier does is produce a user-friendly compile error if it's not present for a block parameter to an inline function. So I think my point stands that macros are really just blocks, regardless of whether the syntax includes a modifier or not.

Lazily-passed blocks are repeatedly evaluatable by call-syntax, making them look more like status-quo functions,
with the difference being they may unwind their caller function(s) up to their source function.
So another term would be function-ized/-ified/-shaped blocks

I think talking about blocks-as-parameters as functions makes them more complicated to describe, not less. If you imagine a block parameter as a block that will get inlined some number of times within the function you define it in, you don't need to have a concept of stack-unwinding, because after inlining everything ends up in the same stack frame. The only thing you need to remember is that variables and control flow are resolved relative to the original lexical context of the block, and that the code for the block will end up somewhere in the function you're writing 0 or more times.

Considering zig already uses function call syntax to represent inlining for functions that are declared inline, I think it's consistent with the existing style of the language that a call to a block/macro action() is not a function call, but an inline done at compile time.

which is very close to anonymous functions #1717 but with extra stack-unwinding capabilities

Overall I think the fact that all of the intermediate calls inline away and collapse into a single stack frame makes this proposal significantly simpler than #1717 (and leaves the implementation with few to no tricky implementation choices), not more complicated.
(EDIT: What I really meant here is that #1717 with stack-unwinding would essentially be a full-blown closures proposal and would need to make tricky implementation choices, not that anonymous function syntax itself is all that complicated)

alexnask added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Nov 3, 2020

andrewrk added this to the 0.8.0 milestone Nov 3, 2020

slimsag mentioned this issue Nov 21, 2021

glfw: Force init before using init dependent functions hexops/mach#61

Closed

andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021

andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022

zzyxyzz mentioned this issue May 27, 2022

ability to annotate functions which allocate resources, with a way to deallocate the returned resources #782

Closed

andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023

andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023

Jarred-Sumner mentioned this issue Sep 17, 2023

Zig tracking issue oven-sh/bun#5611

Open

Pyrolistical mentioned this issue Apr 18, 2024

Proposal: Anonymous function literals with function signature inference #4170

Open

Proposal: Stack-Capturing Macro #6965

Proposal: Stack-Capturing Macro #6965

Comments

SpexGuy commented Nov 3, 2020 • edited

Summary

Description

MasterQ32 commented Nov 3, 2020 • edited

SpexGuy commented Nov 4, 2020

zzyxyzz commented Nov 4, 2020

rohlem commented Nov 4, 2020 • edited

MasterQ32 commented Nov 4, 2020

rohlem commented Nov 4, 2020 • edited

alexnask commented Nov 4, 2020

SpexGuy commented Nov 4, 2020 • edited

MasterQ32 commented Nov 4, 2020

rohlem commented Nov 4, 2020 • edited

SpexGuy commented Nov 4, 2020

tauoverpi commented Nov 4, 2020

rohlem commented Nov 4, 2020 • edited

data-man commented Nov 4, 2020

ghost commented Nov 5, 2020 • edited by ghost

SpexGuy commented Nov 5, 2020

ghost commented Nov 5, 2020 • edited by ghost

rohlem commented Nov 5, 2020 • edited

SpexGuy commented Nov 5, 2020 • edited

tauoverpi commented Nov 6, 2020

zzyxyzz commented Nov 6, 2020

ghost commented Nov 6, 2020

tauoverpi commented Nov 6, 2020

rohlem commented Nov 6, 2020 • edited

zzyxyzz commented Nov 6, 2020

tauoverpi commented Nov 6, 2020

zzyxyzz commented Nov 6, 2020 • edited

SpexGuy commented Nov 6, 2020

yohannd1 commented Jun 30, 2021 • edited

I don't think it's very clear on what is going on

confusion with builtins

what to do if there's more than one macro being passed?

zzyxyzz commented Jun 30, 2021 • edited

topolarity commented Nov 18, 2021

kayomn commented Dec 5, 2021

iacore commented Jan 7, 2022

kayomn commented Jan 7, 2022

zzyxyzz commented Feb 5, 2022 • edited

Jarred-Sumner commented Aug 15, 2022

Jarred-Sumner commented Aug 22, 2022 • edited

12Boti commented Aug 22, 2022

judofyr commented Aug 28, 2022 • edited

kayomn commented Sep 22, 2022

Jarred-Sumner commented Dec 12, 2022 • edited

rohlem commented Dec 16, 2022 • edited

jkensik commented Mar 27, 2023

moderatelyConfused commented Mar 23, 2024 • edited

rohlem commented Mar 23, 2024

moderatelyConfused commented Mar 23, 2024 • edited

SpexGuy commented Nov 3, 2020 •

edited

MasterQ32 commented Nov 3, 2020 •

edited

rohlem commented Nov 4, 2020 •

edited

rohlem commented Nov 4, 2020 •

edited

SpexGuy commented Nov 4, 2020 •

edited

rohlem commented Nov 4, 2020 •

edited

rohlem commented Nov 4, 2020 •

edited

ghost commented Nov 5, 2020 •

edited by ghost

ghost commented Nov 5, 2020 •

edited by ghost

rohlem commented Nov 5, 2020 •

edited

SpexGuy commented Nov 5, 2020 •

edited

rohlem commented Nov 6, 2020 •

edited

zzyxyzz commented Nov 6, 2020 •

edited

yohannd1 commented Jun 30, 2021 •

edited

zzyxyzz commented Jun 30, 2021 •

edited

zzyxyzz commented Feb 5, 2022 •

edited

Jarred-Sumner commented Aug 22, 2022 •

edited

judofyr commented Aug 28, 2022 •

edited

Jarred-Sumner commented Dec 12, 2022 •

edited

rohlem commented Dec 16, 2022 •

edited

moderatelyConfused commented Mar 23, 2024 •

edited

moderatelyConfused commented Mar 23, 2024 •

edited