Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement multi-object for loops #14671

Merged
merged 28 commits into from
Feb 19, 2023
Merged

implement multi-object for loops #14671

merged 28 commits into from
Feb 19, 2023

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Feb 18, 2023

This exposed a latent bug in the (experimental) x86 backend but I'm not letting it block the merge of this PR, since all the other backends are fine.

Closes #7257.

merge checklist:

  • implement multi-object for loops
  • make zig fmt automatically upgrade old code
  • add parser_test.zig test case for zig fmt upgrading the old code
  • improve compile error messages
  • omit OOB safety checks of for loop captures (master branch incorrectly has these)
  • update existing behavior tests to new syntax and get them passing
  • make unbounded looping illegal
  • update stage1.wasm
  • omit safety check for incrementing for loop counter
  • implement all the test cases
  • update the language reference

Release Notes

There are two patterns to update here. The first one can be auto-update by zig fmt, but if you forget to run zig fmt you will see the error:

/home/andy/dev/zig/lib/std/crypto/hmac.zig:49:36: error: extra capture in for loop
            for (ctx.o_key_pad) |*b, i| {
                                   ^
/home/andy/dev/zig/lib/std/crypto/hmac.zig:49:36: note: run 'zig fmt' to upgrade your code automatically

The fix is to add a counter corresponding to the index like this:

--- a/lib/std/crypto/hmac.zig
+++ b/lib/std/crypto/hmac.zig
@@ -46,11 +46,11 @@ pub fn Hmac(comptime Hash: type) type {
                 mem.copy(u8, scratch[0..], key);
             }
 
-            for (ctx.o_key_pad) |*b, i| {
+            for (ctx.o_key_pad, 0..) |*b, i| {
                 b.* = scratch[i] ^ 0x5c;
             }

The second one must be done manually. It occurs when you try to capture a by-value array element by-ref:

/home/andy/dev/zig/lib/std/crypto/hmac.zig:49:21: error: pointer capture of non pointer type '[64]u8'
            for (ctx.o_key_pad, 0..) |*b, i| {
                 ~~~^~~~~~~~~~

If you see this, the solution is simple, just add a & in front of the array input to the for loop:

--- a/lib/std/crypto/hmac.zig
+++ b/lib/std/crypto/hmac.zig
@@ -46,11 +46,11 @@ pub fn Hmac(comptime Hash: type) type {
                 mem.copy(u8, scratch[0..], key);
             }
 
-            for (ctx.o_key_pad) |*b, i| {
+            for (&ctx.o_key_pad, 0..) |*b, i| {
                 b.* = scratch[i] ^ 0x5c;
             }
 

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. frontend Tokenization, parsing, AstGen, Sema, and Liveness. labels Feb 18, 2023
@candrewlee14
Copy link
Contributor

Is for (0.., items) |i, item| { } supported syntax? If not, will there be a helpful error message to explain unbounded ranges must follow length-bounded objects?

@rohlem
Copy link
Contributor

rohlem commented Feb 18, 2023

I assume the "count up forever" example for (0..) |i| { } is actually not forever, but bounded by the type of i (wrapping would seem counter-intuitive to me).
What is that type of i? usize seems sensible, but it would be nice to have a way to adapt this to smaller or larger ranges; maybe via typed capture |i: u128|? (though we don't have those - yet?)

@wooster0
Copy link
Contributor

wooster0 commented Feb 18, 2023

Does this support or is it going to support for (10..0) |i| {} where you count down instead of up? It would make certain code a lot easier to read. Here's a concrete example: https://github.com/r00ster91/wool/blob/751f64a584070bb4cf14b3c63318cd422ff36b28/src/grids/drawing/shapes.zig#L27-L41
Also: for (-5..5) |i| {} which counts -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5.
Or the other way around: for (5..-5) |i| {}: 5, 4, 3, 2, 1, 0, -1, -2, -3, -4, -5.
Also: for (-1..-2) |i| {} and for (-2..-1) |i| {} which count -1, -2 and -2, -1, respectively.
I think data analysts etc. would love that. I think it would be quite logical and make such code a lot easier to read. So those examples would use isize I suppose.
If negative ranges won't be supported, I think at least for (10..0) |i| {} would be great.

@f3fora
Copy link

f3fora commented Feb 18, 2023

In a similar way of the indexed part select of Verilog one could define a count up with (0+..) and a count down with (0-..)

@likern
Copy link

likern commented Feb 18, 2023

How would I make my structs working with for loop? Is there iteration protocol?

for (my_struct.iter()) |el|

And how to loop over with lazy / infinite iterators which don't have known size?

@rohlem
Copy link
Contributor

rohlem commented Feb 18, 2023

@likern use while-loops with an expression yielding a value of an optional type for this
(or return an array or slice, on which for loops already worked previously).
Some standard-library containers, like std.HashMap, already have Iterator types for this usage.

@bcrist
Copy link
Contributor

bcrist commented Feb 18, 2023

for (0..10) |x| {} should stop after 9, right? (for symmetry with .. when used in slicing syntax)

If so I don't think it makes sense to support counting down by flipping the numbers. Usually when counting down you want an inclusive endpoint. I'm not sure that reverse iteration is common enough to warrant special language support, but if it is, it would be nice if it were done in a way that could work on arrays and slices too. e.g. for (some_array descending, 0.. descending) |a, i| ... to iterate an array backwards, with index.

@tauoverpi
Copy link
Sponsor Contributor

tauoverpi commented Feb 18, 2023

Possibly missing a test case for [*]const u8 with a bounded counter?

const thing: [*]const u8 = @as([]const u8, "abcdef").ptr;

for (thing, 0..6) |v, i| { ... }

@rohlem
Copy link
Contributor

rohlem commented Feb 18, 2023

I think I personally prefer the readability of while(true) over the new for().
AFAIK no other control flow structure allows empty parentheses.
Only one obvious way to do things. leads me to believe we should only introduce for() if it's the clear favorite.

@andrewrk andrewrk marked this pull request as ready for review February 18, 2023 23:34
@ghost
Copy link

ghost commented Feb 19, 2023

You are obviously very good at sports.

Vexu and others added 12 commits February 18, 2023 19:17
* Allow unbounded looping.
* Lower by incrementing raw pointers for each iterable rather than
  incrementing a single index variable. This elides safety checks
  without any analysis required thanks to the length assertion and
  lowers to decent machine code even in debug builds.
  - An "end" value is selected, prioritizing a counter if possible,
    falling back to a runtime calculation of ptr+len on a slice input.
* Specialize on the pattern `0..`, avoiding an unnecessary subtraction
  instruction being emitted.
* Add the `for_check_lens` ZIR instruction.
This strategy uses pointer arithmetic to iterate through the loop. This
has a problem, however, which is tuples. AstGen does not know whether a
given indexable is a tuple or can be iterated based on contiguous
memory. Tuples unlike other indexables cannot be represented as a
many-item pointer that is incremented as the loop counter.

So, after this commit, I will modify AstGen back closer to how @Vexu had
it before, using a counter and array element access.
The intent here is to revert this commit after Zig 0.10.0 is released.
This also makes another breaking change to for loops: in order to
capture a pointer of an element, one must take the address of array
values. This simplifies a lot of things, and makes more sense than how
it was before semantically.

It is still legal to use a for loop on an array value if the
corresponding element capture is byval instead of byref.
One of the main points of for loops is that you can safety check the
length once, before entering the loop, and then safely assume that every
element inside the loop is in bounds.

In master branch, the safety checks are incorrectly intact even inside
for loops. This commit fixes it. It's especially nice with multi-object
loops because the number of elided checks is N * M where N is how many
iterations and M is how many objects.
Since for loops are statically analyzed to have an upper bound, and the
loop counter is a usize, it is impossible for it to overflow.
@kubkon
Copy link
Member

kubkon commented Feb 19, 2023

The latent bug in x86_64 is actually not limited to that backend - it is also present in aarch64 backend, however it has not yet surfaced there. The problem is with how we lower some instructions such as bit_cast or ptr_to_int where we directly duplicate instruction to MCValue tracking. Here's an illustrative example:

debug(register_manager): allocated registers { arch.aarch64.bits.Register.x19 } for insts { 28 }                                                                                                                                                                                                                        
debug(codegen): %28 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x19 }                                                                                                                                                                                                                       
debug(codegen): %31 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x19 }                                                                                                                                                                                                                       
debug(codegen): Branch {                                                                                                                                                                                                                                                                                                
 %3 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 8 }                                                                                                                                                                                                                                                           
 %4 => arch.aarch64.CodeGen.MCValue{ .immediate = 0 }                                                                                                                                                                                                                                                                  
 %9 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 16 }                                                                                                                                                                                                                                                          
 %10 => arch.aarch64.CodeGen.MCValue{ .immediate = 0 }                                                                                                                                                                                                                                                                 
 %14 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 24 }                                                                                                                                                                                                                                                         
 %22 => arch.aarch64.CodeGen.MCValue{ .memory = 67108976 }                                                                                                                                                                                                                                                             
 %24 => arch.aarch64.CodeGen.MCValue{ .stack_offset = 40 }                                                                                                                                                                                                                                                             
 %25 => arch.aarch64.CodeGen.MCValue{ .dead = void }                                                                                                                                                                                                                                                                   
 %28 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x19 }                                                                                                                                                                                                                                     
 %31 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x19 }                                                                                                                                                                                                                                     
}                                                                                                                                                                                                                                                                                                                       
debug(codegen): %33 => arch.aarch64.CodeGen.MCValue{ .stack_offset = 32 }                                                                                                                                                                                                                                               
debug(register_manager): locking arch.aarch64.bits.Register.x19                                                                                                                                                                                                                                                         
debug(register_manager): allocated registers { arch.aarch64.bits.Register.x20 } for insts { 33 }                                                                                                                                                                                                                        
debug(register_manager): allocated registers { arch.aarch64.bits.Register.x20 } for insts { 33 }                                                                                                                                                                                                                        
debug(register_manager): locking asserting free arch.aarch64.bits.Register.x20                                                                                                                                                                                                                                          
debug(register_manager): unlocking arch.aarch64.bits.Register.x19                                                                                                                                                                                                                                                       
debug(register_manager): unlocking arch.aarch64.bits.Register.x20                                                                                                                                                                                                                                                       
debug(codegen): Branch {                                                                                                                                                                                                                                                                                                
 %3 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 8 }                                                                                                                                                                                                                                                           
 %4 => arch.aarch64.CodeGen.MCValue{ .immediate = 0 }                                                                                                                                                                                                                                                                  
 %9 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 16 }                                                                                                                                                                                                                                                          
 %10 => arch.aarch64.CodeGen.MCValue{ .immediate = 0 }                                                                                                                                                                                                                                                                 
 %14 => arch.aarch64.CodeGen.MCValue{ .ptr_stack_offset = 24 }                                                                                                                                                                                                                                                         
 %22 => arch.aarch64.CodeGen.MCValue{ .memory = 67108976 }                                                                                                                                                                                                                                                             
 %24 => arch.aarch64.CodeGen.MCValue{ .stack_offset = 40 }                                                                                                                                                                                                                                                             
 %25 => arch.aarch64.CodeGen.MCValue{ .dead = void }                                                                                                                                                                                                                                                                   
 %28 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x19 }                                                                                                                                                                                                                                          
 %31 => arch.aarch64.CodeGen.MCValue{ .dead = void }                                                                                                                                                                                                                                                                   
 %33 => arch.aarch64.CodeGen.MCValue{ .register = arch.aarch64.bits.Register.x20 }                                                                                                                                                                                                                                     
}                                                                                                                                                                                                                                                                                                                       
debug(register_manager): freeing register arch.aarch64.bits.Register.x19   

Note that instructions %28 and %31 both have the same MCValue of .x19 assigned where %28 corresponds to airLoad while %31 to airBitCast. Note next that %31 dies first and causes the backend to incorrectly free register .x19 while it is still being tracked by instruction %28.

@ghost
Copy link

ghost commented Feb 19, 2023

🎉🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Multi-object for loops
10 participants