Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: for on ranges #358

Closed
AndreaOrru opened this issue May 3, 2017 · 65 comments
Closed

Proposal: for on ranges #358

AndreaOrru opened this issue May 3, 2017 · 65 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@AndreaOrru
Copy link
Contributor

AndreaOrru commented May 3, 2017

for (a...b) |x, index| {
    ...
}

Where a and b can be chars, integers, anything that can define a range. This is also better syntax IMHO than:

{var i = 0; while (i < n; i += 1) {
    ...
}}
@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label May 3, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone May 3, 2017
@thejoshwolfe
Copy link
Sponsor Contributor

Previously when @andrewrk and I considered this syntax very early in Zig's development, one problematic point was the upperbound inclusivity/exclusivity. With an expression like i < n, there's no question that n is an exclusive upper bound, but with a...b, does that include b?

Surely the question can have an answer, and it should probably be exclusive, but the point is that the syntax doesn't clearly say that it's exclusive. And just to confuse things, I've seen languages (coco, for example) use two different syntaxes for a..b inclusive and a...b exclusive.

I don't think users will have a very good success rate for guessing whether the upper bound is inclusive or exclusive, which makes me dislike this proposal. That being said, iterating over numbers 0 to n (exclusive) is a very common pattern, even outside the usecase of array indexes, and I think this deserves more discussion.

@raulgrell
Copy link
Contributor

I wouldn't be against the two different tokens, but intuitively I'd say a..b is b exclusive and a...b is b inclusive. I think Rust and Wren do this too...

If we only keep ..., I'd say keep it exclusive and consistent with how slices are made, ie array[0...1] has array.len == 1. Python's range() function has this property

@andrewrk
Copy link
Member

andrewrk commented May 3, 2017

We do have the [a...b] slicing syntax, which is exclusive. So it should be clear from this.

On the other hand, an unsolved problem is that in switch statements, ... is inclusive, for example:

switch (c) {
    'a'...'z' => {}, // inclusive
}

I have 2 ideas to solve this problem:

  • Change ... syntax for slicing to .. and keep it exclusive. Keep ... syntax for switch statements, and keep it inclusive.
  • Change ... syntax for slicing to : (like Python). Keep ... syntax for switch statements, and keep it inclusive.

@raulgrell
Copy link
Contributor

Change ... syntax for slicing to .. and keep it exclusive. Keep ... syntax for switch statements, and keep it inclusive.

I'm not sure how I feel about using the : as a slice. It always has something to do with types. .. and ... are more intuitively associated with ranges

@raulgrell
Copy link
Contributor

I posted this in #359, just adding the relevant part here, which suggests keeping the two range operators .. and ... and add a third : that reflects the number of elements in the range as opposed to its start and finish

for (0 .. 2 ) | x, i | { }  // Exclusive -> 0, 1
for (0 ... 2) | x, i | { }  // Inclusive  -> 0, 1, 2 
for  (2 : 2)  | x, i | { }  // Range     -> 2, 3

@igalic
Copy link

igalic commented Aug 18, 2017

i haven't seen mention of it yet, so i'll just do so myself:

is there a way to count backwards? will for ( 2 ... 0) | x, i | { } count 2, 1, 0`?

@raulgrell
Copy link
Contributor

It was mentioned before, the expected behaviour for that statement would be it would actually loop 0 times - ie, the block would not execute.

To count backwards, you'd do for(a...b) | x, i | { print(b - i) }

@tiehuis tiehuis added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 15, 2017
@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018
@bronze1man
Copy link

I dislike this proposal,it is difficult to know it's meaning without read the document. A function call may be better.

@AndreaOrru
Copy link
Contributor Author

Care to elaborate? Which document? What function call? Can we see an example?

@ghost
Copy link

ghost commented Jul 15, 2018

http://termbin.com/ggyl

from @Hejsil
not the most elegant because you have to type |_,i| instead of |I| but still I'd say usable

question is if the same can be hacked together for ranges not starting at 0

@bronze1man
Copy link

I dislike this proposal:

for (a...b) |x, index| {
    ...
}

what a and b mean? is a include? is b include? what is x mean? what is index mean? If the lang design like this, I have to read the document to understand what it means.

I am looking for something more understandable syntax, may be like this?:

for (var i = range0ToNotInclude(100)) {
}

@thejoshwolfe
Copy link
Sponsor Contributor

this gets you most of the way there:

test "" {
    var j: usize = 0;
    for (times(10)) |_, i| {
        @import("std").debug.assert(i == j);
        j += 1;
    }
}

fn times(n: usize) []const void {
    return ([*]void)(undefined)[0..n];
}

@raulgrell
Copy link
Contributor

We already have the concept of ranges in slices and in switch statements - for this proposal to be reasonable, we just need to be consistent throughout the language. I know we're optimizing for readability, but we should be able to expect that a person reading Zig code has at least looked through the Zig docs.

That times function returning a []const void is a reasonable work around.

In @bronze1man's proposal, an assignment would have to be an expression that returns whatever was assigned. It might also be unclear that the function only runs once - a reader might expect each iteration of the loop to declare a new variable i set to the result of a new call.

Zig had a "null-unwrap-and-declare" operator ?= in ifs that was replaced precisely to keep identifier declaration consistent: control (predicate) | body_names | { body } prong | prong_names | { body }

// So we could do this:
if (maybe_foo()) | foo | { foo.bar() };

// Instead of 
if (var foo ?= maybe_foo()) { foo.bar() }

@skyfex
Copy link

skyfex commented Jul 30, 2018

I think there is a number of things that needs to be thought of here:

  1. Settle on a syntax for inclusive and exclusive range operator
  • One suggestion in 1..3 and 1...2 (I gotta say I think this is too hard to read the difference)
  • In Ruby, 1..2 is inclusive and 1...3 is inclusive.
  • Nim uses 1..<3 and 1..2 (I think the ..< is really clear here, but it leaves .. ambigous)
  • Mathematics (Interval Notation) uses [1,3) and [1,2] .. it would be great if Zig could somehow use this fact. Maybe [1..3) and [1..2] could be considered.
  • It can be useful to include stride. Like Julia and Python has a:b and a:s:b where s is a stride. It may be too much for a simple language like Zig, but worth considering.
  1. Consider making the range operator return a value of a built-in range type.
  • I think it's a missed opertunity not to use the syntax for range to construct a value of a "range" type, just as with int, float and strings.
  • This would work well with for loops
  • It's not clear how this would work with the switch statement though.. then range would be a special case
  1. Consider generalizing the for loop

If for loops work on slices, arrays and ranges, it's starting to get confusing. What's the pattern? Could it be generalized in a meaningful way?

I gotta say I'm a big fan of languages where the for loop accepts some kind of "iterator", rather than a few special types. It makes it much more clear and explicit what's going on, and easier to read in my opinion. So if you have an array items, maybe it should something like:

for (keys(items)) |key| { ... }
for (values(items)) |value| { ... }
for (pairs(items)) |key, value| { ... }

And then for ranges you could do:

for (range(1..3)) |num| {...}
for (rangePairs(1..3)) |index, num| { ... }

But how you make those iterators work is a huge proposal on its own.

@ghost
Copy link

ghost commented Jul 30, 2018

I gotta say I'm a big fan of languages where the for loop accepts some kind of "iterator", rather than a few special types.

if only there were something called interfaces ore alike 🥇

#1268

@thejoshwolfe
Copy link
Sponsor Contributor

Iterators can be done with while instead of for.

zig/build.zig

Line 155 in 02713e8

while (it.next()) |lib_arg| {

@ghost
Copy link

ghost commented Jul 30, 2018

compare

var i = 0;
var it = get_some_it (thing); // this code may actually look different every time so you never know what you read
while(it.next()) |arg| {
  i++;
}

to

for(thing) |arg, i| {
}

I do not think the first one enhances clarity.

And iterating is one of the most common things so there is that ...

with interfaces its possible to have concise iterators


Iterators can be done with while

while can be done with recursion https://softwareengineering.stackexchange.com/a/279006

sure but whats the point? (apart that recursion is currently not working and should be avoided)

@ghost
Copy link

ghost commented Jul 30, 2018

@skyfex

[1,3) and [1,2]

I can't even remember those two (and they look very similar as well) while I find the ruby syntax intuitive so it really depends on the person...

In the end you just have to remember some syntax so I do not think this is actually such a big deal.

The issue is a very restricted for loop and not a . vs a (.

@binary132
Copy link

👍 for : without a variant form since the intent is very clear and unambiguous among a number of languages. Might be nice to add optional stride too ;)

@raulgrell
Copy link
Contributor

I'm not sure it's a good idea to couple syntax with the names that are defined in a struct - saying you can use for on types that have iterate() and next() this is basically the same thing as operator overloading.

Iterators can be done with a while

Yep. The while iterator pattern is both explicit and concise - if you end up having to change how the iterator has to be initialized or continued, the function calls aren't hidden behind the syntax.

If the value of this proposal is good, and the only concerns are regarding syntax, we could accomplish this with built in functions and call it as for(@range(u8, 'a', 'z')) | c, i | { }

Example names and signatures. If it's hard to work out what they mean, it's a sign we need better functions.

@times(a: var) []const void;
@range(comptime T: type, a: T, b: T) []const T;
@sequence(comptime T: type, a: T, b: T, stride: T) []const T;
@linearSpace(comptime T: type, a: T, b: T, num: int) []const T;

Can built in functions be async/generators?

@generateRange(comptime T: type, a: T, b: T) yield T;

@skyfex
Copy link

skyfex commented Jul 31, 2018

@raulgrell I think if those built-ins end up being needed, it's a failure of language design. Built-ins should be functionality that can not in any way be implemented with library code. There are many of ways to design the language such that these can be implemented as plain code rather than magic built-ins, and I'm sure one of them can keep Zig conceptually simple and explicit.

I agree that there shouldn't be special function calls generated by the for syntax though.

I don't really like having to use while-loops to use an iterator pattern, but when I think about it, it's probably the correct choice for Zig.

Is there a proposal for generators (or observable or whatever it should be called)? It would make a lot of sense to extend the async support to allow an async function to yield multiple values. Then it would also make sense to extend for-loops to support those.

@MasterQ32
Copy link
Contributor

I know this proposal is closed already, but one thing to think about is this:

for(0..@as(u8, 255)) |index| {
  ...
}

is way more readable and less error prone than

var index: u8 = 0;
while(true) {
  ...
  if(@addWithOverflow(u8, index, 1, &index))
    break;
}

So my call would be: if done, the range needs to be inclusive to allow such loops to be implemented efficiently and without the requirement of @intCast or @truncate

@nodefish
Copy link

nodefish commented Jan 19, 2021

FWIW you can always increment a variable with defer if you don't like : (i += 1)

while(...) {
    defer i = i + 1;
}

@ifreund
Copy link
Member

ifreund commented Jan 19, 2021

FWIW you can always increment a variable with defer if you don't like : (i += 1)

while(...) {
    defer i = i + 1;
}

This does not have the same semantics as the : (i +=1) continue expression as the defer will be executed on breaking from the loop while the continue expression will not.

@nodefish
Copy link

nodefish commented Jan 19, 2021

FWIW you can always increment a variable with defer if you don't like : (i += 1)

while(...) {
    defer i = i + 1;
}

This does not have the same semantics as the : (i +=1) continue expression as the defer will be executed on breaking from the loop while the continue expression will not.

Consider the following snippet:

const std = @import("std");
const warn = std.debug.warn;

pub fn main() void {
    var i: u8 = 0;
    while (i < 10) {
        defer i += 1;
        warn("{}\n", .{i});
    }
}

The output is:

0
1
2
3
4
5
6
7
8
9

It's easy to overlook but each iteration of a loop has its own frame/scope. In golang you'd be right but defer applies to the immediate scope in zig.

@ifreund
Copy link
Member

ifreund commented Jan 19, 2021

@nodefish:

const std = @import("std");
const print = std.debug.print;

pub fn main() void {
    print("loop 1:\n", .{});
    var i: u8 = 0;
    while (i < 10) {
        defer i += 1;
        print("{}\n", .{i});
        if (i == 5) break;
    }
    print("{}\n", .{i});

    print("loop 2:\n", .{});
    i = 0;
    while (i < 10) : (i += 1) {
        print("{}\n", .{i});
        if (i == 5) break;
    }
    print("{}\n", .{i});
}
loop 1:
0
1
2
3
4
5
6
loop 2:
0
1
2
3
4
5
5

@ghost
Copy link

ghost commented Jan 19, 2021

@nodefish Consider the following snippet:

const std = @import("std");
const info = std.log.info;

pub fn main() void {
    var i: u8 = 0;
    while (i < 10) {
        defer i += 1;
        break;
    }
    warn("{}", .{i});
}

It prints 1. : (i += 1) would have resulted in printing 0. That's what Isaac was saying. Dude's been here for years, please don't be condescending.

@MasterQ32 See, I would have expected that to be an exclusive range, as that's how .. is currently used, and that's also the most sensible way to do it for for (so that we can choose to iterate 0 times). Also, is there a case in real code where you'd actually want to do this, and you're not iterating over an array/slice?

@cryptocode No good -- ; means sequence, and Zig's grammar is intentionally very simple, so if that were allowed at all both clauses would be executed every time.

The fundamental issue is: almost every single time you want to iterate over a range in real code, it's actually to index into a data structure. The sensible way to structure your code is to iterate over the structure directly, and the lack of ranged for is a subtle nudge in this direction. If structural for really doesn't work, while is still available, but the relative awkwardness means you don't use it unless it's the most sensible way. There are cases where structural for is awkward where it really shouldn't be, and #7257 exists to remedy that, but we shouldn't make it any more capable than we absolutely have to.

Really, the problem is that for is poorly named. It evokes ranged iteration. I'd be in favour of renaming it, but that's a separate issue.

@cryptocode
Copy link
Sponsor Contributor

cryptocode commented Jan 19, 2021

@EleanorNB I agree with the range point of view, I'm just addressing the scoping issue. I don't see people adding additional {} scopes in complex/nested loops, leaving the door open to subtle bugs. I see how the ; sequence means a difference syntax is needed, but that's orthogonal.

@nodefish
Copy link

nodefish commented Jan 19, 2021

@EleanorNB

Dude's been here for years, please don't be condescending.

Please reconsider your uncharitable reading of what I wrote. I will leave it at that.

@ifreund

I see, thanks for the clarification. That is rather subtle. In the most common cases it'll be a similar distinction to using < as opposed to <= in the condition expression, so I think I will still use the defer approach when possible and keep the extra incrementation in mind, just as a matter of personal preference.

@thejoshwolfe
Copy link
Sponsor Contributor

You can do range iterators in userland:

fn range(times: usize) RangeIterator {
    return RangeIterator{
        .cursor = 0,
        .stop = times,
    };
}

const RangeIterator = struct {
    cursor: usize,
    stop: usize,

    pub fn next(self: *RangeIterator) ?usize {
        if (self.cursor < self.stop) {
            defer self.cursor += 1;
            return self.cursor;
        }
        return null;
    }
};

However, I did some optimization science in godbolt, and i believe there is some optimization benefit to having some kind of builtin range loop.

Call a function N times

Baseline status quo: https://godbolt.org/z/rdYMq4

export fn callSomethingNTimes(num: usize) void {
    var i: usize = 0;
    while (i < num) : (i += 1) {
        something(i);
    }
}

More convenient syntax using a userland iterator: https://godbolt.org/z/98G5d3

export fn callSomethingNTimes(num: usize) void {
    var it = range(num);
    while (it.next()) |i| {
        something(i);
    }
}

The output is slightly different, but I'm not an expert enough to know which one is better.

Do math with the iterator variable

using iterator integer: https://godbolt.org/z/Y5Pon5

export fn mathThing(num: usize) usize {
    var sum: usize = 0;
    var i: usize = 0;
    while (i < num) : (i += 1) {
        sum +%= i * i;
    }
    return sum;
}

using iterator object: https://godbolt.org/z/Gh7MPc

export fn mathThing(num: usize) usize {
    var sum: usize = 0;
    var it = range(num);
    while (it.next()) |i| {
        sum +%= i * i;
    }
    return sum;
}

The output looks very different for these two, which declares the iterator integer the clear winner in terms of optimizability.

So it's not as clear cut as "just use a userland iterator object", but the option is still there.

@andrewrk
Copy link
Member

The output looks very different for these two, which declares the iterator integer the clear winner in terms of optimizability.

So it's not as clear cut as "just use a userland iterator object", but the option is still there.

see also the -OReleaseSmall optimization mode for these examples which is quite interesting

@Srekel
Copy link

Srekel commented Jan 31, 2021

The fundamental issue is: almost every single time you want to iterate over a range in real code, it's actually to index into a data structure.

@EleanorNB I don't agree with this. In game development it is quite common to loop over ranges that aren't bound to an exact data structure. Often because the loop values aren't actually indices, but coordinates, for example. Useful when generating meshes.

Or you may have a data structure, but you may want to sample over it in weird ways, for example a 2d map in a circular pattern.

And it is often two-dimensional, sometimes three, and that makes using the while-loop style @andrewrk suggested cumbersome.

Not having a programmer-friendly way to write for loops would I think be quite alienating to game programmers. I'm not sure I see the upside. The appeal of Zig's current for loops is very appealing as-is, I will still use them when I can.

[Edit: My comment sounded a bit rude/snarky, made a couple edits as it wasn't my intention]

Here's some typical code I've written, taken from the map generation code for Hammerting:

  // Fix foreground in an oval area near the start
  f32 width_sq  = flatten_width * flatten_width;
  f32 height_sq = flatten_height * flatten_height;
  for ( f32 y = -flatten_height; y < flatten_height; y++ ) {
    for ( f32 x = -flatten_width; x < flatten_width; x++ ) {
      f32 ellipse_eq = x * x / width_sq + y * y / height_sq;
      if ( ellipse_eq >= 1 ) {
        continue;
      }

      double influence_x = 1 - wc::abs( x ) / flatten_width;
      double influence_y = 1 - wc::abs( y ) / flatten_height;
      double  influence = influence_x * influence_y;
      i32     map_index = room_pos._x + i32( x ) + ( room_pos._y + i32( y ) ) * i32( MAP_WIDTH );
      double& heightmap_value = heightmap[map_index];
      heightmap_value += ( y <= 0 ? 2 : -2 ) * influence * influence;
      if ( heightmap_value < HEIGTMAP_CUTOFF ) {
        out->_foreground._materials[map_index] = 0;
      }
      else {
        u8 mat                                 = out->_background._materials[map_index];
        out->_foreground._materials[map_index] = mat;
      }
    }
  }

@BinaryWarlock
Copy link

BinaryWarlock commented Feb 10, 2021

The fundamental issue is: almost every single time you want to iterate over a range in real code, it's actually to index into a data structure

That's simply not true, I've read mountains of "real code" that does this all the time. Again, I do not see how experienced C programmers have not encountered this more often, because it's seemingly everywhere. Perhaps we are just looking at vastly different types of projects.

Perhaps take a stroll through some mathematics, game development, or even OS code to see some real world examples of this.

Writing a RangeIterator is more verbose (and slower) than just using the status-quo while loop, which is no good.

This problem is exacerbated by not being able to shadow locals, and making a new block scope just for iteration is awful as mentioned. This is a super unergonomic case in Zig right now and really something should be done about it, whether that be macros (#6965), or this, or another solution is up to debate, but it is unhelpful to close it with "the status quo is fine" when it is clearly not fine, or it would not still be debated 4 years later(!)

Remember that every developer has different use cases, so although you may not see this as a big deal, others definitely do. And this is a simple and elegant solution to the problem, so I don't see why there is so much pushback on it.

@ElectricCoffee
Copy link

Getting some form of range functionality would be really nice, if not for readability, then for writeability.

Surely it's possible to have the compiler build an array at compile-time or return a slice at runtime in order to facilitate something like this:

for (@range(2, 8)) |i| {
    // do stuff
}

Maybe even leverage anonymous structs somehow to facilitate optional arguments:

for (@range(2, 8, .{ .by = 2, .inclusive = true })) |n| {
    // you get it
}

It's not as elegant as the equivalent in Python, but it's at least clear in its intent and much harder to mess up than a c-style for loop or a plain while with a defer

@Guigui220D
Copy link

Also to specify the type @ElectricCoffee , like @range(u8, 0, 10)

@ElectricCoffee
Copy link

Also to specify the type @ElectricCoffee , like @range(u8, 0, 10)

I didn't even think of that, but yeah, that's a good logical addition also

alunbestor added a commit to alunbestor/AZiggierWorld that referenced this issue Sep 8, 2021
…f while

Using `while` with a counter is a code smell for "maybe you ought to be iterating a data structure instead". Using a slice also avoids repeated bounds-checks on each execution of the loop body in safe compile modes.

Related discussion (and why Zig does not have a for range syntax: ziglang/zig#358 (comment)
@hazeycode
Copy link
Sponsor Contributor

I made a comptime range fn for anyone interested:
https://gist.github.com/hazeycode/e7e2d81ea2b5b9137502cfe04541080e

@gonzus
Copy link
Contributor

gonzus commented Feb 4, 2022

If we are still looking for a range notation, I find this quite readable and it covers all the cases:

a =..< b  // includes a, excludes b; most used case?
a =..= b  // includes a, includes b
a >..= b  // excludes a, includes b
a >..< b  // excludes a, excludes b

You could even have a .. b as shorthand, equivalent to the first case.

@dbechrd
Copy link

dbechrd commented Oct 6, 2022

If we are still looking for a range notation, I find this quite readable and it covers all the cases:

a =..< b  // includes a, excludes b; most used case?
a =..= b  // includes a, includes b
a >..= b  // excludes a, includes b
a >..< b  // excludes a, excludes b

You could even have a .. b as shorthand, equivalent to the first case.

I assume you meant your > to be < on the left side? Your suggestion would make significantly more sense that way:

a =..< b  // equal to less than
a =..= b  // equal to equal
a <..= b  // greater than to equal
a <..< b  // greater than to less than

That said.. exclusive start seems pretty bizarre to me, so why not just:

a ..< b  // exclusive
a ..= b  // inclusive

Edit: I just looked up Odin's loop syntax and realized this is pretty much exactly how Odin works. Genuinely was not aware of that, but alas...

@gonzus
Copy link
Contributor

gonzus commented Oct 6, 2022

Yes, I meant that, brain fart.

I now realise this issue is closed, and I am not sure if any decision was made about this; anybody knows?

@Vexu
Copy link
Member

Vexu commented Oct 6, 2022

#7257 is accepted and will add for loops on ranges.

@andrewrk
Copy link
Member

Multi-object for loops have landed with #14671, and now for loop syntax supports counters:

const std = @import("std");

pub fn main() !void {
    for (0..10) |i| {
        std.debug.print("{d}\n", .{i});
    }
}
$ zig run test.zig 
0
1
2
3
4
5
6
7
8
9

@jkellogg01
Copy link

Are there any plans for inclusive counters as a convenience? obviously the same functionality is achievable by just increasing the end number by one but I would argue there's a readability benefit to the counter being able to specify that it's inclusive.

I know there have been a bunch of suggestions in terms of formatting, but I haven't seen a mention of the fact that Rust already uses the x..y syntax for exclusive ranges and simply uses x..=y to specify inclusive. The benefit to borrowing this syntax would be that the existing counter syntax wouldn't need to change at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests