New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Streamline loops, and enhance iteration #3110
Comments
I like this proposal a lot -- sometimes I feel like it's easier to allocate a slice and copy into it to This strikes me as very similar to how Ruby approaches iteration, with blocks. Ruby's blocks are slightly more general than what you propose here, since in addition to the iterating function being able to send values to the block, the block can send values back to the iterating function. That is perhaps another possible solution to the removal problem (though, since under this proposal each container type can only have a single iterator, you would most likely need to introduce some "default return" so that loops which don't ever remove elements, which is almost all of them, wouldn't have noisy return/yield statements saying so) This seems to depend on #229, since the |
The use of function pointers and compiler-inlining-voodoo is awkward and reduces the applicability of iterators. They have combinatorial properties, e.g. iterator zipping, and they may be passed as arguments to other functions. The underlying struct is there to provide the necessary value-semantics for the rest to work. The pattern of using iterators can be improved simply by allowing a // iterator by convention (struct that supports fn next(self: *Self) ?Element)
for(some_struct.iterator()) |e| {}
var zip = ZipIterator(struct1.Iterator, struct2.Iterator){
struct1.iterator(),
struct2.iterator(),
};
for(zip) |dual| {
// parallel iteration
}
var it = some_struct.iterator();
find_first_of(&it, condition);
while(it.next()) |e| {
// do something with rest
}
|
This is an area where some sort of standard trait/interface would be sufficient. That is how Rust does this and it works quite well. If you had traits, and a standard Iterator trait then any type that implemented that trait could be iterated in a for loop. |
I'm not yet convinced by the line savings this proposal seems to imply. Also rewriting between for- and while-loops doesn't sound like that much of an inconvenience to me, although of course large codebases can always benefit from standardized interfaces. // add a variation on the it's-only-5-lines-iteration
// as a container method (here for std.ArrayList)
pub fn iterate(self: *Self, e_ptr: *?*T) void {
//comparable to current Iterator.next, async frame used as iterator state
var i: usize = 0;
while(i < self.len) : (i += 1) {
e_ptr.* = &self.items[i];
suspend;
}
e_ptr.* = null;
return;
}
//... usage in code (omitting construction, filling with values etc.):
{
var e_ptr: ?*T = null; //"client-side" iteration state, because we have no multi-return functions aka generators (yet)
var iteration = async list.iterate(&e_ptr);
while(e_ptr) |ep| : (resume iteration) {
// process elements, f.e.: @import("std").debug.warn("{}\n", ep.*);
}
} Of course this doesn't handle removal of elements, or other container-specific more sophisticated use cases. |
|
@rohlem ooh, nice! All you need to make any data structure iterable (is that a word?) is to write a generator for it. I do not see the lack of delete-while-iterating as a show stopper. That would probably be data structure-dependent anyway. |
what about this? |
I'm not sure where this was previously discussed, but here are some of the points that were raised: Although the proposal suggests a way to compile away the function calls, it still counts as hidden control flow. It might not have performance issues, but the reader of the code still has to jump around to figure out what the iterator does. One of the nice things about the status quo Status quo As well as the function call, the stack space needed which should probably be explicit. It also ties the language to the standard library - not necessarily bad, lots of languages do it. But something to keep in mind. My best attempt at reconciling these issues isn't really much better than using
@rohlem that's pretty cool! It prompted me to consider the following, similar to the above:
The reason I'd make the order different is that above the That said, I'm very happy with status quo. |
Async is not a facility for implementing generators. That is both convoluted and inefficient. There are proposals for supporting generators but they're not part of the language yet. A for loop is just syntactic sugar for a while loop over a slice. Establishing a good convention for iterators and ranges could allow extending the sugar to these concepts without altering syntax. The concept of while-loops over optionals is good for unbounded or unknown-bound iterators. When bounds are known, e.g. for slices, it is better to loop without optionals as the result is trivially known from the bound. That is basically what is achieved by the sugar without exploiting the property in a more general sense. Implementing removals during iteration requires guarantees on iterator stability which is not generally feasible to provide. |
@CurtisFenner |
I actually find the generator idea relatively interesting... provided it is as fast as not writing a generator.. One of the things I always hated about Python was how I wanted to use generators, but couldn't because they were so much slower than not using them. If you did it that way: for (list.each()) |e| {}
for (list.eachReverse()) |e| {} But again, how do you signal you want it removed? |
Generators and async function's aren't fundamentally different. A generator is basically a resumable function that can return multiple times. @rohlem did that with an output parameter. Though I agree with @andersfr that we shouldn't rely on async for this. @Tetralux what you propose is necessarily a closure unless we disallow the use of variables declared outside the body of the for. As soon as the for block becomes expressible as a function, the user is able to use it as a function, including saving it and calling it somewhere later. Which means it also has to capture part of its enclosing scope. If we have closures (and/or anonymous functions), another option is to just pass it as a parameter to an iteration function. This also means that you can also do something like a
I particularly dislike the concept of a remove keyword. @RUSshy Zig's syntax is pretty consistent throughout The Part of the fun of reinventing simple things is inventing things that are simple =) |
I dont think a What if you have collections where you can't remove elements from? now you have a |
@DutchGhost I thought I remarked about that already: if the collection does not support removal, then @raulgrell The idea about it being a closure; no. f: fn(e: ElType) void = undefined;
fn iterate(self: *Self, f: inline fn(e: ElType) void) {
self.f = f; // compile error: function 'f' must be inlinable, and therefore has no storage
} |
I realize this is controversial but I'm a big fan of status quo iteration. What I like about it is that there is no hidden control flow. You don't have to know the type of anything to understand how iteration works. If anything, I'm tempted to delete |
Today I learned: * There's a nice list of projects in Zig at [Awesome Zig], which may include useful examples of code and useful libraries. * `for` loops are only for slices/arrays, not custom iterators, but `while` loops are [quite featureful]! In fact it seems like the language creator has been tempted to [delete for loops] from the language and only have while loops. * The language reference says that "Zig programmers must always be able to answer the question: [Where are the bytes]?" and this is why you have to choose an allocator, but even after reading some of the source code I really don't know where the 4KB buffer for bufferedReader lives. * I can read a file after all! Thanks to chapter 2 of ziglearn.org. [Awesome Zig]: <https://github.com/nrdmn/awesome-zig> [quite featureful]: <https://ziglang.org/documentation/0.7.0/#while> [delete for loops]: <ziglang/zig#3110 (comment)> [Where are the bytes]: <https://ziglang.org/documentation/0.7.0/#Memory> For the solution, I got frustrated with ArrayList and the helper functions in std.io.Reader and std.fmt, so I decided to just do a finite state machine which reads one character at a time. I think the result is not so bad, but I have to admit that I'm surprised it gave the correct answer on the first run, because it would have been extremely easy to forget updating the parsing_state or some other trivial error in the state machine. $ zig build run 643 On to part 2, and let's hope it doesn't need much more state.
If #1717 is implemented, another way to implement a for-loop would be using function expressions: list.each(.{
fn (self: anytype, i: i64) void {
std.log.info("arg={}, item={}", .{ self.@"1", i });
},
123,
}); |
There's probably more I could flesh out about this proposal but I'd like to get it out there, so here goes:
Synopsis
for
loops are only really useful for slices, and even for them, it's better to use awhile
because you may not be iterating a slice in the future, as your program matures.However,
while
loops have some shortcomings overfor
loops, being more verbose, and no easier to understand.For example, a
std.ArrayList
calledlist
:Currently, in order to make a struct iterable, convention is that you have an
.iterator()
method that returns an instance of a iterator struct for that type, and you then call.next()
on that iterator until it returnsnull
.Again, for a
std.ArrayList
calledlist
:This approach is arguably a tad better than the
while
example above; more terse, pretty clear.However it is a little awkward because, it's different from how you iterate slices with
for
--the only typefor
can iterate--and the author of the custom struct (herestd.ArrayList
) ends up writing dozens of lines of code just to make the iterator struct work as it should... when all you wanted to do was write the code necessary for iterating this collection and reuse that code whenever you want to iterate one of them.Here is that code for
std.ArrayList
:For
std.ArrayList
, it's not that bad; a total of 23 lines.However, this code looks very different from the code you actually need to write in order to iterate this list - it's from the first example - here it is again:
That's pretty simple, right? It's only 5 lines!
I'd argue that having to use a custom struct just to iterate your actually-important custom struct is something of a waste of your time - it shouldn't be that hard.
It should not require that many lines more code to write it, ideally.
And again, this is just an
ArrayList
- the custom iterator code for aBucketArray
is worse, and harder to follow than this is.This proposal fixes both of these problems in one swoop.
Basics
It makes it so that you can just
for (iterable) |elem| { ... }
to iterate over any custom struct that declares how you iterate it.Iterating would then look like this:
Behavior-wise, this setup pastes the code from the body of
iterate
intomain
, replacing thefor
loop, and so, local variables inmain
are available in the loop.The
f
function parameter toiterate
is how you represent the body of the users'for
loop.No function calls, either to
iterate
orf
, actually happens.This is actually the same amount of magic as currently happens with
for
loops,the only difference is that you are able to specify what to do for an abitrary custom structure,
rather than it only working for slices.
Also, it's as if you just wrote this:
Notice how close the loop is to the body of
iterate
.Failure
Futher, sometimes, though rarely, the iteration can fail.
For example, command line arguments allocate each arg as you ask for them:
zig/std/process.zig
Lines 276 to 278 in ec7d7a5
zig/std/process.zig
Lines 223 to 234 in ec7d7a5
Currently, you can use
while
on an error union and have anelse
branch to handle the error.This code will repeatedly evaluate the expression until it's error.
It will then run the
else
branch:The same applies here.
If the iteration results in an error, you'd be forced by the compiler to provide an
else
branch, and that would look like this:Notice any similaraties? 😝
To allow for this case, the implementation of
iterate
for the custom struct could return eithervoid
if it always suceeds, or!void
if it can fail.Element removal
There is an open question with this approach.
How would you remove an item from the iterable, while you are iterating over it.
If you don't think you'd need to, see
std.ArrayList.swapRemove
. Trust me, it's useful. 😁A little while ago, I made a PR which added
.swapRemove
and.orderedRemove
to the custom iterator type ofstd.ArrayList
.If you skip needing to have an iterator, how would you access that functionality in a simple way without having to concern yourself over how you actually do that for the data structure in question, or figuring out how to even do it at all, as applicable.
Jai solves this problem by having a
remove
keyword which you provide the value to.The Zig translation of that is this:
I'm not sure of the best way to do this, though this is one way I thought of:
The text was updated successfully, but these errors were encountered: