Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for-loops that iterate over key-value pairs #154

Closed
jsiwek opened this issue Sep 17, 2018 · 7 comments

Comments

@jsiwek
Copy link
Member

commented Sep 17, 2018

Moved from https://bro-tracker.atlassian.net/browse/BIT-1879

Created by Jon Siwek at 2017-12-08T13:45:51.144-0600:

Should investigate what it would take to have for-loops that iterate over keys and values simultaneously rather than just over keys. Since the value is already available in the core for free when iterating over table entries, it just makes it more convenient for the programmer and it also avoids the extra overhead of having to do an extra lookup from script-land in the cases where one needs the value.

@ZekeMedley

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

I'm happy to take this on.

It seems like the implementation shouldn't be too difficult as much can be borrowed from the current for loop.

zeek/src/Stmt.cc

Line 1346 in ed1a50e

ForStmt::ForStmt(id_list* arg_loop_vars, Expr* loop_expr)

I could use some feedback on particulars of this though.

Right now there seem to be two (fairly pythonic) ways of using a for loop, for (p in myset) {...} and for ([i, j] in mytable]) {...}. I plan on keeping with the pythonic way of doing things and using syntax like for (key, value in mytable) {...} where key and value are arbitrary variables.

I plan on implementing this as a new class in Stmt.h/cc called KeyValueForStmt. But I'm happy to change that if it seems better to modify the existing ForStmt class. I havn't gotten much of a chance to look at the parser yet, so I'm open to other implementation ideas.

If that all sounds ok I can go ahead and get started.

@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Mar 7, 2019

The syntax sounds good to me. My first instinct would be to just try extending ForStmt and see what comes out of it, but ultimately the goal would just be implementing something that works and ideally doesn't have an adverse performance impact on existing code that doesn't use key-value pair iteration. That last part about performance I guess could be a reason to favor creating an entirely new statement type, but best actually do measurements first. Have fun :)

@ckreibich

This comment has been minimized.

Copy link
Contributor

commented Mar 7, 2019

Sounds good to me too. I have one question, though: if the index type of a table is a tuple, like in this ...

local foo: table[count, count] of string;
foo[1,2] = "foobar";

... and we now iterate over this, hypothetically like this ...

for (key, val in foo) { ... }

... then what is the type of key, and how do you access its two numeric members?

One possible solution would be that you have to say:

for ([c1, c2], val in foo) { ... }
@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Mar 7, 2019

One possible solution would be that you have to say:

for ([c1, c2], val in foo) { ... }

That's what I'd suggest, too -- we use that syntax already for iterating over keys-that-are-tuples (without also iterating values):

local t: table[count, count] of string = table();

t[1, 2] = "hi";

for ( [i, j] in t )
    print i, j, t[i, j];

So nothing changes when adding support for iterating over key-value pairs.

Internally, that's a ListVal IIRC, at least a comment says:

// ListVals are mainly used to index tables that have more than one
// element in their index.
@ckreibich

This comment has been minimized.

Copy link
Contributor

commented Mar 7, 2019

Great! Zeke, <picard>make it so</picard> 👍

@ZekeMedley

This comment has been minimized.

Copy link
Member

commented Mar 9, 2019

Looks great! Thank you :)

It seems like this is going to require adding some new code to parse that type of loop. I'm new to Yacc and Lex, but I'll follow along with older commits that dealt with parsing and found a nice guide.

I still have to do some thinking about how best to parse and tell ForStmt. Right now I think I'll have ForStmt check the number of loop variables its passed. If its passed more than one, and its iterating over a table, then it can go ahead and work through the items giving out keys and values.

@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2019

Yeah, generally speaking, you'll have to update the grammar in parse.y to additionally (optionally) parse a comma-followed-by-identifier, inform the ForStmt constructor about the identifier used for to store table values, and actually assign the value when the loop is executed. Probably you'll want some error check (maybe in the ctor) to make sure the new key-value for-loop syntax is actually being used on an expression that will evaluate to a table-type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.