-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: clarify comprehension scope issues #84
Comments
I dislike the asymmetry of Python's semantics here. But this probably doesn't cause problems for implementors, or much surprise for users. Unlike other deviations for Python, e.g. not supporting implicit string concatenation, there's little tangible benefit to deviating here -- it'd just be "improving" the language in an abstract way. So I'm fine with codifying the Python 3 behavior in Starlark if Alan is. |
If so, I'm okay with it. |
This is not 100% safe to existing code.
I think the change for consistency's sake may be worth it anyway. |
In addition, we need to specify the behavior of Recall that the first for clause ( Similarly, the operand of the second for clause may refer to vars bound by the third:
Again, all four implementations agree but the spec says nothing. |
Yes, in Python, the first for-clause's iterable expression is evaluated in the outer scope, while everything else occurs inside the closure created by the comprehension. This can be confirmed by looking at the generated bytecode.
It's actually an intentional design decision to benefit generator expressions, whose implementation leaks over to other kinds of comprehensions where there's little or no benefit. In generator expressions, this causes the RHS of the first Where do we stand on this github issue? My memory is that we decided (and implemented) that comprehensions define variables in their own local scopes. I furthermore thought that we did away with Python 3's special casing of the first RHS, so that |
- first loop operand is resolved in enclosing environment - all loop vars are bound before any expression is resolved See bazelbuild/starlark#84 Change-Id: I2c4283719cf78cea27dccba1ffb0932857254909
I see from the PR discussion there's also a point about whether each I'm not sure I have strong opinions about this. The two possible design philosophies are
|
Jon, we follow Python3: a comprehension creates a single lexical block. The first loop's operand is resolved in the enclosing environment. Then all vars are bound; their scope is the entire block. Then all expressions are resolved. (BTW, terms: a binding has a scope, which is the region of text in which its name refers to it. The environment is a tree of blocks, which map names to bindings.) In the interests of sanity I think I would prefer to stick with the Python behavior. If the inner loop body is ever executed, you'll discover fwd-ref mistakes quickly. BTW it is possible to exploit the Python behavior by having the second loop conditionally execute the reference to the var of the third loop only on its second iteration |
If the resolution is to do exactly the same scoping behavior as Python 3, that's fine be me, and it's just a matter of spec wording and implementation. |
I greatly appreciate increasing the strictness of the spec. Replicating Python 3 behaviour is good. Though I think Python 3 developers made a mistake by defining only one scope for the whole comprehension, we could do better (create scope for each for clause: that is probably what users intuitively expect). (I guess Python did not bother with static error reporting due to its very dynamic nature, like
This can be said about any static error. But a lot of definitions are very infrequently executed code. As some evidence, when I enabled strict scope resolution in our repositories (each variable must be resolved to global or local), I have discovered a huge amount of broken code. Like this:
Can you spot an error here? But Starlark can, and that's great. But either way is fine with me, it is not a life and death issue. |
I asked @stepancheg for an example of how you can observe the scope difference - he replied with: def foo():
z = []
print([x for x in [1] for y in z for z in []])
foo() The |
See my above comment on simplifying things versus matching Python behavior. Fixing this means yet another difference, and I'm dubious that this kind of error really occurs in practice. It's easier to deviate from Python when we're outright banning a feature that commonly leads to errors, rather than changing semantics of something for the sake of purity. That said, I don't feel super strongly. |
- first loop operand is resolved in enclosing environment - all loop vars are bound before any expression is resolved See bazelbuild/starlark#84
The remaining task is to copy the changes from google/starlark-go#299 into this repo. |
Done by #115. |
The Go implementation of Starlark follows the Python3 semantics---specifically, the outermost for loop's sequence expression is evaluated outside the new lexical block---and its spec describes this in some detail (https://github.com/bazelbuild/starlark/blob/master/spec.md#comprehensions). By contrast, the Java implementation evaluates the for-operand sequence in the outer lexical block, which resembles Python2.
So:
We should decide on the semantics and make the implementations agree. I propose we aim for the Python3-compatible behavior, for "least surprise". (The difference is entirely in the name resolver.)
The text was updated successfully, but these errors were encountered: