Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify evaluation order of function arguments #13

Closed
laurentlb opened this issue Oct 22, 2018 · 18 comments
Closed

Specify evaluation order of function arguments #13

laurentlb opened this issue Oct 22, 2018 · 18 comments

Comments

@laurentlb
Copy link
Contributor

laurentlb commented Oct 22, 2018

Reported by @alandonovan

This program prints each argument in the call to f as it is evaluated.

def id(x):
        print(x)
        return x

def f(*args, **kwargs):
        print(args, kwargs)

f(id(1), id(2), x=id(3), *[id(4)], y=id(5), **dict(z=id(6)))

Its results vary across all implementations:

Python2: 1 2 3 5 4 6 (*args and **kwargs evaluated last)
Python3: 1 2 4 3 5 6 (positional args evaluated before named args)
Starlark-in-Java: 1 2 3 4 5 6 (lexical order)
Starlark-in-Go: crashes (github.com/google/skylark/issues/135)

The spec currently says nothing about argument evaluation order. The Starlark-in-Java semantics are the cleanest of the three but are the most complicated and least efficient to implement in a compiler, which is why the Python compilers do what they do. The problem with the Java semantics is that it requires the compiler to generate code for a sequence of positional arguments, named arguments, an optional *args, more named arguments, and an optional **kwargs, and to provide the callee with a map of the terrain. By contrast, both of the Python approaches are simple to compile: Python2 always passes *args and *kwargs at the end, and Python3 always pushes the *args list before the first named.

One way to finesse the problem without loss of generality would be to specify Python2 semantics but to statically forbid named arguments after *args, such as y=id(5) in the example. (This mirrors the static check on the def statement, which requires that *args and **kwargs come last.

@laurentlb
Copy link
Contributor Author

statically forbid named arguments after *args, such as y=id(5) in the example

I think that's a good solution.

Buildifier can automatically move the arguments (order should be: positional, then named, then *args, then **kwargs) - cc @vladmos

@brandjon
Copy link
Member

Stylistically speaking, I think user code should not rely on the order of argument evaluation (except of course that it's deterministic). The only circumstance in which I could imagine encouraging such a dependency is if that order is left-to-right, as in Starlark-in-Java. I also think it's rare that this will come up in practice.

I've always understood the calling semantics of Python functions (at least in Python 3) to be: 1) gather the positional args (including *args) and populate the params, then 2) gather the keyword args (including **kargs) and populate more params. This order is important to understanding how conflicts are handled -- you'd get different results if you populated the keywords first, then skipped over params that were already filled in when populating positionals. So it doesn't surprise me that this separation is reflected in Python 3's evaluation order.

I don't feel super strongly about prohibiting keyword args after *args in the caller. Though I guess that would prohibit calls like

my_join_func(string1, string2, ..., stringn, *[even_more_strings], sep=',')

@alandonovan
Copy link
Contributor

Stylistically speaking, user code should not rely on the order of argument evaluation

A style rule we can all agree on, in any language.

I don't feel super strongly about prohibiting keyword args after *args in the caller. Though I guess that would prohibit calls like
my_join_func(string1, string2, ..., stringn, *[even_more_strings], sep=',')

How would you declare such a function? I think you would have to use **kwargs for the optional sep parameter, which is quite unnatural.

@brandjon
Copy link
Member

Indeed, that would be an awkward situation. But I think if the aim was to provide that interface to the caller, then the awkwardness would not deter a user from writing that function definition.

@laurentlb
Copy link
Contributor Author

I think the simplest is to forbid it.

  • We force the order with Buildifier, which gives a more consistent formatting
  • We keep left-to-right evaluation while we align the evaluation with Python semantics
  • It gives more flexibility to the interpreter (AST representation, code generation, or evaluation).
  • It may simplify other tools as well

@alandonovan
Copy link
Contributor

I think you are proposing that we statically disallow f(*args, k=v) and evaluate f(a, *[b], **dict(k=c)) in the order a, b, c, following Python2. That sounds good to me.

We force the order with Buildifier, which gives a more consistent formatting

If you plan to make Buildifier replace f(*args, k=v) with f(k=v, *args), review the changes carefully as it is not behavior-preserving.

@laurentlb
Copy link
Contributor Author

laurentlb commented Oct 29, 2018

On a related note, this code: foo(1, *[2], 3) is rejected by Python2 and Starlark-Go, but accepted by Python3 and Bazel.

I think it would be simpler if we forced the argument order fully:

  1. positional arguments,
  2. named arguments,
  3. *arg,
  4. **kwarg.

This would allow an implementation to have 4 fields (one for each kind of argument), and not worry about relative order of items between the different kinds of arguments.

@alandonovan
Copy link
Contributor

That also sounds good to me.

@brandjon
Copy link
Member

When we removed support for keyword-only parameters, I had the notion that at some point in the future, when Python 2 syntactic compatibility is no longer a goal, we could add them back. I'm ok with the extra restrictions described in this thread, but we should ask whether we intend to relax these constraints later on (well after 1.0). If so, it may affect what assumptions interpreter authors rely on today.

I used to think that ideally Starlark would support the same parameter and argument features as Python 3. But it turns out that's a moving target. More recent versions of Python 3 support multiple unpacking expressions, and a draft pep introduces positional-only parameters [1]. I agree that some call syntaxes like f(a, *b, c) and f(*a, *b) do not seem to have legitimate use cases that outweigh their readability costs. So in light of the growing gap between Python 3 and Starlark, perhaps its time to take the hard line on simplicity.

Allowing *args/**kargs only at the end (for both calls and definitions) is similar to how C-style languages put varargs at the end. It's easier to represent and work with in the interpreter. It avoids catering to use cases we don't fully understand or endorse. I do wish we had an easy way to support both positional-only params and keyword-only params, but it's probably not worth the complication.

[1] I laughed out loud when I saw the token separating positional-only params from the rest.

@laurentlb
Copy link
Contributor Author

Keyword-only syntax is for the definition site. The proposal here is for the call site. I don't think it will interact. Whether an argument is optionally named or mandatory named makes no difference on the call site.

I still expect we'll want keyword-only arguments at some point (especially since we use that in rules and other builtins).

@vladmos
Copy link
Member

vladmos commented Nov 15, 2018

Added a fix to buildifier: buildifier --lint=fix --warnings=args-order file.

@alandonovan
Copy link
Contributor

[Jon Brandvin:] When we removed support for keyword-only parameters

Wait, this feature was removed from Bazel? I just spent two hours working on google/starlark-go#62! D'oh. Should I assume then that any uses I encounter with Blaze are neglected BUILD files?

@brandjon
Copy link
Member

Sadly yes, keyword-only params in function definitions were removed because they aren't syntactically supported in Python 2. IIRC the only thing that can come after *args in a definition is **kargs. The asterisk * by itself in the comma separated list (to delimit keyword-only's when there's no varargs param) is a syntax error.

@alandonovan
Copy link
Contributor

I see. Why does Python2 syntax matter?

@brandjon
Copy link
Member

It's a design decision to keep Starlark in the intersection of Python 2 and Python 3 syntax. This is so we can imperfectly benefit from existing Python tooling, until Starlark tooling is more mature. We could revisit this later on.

@laurentlb
Copy link
Contributor Author

For a long time, Python 2 syntactic compatibility was critical.

@laurentlb
Copy link
Contributor Author

Marking this issue as approved (#13 (comment)).

For discussion about Python 2 syntax, see #27

adonovan added a commit to google/starlark-go that referenced this issue Nov 13, 2020
f(*args, k=v) is a now an error. Use f(k=v, *args).

Arguments are evaluated left-to-right, as in Python2 (but not Python3).
See bazelbuild/starlark#13 for spec change.

Change-Id: I267823df2cbae55a1edac4a5627a04b3e9f82c94
@alandonovan
Copy link
Contributor

The spec says:

Function arguments are evaluated in the order they appear in the call.                               
                                                                                                     
Unlike Python, Starlark does not allow more than one `*args` argument in a                           
call, and if a `*args` argument is present it must appear after all                                  
positional and named arguments.                                                                      

Marking as done.

bazel-io pushed a commit to bazelbuild/bazel that referenced this issue Nov 13, 2020
alandonovan pushed a commit to google/starlark-go that referenced this issue Nov 13, 2020
f(*args, k=v) is a now an error. Use f(k=v, *args).

Arguments are evaluated left-to-right, as in Python2 (but not Python3).
See bazelbuild/starlark#13 for spec change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants