Multiple context expressions do not support parentheses for continuation across lines #56991
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = 'https://github.com/pablogsal' closed_at = <Date 2020-06-25.19:11:20.497> created_at = <Date 2011-08-19.02:10:17.157> labels = ['interpreter-core', 'type-feature', '3.10'] title = 'Multiple context expressions do not support parentheses for continuation across lines' updated_at = <Date 2022-01-16.22:14:02.746> user = 'https://github.com/Julian'
activity = <Date 2022-01-16.22:14:02.746> actor = 'eric.araujo' assignee = 'pablogsal' closed = True closed_date = <Date 2020-06-25.19:11:20.497> closer = 'pablogsal' components = ['Interpreter Core'] creation = <Date 2011-08-19.02:10:17.157> creator = 'Julian' dependencies =  files =  hgrepos =  issue_num = 12782 keywords = ['patch'] message_count = 29.0 messages = ['142411', '142549', '142550', '142631', '142658', '142750', '180512', '180514', '180515', '236083', '236121', '293660', '326637', '326639', '326645', '326648', '326727', '327875', '346137', '346157', '347025', '363755', '372383', '372384', '372385', '372388', '410726', '410728', '410729'] nosy_count = 21.0 nosy_names = ['gvanrossum', 'barry', 'georg.brandl', 'ishimoto', 'ncoghlan', 'benjamin.peterson', 'ezio.melotti', 'eric.araujo', 'steven.daprano', 'r.david.murray', 'lukasz.langa', 'Julian', 'serhiy.storchaka', 'ulope', 'Anthony Sottile', 'pablogsal', 'thautwarm', 'BTaskaya', 'Terry Davis', 'Jeffrey.Kintscher', 'jack1142'] pr_nums =  priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue12782' versions = ['Python 3.10']
The text was updated successfully, but these errors were encountered:
with (open("a_really_long_foo") as foo, open("a_really_long_bar") as bar): pass
Traceback (most recent call last): File "<input>", line 1, in <module> File "demo.py", line 19 with (open("a_really_long_foo") as foo, ^ SyntaxError: invalid syntax
Also, without convoluting things, import also does not support doing so, and is the only other example I can think of of a compound statement that forces you to either be redundant or bite your teeth and use \, despite the fact that PEP-328 gave us parens for from imports.
(I did not find a discussion as to why import didn't grow it as well, so please correct me as I'm sure it must have been discussed before).
It's understandably a lot rarer to need multiple lines when importing, but it'd be nice if all compound statements uniformly allowed the same continuation syntax.
One similar example would be "raise" in Python 2.
This is not true: only "import-as" allows this syntax. All other uses of parentheses for continuation are continuations of *expressions*.
As Georg noted, only individual expressions get parentheses based continuations automatically. For statement level use of comma separation, it's decided on a case-by-cases basis as to whether we think it is a legitimate usage based on our style guidelines.
That's why 'from location import (name1, name2)' is allowed, but 'import (name1, name2)' is not: we explicitly advise against importing too many modules in a single import statement, but importing multiple names from a single location is often a useful thing to do.
However, while the multiple context expression use case is reasonable, there may be a grammar ambiguity problem in this case, since (unlike from-import) with statements allow arbitrary subexpressions.
Cool. I imagined this had to do with it.
Sorry, can you possibly clarify where the ambiguity might come in?
There is a discussion about this on Python-Ideas:
This has come up on Python-Ideas again:
This was closed without enough explanation. Suggesting people should use ExitStack due to a Python grammar deficiency is suboptimal to say the least.
This problem is coming back to users of Black due to Black's removal of backslashes. It's the only piece of our grammar where backslashes are required for readability which shows there's something wrong.
The syntax ambiguity that Nick is raising fortunately shouldn't be a problem because a single tuple is an invalid context manager. In other contexts if the organizational parentheses are matched by the with-statement and not by the underlying
Pablo has a working patch for this, we intend to fix this wart for Python 3.8.
Especially since the dynamic flexibility of ExitStack comes at a genuine runtime cost when unwinding the resource stack.
I also (very!) belatedly noticed that I never answered Julian's request for clarification about the potential grammar ambiguity, so going into detail about that now:
The first item in the grammar after the 'with' keyword is a 'test' node, which can already start with a parenthesis, which means a naive attempt at allowing grouping parentheses will likely fail to generate a valid LL(1) parser.
That doesn't mean a more sophisticated change isn't possible (and Pablo has apparently implemented one) - it just means that the required grammar update is going to be more complicated than just changing:
(That would need too much lookahead to decide whether an opening parenthesis belongs to the first 'with_item' in 'with_items' or if it's starting the alternative multi-line grouping construct)
The Python grammar is already not LL(1) strictly. Take for example the production for "argument":
argument: ( test [comp_for] | test '=' test | '**' test | '*' test )
obviously the first sets of test and test are the same and is ambiguous, but the NDFAs are still able to produce DFAs that can generate a concrete syntax tree that allows the AST generation to disambiguate that the second test is a NAME and not any other thing.
The rule with_stmt: 'with' ( with_item (',' with_item)* | '(' with_item (',' with_item)* [','] ')' ) ':' suite
will generate a similar scenario. The NDFAs will generate DFAs that will ultimately allow us to just skip the more external group of parenthesis when generating the nodes. This makes valid all these expressions:
with (manager() as x, manager() as y): pass with (manager() as x, manager() as y,): pass with (manager()): pass with (manager() as x): pass with (((manager()))): pass with ((((manager()))) as x):
but not this one:
with (((manager()))) as x:
the reason is that it assigns the first LPAR to the second production and it fails when searching for the one that is at the end. I think this limitation is OK.
If you want to play with that. here is a prototype of the implementation with some tests:
The DFA for the rule
with_stmt: 'with' ( with_item (',' with_item)* | '(' with_item (',' with_item)* [','] ')' ) ':' suite
DFA for with_stmt [512/2103]
It works because the transition from State 1 into a "(" is going to prioritize the path:
0 -> 1 -> "(" -> 2
0 -> 1 -> with_item -> 3
Reviewing the thread, we never actually commented on thautwarm's proposal in https://bugs.python.org/issue12782#msg327875 that aims to strip out any INDENT, NEWLINE, and DEDENT tokens that appear between the opening "with" keyword and the statement header terminating ":".
The problem with that is that line continuations are actually handled by the tokenizer, *not* the compiler, and the tokenizer already switches off the INDENT/NEWLINE/DEDENT token generation based on the following rules:
By design, the tokenizer is generally unaware of which NAME tokens are actually keywords - it's only aware of async & await at the moment as part of the backwards compatibility dance that allowed those to be gradually converted to full keywords over the course of a couple of releases.
Hence why INDENT/NEWLINE/DEDENT never appear inside expressions in the Grammar: the tokenization rules mean that those tokens will never appear in those locations.
And it isn't simply a matter of making the tokenizer aware of the combination of "with" and ":" as a new pairing that ignores linebreaks between them, as ":" can appear in many subexpressions (e.g. lambda functions, slice notation, and the new assignments expressions), and it's only the full parser that has enough context to tell which colon is the one that actually ends the statement header.
Thus the design requirement is to come up with a grammar rule that allows this existing code to continue to compile and run correctly:
While also enabling new code constructs like the following:
with (nullcontext() as example): pass with (nullcontext(), nullcontext()): pass with (nullcontext() as example, nullcontext()): pass with (nullcontext(), nullcontext() as example): pass with (nullcontext() as example1, nullcontext() as example2): pass
If we can get the Grammar to allow those additional placements of parentheses, then the existing tokenizer will take care of the rest.
I can confirm Guido's words, now parentheses for continuation across lines are already supported.
Even without parentheses, multiline with items can be supported. I just implemented it here: https://github.com/thautwarm/cpython/blob/bpo-12782/Grammar/python.gram#L180-L187
from contextlib import contextmanager @contextmanager def f(x): try: yield x finally: pass # Ok with f('c') as a, f('a') as b: pass # Ok with f('c') as a, f('a') as b, f('a') as c: pass # ERROR with f('c') as a, f('a') as b, f('a') as c: x = 1 + 1
# ERROR with f('c') as a, f('a') as b, f('a') as c: x = 1 + 1
File "/home/thaut/github/cpython/../a.py", line 49
IndentationError: unexpected indent
The grammar is:
The restriction here is, since the second 'with_item', until the end of 'statements', the expression and statements have to keep the same indentation.
with item1, item2, ...: block The indentation of 'item2', ..., 'block' should be the same.
This implementation leverages the new PEG and how the lexer deals with indent/dedent.