-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEG discrepancy on 'if {x} {a}: pass' #85825
Comments
I just noticed a subtle discrepancy between the old parser and the PEG parser. Consider this syntax error:
The old parser places the caret at the '{':
The PEG parser puts it at 'a':
I don't think we should put much effort into fixing it -- it's just a curiosity. I suspect it's got to do with some lookahead. |
I had a look at this. I have been testing with the input a{x}, which faces the same problem. It's actually because of an invalid_* rule. The call stack looks like this: ... The invalid_comprehension rule acecpts an LBRACE as the starting token and only fails after it's parsed it, which means that the parser fails with three tokens in the tokens array, the NAME which is valid, the LBRACE which is parsed for the invalid_comprehension rule and the NAME thereafter, upon which the parser fails and backs out of everything. Then, we look at the last token we've parsed and that's where we're placing the caret. Because of invalid_comprehension, we can even go as far as making the parser show a completely different error, for example: ➜ cpython git:(master) ✗ cat a.py Or place the caret even further: ➜ cpython git:(master) ✗ cat a.py There's a simple fix, which is adding an alternative to the primary rule, that parses something along the lines of |
Minimal example
>>> a{ # or
>>> a {
In 3.8, this is immediately flagged as a SyntaxError. In 3.9 and master, a continuation prompt is issued. This strikes me as a parsing buglet that should preferably be fixed, as it implies that something valid *could* follow '{', thus misleading beginners. On the other hand, after scanning my keyboard, '{' seems unique in being a legal symbol, unlike `, $, and ?, or combinations like +*, that can AFAIK never follow a name. So it would need special handling.
Side note: for the same reason I dislike the { change, I like the generic 3.9 change for legal operators without a second operand.
>>> a *
Both flag as SyntaxError, but in 3.8, the caret is under '*', falsely implying that '*' cannot follow a name, while in 3.9, it is under the whitespace following, correct implying that the * is legal and that the problem is lack of a second expression (on the same line without continuation). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: