-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make lib2to3 grammar better match Python, support the := walrus #80722
Comments
The grammar in lib2to3 is out of date and can't parse I'm unsure if I need a separate bug per pull request, but need at least one to get started. |
jreese reminded me of PEP-570, which will make more grammar changes. I'm open to the idea of replacing the grammar with the live one, plus porting the 2isms forward like print, eval, except with comma. My sincere hope is that everyone that depends on this structure will have tests (mine and lib2to3 do); the only big user I'm aware of is probably libfuturize. Definitely worth a changelog entry if this is the way forward. |
Here's approximately what it would look like to do the big change now: master...thatch:lib2to3-update-grammar (one test failing, and some helpers may need more test coverage) |
For the changes of PEP-570, please wait until I merge the implementation to do the grammar changes in lib2to3 for that. |
My strong preference would be getting the lib2to3 grammar to be the python grammar + additions, to make future changes easier to merge. The strongest argument against doing that is the backwards-incompatibility of patterns -- some won't compile, while others will compile but do something unexpected). It's good to hear (or at least infer) that parsing modern code is also a goal of lib2to3. |
Re: breakage due to changes in structure (https://bugs.python.org/issue36541#msg339669) ... this has already happened in the past (e.g., type annotations and async). It's probably a good idea to add some documentation that structure changes can be expected with each release of Python. |
Also the Grammar.txt diffs look about the same size as I've seen with other upgrades to lib2to3 when the Python grammar changed. |
Support for |
Parsing support for as lib2to3 is pending deprecation at this point, i'm not going to work on this. anyone is welcome to pick it up. modifying the lib2to3 grammar, and any related code, and adding a test is what's needed to parse that syntax. |
I made a suggestion for augmenting ast.parse with some of lib2to3's features; but nobody seemed interested. RIP lib2to3. Like many pieces of software, it was used for far more than for what it was originally intended. |
I don't see the point of augmenting the ast.parse, since we already have variants of proper CST implementations outside the core python. Such as github.com/davidhalter/parso/ or LibCST. Also for basic refactorings, it is so easy to use tokens for the refactoring and AST for the analysis! Even the ast.unparse() can be partially used (like first finding the related segment of the code through AST analysis, building the corresponding variant, unparsing it, finding the region of related tokens in the source code and replacing them). There are also quite a few libraries for using tokenize in different purposes (or wrappers) such as https://github.com/asottile/tokenize-rt or github.com/isidentical/brm. |
Every piece of code that uses either lib2to3 or a parser derived from it (including parso and LibCST) will eventually not be able to upgrade the parser because PEG can handle grammars that LL(k) can't. That's why I proposed adding some functionality to ast.parse, to make the whitespace and token information easily available - this seems to be what @BTaskaya says is "easy" (maybe they mean it's easy using LibCST? It seems to be fiddly using ast.parse). The alternative is that all these projects (black, LibCST, yapf, etc.) will have to roll their own solutions, which doesn't seem a very productive use of people's time and makes version upgrades slow. If people are interested in using ast.parse extensions as a replacement for lib2to3, I suggest discussing at https://mail.python.org/archives/list/python-ideas@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/ |
Since these projects are external, depending on the functionality they are free-to-roll their own parser implementations or make hacks to pass away things. Or fork the Grammar/python.gram to preserve all tokens and generate a Python parser from it.
I don't quite get what you are proposing here,
How would you do that? By augmenting the AST with the information retrieved from tokens? If so, check this out; https://github.com/leo-editor/leo-editor/blob/master/leo/core/leoAst.py and asttokens. Also, please move the discussion to somewhere else (like discuss.python.org etc.) since this is not the ideal place to discuss and people might be distracted! (feel free to cc me where you move the discussion) |
I'd also agree, and not supporting to add features from now on. If someone really needs this to be added [and backported], please re-open the issue. |
While I said i didn't care... and don't really want to... I found a reason to at least not omit pep-570 positional only arg parsing support give things like yapf still use it rather than forking their own copy. PR testing. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: