Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split/Remove Backward tokenization #6363

Closed
Tracked by #6502
MichaReiser opened this issue Aug 5, 2023 · 2 comments
Closed
Tracked by #6502

Split/Remove Backward tokenization #6363

MichaReiser opened this issue Aug 5, 2023 · 2 comments
Assignees
Labels
internal An internal refactor or improvement

Comments

@MichaReiser
Copy link
Member

MichaReiser commented Aug 5, 2023

The SimpleTokenizer supports backward lexing. The implementation will break when supporting Python 3.12's new F-String parsing because comments can now appear in parts that appear to be strings.

f"test{more  # quote
}"

My preferred option would be to remove backward lexing altogether. But I'm unsure how to support is_parenthesized_expression without it in the formatter. The problem is that we need to look back from the start of the expression. One option I could think of is to integrate the parenthesize detection into the CommentsVisitor where we already track the parent nodes (necessary to avoid mistaking a as a parenthesized expression in call(a)), and we could store the last start/position and start lexing from there. This would require skipping over some tokens, which I'm not sure we can handle.

The other alternative is to split the SimpleTokenizer into one implementation that only supports forward lexing and a BackwardTokenizer that supports backward lexing, but takes the CommentRanges as a second argument.

CC: @dhruvmanila

@charliermarsh charliermarsh added the internal An internal refactor or improvement label Aug 8, 2023
@dhruvmanila
Copy link
Member

/cc @konstin who's working on this, thanks for taking this up!

@charliermarsh
Copy link
Member

I believe this landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement
Projects
None yet
Development

No branches or pull requests

4 participants