Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: lex soft keywords #1994

Merged
merged 1 commit into from
Dec 29, 2021
Merged

Conversation

amitkummer
Copy link
Contributor

@amitkummer amitkummer commented Dec 20, 2021

Some notes:

  • This approach is very similar to what @taleinat did in bpo-44010: IDLE: colorize pattern-matching soft keywords python/cpython#25851 (thank you! this was a huge help).
    It is not perfect, but it's rather simple and I can't think of an edge case.

  • I did not use the words function to create the regex matching the
    keywords list, because it returns a capturing group (()) and it
    needs to be non-capturing here (because of bygroups usage).

  • I chose to go to the 'soft-keywords-inner' state after both
    match and case, even though it's unnecessary for match
    (the inner state catches the _ wildcard keyword which appears
    only after a case).

    This is mostly harmless and saves us from writing the 'soft-keywords'
    regex twice each for match and case with the extra inner state
    just for case.

    The only piece of code this will lex incorrectly is match _:
    (_ will be lexed as keyword). I doubt though that pattern matching
    will be used like this.

Fixes #1797.

Some notes:

- This approach is not perfect, but it's rather simple and I can't
  think of an edge case.

- I did not use the `words` function to create the regex matching the
  keywords list, because it returns a capturing group (`()`) and it
  needs to be non-capturing here (because of `bygroups` usage).

- I chose to go to the 'soft-keywords-inner' state after both
  `match` and `case`, even though it's unnecessary for `match`
  (the inner state catches the `_` wildcard keyword which appears
  only after a `case`).

  This is mostly harmless and saves us from writing the 'soft-keywords'
  regex twice each for `match` and `case` with the extra inner state
  just for `case`.

  The only piece of code this will lex incorrectly is `match _:`
  (`_` will be lexed as keyword). I doubt though that pattern mathcing
  will be used like this.
@Anteru Anteru added this to the 2.11.0 milestone Dec 29, 2021
@Anteru Anteru merged commit 09589f6 into pygments:master Dec 29, 2021
@Anteru
Copy link
Collaborator

Anteru commented Dec 29, 2021

Merged, thanks a lot!

@amitkummer amitkummer deleted the python-pattern-matching branch December 29, 2021 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support new Python "soft keywords": match, case and _
2 participants