-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDLE: highlight soft keywords #88176
Comments
As-per PEP-634, structural pattern matching is now in Python. This introduces the The problem is that these are listed in Do we change IDLE to use softkwlist, or move those keywords into kwlist? |
Hi, I'm no IDLE expert, but I think moving the new soft keywords into kwlist seems wrong: Soft keywords were added in Python 3.9 when the PEG parser became the default. The keyword list was also updated accordingly https://docs.python.org/3/library/keyword.html#keyword.softkwlist. This link provides an explanation of how soft keywords differ from normal keywords: https://docs.python.org/3.10/reference/lexical_analysis.html#soft-keywords Thanks |
Thanks for linking to the Lexical Analysis docs. Not quite sure how I missed this given it is directly below the normal keywords section. Given the distinction described there, it may instead be best for IDLE to highlight this as its own category (i.e. not grouping it with the standard keywords). |
match
/ case
syntax
Soft keywords are a huge nuisance for syntax highlighting as they need special case regexes and tests. Hard keywords are matched against complete words, regardless of whether the context is syntactically valid or not. If 'for' and 'else' were the only keywords, the keyword part of the IDLE colorizer regex would be as follows. >>> kw = r"\b" + colorizer.any("KEYWORD", ['for', 'else']) + r"\b"
>>> kw
'\\b(?P<KEYWORD>for|else)\\b' Both words in 'for.else' are highlighted as the tokenizer will see them as keywords. The parser will later see the combination as an error. The tag name in a "(?P<name>...) construct can only be used once in a regex. Since the word-boundary context is the same for all hard keywords, the alternation can be done within one such context and all (hard) keywords get the same match tag (dict key "KEYWORD"), making it easy to give all the same highlight. Soft keywords need different contexts to avoid false positives. 'match' and 'case' must be the first non-blank on a line and followed by ':'. '_' must follow 'case' and space. I believe each context will have to have its own tag name, so multiple keyword tags must be mapped to 'keyword'. skw1 = r"^[ \t]*(?P<SKEY1>match|case)[ \t]+:"
skw2 = r"case[ \t]+(?P<SKEY1>_)\b" Add skw1 and skw2 to the prog definition, which should use "|".join(...). In ColorDelegator.LoadTagDefs (which should be renamed), replace
with
"KEYWORD": keydef
"SKEY1": keydef
"SKEY2": keydef after first defining keydef with keydef = idleConf.GetHighlight(theme, "keyword") Some new tests will be needed. |
Terry, Elisha, does one of you intend to work on this? If not, I'd be willing to. |
I don't mind, would you like to Tal? (I probably won't be able to dedicate any serious time to it until mid-June). One thing I've been thinking is whether it's worth us highlighting regardless of context. For example, you can assign a variable to a builtin name (not that it's recommended) so we could just give soft keywords their own colour and (unofficially) recommend people don't use such words for variables. I think this would be more future-proof as we wouldn't need to update the regexes for each new soft keyword added. However, we might not want to highlight every time the user has an '_' variable (as is fairly common). |
I think it is rather crucial to have this with the 3.10 release. I'll try to get this working ASAP. I agree that a simple "good enough" solution could be a good start, but "_" will likely need special handling. |
My plan for the next day or two is to submit followup issue for Shell and formally code what I wrote. The only way to handle soft keywords correctly is with a custom re. I don't expect them to become common. They are different from builtins because they only have special meaning in (so far) definable situations. When builtin is 'redefined, it may or may not be appropriate to keep the highlight. Examples when it is: oldprint = print
def print(*args, **kwds:
log the print
oldprint(*args, **kwds)
def intsum(nums, int=int): # Localize int for speed.
<code that calls int multiple times> |
I agree with getting this in soon. A related request is to to syntax highlight field expressions in f strings. I don't think there is an existing issue. Apparently, at least some alternatives to IDLE do this. I am not sure I would really want it, but we need at least some mockups. Tal, what do you think and are you interested in trying to write a PR? |
Related, but separate, and IMO not quite as urgent. I can commit to working on this issue (soft keywords), but I'll have to see where things stand once this is finished before moving on to f-strings. |
I've created a PR (GH-25851) with a rather quick, working implementation. This includes some tests but I haven't thoroughly tested it yet. If anyone can take a look and give feedback on the approach, that would be great. |
Can we close this, or are we leaving it open for when (if) we do a colouriser refactor? |
IMO this can be closed, so I'm closing this. (Terry is welcome to reopen this if needed.) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: