New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bracket Highlight does not detect word boundary for language using word as bracket #132162
Comments
It is difficult to not detect
In case of X/Y = Suggestions? |
@hediet An option would be for word base block delimitation to use non word delimiter (in regex it's \W or \b, which exclude a-zA-Z0-9 and _). This would limit detection of the keyword by themselves or next to other non-word symbol (such as "("). For "(", detecting the symbol as done now should be fine. I haven't seen any weird issue for now, except that the highlighting should be disabled in the comments. This does require to have slightly different matching rules about surrounding boundary to the bracket pair. I'm not sure if this is easily feasible with the current design of the bracket highlighter. But, this seems reasonable since it is two completely different way of delimiting block in programming languages with a pair of symbol. Expecting to have the same matching logic for both seems to be difficult as you showed. |
How would you decide which matching logic to use? Theoretically, there could even be brackets like We could use word-delimiters if and only if the bracket matches |
In the case of So, anytime there is a bracket character such as I am not familiar with unicode standard, so I cannot comment on the |
If you look at #132504 (comment), which I think is related to this, you can see a list of brackets with both word and non-word symbols in them taken from LaTeX, for example I would say you need three different strategies for brackets depending if the consists of:
For the first What do you think? |
I like the three way split. The How is |
I am a minor contributor to the VSCode-SystemVerilog extension and have been thinking about this issue since updating to the super speed brace matching. Mostly on whether there is a fix/workaround we could apply or whether we should remove the words as brackets altogether. For SystemVerilog, I believe @maxnordlund's solution would work for the words (i.e. 1. + To add a bit to the discussion here for SystemVerilog, per the standard there are two ways to specify an identifier:
But also note from the second regular expression that you could have a legal identifier in SystemVerilog using any printable ASCII character as long as it begins with |
Verification steps: "[plaintext]": {
"editor.language.colorizedBracketPairs": [
["class", "end"]
]
}, Use plain text mode and enable bracket pair colorization (editor.bracketPairColorization.enabled: true):
|
It works well with the "brackets" key already in The scenario I outlined above does cause it to slightly misidentify the keyword specifically for SystemVerilog: However, I suspect the use of escaped identifiers is minimal and this update covers the majority of scenarios for SystemVerilog. |
Sorry, I missed that comment. Unfortunately, our current design of the text-mate tokenizer only tracks comments/strings/regexp tokens semantically, which aren't really extendable. It is unlikely that we will introduce a mechanism to detect that However, feel free to open a new issue. |
Just to help decide if the complexity that I am a hardware designer and I used SystemVerilog for many years (for multiple clients). The only time I saw the All coding style guide forbid this style of naming, since it may cause havoc with synthesis tools that are not expecting this kind of naming scheme. Unless someone comes in with a compelling reason, supporting the |
Yep Michael. That is almost exactly where I've seen it before. In tool output where they express the physical netlist itself in Verilog. Most of the use of escaped identifiers looked to me as a way to maintain Verilog 1995/2001 backward compatibility where structs/enums/etc... are not supported. My suspicion is that this is necessary to keep the underlying engines happy as most of the core engine code in them dates back to the 80's. I agree that this style of destructuring is not something a hardware designer entering HDL into an editor would be using much of. The way I've solved/hacked it in the extension is to classify the escaped identifier as a regular expression as that appears to be a fenced off region already for the bracket matching and SystemVerilog has no regular expressions. I then recolor the token either in the settings or through semantic highlighting to make it appear "technically correct" as an identifier. |
@jecassis If you solution seems to work on the full identifier naming rule I will gladly take it. I was just warning against putting to much focus on a fringe case. I am a happy user of VSCode-SystemVerilog, so if the colouring works for tool generated code that is great. It will help me when debugging synthesis issue. Thank you for your work |
Issue Type: Bug
When using systemverilog language the bracket colorization is highlighting part of a word that match the bracket work.
The language uses begin..end as bracket and highlight all instance of "end".
the end in the word "extends" should not be highlighted.
Here is a small sample that trigger the issue for the word "class" and "end"
You can find the list of bracket word used in the language.
VS Code version: Code 1.60.0 (e7d7e9a, 2021-09-01T10:54:53.442Z)
OS version: Darwin x64 20.6.0
Restricted Mode: No
Remote OS version: Linux x64 3.10.0-1160.41.1.el7.x86_64
System Info
gpu_compositing: enabled
metal: disabled_off
multiple_raster_threads: enabled_on
oop_rasterization: enabled
opengl: enabled_on
rasterization: enabled
skia_renderer: disabled_off_ok
video_decode: enabled
webgl: enabled
webgl2: enabled
Extensions (46)
A/B Experiments
The text was updated successfully, but these errors were encountered: