Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing syntax behavior for single quote to allow binary and hex literal separator #3310

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion runtime/syntax/c.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ rules:
end: "'"
skip: "\\\\."
rules:
- error: "..+"
# TODO: Revert back to - error: "..+" once #3127 is merged
- error: "[[:graph:]]{2,}'"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need it in c.yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work with something like

const char charArray[] = {'(', ')', ';', '{', '}', '/', '#', ',', '=', '<', '>'};

right now

- constant.specialChar: "\\\\([\"'abfnrtv\\\\]|[0-3]?[0-7]{1,2}|x[0-9A-Fa-f]{1,2}|u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})"

- comment:
Expand Down
37 changes: 31 additions & 6 deletions runtime/syntax/cpp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ filetype: c++

detect:
filename: "(\\.c(c|pp|xx)$|\\.h(h|pp|xx)?$|\\.ii?$|\\.(def)$)"
signature: "namespace|template|public|protected|private"
signature: "\\b(namespace|class|public|protected|private|template|constexpr|noexcept|nullptr|throw)\\b"

rules:
- identifier: "\\b[A-Z_][0-9A-Z_]*\\b"
Expand All @@ -29,12 +29,36 @@ rules:
- symbol.operator: "[-+*/%=<>.:;,~&|^!?]|\\b(sizeof|alignof|typeid|(and|or|xor|not)(_eq)?|bitor|compl|bitand|(const|dynamic|reinterpret|static)_cast)\\b"
# Parenthetical Color
- symbol.brackets: "[(){}]|\\[|\\]"

# Integer Literals
- constant.number: "(\\b([1-9][0-9']*|0[0-7']*|0[Xx][0-9a-fA-F']+|0[Bb][01]+)([Uu]?[Ll][Ll]?|[Ll][Ll]?[Uu]?)?\\b)"
- constant.number: "(\\b([0-9]|0[0-7]|0[Xx][0-9A-Fa-f]|0[Bb][01])([Uu][Ll]?[Ll]?|[Ll][Ll]?[Uu]?)?\\b)" # Base case (Without ' separtor)
- constant.number: "(\\b([1-9][0-9']*[0-9])([Uu][Ll]?[Ll]?|[Ll][Ll]?[Uu]?)?\\b)" # Decimal
- constant.number: "(\\b(0[0-7][0-7']*[0-7])([Uu][Ll]?[Ll]?|[Ll][Ll]?[Uu]?)?\\b)" # Oct
- constant.number: "(\\b(0[Xx][0-9A-Fa-f][0-9A-Fa-f']*[0-9A-Fa-f])([Uu][Ll]?[Ll]?|[Ll][Ll]?[Uu]?)?\\b)" # Hex
- constant.number: "(\\b(0[Bb][01][01']*[01])([Uu][Ll]?[Ll]?|[Ll][Ll]?[Uu]?)?\\b)" # Binary

# Decimal Floating-point Literals
- constant.number: "(\\b(([0-9']*[.][0-9']+|[0-9']+[.][0-9']*)([Ee][+-]?[0-9']+)?|[0-9']+[Ee][+-]?[0-9']+)[FfLl]?\\b)"
- constant.number: "(([0-9]?[.]?[0-9]+)([Ee][+-]?[0-9]+)?[FfLl]?\\b)" # Base case optional interger part with exponent base case
- constant.number: "(\\b([0-9]+[.][0-9]?)([Ee][+-]?[0-9]+)?[FfLl]?)" # Base case optional fractional part with exponent base case
- constant.number: "(([0-9]?[.]?[0-9]+)([Ee][+-]?[0-9][0-9']*[0-9])?[FfLl]?\\b)" # Base case optional interger part with exponent
- constant.number: "(\\b([0-9]+[.][0-9]?)([Ee][+-]?[0-9][0-9']*[0-9])?[FfLl]?)" # Base case optional fractional part with exponent

- constant.number: "(([0-9][0-9']*[0-9])?[.]?([0-9][0-9']*[0-9])+([Ee][+-]?[0-9]+)?[FfLl]?\\b)" # Optional interger part with exponent base case
- constant.number: "(\\b([0-9][0-9']*[0-9])+[.]([0-9][0-9']*[0-9])?([Ee][+-]?[0-9]+)?[FfLl]?)" # Optional fractional part with exponent base case
- constant.number: "(([0-9][0-9']*[0-9])?[.]?([0-9][0-9']*[0-9])+([Ee][+-]?[0-9][0-9']*[0-9])?[FfLl]?\\b)" # Optional interger part with exponent
- constant.number: "(\\b([0-9][0-9']*[0-9])+[.]([0-9][0-9']*[0-9])?([Ee][+-]?[0-9][0-9']*[0-9])?[FfLl]?)" # Optional fractional part with exponent

# Hexadecimal Floating-point Literals
- constant.number: "(\\b0[Xx]([0-9a-zA-Z']*[.][0-9a-zA-Z']+|[0-9a-zA-Z']+[.][0-9a-zA-Z']*)[Pp][+-]?[0-9']+[FfLl]?\\b)"
- constant.number: "(\\b0[Xx]([0-9a-zA-Z]?[.]?[0-9a-zA-Z]+)([Pp][+-]?[0-9]+)?[FfLl]?\\b)" # Base case optional interger part with exponent base case
- constant.number: "(\\b0[Xx]([0-9a-zA-Z]+[.][0-9a-zA-Z]?)([Pp][+-]?[0-9]+)?[FfLl]?)" # Base case optional fractional part with exponent base case
- constant.number: "(\\b0[Xx]([0-9a-zA-Z]?[.]?[0-9a-zA-Z]+)([Pp][+-]?[0-9][0-9']*[0-9])?[FfLl]?\\b)" # Base case optional interger part with exponent
- constant.number: "(\\b0[Xx]([0-9a-zA-Z]+[.][0-9a-zA-Z]?)([Pp][+-]?[0-9][0-9']*[0-9])?[FfLl]?)" # Base case optional fractional part with exponent

- constant.number: "(\\b0[Xx]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])?[.]?([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])+([Pp][+-]?[0-9]+)?[FfLl]?\\b)" # Optional interger part with exponent base case
- constant.number: "(\\b0[Xx]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])+[.]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])?([Pp][+-]?[0-9]+)?[FfLl]?)" # Optional fractional part with exponent base case
- constant.number: "(\\b0[Xx]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])?[.]?([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])+([Pp][+-]?[0-9][0-9']*[0-9])?[FfLl]?\\b)" # Optional interger part with exponent
- constant.number: "(\\b0[Xx]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])+[.]([0-9a-zA-Z][0-9a-zA-Z']*[0-9a-zA-Z])?([Pp][+-]?[0-9][0-9']*[0-9])?[FfLl]?)" # Optional fractional part with exponent

- constant.bool: "(\\b(true|false|NULL|nullptr|TRUE|FALSE)\\b)"

- constant.string:
Expand All @@ -47,9 +71,10 @@ rules:
- constant.string:
start: "'"
end: "'"
skip: "\\\\."
skip: "(\\\\.)|(\\b[1-9][0-9']+[0-9]|0[0-7']+[0-7]|0[Xx][0-9A-Fa-f][0-9A-Fa-f']+[0-9A-Fa-f]|0[Bb][01][01']*[01]\\b)"
rules:
- error: "..+"
# TODO: Revert back to - error: "..+" once #3127 is merged
- error: "[[:graph:]]{2,}'"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, this is buggy: 'ab' is no longer highlighted as an error. The point of the original "..+" regex is that any character literal containing more than one character is an error (unless it is an escaped character like '\n', which is covered by the skip regex).

Second, what is this for anyway? The skip regex should already ensure that numbers with single-quotes are skipped, so we should not need to mess with this error regex.

Actually I get what this is for: it is a nasty workaround for the bug in micro causing the skip regex being insufficient in some cases (e.g. 123'45 'a'). As I said previously, this bug seems to be fixed in #3127. I've checked that with #3127 + your PR + this error regex changed back to "..+" it works correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmaluka
Could you confirm this again? - error: "[[:graph:]]{2,}'" is highlighting 'ab' for me.

The whole point of this change is because "..+" is not working properly with

const std::unordered_set<char> CharTokens = {'(', ')', ';', '{', '}', '/', '#', ',', '=', '<', '>'};

I have not tried the change in #3127 but if that indeed fixes, I don't mind going back to "..+"
But that means this PR would need to be merged after it though otherwise the line I mentioned above will be highlighted as error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've double-checked: with your PR as is, 'ab' is correctly highlighted as error. But with your PR applied on top of #3127 it is not. And then, after I replace your error: "[[:graph:]]{2,}'" with the original error: "..+", it is correctly highlighted as error again.

I think this "[[:graph:]]{2,}'" is obviously incorrect (the error regex should not include the single quote, since the single quote is not a part of the string that is highlighted as an error), and happens to somehow work correctly just because, as I said, there is a bug in micro.

- constant.specialChar: "\\\\([\"'abfnrtv\\\\]|[0-3]?[0-7]{1,2}|x[0-9A-Fa-f]{1,2}|u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})"

- comment:
Expand Down
Loading