Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tcl lexer: \ and $ interpreted as errors fix, add string escape handling, adding test case (fixing issue #2686) #2728

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

trungtruong1
Copy link

@trungtruong1 trungtruong1 commented Jun 10, 2024

Update posted in the comments!

Good day! I am proposing a fix for the issue #2686.

I have added some changes to the lexer so that it no longer considers backslashes as errors. The lexer will now consider \ to be String.Escape. Standalone $ will be considered as Text.

Details changes
Path: pygments/lexers/tcl.py
Change

Result:
Before:
Before

After:
After

Tokens:

'set'         Keyword
' '           Text.Whitespace
'amount'      Text
' '           Text.Whitespace
'\\$'         Literal.String.Escape
'16.42'       Literal.Number.Float
'\n'          Text

'puts'        Name.Builtin
' '           Text.Whitespace
'lots'        Text
'\\n'         Literal.String.Escape
'of'          Text
'\\n'         Literal.String.Escape
'\\n'         Literal.String.Escape
'\\n'         Literal.String.Escape
'\\n'         Literal.String.Escape
'\\n'         Literal.String.Escape
'\\n'         Literal.String.Escape
'newlines'    Text
'\n'          Text

'set'         Keyword
' '           Text.Whitespace
'somevar'     Text
' '           Text.Whitespace
'{'           Keyword
'\n    '      Text.Whitespace
'This'        Name.Variable
' '           Text.Whitespace
'is'          Text
' '           Text.Whitespace
'a'           Text
' '           Text.Whitespace
'literal'     Text
' '           Text.Whitespace
'$ '          Text
'sign,'       Text
' '           Text.Whitespace
'and'         Text
' '           Text.Whitespace
'this'        Text
' '           Text.Whitespace
'\\}'         Literal.String.Escape
' '           Text.Whitespace
'escaped'     Text
'\n'          Text

'    '        Text.Whitespace
'brace'       Name.Variable
' '           Text.Whitespace
'remains'     Text
' '           Text.Whitespace
'uninterpreted' Text
'\n'          Text

'}'           Keyword
'\n'          Text

Please inform me if any alterations or modifications are necessary. I would greatly appreciate any feedback or comments!

Copy link

@McSinyx McSinyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just driving by, but since tests for Tcl are rather lacking, I'd suggest adding the relevant ones.

@@ -77,6 +77,8 @@ def _gen_command_rules(keyword_cmds_re, builtin_cmds_re, context=""):
(r'\(', Keyword, 'paren'),
(r'\[', Keyword, 'bracket'),
(r'\{', Keyword, 'brace'),
(r'\\(.)', String.Escape),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also line break and character escape sequences, e.g.

puts \
\164\x68\u1eed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for the comment, I have made the changes required to address the issue.

pygments/lexers/tcl.py Outdated Show resolved Hide resolved
@trungtruong1
Copy link
Author

trungtruong1 commented Jun 13, 2024

Update new change:

Update_change

@McSinyx have pointed out that \ character at the end of a line is still considered as an error, I have added another rule to the lexer to address this.

Also, his example:

puts \
\164\x68\u1eed

has shown that my change is not considering the case of 16-bit Unicode escape sequences, 8-bit Hexadecimal escape sequences, and 8-bit Octal escape sequences properly.

This is the new output for the updated lexer:
image

@trungtruong1
Copy link
Author

Additionally, I'm adding a new file named test_string_escape.txt, which contains examples of string escapes related to this issue. This addition will enhance our test suite, as the current test coverage for the Tcl lexer is relatively limited.

@trungtruong1 trungtruong1 changed the title Fix for issue #2686 Tcl lexer: \ and $ interpreted as errors fix, add string escape handling (fixing issue #2686 and more) Jun 13, 2024
@trungtruong1 trungtruong1 changed the title Tcl lexer: \ and $ interpreted as errors fix, add string escape handling (fixing issue #2686 and more) Tcl lexer: \ and $ interpreted as errors fix, add string escape handling (fixing issue #2686 and adding test) Jun 13, 2024
@trungtruong1 trungtruong1 changed the title Tcl lexer: \ and $ interpreted as errors fix, add string escape handling (fixing issue #2686 and adding test) Tcl lexer: \ and $ interpreted as errors fix, add string escape handling, adding test case (fixing issue #2686) Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants