-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syntax highlighting : standard strings in Python #50
Comments
Yeah, that's probably because nobody ever bothered to make them better :D
I am in the processes of making changes with some ideas gained over the year, so readme is a bit out of date. It will be updated once I feel like I've exhausted my ideas. The way end works, its a 16 byte array of chars, each char can take 3 states: -1, 0, 1. It determines where the result of a pattern match continues. 16 corresponds to 16 groups that could be defined, more than enough. 0 means nothing special happens (default) advance sp by the length of the matched group. blkend tells which group is going to terminate the multiline block. if its negative the same group that started the block terminates the block. There are a few gochas, for example when screen is positioned going up instead of down, it is always assumed that group 1 will do the termination in that case. Hope this helps, I know its arcane, but the purpose is to have a minimal and lean implementation. |
Hi,
As I continue to experiment, I am often surprised by the results. The order
changes the behaviour: putting the triple-quoted string patterns before the
single-quoted string patterns produces a better result. And I have returned
to the original patterns for the single-quoted strings. :-)
Here are the patterns I am using for Python.
/* python */
{"py", NULL, {14 | SYN_BD}, {1}, 0, 2},
{"py", NULL, {9}, {0}, 0, 1},
{"py", "#.*", {2}},
{"py",
"\\<(?:and|break|class|continue|def|del|elif|else|except|finally|\
for|from|global|if|import|in|is|lambda|not|or|pass|print|raise|return|try|while)\\>",
{5}},
{"py", "([a-zA-Z0-9_]+)\\(", {0, SYN_BD}},
/* triple-quoted strings */
{"py", "\"{3}.*?\"{3}", {4 | SYN_IT}, {0}}, /* works on one line */
/* multi-line experimentation */
{"py", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {4 | SYN_IT},
{0}, -1},
/* single-quoted strings */
{"py", "[\"](\\\\\"|[^\"])*?[\"]", {4}},
{"py", "['](\\\\'|[^'])*?[']", {4}},
I have attached the old Python program which I have been using for testing;
I added extra strings near the top. The highlighting is quite good, but the
result is sometimes bizarre: on line 591, the *fd.write* call uses a
triple-quoted string; the string ends on line 616, but the highlighting
continues until line 626. The highlighting changes as I move around in the
file. Is that expected? Is this all because multi-line is difficult?
I have not worked this much with C and regular expressions in a long time.
Have a good day.
P. S. I am using yesterday's master and building with the *build.sh*
script.. My computer runs Debian 11 on an AMD64.
P. P. S. I am going to use the following since the result is the least
surprising. The triple quotes of multi-line strings are highlighted and the
contents are not; single-line strings are fully highlighted.
/* triple-quoted strings */
{"py", "\"{3}.*?\"{3}", {4 | SYN_IT}, {0}}, /* works on one line
*/
/* multi-line experimentation */
{"py", "(\"{3})", {4 | SYN_RV}},
…--
Claude Marinier
|
Use null pointer, I don't think github reply attachments work (or you forgot to attach it).
Moving up screen up won't do it right if the pattern is not fully enclosed (screen wise) Btw, my multiline pattern now looks like this: Did you happen to test it? I think its better because it handles this: fsjfjsdj """ jdfsjfj """ jdfsjfj will be highlighted. I use 6 for color, a special string a separate color I guess ha ha. -Kyryl |
File uploaded: https://0x0.st/HLkh.py
…On Mon, Aug 21, 2023 at 9:04 AM Kyryl ***@***.***> wrote:
I have attached the old Python program which I have been using for testing;
Use null pointer, I don't think github reply attachments work (or you
forgot to attach it).
https://0x0.st/
I added extra strings near the top. The highlighting is quite good, but
the result is sometimes bizarre: on line 591, the *fd.write* call uses a
triple-quoted string; the string ends on line 616, but the highlighting
continues until line 626. The highlighting changes as I move around in the
file. Is that expected?
Moving up screen up won't do it right if the pattern is not fully enclosed
(screen wise)
Btw, my multiline pattern now looks like this:
{"py", "((?:[!\"\"\"]*\"{3}\n$)|(?:\"{3}[!\"\"\"]*)|\"{3})", {6}, {0}, -1},
Did you happen to test it? I think its better because it handles this:
fsjfjsdj """ jdfsjfj
"""
jdfsjfj will be highlighted.
I use 6 for color, a special string a separate color I guess ha ha.
-Kyryl
—
Reply to this email directly, view it on GitHub
<#50 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH67Y7Y5VOARTUK6LN2PIDXWNMG7ANCNFSM6AAAAAA3XOV37I>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Claude Marinier
|
Hi,
The syntax highlighting pattern for standard Python strings fails with embedded escaped quotes.
These two patterns produce the expected highlighting. I used separate patterns because it is easier for me to read.
{"py", "[\"]([^\"]|\\\")*[\"]", {4}},
{"py", "[']([^']|\\')*[']", {4}},
I used the following strings to test this.
"testing \"this\" embedded double-quoted"
'testing \'this\' embedded single-quoted'
Before this change, the word this was white; now the whole string is blue. I changed the number of backslashes.
Some people use triple-quoted strings when they need to embed quotes, or they use one type of quote outside and the other type inside, but not always - sometimes they escape the clashing quotes.
If this makes sense, please consider updating conf.c.
Thanks.
P. S. I am trying to understand how the highlight structure's end and blkend work. Please give me some hints.
The text was updated successfully, but these errors were encountered: