Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syntax highlighting : standard strings in Python #50

Closed
ghost opened this issue Aug 20, 2023 · 4 comments
Closed

syntax highlighting : standard strings in Python #50

ghost opened this issue Aug 20, 2023 · 4 comments

Comments

@ghost
Copy link

ghost commented Aug 20, 2023

Hi,

The syntax highlighting pattern for standard Python strings fails with embedded escaped quotes.

These two patterns produce the expected highlighting. I used separate patterns because it is easier for me to read.

{"py", "[\"]([^\"]|\\\")*[\"]", {4}},
{"py", "[']([^']|\\')*[']", {4}},

I used the following strings to test this.

"testing \"this\" embedded double-quoted"
'testing \'this\' embedded single-quoted'

Before this change, the word this was white; now the whole string is blue. I changed the number of backslashes.

Some people use triple-quoted strings when they need to embed quotes, or they use one type of quote outside and the other type inside, but not always - sometimes they escape the clashing quotes.

If this makes sense, please consider updating conf.c.

Thanks.

P. S. I am trying to understand how the highlight structure's end and blkend work. Please give me some hints.

@kyx0r
Copy link
Owner

kyx0r commented Aug 20, 2023

Hi,

The syntax highlighting pattern for standard Python strings fails with embedded escaped quotes.

Yeah, that's probably because nobody ever bothered to make them better :D
For instance c quotes pattern is exceptionally robust. I rarely deal with python in my work
so it had less of my attention.

P. S. I am trying to understand how the highlight structure's end and blkend work. Please give me some hints.

I am in the processes of making changes with some ideas gained over the year, so readme is a bit out of date. It will be updated once I feel like I've exhausted my ideas.

The way end works, its a 16 byte array of chars, each char can take 3 states: -1, 0, 1. It determines where the result of a pattern match continues. 16 corresponds to 16 groups that could be defined, more than enough.

0 means nothing special happens (default) advance sp by the length of the matched group.
-1 means the pattern group that matched will be highlighted but treated like it didn't match. thus the string pointer is set to sp+1. Notice the +1 is necessary otherwise we would be stuck in an infinite loop matching the same pattern over and over. So its not exactly no match operation, more like a way to let other patterns passthrough.
1 same concept as -1 except we advance by the starting location of the group match. i.e if match was (5,10) add sp+5

blkend tells which group is going to terminate the multiline block. if its negative the same group that started the block terminates the block. There are a few gochas, for example when screen is positioned going up instead of down, it is always assumed that group 1 will do the termination in that case.

Hope this helps, I know its arcane, but the purpose is to have a minimal and lean implementation.

@ghost
Copy link
Author

ghost commented Aug 21, 2023 via email

@kyx0r
Copy link
Owner

kyx0r commented Aug 21, 2023

I have attached the old Python program which I have been using for testing;

Use null pointer, I don't think github reply attachments work (or you forgot to attach it).
https://0x0.st/

I added extra strings near the top. The highlighting is quite good, but the result is sometimes bizarre: on line 591, the fd.write call uses a triple-quoted string; the string ends on line 616, but the highlighting continues until line 626. The highlighting changes as I move around in the file. Is that expected?

Moving up screen up won't do it right if the pattern is not fully enclosed (screen wise)

Btw, my multiline pattern now looks like this:
{"py", "((?:[!\"\"\"]*\"{3}\n$)|(?:\"{3}[!\"\"\"]*)|\"{3})", {6}, {0}, -1},

Did you happen to test it? I think its better because it handles this:

fsjfjsdj """ jdfsjfj

"""

jdfsjfj will be highlighted.

I use 6 for color, a special string a separate color I guess ha ha.

-Kyryl

@ghost
Copy link
Author

ghost commented Aug 21, 2023 via email

@kyx0r kyx0r closed this as completed Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant