Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

highlight Python docstrings #49

Closed
ghost opened this issue Aug 19, 2023 · 8 comments
Closed

highlight Python docstrings #49

ghost opened this issue Aug 19, 2023 · 8 comments

Comments

@ghost
Copy link

ghost commented Aug 19, 2023

Hi,

With the addition of the exclamation mark and equal in regex, it should be possible to highlight multiline docstrings. I have been experimenting and have not succeeded.

Has anyone done this? Does anyone have suggestions?

I thought this might work (simple case with double quotes only), but it does not.

{"py", "\"\"\"[!\"\"\"]*?\"\"\"", {2 | SYN_IT}},

I am using this as a stop-gap measure.

{"py", "^[ \t]*\"\"\".*$", {2 | SYN_BD | SYN_RV}},

Thanks.

@kyx0r
Copy link
Owner

kyx0r commented Aug 19, 2023

At the moment this is not possible due to pattern invariance, i.e. the start and end pattern consist of the same string pattern which is ambiguous. And what's possible won't be that great, for example:

{"py", "(\"\"\"[!\"\"\"]*)|(^[ \t]*\"\"\".*$)", {4 | SYN_IT}, {0}, 2},

In this case it will only work if starting string does not contain [ \t] and the ending pattern has it. Otherwise the priority is always given to the ending pattern due to shortest match first regex rule which won't trigger a start of a multiline block.

No matter how you look at it the patterns are the same (or could be the same theoretically).

One solution is to provide a way for user to specify the starting pattern such that the start and end are the same match group, and since they will be the same the 2nd match will terminate the blockhl.
This will be enough to terminate the block, but im not sure if highlighting """.* and .*""" will work. That's the case where everything before the first """ and after the last """ must not match .

I have to revisit this in the future.

@ghost
Copy link
Author

ghost commented Aug 19, 2023 via email

@kyx0r
Copy link
Owner

kyx0r commented Aug 19, 2023

Hi Claude,

I've made a promising change. Pull the last commit and try this pattern out:

{"/", "\"{3}.*\"{3}", {5 | SYN_IT}, {0}},
{"/", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {4 | SYN_IT}, {0}, -1},

It comes pretty close to what you want, here are the test cases:

   ashds """ sdhash
123
asdas
asd
hasdhas """ sdhadh

Not quite, hasdhas and sdhash are not highlighted

""" sdhash
123
asdas
asd
hasdhas """

All good.

""" sdhash
123
asdas
asd
hasdhas """df

Not quite, hasdhas not highlighted but the block is terminated.

""" sdhash
123
asdas
asd
"""

All good.

"""
123
asdas
asd
"""

All good.

absbdabs """ 123 asdas asd """ sdf s

All good.

""" 123 asdas asd """

All good.

hh """ 123 asdas asd """

All good.

""" 123 asdas asd """ hsdfhshd

All good.

@ghost
Copy link
Author

ghost commented Aug 19, 2023 via email

@kyx0r
Copy link
Owner

kyx0r commented Aug 19, 2023

Are these two patterns working together? It looks like the first one will detect a single line docstring, but not trigger on a multi-line docstring. I am assuming the .* will not include newlines.

Yes, multiline patterns are those that contain non zero value for blkend struct member.

In more general contexts, some have suggested this pattern. Note the use of the non-greedy ? which nextvi supports. """[\s\S]?""" I think nextvi's regex does not support meta characters, right? If so, is there an equivalent to the [\s\S] ?

Right, [\s\S] is not supported because it can be achieved by just hard entering the character values into [],
I don't remember exactly but for example \s would be equalent to something like [ \t-\r]

There remains the difficulty of the identical docstring start and end markers. The second pattern has three options: a line ending with """, a line beginning with """, and just """. Is the order of these three options significant?

The order is significant, what comes first gets tested first.

This covers all the common uses of a docstring (in my limited experience). I could run into imaginative uses of docstrings. :-) Excellent! Thanks. P. S. I added the following to the Python section of conf. I like to highlight docstrings as comments.

Aye, I think I will add this to the base version as it demonstates single multiline pattern group. Its only valid if blkend is negative.

-Kyryl

@ghost
Copy link
Author

ghost commented Aug 19, 2023 via email

@kyx0r
Copy link
Owner

kyx0r commented Aug 19, 2023

Ah, I see if docstring must be preceded by ): maybe a different approach could be used.

@ghost
Copy link
Author

ghost commented Aug 20, 2023 via email

@kyx0r kyx0r closed this as completed Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant