-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
highlight Python docstrings #49
Comments
At the moment this is not possible due to pattern invariance, i.e. the start and end pattern consist of the same string pattern which is ambiguous. And what's possible won't be that great, for example:
In this case it will only work if starting string does not contain [ \t] and the ending pattern has it. Otherwise the priority is always given to the ending pattern due to shortest match first regex rule which won't trigger a start of a multiline block. No matter how you look at it the patterns are the same (or could be the same theoretically). One solution is to provide a way for user to specify the starting pattern such that the start and end are the same match group, and since they will be the same the 2nd match will terminate the blockhl. I have to revisit this in the future. |
Thank you.
…On Sat, Aug 19, 2023 at 12:19 AM Kyryl ***@***.***> wrote:
At the moment this is not possible due to pattern invariance, i.e. the
start and end pattern consist of the same string pattern which is
ambiguous. And what's possible won't be that great, for example:
{"py", "(\"\"\"[!\"\"\"]*)|(^[ \t]*\"\"\".*$)", {4 | SYN_IT}, {0}, 2},
In this case it will only work if starting string does not contain [ \t]
and the ending pattern has it. Otherwise the priority is always given to
the ending pattern due to shortest match first regex rule which won't
trigger a start of a multiline block.
No matter how you look at it the patterns are the same (or could be the
same theoretically).
One solution is to provide a way for user to specify the starting pattern
such that the start and end are the same match group, and since they will
be the same the 2nd match will terminate the blockhl.
This will be enough to terminate the block, but im not sure if
highlighting """.* and .*""" will work. That's the case where everything
before the first """ and after the last """ must not match .
I have to revisit this in the future.
—
Reply to this email directly, view it on GitHub
<#49 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH67YY7ZVMJZABSUB54LWTXWA5FLANCNFSM6AAAAAA3WFCVOI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Claude Marinier
|
Hi Claude, I've made a promising change. Pull the last commit and try this pattern out:
It comes pretty close to what you want, here are the test cases:
|
On 2023-08-19, Kyryl wrote:
Hi Claude,
I've made a promising change. Pull the last commit and try
this pattern out:
{"/", "\"{3}.*\"{3}", {5 | SYN_IT}, {0}},
{"/", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {4 | SYN_IT}, {0},
-1},
Are these two patterns working together? It looks like the first
one will detect a single line docstring, but not trigger on a
multi-line docstring. I am assuming the .* will not include
newlines.
In more general contexts, some have suggested this pattern. Note
the use of the non-greedy *? which nextvi supports.
"""[\s\S]*?"""
I think nextvi's regex does not support meta characters, right?
If so, is there an equivalent to the [\s\S] ? There remains the
difficulty of the identical docstring start and end markers.
The second pattern has three options: a line ending with """, a
line beginning with """, and just """. Is the order of these
three options significant?
It comes pretty close to what you want, here are the test cases:
This covers all the common uses of a docstring (in my limited
experience). I could run into imaginative uses of docstrings.
:-)
Excellent! Thanks.
P. S. I added the following to the Python section of conf. I
like to highlight docstrings as comments.
{"py", "\"{3}.*\"{3}", {2 | SYN_IT}, {0}},
{"py", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {2 | SYN_IT},
{0}, -1},
|
Yes, multiline patterns are those that contain non zero value for blkend struct member.
Right, [\s\S] is not supported because it can be achieved by just hard entering the character values into [],
The order is significant, what comes first gets tested first.
Aye, I think I will add this to the base version as it demonstates single multiline pattern group. Its only valid if blkend is negative. -Kyryl |
Since triple double-quotes are also used for special strings, I
need to reconsider this. That makes this more difficult: a
docstring is a bare string appearing at the beginning of a file
and following def, defclass, etc.
https://peps.python.org/pep-0257/#what-is-a-docstring
On 2023-08-19, ClaudeM wrote:
On 2023-08-19, Kyryl wrote:
> Hi Claude,
>
> I've made a promising change. Pull the last commit and try
> this pattern out:
>
> {"/", "\"{3}.*\"{3}", {5 | SYN_IT}, {0}},
> {"/", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {4 | SYN_IT}, {0},
-1},
Are these two patterns working together? It looks like the first
one will detect a single line docstring, but not trigger on a
multi-line docstring. I am assuming the .* will not include
newlines.
In more general contexts, some have suggested this pattern. Note
the use of the non-greedy *? which nextvi supports.
"""[\s\S]*?"""
I think nextvi's regex does not support meta characters, right?
If so, is there an equivalent to the [\s\S] ? There remains the
difficulty of the identical docstring start and end markers.
The second pattern has three options: a line ending with """, a
line beginning with """, and just """. Is the order of these
three options significant?
> It comes pretty close to what you want, here are the test cases:
This covers all the common uses of a docstring (in my limited
experience). I could run into imaginative uses of docstrings.
:-)
Excellent! Thanks.
P. S. I added the following to the Python section of conf. I
like to highlight docstrings as comments.
{"py", "\"{3}.*\"{3}", {2 | SYN_IT}, {0}},
{"py", "((?:[!\"\"\"]*\"{3}\n$)|(?:^\"{3}.*)|\"{3})", {2 |
SYN_IT}, {0}, -1},
|
Ah, I see if docstring must be preceded by |
My idea of treating docstrings differently from other triple-quoted strings
is not pythonic and deviates from what vim and emacs are doing. I was
wrong. Triple-quoted strings are meant for multiline strings; docstrings
are multiline strings and often have the triple-quotes on separate lines.
The current solution works well enough for docstrings; as you pointed out,
it cannot handle situations where things precede or follow the
triple-quotes. It is starting to look like this is beyond the capabilities
of regular expressions. Python is an exception.
I will continue to think about this: what can a regular expression do that
produces an acceptable compromise?
Thank you for your help.
…On Sat, Aug 19, 2023 at 3:58 PM Kyryl ***@***.***> wrote:
Ah, I see if docstring must be preceded by ): maybe a different approach
could be used.
—
Reply to this email directly, view it on GitHub
<#49 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH67Y5B6V345FI7AHZUEITXWELHVANCNFSM6AAAAAA3WFCVOI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Claude Marinier
|
Hi,
With the addition of the exclamation mark and equal in regex, it should be possible to highlight multiline docstrings. I have been experimenting and have not succeeded.
Has anyone done this? Does anyone have suggestions?
I thought this might work (simple case with double quotes only), but it does not.
{"py", "\"\"\"[!\"\"\"]*?\"\"\"", {2 | SYN_IT}},
I am using this as a stop-gap measure.
{"py", "^[ \t]*\"\"\".*$", {2 | SYN_BD | SYN_RV}},
Thanks.
The text was updated successfully, but these errors were encountered: