You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In addition to character classes, there will also be shorthand character classes. However, I'm not quite sure yet which ones there should be and which characters they should cover.
BTW, I've bought both RegexBuddy and RegexMagic form JGS (author of the website you linked), so if you need me to test some RegExs for you I'll happily do it. Both tools have a custom engine that includes all versions of the major RegEx engines (so that you can test backward compatibility issues with any engine) plus the custom engine by JGS, which is very powerful (also documented at the website).
One of these two programs also allows debugging a RegEx to break it down into each single passage, in case you need to compare expected behaviour in your code with actual behaviour by other engines.
As for the shorthand classes to implement, it really depends on what your engine goals are — which I'm guessing is mostly oriented toward lexers creation?
I'm not quite sure that \h and \v would be all that useful (vertical tabs are not used much in Western languages, and \t should suffice in place of \h), also these tend to have different meanings across engines.
In addition to character classes, there will also be shorthand character classes. However, I'm not quite sure yet which ones there should be and which characters they should cover.
According to this website, the different RegEx engines cover different characters in the shorthand character classes:
https://www.regular-expressions.info/shorthand.html
The current listing:
\d
for[0-9]
\D
for[^\d]
\t
for the tab character\r
for carriage return (CR)\n
for linefeed (LF)\f
for form feed\s
for[ \t\r\n\f]
\S
for[^\s]
\w
for[A-Za-z0-9_]
\W
for[^\w]
\h
for[ \t]
\v
for[\r\n\f]
The text was updated successfully, but these errors were encountered: