New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[\s\S] doesn't seem to work #4
Comments
It looks the same issue as https://bugs.freepascal.org/view.php?id=34130 In this implementation of regexpr you cannot use 'not-space' inside character class. You can invert character class as [^a-z] but I do not understand what do you want to do. for example expression '(?m)Test:\s*(.*?)\s;' will be found in input text 'Test: hel'#$d#$a'lo ;' and returns 'Test: hel'#$d#$a'lo ;' |
The problem with using |
(.) actually works - I even add the test for it ((?m) switches on multiline flag):3bf2b36 But of cause you need '*' because there are more than one char in you example. Please help me to reproduce the problem and I will fix it.The best way - give me or send in github pull request a test that will fail but you think it should not. [\s\S] just won't work because TRegExpr does not work with metachars inside char class (square brackets '[]').TRegExpr expects inside char class just chars or intervals (like [a-z]). 22.10.2018, 16:09, "Twigpig" <notifications@github.com>:The problem with using "(.?)" instead of "([\s\S]?)" is that it doesn't return results that include line breaks, even with the multiline flag enabled (as demonstrated in my second to last screenshot above).—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.
|
I was classing this as a bug because I wrongly expected this kind of syntax to be consistent across all variations of RegEx and other implementations of RegEx allow it. However it sounds like you're suggesting it's intentionally not supported in TRegExpr. If you don't mind me asking, is there a reason? Was it by design? My aim was to produce a variation of the above RegEx that could be used in both TRegExpr and http://www.regexr.com/ but it seems that the required syntax for each is incompatible with the other (and I suppose that's okay). Thank you for taking the time to look into this and getting back to me. Much appreciated. |
This library I wrote 20 years ago and at the time I implemented just the re subset that I need. Now I do not use pascal in my everyday life so if we are going to continue development we need somebody with current pascal skills and wish to join TRegExpr development. In fact that's the main reason why I published it on github. Meanwhile I am going to fix bugs .. if you can find one, after 20 years of the library exposure ;) |
Ah, I see. So it's just something it doesn't currently handle, rather than something it shouldn't allow on principle? If that's the case, perhaps I'll implement the feature myself if I can find the time. Thanks again. |
As I see there no such thing in POSIX basic standard As for extensions this is [:blank:] in POSIX, \s in vim and no such thing in perl. Or this is [:digit:] for POSIX and \d for vim and perl. May be perl way is better because POSIX is too verbose. |
It's very intresting, because the latest trunk version works correctly with UTF8 chars in Lazarus 2.0.3 / fpc 3.0.5. , and no need anymore the complicate UTF8<->unicodestring conversion. Thank you for this library! |
\S \D \W not allowed in
we cannot add here handling of \D\ W\ S - too much chars needed in param (65k minus few). |
Using Regex "Test:\s*([\s\S]?)\s;" (without quotes, obviously) with an input of "Test: hello ;" correctly Returns "hello" on other Regex tools (e.g. http://www.regexr.com/) but returns no results using TRegExpr.
Using "Test:\s*(.?)\s;" works for this case in TRegExpr but obviously wouldn't do the same job if you were using a multi-line input string.
vs
Unless I'm mistaken, the below should return "hell\nlo":
The text was updated successfully, but these errors were encountered: