You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! First, thanks for this great library - this is an impressive feat!
I needed an equivalent function for https://golang.org/pkg/regexp/#Regexp.FindAllString which ideally would be a part of this library, but unfortunately doesn't exist today. I took a stab at implementing it (without the n parameter):
At first glance, this seemed correct and appeared to work - however I realized that it in fact is incompatible with unicode because match.Length appears to report length in runes not bytes. I'm not sure whether or not Capture.Index reports bytes or runes either, and the docs don't define this:
// the position in the original string where the first character of// captured substring was found.Indexint// the length of the captured substring.Lengthint
From testing, it appears that Capture.Index oddly is in bytes and not runes. A corrected implementation is:
func regexp2FindAllString(re *regexp2.Regexp, s string) []string {
var matches []string
for {
match, _ := re.FindStringMatch(s)
if match == nil {
break
} else {
matches = append(matches, match.String())
- s = s[match.Index+match.Length:]+ s = s[match.Index+len(match.String()):]
}
}
return matches
}
This brings me to my points of feedback:
Index in bytes and Length in runes is an odd inconsistency, I imagine they should be the same.
The docstrings should ideally clarify this.
It would be great if the library exposed a FindAllString implementation
Thanks again for the great library!
The text was updated successfully, but these errors were encountered:
I'm going to close this issue since this looks like a pretty simple function to implement. However, I will update the README.md to give an example usage of FindNextMatch.
Hello! First, thanks for this great library - this is an impressive feat!
I needed an equivalent function for https://golang.org/pkg/regexp/#Regexp.FindAllString which ideally would be a part of this library, but unfortunately doesn't exist today. I took a stab at implementing it (without the
n
parameter):At first glance, this seemed correct and appeared to work - however I realized that it in fact is incompatible with unicode because
match.Length
appears to report length in runes not bytes. I'm not sure whether or notCapture.Index
reports bytes or runes either, and the docs don't define this:From testing, it appears that
Capture.Index
oddly is in bytes and not runes. A corrected implementation is:This brings me to my points of feedback:
Index
in bytes andLength
in runes is an odd inconsistency, I imagine they should be the same.FindAllString
implementationThanks again for the great library!
The text was updated successfully, but these errors were encountered: