Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing matches in RE alternative cause UTF-8 decode error #171

kquick opened this issue Oct 20, 2019 · 0 comments


Copy link

@kquick kquick commented Oct 20, 2019

If a match target appears in an alternative an error is thrown:

$ ghci
Prelude> importText.RE.PCRE.String
Prelude PCRE.String> r = [re|foo(A${here}(.*)B|C${there}(.*)D)|]
Prelude PCRE.String> allMatches ("foobar" *=~ r)
Prelude PCRE.String> allMatches ("fooAoneB" *=~ r)
[ Match {matchSource = "fooAoneB", .... *** Exception: utf8_correct_bs: UTF-8 decoding error
CallStack (from HasCallStack):
  error, called at ./Text/RE/ZeInternals/Types/Match.lhs:248:13 in regex-
Prelude PCRE.String> allMatches ("fooCtwoD" *=~ r)
[ Match {matchSource = "fooCtwoD", ... [same error]

This seems to be related to the branch where the match is not found:

PCRE.String> r = [re|foo(A${here}(.*)B|CD)|]
PCRE.String> allMatches ("foobar" *=~ r)
PCRE.String> allMatches ("fooAbarB" *=~ r)
... valid match, no error ...
PCRE.String> allMatches ("fooCD" *=~ r)
... error as above...

It's possible this is an invalid usage on my part, but I would expect a different type of error than a UTF-8 decoding error. Additionally, I originally had the same match name on both alternatives and got the same error, so I should have had a valid match regardless of which alternative matched.

regex version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
1 participant
You can’t perform that action at this time.