Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF8 RegexFix doesn't correct capture's text #170

Open
chekoopa opened this issue Jul 17, 2019 · 1 comment

Comments

@chekoopa
Copy link

commented Jul 17, 2019

convertMatchText @ Text.RE.ZeInternals.Types.Match does perfectly correct captures' offsets and lengths, but capturedText is left intact (at this very line you can see, it's put straight from input), which may provoke more issues with using the library, mostly Text.RE.PCRE.Text.

The workaround is take (captureLength c) $ drop (captureOffset c) $ (captureSource c), but it's kind of lame. Incorporating similar code into RegexFix would make it more transparent but may impact on performance.

@cdornan

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

@chekoopa thanks for the clear analysis. I am far too busy to be able to work on this at the moment but will be amenable to carving out some time. The more demand there is the sooner I am likely to get to this so please shout if anybody needs this fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.