Skip to content

Conversation

machtigBE
Copy link

changing \R to \r seems to fix the output of unicode characters

changing \R to \r seems to fix the output of unicode characters
@Mistralys
Copy link
Owner

Hi, thanks for the pull request!

After doing a bit of research, I think I will add some tests first. The \R modifier should be the correct one, as it includes a range of newline characters. I suspect that the solution can be found in the unicode regex switch, but I will confirm it first.

@Mistralys
Copy link
Owner

The official regular expression doc has this to say:

"by default, the escape sequence \R matches any Unicode newline sequence. In 8-bit non-UTF-8 mode \R is equivalent to the following: (?>\r\n|\n|\x0b|\f|\r|\x85)"

http://www.pcre.org/current/doc/html/pcre2pattern.html#newlineseq

So in theory, the regex is correct. I will confirm :)

@Mistralys
Copy link
Owner

Hi @machtigBE,

I have pushed some changes and published a new release that fixes the issue.

@Mistralys Mistralys closed this May 31, 2023
@machtigBE
Copy link
Author

Works perfectly, thank you!

@machtigBE machtigBE deleted the patch-1 branch May 31, 2023 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants