Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy matching using ? not enabled in bracketed subexpressions #11

Open
nalimilan opened this issue Sep 8, 2013 · 0 comments
Open

Lazy matching using ? not enabled in bracketed subexpressions #11

nalimilan opened this issue Sep 8, 2013 · 0 comments

Comments

@nalimilan
Copy link

This is something I experienced using R, which includes a copy of TRE which dates back to 2009. Sorry if it has been fixed since then.

The idea is that I want to remove the inner tag in the following expression, while keeping the outer one. I do not know beforehand what may appear in the tags after "class=" and "style=".

"ab"

Thus, the expected result is:

"ab"

The expression I tried is the following, and it works with PCRE. With TRE, the first .* is always greedy:

(gsub() matches the first pattern against the third string and replaces them with the second pattern.)

gsub("(?U)(._)", "\1", "ab")
[1] "b"

gsub("(._?)", "\1", "ab")
[1] "b"

// Use PCRE instead of TRE

gsub("(._?)", "\1", "ab", perl=TRUE)
[1] "ab"

Moreover, it looks like the parentheses around the second .* change the result:

gsub("(._?)", "", "ab")
[1] ""

gsub("._?", "", "ab")
[1] "b"

gsub("(?U)(._)", "", "ab")
[1] ""

gsub("(?U)._", "", "ab")
[1] "b"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant