-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/(.(?2))((?<=(?=(?1)).))/ hangs and eats all available RAM #14935
Comments
From victor@drawall.ccCreated by @GrimyHow to reproduce
Expected behavior Perl should immediately die with the following diagnostic:
This is the current behavior of almost everything that would cause a regex to Actual behavior Perl becomes unresponsive, allocating more and more RAM Perl Info
|
From @khwilliamsonI am unable to reproduce this. I've tried on both -DEBUGGING builds, and not, and in 5.22.2 The regex compiles in all builds I tried it as Final program: -- |
The RT System itself - Status changed from 'new' to 'open' |
From victor@drawall.ccMy bad, my “How to reproduce section” is wrong. When matched against perl -e '"a"=~/(.(?2))((?<=(?=(?1)).))/' I’ve just reproduced it on the latest blead I’d recommend doing a `ulimit -Sv 1048576` or some such before running |
From @demerphqOn 14 October 2015 at 08:05, Victor ADAM <victor@drawall.cc> wrote:
This should trigger an infinite recursion error. yves -- |
From victor@drawall.ccYes, it *should* trigger an infinite recursion error, but it actually doesn’t. |
From @khwilliamsonOn 10/14/2015 03:11 AM, Victor ADAM wrote:
Thanks for the ulimit hint. This is an area of the engine that I have never looked at, so it would Matching REx "(.(?2))((?<=(?=(?1)).))" against "a" |
From @demerphqOn 14 October 2015 at 18:01, Karl Williamson <public@khwilliamson.com> wrote:
I have a fix. (Bit vector of visited nodes).
Exactly. I am still investigating. Might take me a few days, but ill get there. Yves |
From @khwilliamsonOn 10/14/2015 11:09 AM, demerphq wrote:
Yves++ |
From @khwilliamsonOn 10/14/2015 11:09 AM, demerphq wrote:
How is this coming? |
From @demerphqI'm really sorry, a series of personal issues, family in hospital type If it would help I will try to find time to upload what Yves On 26 December 2015 at 06:24, Karl Williamson <public@khwilliamson.com> wrote:
-- |
From @khwilliamsonI've added this to the 5.24 blockers list. Yves, if you don't have the tuits for this, could you post what you got so far? |
From @jkeenanOn Tue Oct 13 23:05:46 2015, victor@drawall.cc wrote:
And here it is at today's blead (on Linux x86_64): $ ulimit -Sv 1048576 && ./perl -Ilib -e '"a"=~/(.(?2))((?<=(?=(?1)).))/' -- |
From @demerphqOn 2 March 2016 at 00:58, James E Keenan via RT
Yeah, I know. This is still on my todo list. All I can say is that I Yves -- |
From @demerphqOn 2 March 2016 at 11:01, demerphq <demerphq@gmail.com> wrote:
And this should be fixed now with: This might need a doc fix. It is no longer illegal to do "infinite so for instance this: /a(?R)?b/ now matches "ab", "aabb", etc. I am not sure this is the best wording, but now a pattern sub-routine Eg: "foobar"=~/(?&y)(?(DEFINE)(?<x>(?&y)?foo)(?<y>(?&x)?bar))/ matches. However: "foobar"=~/(?&y)(?(DEFINE)(?<x>(?&y)foo)(?<y>(?&x)bar))/ does not, and produces output like this: floating "foobar" at 0..9223372036854775807 (checking floating) minlen 6 So I think this ticket can be closed. cheers, -- |
From zefram@fysh.orgdemerphq wrote:
This is broken behaviour, in multiple respects. Though we already have If a recursion occurs without moving through the string, this does "ab" =~ /a(?(1)z|(?(2)()|()))*\1b/ should match, with exactly two iterations of the * loop, each consuming "axxb" =~ /a(?:x(?(1)z|(?(2)()|())))*\1b/ So perl has a bug here in failing to perform the earlier match. Now let's consider cases that actually could recurse infinitely. "ab" =~ /a(?:)*b/ Note that in actual *regular* expressions, in the sense of matching a perl does have a problem with infinite iteration, because it uses an NFA Incidentally, if we change the * to a *?, preferring lower numbers We can muck about with the loop structure to get worse misbehaviour "ab" =~ /a(?:()|())*\1\2b/ should match, and in fact has infinitely many ways to match. It requires "ab" =~ /a(?(2)()|())*?\1\2b/ doesn't make perl recognise the match. So there are many cases here where the extra logic put in to avoid hanging Detecting something that would otherwise hang is nice, but not essential. -zefram |
From @demerphqOn 8 March 2016 at 01:36, Zefram <zefram@fysh.org> wrote:
I understand your argument, but my reading of the docs says that it is Specifically: "To break the loop, the following
Except see the documentation I quote above.
I am sympathetic to what you say here. But there is a subtle difference, Patterns are often supplied by the So, to me, making every Perl program that allows user supplied regexes Anyway, much of the dialog that you posted here IMO is unrelated to Can you please reframe your dialog to be purely about regex recursion? FWIW, I am open to restoring the old behavior, and making it a fatal But I do think that /(?<foo>(?&foo)foo)/ infinite looping and then Note that /(?<foo>a(?&foo)?b)/ works just fine. The only problem are things like /(?<foo>(?&foo))/. Thanks for your feedback. I will say that independently of this, we *CAN* implement some form of cheers, -- |
From zefram@fysh.orgdemerphq wrote:
It's almost precisely equivalent. The problems you're addressing in (?R)
Just using left recursion, not emulating a zero-length iteration, consider "abbc" =~ /a((?:(?1)b)?)c/ Mathematically this does match, in exactly one way. But the naive The left recursion can be rescued by changing the ? to ??, so preferring Looking at some zero-length-iteration cases, I found some curious "abc" =~ /a((?1)??)c/ 5.22 signals "Infinite recursion" for this, not surprisingly, but
With unconditional recursion, that has no way to match using a finite -zefram |
From @demerphqOn 8 March 2016 at 18:20, Zefram <zefram@fysh.org> wrote:
Ok, then I will restore the "Infinite recursion in regex" so we do not
Hmm.
Yes, me too. I am looking into why it doesn't.
Could you help me with this and post some test you think should pass? I am digging into the example you gave to figure out why it doesnt get
I appreciate that concern and I share it too. cheers, -- |
From @khwilliamsonIn http://nntp.perl.org/group/perl.perl5.porters/235003, which got attached to the wrong ticket, it was proposed to possibly revert the patch. I do not believe that it is acceptable to eat up all the machine's available memory. which would be one consequence of that reversion. |
From @demerphqOn Thu Mar 10 10:33:39 2016, khw wrote:
I believe that this ticket can be closed now. Outside of an issue with Solaris as far as I know I fixed the issues raised in this thread, including the ones from zefram. and especially If the OP concurs then I think this is done. |
From @khwilliamsonOn Mon Apr 04 03:35:12 2016, demerphq wrote:
The OP has not responded one way or the other. I am taking this ticket, and if the OP continues to not respond, after at least 7 days, I will close the ticket -- |
From @khwilliamsonResolved, as I threatened earlier |
@khwilliamson - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#126182 (status was 'resolved')
Searchable as RT126182$
The text was updated successfully, but these errors were encountered: