-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible regexp memory explosion in 5.20.0 #13984
Comments
From @hvdsCreated by @hvdsI've been experimenting with an attempt to take a SQL grammar expressed The code below is (60%) cut down from an interim stage in that process; Damain and I are looking into it, but he suggested I perlbug it as a zen% ulimit -v # I've set a 1GB process-size limit my $g = qr{ <rule: simple_Latin_letter> <simple_Latin_upper_case_letter> | <simple_Latin_lower_case_letter> }x; Perl Info
|
From DCONWAY@cpan.orgFurther to Hugo's report... I have now had the opportunity to investigate the problem, and have concluded that this has nothing to do with Regexp::Grammars per se, except that R::G is generating the enormous regex that 5.20 is failing to compile. The attached example (constructed by removing all the R::G syntactic sugar from Hugo's original) does not make use of Regex::Grammars at all, and still leaks endlessly under 5.20...whereas it compiles repidly and without complaint under all of: 5.10.1 I hope this additional information may be of help in tracking down the regression. Damian |
The RT System itself - Status changed from 'new' to 'open' |
From @arcDamian Conway via RT <perlbug-followup@perl.org> wrote:
Yes, it looks that way to me too. Thanks for supplying that reduction. The file attached cuts down this regex further still, by removing all The symptoms I observed seem to be the same, though I also get a In blead, it looks like the immediate culprit is study_chunk() — it -- |
From @arc |
From @iabynOn Mon, Jul 14, 2014 at 12:34:06AM +0100, Aaron Crane wrote:
It bisects to the following. Yves...? commit 099ec7d Fix RT #120600: Variable length lookbehind is not variable -- |
From @demerphqThe only useful thing I have to add /right now/ is that I am glad I wrote a On 14 July 2014 12:13, Dave Mitchell <davem@iabyn.com> wrote:
-- |
From @khwilliamsonOn 07/14/2014 04:13 AM, Dave Mitchell wrote:
I'm curious as to how you bisected this. When I tried running Aaron's |
From @iabynOn Mon, Jul 14, 2014 at 12:15:46PM -0600, Karl Williamson wrote:
I just started a new shell and did $ ulimit -v 500000 then ran the bisect. (I had to experiment for a minute or so to find a suitable limit that -- |
From dana@acm.orgA tool I found useful for this is massif from the valgrind suite, e.g.: valgrind --tool=massif --stacks=yes --alloc-fn=Perl_safesysmalloc --alloc-fn=Perl_safesyscalloc --alloc-fn=Perl_safesysrealloc --alloc-fn=Perl_Slab_Alloc --time-unit=B --max-snapshots=1000 perl -MMath::Prime::XS=:all -E 'say 1' (in this case, load up a module and do basically nothing else), then: massif-visualizer massif.out.#### [#### depending on the file] to see the graphical results. This shows, for example, a memory spike from Perl__invlist_union_maybe_complement_2nd that shows up in 5.20.0 and 5.21.2 that is not in 5.19.7 when processing constant.pm's: my $normal_constant_name = qr/^_?[^\W_0-9]\w*\z/; which means it hits lots of modules. Tracking down the Perl source that causes given memory behavior is not very straightforward, but I think the tool is pretty valuable for seeing how memory is being used, and what is causing the use, over time. |
From @khwilliamsonOn 07/22/2014 12:29 PM, Dana Jacobsen via RT wrote:
The memory spike occurs when taking the union of two lists. At the This is done to avoid having to ask for extra memory in the middle of |
From @demerphqOn 14 July 2014 12:13, Dave Mitchell <davem@iabyn.com> wrote:
I will try to find some time for this. Yves |
From @demerphqOn 14 July 2014 12:13, Dave Mitchell <davem@iabyn.com> wrote:
Some basic details on this issue: In order to detect infinite recursion, and to expand certain constructs to So for instance if we have node C which uses D which uses E, and we can One could probably argue this is wrong and that we should somehow store This is coupled with the naive bitmask strategy for marking which nodes we More to come later. Yves |
From @demerphqOn 13 July 2014 16:27, Hugo van der Sanden <perlbug-followup@perl.org>
You really shoud use character classes here, and not use regex subs for $digit= "[0-9]" Similar for (?&ws) and similar patterns. Anyway, I have pushed the following commit which should fix this. Please commit a51d618 rt 122283 - do not recurse into GOSUB/GOSTART when not SCF_DO_SUBSTR See also comments in patch. A complex regex "grammar" like that in Unfortunately I could not track down exacty why this occured, but I have not thought of a good way to test this change, so this patch Ticket closers: please dont close the ticket until I have reported that I cheers, -- |
From @hvdsdemerphq <demerphq@gmail.com> wrote: Thanks Yves, I'll try this out over the weekend. Hugo |
From @cpansproutOn Thu Sep 25 00:42:31 2014, demerphq wrote:
You applied tests as d9a72fc. Does that mean this can be closed? (a51d618 is the cause of bug #122890, but I don’t think this needs to stay open because of that.) -- Father Chrysostomos |
From @demerphqOn 5 October 2014 03:55, Father Chrysostomos via RT <
It is up to you. I would leave it open, but if you think its better to Yves -- |
@cpansprout - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#122283 (status was 'resolved')
Searchable as RT122283$
The text was updated successfully, but these errors were encountered: