Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak in regular expressions (perl > 5.24) #17218

Closed
mumpitzstuff opened this issue Oct 24, 2019 · 13 comments
Closed

memory leak in regular expressions (perl > 5.24) #17218

mumpitzstuff opened this issue Oct 24, 2019 · 13 comments
Assignees

Comments

@mumpitzstuff
Copy link

mumpitzstuff commented Oct 24, 2019

Description
The following code leads to huge memory leaks for perl versions above 5.24.
no leak (tested): v5.20, v5.22, v5.24
leak (tested): v5.28, v5.30

Steps to Reproduce

#!/usr/bin/perl
use strict;
use warnings; 
use utf8;
use Encode qw(encode_utf8 decode_utf8);


sub leak($$)
{
  my ($data, $regex) = @_;
  
  my @matches = ($data =~ /$regex/);
  
  print "matches = ".join(", ", @matches)."\n" if (@matches);
}

sub noleak($$)
{
  my ($data, $regex) = @_;
  
  $regex = decode_utf8($regex);
  
  my @matches = ($data =~ /$regex/);
  
  print "matches = ".join(", ", @matches)."\n" if (@matches);
}


for (my $i = 0; $i < 9999; $i++)
{
  # replace both leak calls by noleak and the error is gone
  leak('title="SAT.1"><img<td class="timeRow"><div<div class="content"><a>match_first</a>', 'title="SAT.1"><img[\w\W]*?<td class="time[\w\W]*?Row">[\w\W]*?<div[\w\W]*?<div class="content">\s*<a[\w\W]*?>\s*(.*?)\s*<\/a>');
  leak('title="ORF 3"><img<div class="content"><a>match_second</a>', 'title="ORF 3"><img[\w\W]*?<div class="content">\s*<a[\w\W]*?>\s*(.*?)\s*<\/a>');
}

Expected behavior
The script fills the whole RAM on my raspberry pi with Perl 5.28 which should never happen.

Workaround: If both leak() function calls within the for loop are replaced by noleak(), the memory leak is gone.

Perl configuration

Summary of my perl5 (revision 5 version 28 subversion 1) configuration:
   
  Platform:
    osname=linux
    osvers=4.9.0
    archname=arm-linux-gnueabihf-thread-multi-64int
    uname='linux localhost 4.9.0 #1 smp debian 4.9.0 armv7l gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dcc=arm-linux-gnueabihf-gcc -Dcpp=arm-linux-gnueabihf-cpp -Dld=arm-linux-gnueabihf-gcc -Dccflags=-DDEBIAN -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=/build/perl-v3uRQW/perl-5.28.1=. -fstack-protector-strong -Wformat -Werror=format-security -Dldflags= -Wl,-z,relro -Dlddlflags=-shared -Wl,-z,relro -Dcccdlflags=-fPIC -Darchname=arm-linux-gnueabihf -Dprefix=/usr -Dprivlib=/usr/share/perl/5.28 -Darchlib=/usr/lib/arm-linux-gnueabihf/perl/5.28 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/arm-linux-gnueabihf/perl5/5.28 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.28.1 -Dsitearch=/usr/local/lib/arm-linux-gnueabihf/perl/5.28.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -Ui_xlocale -Uversiononly -DDEBUGGING=-g -Doptimize=-O2 -dEs -Duseshrplib -Dlibperl=libperl.so.5.28.1'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='arm-linux-gnueabihf-gcc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2 -g'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion=''
    gccversion='8.2.0'
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=4
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=8
    longdblkind=0
    ivtype='long long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='arm-linux-gnueabihf-gcc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/arm-linux-gnueabihf/8/include-fixed /usr/include/arm-linux-gnueabihf /usr/lib /lib/arm-linux-gnueabihf /lib /usr/lib/arm-linux-gnueabihf
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=libc-2.28.so
    so=so
    useshrplib=true
    libperl=libperl.so.5.28
    gnulibc_version='2.28'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_IMPLICIT_CONTEXT
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
  Locally applied patches:
    DEBPKG:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN.
    DEBPKG:debian/db_file_ver - https://bugs.debian.org/340047 Remove overly restrictive DB_File version check.
    DEBPKG:debian/doc_info - Replace generic man(1) instructions with Debian-specific information.
    DEBPKG:debian/enc2xs_inc - https://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @INC directories.
    DEBPKG:debian/errno_ver - https://bugs.debian.org/343351 Remove Errno version check due to upgrade problems with long-running processes.
    DEBPKG:debian/libperl_embed_doc - https://bugs.debian.org/186778 Note that libperl-dev package is required for embedded linking
    DEBPKG:fixes/respect_umask - Respect umask during installation
    DEBPKG:debian/writable_site_dirs - Set umask approproately for site install directories
    DEBPKG:debian/extutils_set_libperl_path - EU:MM: set location of libperl.a under /usr/lib
    DEBPKG:debian/no_packlist_perllocal - Don't install .packlist or perllocal.pod for perl or vendor
    DEBPKG:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets.
    DEBPKG:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor.
    DEBPKG:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy.
    DEBPKG:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable.
    DEBPKG:debian/perlivp - https://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local
    DEBPKG:debian/squelch-locale-warnings - https://bugs.debian.org/508764 Squelch locale warnings in Debian package maintainer scripts
    DEBPKG:debian/patchlevel - https://bugs.debian.org/567489 List packaged patches for 5.28.1-6 in patchlevel.h
    DEBPKG:fixes/document_makemaker_ccflags - https://bugs.debian.org/628522 [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags}
    DEBPKG:debian/find_html2text - https://bugs.debian.org/640479 Configure CPAN::Distribution with correct name of html2text
    DEBPKG:debian/perl5db-x-terminal-emulator.patch - https://bugs.debian.org/668490 Invoke x-terminal-emulator rather than xterm in perl5db.pl
    DEBPKG:debian/cpan-missing-site-dirs - https://bugs.debian.org/688842 Fix CPAN::FirstTime defaults with nonexisting site dirs if a parent is writable
    DEBPKG:fixes/memoize_storable_nstore - [rt.cpan.org #77790] https://bugs.debian.org/587650 Memoize::Storable: respect 'nstore' option not respected
    DEBPKG:debian/makemaker-pasthru - https://bugs.debian.org/758471 Pass LD settings through to subdirectories
    DEBPKG:debian/makemaker-manext - https://bugs.debian.org/247370 Make EU::MakeMaker honour MANnEXT settings in generated manpage headers
    DEBPKG:debian/kfreebsd-softupdates - https://bugs.debian.org/796798 Work around Debian Bug#796798
    DEBPKG:fixes/autodie-scope - https://bugs.debian.org/798096 Fix a scoping issue with "no autodie" and the "system" sub
    DEBPKG:fixes/memoize-pod - [rt.cpan.org #89441] Fix POD errors in Memoize
    DEBPKG:debian/hurd-softupdates - https://bugs.debian.org/822735 Fix t/op/stat.t failures on hurd
    DEBPKG:fixes/math_complex_doc_great_circle - https://bugs.debian.org/697567 [rt.cpan.org #114104] Math::Trig: clarify definition of great_circle_midpoint
    DEBPKG:fixes/math_complex_doc_see_also - https://bugs.debian.org/697568 [rt.cpan.org #114105] Math::Trig: add missing SEE ALSO
    DEBPKG:fixes/math_complex_doc_angle_units - https://bugs.debian.org/731505 [rt.cpan.org #114106] Math::Trig: document angle units
    DEBPKG:fixes/cpan_web_link - https://bugs.debian.org/367291 CPAN: Add link to main CPAN web site
    DEBPKG:debian/hppa_op_optimize_workaround - https://bugs.debian.org/838613 Temporarily lower the optimization of op.c on hppa due to gcc-6 problems
    DEBPKG:debian/installman-utf8 - https://bugs.debian.org/840211 Generate man pages with UTF-8 characters
    DEBPKG:fixes/getopt-long-4 - https://bugs.debian.org/864544 [rt.cpan.org #122068] Fix issue #122068.
    DEBPKG:debian/hppa_opmini_optimize_workaround - https://bugs.debian.org/869122 Lower the optimization level of opmini.c on hppa
    DEBPKG:debian/sh4_op_optimize_workaround - https://bugs.debian.org/869373 Also lower the optimization level of op.c and opmini.c on sh4
    DEBPKG:debian/perldoc-pager - https://bugs.debian.org/870340 [rt.cpan.org #120229] Fix perldoc terminal escapes when sensible-pager is less
    DEBPKG:debian/prune_libs - https://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need.
    DEBPKG:debian/mod_paths - Tweak @INC ordering for Debian
    DEBPKG:debian/configure-regen - https://bugs.debian.org/762638 Regenerate Configure et al. after probe unit changes
    DEBPKG:debian/deprecate-with-apt - https://bugs.debian.org/747628 Point users to Debian packages of deprecated core modules
    DEBPKG:debian/disable-stack-check - https://bugs.debian.org/902779 [perl #133327] Disable debugperl stack extension checks for binary compatibility with perl
    DEBPKG:debian/gdbm-fatal - [perl #133295] https://bugs.debian.org/904005 Temporarily skip GDBM_File fatal.t for gdbm >= 1.15 compatibility
    DEBPKG:fixes/storable-recursion - https://bugs.debian.org/912900 [perl #133326] [120060c] (perl #133326) fix and clarify handling of recurs_sv.
    DEBPKG:fixes/caretx-fallback - https://bugs.debian.org/913347 [perl #133573] [03b94aa] RT#133573: $^X fallback when platform-specific technique fails
    DEBPKG:fixes/eumm-usrmerge - https://bugs.debian.org/913637 Avoid mangling /bin non-perl shebangs on merged-/usr systems
    DEBPKG:fixes/errno-include-path - [6c5080f] [perl #133662] https://bugs.debian.org/875921 Make Errno_pm.PL compatible with /usr/include/<ARCH>/errno.h
    DEBPKG:fixes/kfreebsd-renameat - [a3c63a9] https://bugs.debian.org/912521 [perl #133668] Also work around renameat() kernel bug on GNU/kFreeBSD
    DEBPKG:fixes/time-local-2020 - https://bugs.debian.org/915209 [rt.cpan.org #124787] Fix Time::Local tests
    DEBPKG:fixes/inplace-editing-bugfix/part1 - https://bugs.debian.org/914651 (perl #133659) move argvout cleanup to a new function
    DEBPKG:fixes/inplace-editing-bugfix/part2 - https://bugs.debian.org/914651 (perl #133659) tests for global destruction handling of inplace editing
    DEBPKG:fixes/inplace-editing-bugfix/part3 - https://bugs.debian.org/914651 (perl #133659) make an in-place edit successful if the exit status is zero
    DEBPKG:fixes/fix-manifest-failures - https://bugs.debian.org/914962 Fix t/porting/manifest.t failures when run in a foreign git checkout
    DEBPKG:fixes/pipe-open-bugfix/part1 - [perl #133726] https://bugs.debian.org/916313 Always mark pipe in pipe-open as inherit-on-exec
    DEBPKG:fixes/pipe-open-bugfix/part2 - [perl #133726] https://bugs.debian.org/916313 Always mark pipe in list pipe-open as inherit-on-exec
    DEBPKG:fixes/storable-probing/prereq1 - [3f4cad1] Storable: fix for strawberry build failures:
    DEBPKG:fixes/storable-probing/prereq2 - [perl #133411] [edf639f] (perl #133411) don't try to load Storable with -Dusecrosscompile
    DEBPKG:fixes/storable-probing/disable-probing - https://bugs.debian.org/914133 [perl #133708] [2a0bbd3] (perl #133708) remove build-time probing for stack limits for Storable
    DEBPKG:debian/perlbug-editor - https://bugs.debian.org/922609 Use "editor" as the default perlbug editor, as per Debian policy
    DEBPKG:fixes/posix-mbrlen - [25d7b7a] https://bugs.debian.org/924517 [perl #133928] Fix POSIX::mblen mbstate_t initialization on threaded perls with glibc
  Built under linux
  Compiled at Mar 31 2019 11:51:22
  @INC:
    /etc/perl
    /usr/local/lib/arm-linux-gnueabihf/perl/5.28.1
    /usr/local/share/perl/5.28.1
    /usr/lib/arm-linux-gnueabihf/perl5/5.28
    /usr/share/perl5
    /usr/lib/arm-linux-gnueabihf/perl/5.28
    /usr/share/perl/5.28
    /usr/local/lib/site_perl
    /usr/lib/arm-linux-gnueabihf/perl-base

@jkeenan
Copy link
Contributor

jkeenan commented Oct 24, 2019

The config_args section in your metadata appears to be truncated; it's missing the closing single quote.

config_args='-Dusethreads -Duselargefiles -Dcc=arm-linux-gnueabihf-gcc -Dcpp

Can you double-check that information? Was something cut off?

Thank you very much.
Jim Keenan

@mumpitzstuff
Copy link
Author

Okay fixed hopefully.

@dur-randir
Copy link
Member

Bisect points to 02517e3 is the first bad commit
commit 02517e3
Author: Karl Williamson khw@cpan.org
Date: Tue Jul 12 21:15:07 2016 -0600

regcomp.c: Refactor code dealing with m/[...]/d

This consolidates some code that deals with bracketed character classes
under /d.  As a result, some throw-away steps can be omitted, and things
aren't scattered about.  The earlier version skipped doing some things
if the class is to be inverted.  The reason turns out to not be because
it was necessary, but that the dump of the compiled pattern was unclear.
Previous commits have fixed that, so this now handles inverted character
classes.

@dur-randir
Copy link
Member

PS: this is a commit after 5.26, 5.24 doesn't leak for me.

@mumpitzstuff
Copy link
Author

mumpitzstuff commented Oct 24, 2019

5.24 and below does not leak for me to. The bug can be confirmed for 5.28 and 5.30 from my side. Found nobody who could test it with 5.26. Therefore it could make sense that it have something to do with the refactoring of regcomp.c.

@pos-ei-don
Copy link

Debian Buster, perl 5.28, affected as well

@mumpitzstuff
Copy link
Author

mumpitzstuff commented Oct 25, 2019

MacOS results:

/usr/bin/time -l /opt/local/bin/perl5.24 Desktop/leaktest.pl > /dev/null
5017600 maximum resident set size

/usr/bin/time -lp /opt/local/bin/perl5.26 Desktop/leaktest.pl > /dev/null
807833600 maximum resident set size <- affected

/usr/bin/time -lp /opt/local/bin/perl5.28 Desktop/leaktest.pl > /dev/null
802365440 maximum resident set size <- affected

/usr/bin/time -lp /opt/local/bin/perl5.30 Desktop/leaktest.pl > /dev/null
657444864 maximum resident set size <- partly fixed but also affected

@mumpitzstuff
Copy link
Author

The modifier //a can also be used as workaround.

@mumpitzstuff
Copy link
Author

PS: this is a commit after 5.26, 5.24 doesn't leak for me.

Seems to be a commit before 5.26 was released.

@dur-randir
Copy link
Member

Yeah, i meant "for 5.26", not "after 5.26".

@khwilliamson
Copy link
Contributor

Now fixed by commit 0463f3a

@Dieken
Copy link

Dieken commented Dec 20, 2019

@khwilliamson any plan to backport this one-line fix to 5.26, 5.28 and 5.30? Currently Debian, Ubuntu, CentOS 8, Gentoo, ArchLinux, openSUSE, FreeBSD, NetBSD, OpenBSD, Homebrew all are affected by this issue.

I was bitten when I used GNU Parallel "parallel --pipe --roundrobin" with 5.30.1.

@toddr toddr modified the milestones: 5.30.3, 5.30.2 Jan 6, 2020
@toddr
Copy link
Member

toddr commented Jan 6, 2020

I've added a backport tag so we can see what is recommended for backport?

oflimm added a commit to oflimm/openbib that referenced this issue Aug 31, 2022
anstelle OpenBib::Common::Util wegen aktuellem Memory Leak

Perl/perl5#17218
oflimm added a commit to oflimm/openbib that referenced this issue Aug 31, 2022
(normalize, normalize_lang) in eigenes Objekt wg. aktuellem regexp
Memory Leak in unserer Perl-Version 5.28

Perl/perl5#17218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants