New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: onepass regexp.Match allocates memory unnecessarily #19573

Closed
haya14busa opened this Issue Mar 16, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@haya14busa
Contributor

haya14busa commented Mar 16, 2017

Please answer these questions before submitting your issue. Thanks!

What did you do?

I noticed that regexp.Match allocates memory unnecessarily while profiling a
program which uses "onepass regex" a lot (e.g. ^[a-z]$).

ROUTINE ======================== regexp.(*Regexp).doExecute in /usr/lib/go/src/regexp/exec.go
   95.50MB    96.02MB (flat, cum)  7.37% of Total
         .          .    434:           }
         .          .    435:   } else if size < m.maxBitStateLen && r == nil {
         .          .    436:           if m.b == nil {
         .          .    437:                   m.b = newBitState(m.p)
         .          .    438:           }
         .   528.17kB    439:           if !m.backtrack(i, pos, size, ncap) {
         .          .    440:                   re.put(m)
         .          .    441:                   return nil
         .          .    442:           }
         .          .    443:   } else {
         .          .    444:           m.init(ncap)
         .          .    445:           if !m.match(i, pos) {
         .          .    446:                   re.put(m)
         .          .    447:                   return nil
         .          .    448:           }
         .          .    449:   }
   95.50MB    95.50MB    450:   dstCap = append(dstCap, m.matchcap...)
         .          .    451:   if dstCap == nil {
         .          .    452:           // Keep the promise of returning non-nil value on match.
         .          .    453:           dstCap = arrayNoInts[:0]
         .          .    454:   }
         .          .    455:   re.put(m)

(This is go tool pprof output profiling against my program, so it contains m.backtrack allocations too, but please ignore it (L439).)

For example, regexp.MustCompile("^[a-z]$").Match([]byte("a")) should not allocate memory unnecessarily at L450 (dstCap = append(dstCap, m.matchcap...)), but allocation occurs because len(m.matchcap) == 2.

As for matching of non onepass regex (e.g. .*[a-z] or something), m.backtrack(i, pos, size, ncap) or m.init(ncap) reset m.matchcap length to zero (ncap==0 for regexp.Match) and doesn't allocate memory at L450.

What did you expect to see?

No allocation for regexp.Match of onepass regexp.

See CL for benchmark script.

benchmark                                    old ns/op      new ns/op      delta
BenchmarkMatch_match_onepass_regex/32-4      6465           4628           -28.41%
BenchmarkMatch_match_onepass_regex/1K-4      208324         151558         -27.25%
BenchmarkMatch_match_onepass_regex/32K-4     7230259        5834492        -19.30%
BenchmarkMatch_match_onepass_regex/1M-4      234379810      166310682      -29.04%
BenchmarkMatch_match_onepass_regex/32M-4     7903529363     4981119950     -36.98%

benchmark                                    old MB/s     new MB/s     speedup
BenchmarkMatch_match_onepass_regex/32-4      4.95         6.91         1.40x
BenchmarkMatch_match_onepass_regex/1K-4      4.92         6.76         1.37x
BenchmarkMatch_match_onepass_regex/32K-4     4.53         5.62         1.24x
BenchmarkMatch_match_onepass_regex/1M-4      4.47         6.30         1.41x
BenchmarkMatch_match_onepass_regex/32M-4     4.25         6.74         1.59x

benchmark                                    old allocs     new allocs     delta
BenchmarkMatch_match_onepass_regex/32-4      32             0              -100.00%
BenchmarkMatch_match_onepass_regex/1K-4      1024           0              -100.00%
BenchmarkMatch_match_onepass_regex/32K-4     32768          0              -100.00%
BenchmarkMatch_match_onepass_regex/1M-4      1048576        0              -100.00%
BenchmarkMatch_match_onepass_regex/32M-4     104559255      0              -100.00%

benchmark                                    old bytes      new bytes     delta
BenchmarkMatch_match_onepass_regex/32-4      512            0             -100.00%
BenchmarkMatch_match_onepass_regex/1K-4      16384          0             -100.00%
BenchmarkMatch_match_onepass_regex/32K-4     524288         0             -100.00%
BenchmarkMatch_match_onepass_regex/1M-4      16777216       0             -100.00%
BenchmarkMatch_match_onepass_regex/32M-4     2019458128     0             -100.00%

What did you see instead?

1 allocation for regexp.Match of onepass regexp at L450 (dstCap = append(dstCap, m.matchcap...)).
It's only one needless allocation, but regex will be often called a lot of times.
My program which heavily uses onepass regexp Match actually speedup by about 10% by removing the needless allocation.

System details

go version go1.8 linux/amd64
GOARCH="amd64"
GOBIN="/home/haya14busa/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/haya14busa"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build371996710=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
GOROOT/bin/go version: go version go1.8 linux/amd64
GOROOT/bin/go tool compile -V: compile version go1.8 X:framepointer
uname -sr: Linux 4.4.48-1-MANJARO
LSB Version:	n/a
Distributor ID:	ManjaroLinux
Description:	Manjaro Linux
Release:	16.10
Codename:	Fringilla
/usr/lib/libc.so.6: GNU C Library (GNU libc) stable release version 2.24, by Roland McGrath et al.
gdb --version: GNU gdb (GDB) 7.12.1

@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 16, 2017

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz
Member

bradfitz commented Mar 16, 2017

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot commented Mar 16, 2017

CL https://golang.org/cl/38270 mentions this issue.

@gopherbot gopherbot closed this in 8a16d7d Mar 23, 2017

@haya14busa

This comment has been minimized.

Show comment
Hide comment
@haya14busa

haya14busa Mar 23, 2017

Contributor

Thank you for including it 👍

Contributor

haya14busa commented Mar 23, 2017

Thank you for including it 👍

@Mygod Mygod referenced this issue Feb 25, 2018

Closed

liboverture.so memory leak #1639

6 of 17 tasks complete

@golang golang locked and limited conversation to collaborators Mar 23, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.