Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: investigate further performance improvements #26623

Open
hsluoyz opened this issue Jul 26, 2018 · 23 comments
Open

regexp: investigate further performance improvements #26623

hsluoyz opened this issue Jul 26, 2018 · 23 comments
Labels
NeedsInvestigation Performance
Milestone

Comments

@hsluoyz
Copy link

hsluoyz commented Jul 26, 2018

Languages Regex Benchmark:

Language Email(ms) URI(ms) IP(ms) Total(ms)
C PCRE2 25.00 25.02 5.65 55.66
Rust 31.31 31.73 6.75 69.79
PHP 54.39 50.22 5.80 110.40
Javascript 74.88 63.09 2.02 140.00
D ldc 146.01 140.03 5.19 291.24
D dmd 205.52 200.30 5.59 411.41
Perl 246.91 170.74 45.60 463.24
Crystal 339.79 280.74 27.03 647.56
Python PyPy 207.96 177.18 329.85 714.99
Ruby 354.16 308.55 52.73 715.44
Java 382.57 456.34 297.66 1136.57
Kotlin 395.23 474.31 293.53 1163.07
Python 2 368.85 286.70 514.10 1169.65
Python 3 565.71 416.32 493.07 1475.09
Go 423.53 415.45 722.53 1561.51
C# .Net Core 1952.13 1681.00 111.32 3744.45
C# Mono 2463.84 2088.87 153.78 4706.49

In the above benchmark, Go's regex is even slower than Python. It is not ideal because as Python is a scripting language, and Go is a static language, Go should be faster than Python.

I noticed that there's an issue here: #19629, and someone said that because Python uses C for regex, and C is faster than Go. But Python is a cross-platform language and it can enjoy the C regex implementations for all platforms. Why can't Go do the same thing? This may be a stupid question but I just don't understand why Go has to use cgo to call C code, but Python doesn't have this limitation? Thanks.

@ianlancetaylor ianlancetaylor changed the title Go's regex is even slower than Python regexp: Go's regex is even slower than Python Jul 26, 2018
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jul 26, 2018

Calling out to C carries a cost. We don't want to do it for a basic package like regexp. We're much more interested in speeding up Go's regexp package. If people want to work on that, that would be great.

Note that one reason that Go's regexp package may be slower is that it works on UTF-8 characters, not ASCII bytes. I don't know what Python does.

Also note that Go is committed to using regexps that scale well (see https://swtch.com/~rsc/regexp/). I don't know what Python does.

I'm not sure it's useful to leave a general issue like this open. It doesn't suggest any specific action to take. Are you interested in examining the regexp code to understand why Python does better on this benchmark?

@andybons andybons added the WaitingForInfo label Jul 26, 2018
@andybons andybons added this to the Unplanned milestone Jul 26, 2018
@mvdan
Copy link
Member

mvdan commented Jul 26, 2018

The benchmark code includes compiling the regex. In a common use of regexp, one would compile the regex once and run it many times, so the benchmark numbers aren't very helpful.

Also note that the benchmark numbers are almost a year old at this point, and Go does two releases per year.

@Azareal
Copy link

Azareal commented Jul 27, 2018

I might be mistaken, but doesn't PCRE have a JIT compiler? That might explain it, at-least for a couple of the top ones (I know PHP uses PCRE).

@Azareal
Copy link

Azareal commented Jul 27, 2018

The benchmark code includes compiling the regex. In a common use of regexp, one would compile the regex once and run it many times, so the benchmark numbers aren't very helpful.

mariomka/regex-benchmark#2 I found an example on the same repository which apparently excludes compilation, but it doesn't look too scientific (only ten executions). It shows more or less the same results (which is odd as I'd thought that compilation would have more of an impact on the times).

@mvdan
Copy link
Member

mvdan commented Jul 27, 2018

The compilation does have a large impact on the speed:

$ cat f_test.go
package p

import (
        "regexp"
        "testing"
)

var Sink bool

func BenchmarkCompileRun(b *testing.B) {
        for i := 0; i < b.N; i++ {
                rx := regexp.MustCompile(`[\w\.+-]+@[\w\.-]+\.[\w\.-]+`)
                Sink = rx.MatchString("123456789 foo@bar.etc")
        }
}

func BenchmarkRun(b *testing.B) {
        rx := regexp.MustCompile(`[\w\.+-]+@[\w\.-]+\.[\w\.-]+`)
        for i := 0; i < b.N; i++ {
                Sink = rx.MatchString("123456789 foo@bar.etc")
        }
}
$ go test -bench=.
goos: linux
goarch: amd64
pkg: mvdan.cc/p
BenchmarkCompileRun-4             100000             14160 ns/op
BenchmarkRun-4                   1000000              1121 ns/op
PASS
ok      mvdan.cc/p      2.693s

I presume it doesn't show up in the original numbers because the input data is very large, though.

I agree with @ianlancetaylor that a generic issue like this isn't very helpful. If specific parts of the regexp package could be improved, or certain edge cases are orders of magnitude slower than they should be, we should use separate issues to tackle those. For example, we already have some like #24411 and #21463.

@CAFxX
Copy link
Contributor

CAFxX commented Aug 9, 2018

While it's true that this issue is less than actionable as-is, it is also true that the results of the benchmark reported here are not too dissimilar from the ones on https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/regexredux.html (where they use 1.10). I agree it's unfortunate that all those benchmarks include pattern compilation (although it seems it's not so significant).

@junyer
Copy link
Contributor

junyer commented Aug 18, 2018

My comments on #26943:

Would it be feasible to move the syntax.InstRune* match checks from step() to add()? A thread failing in step() constitutes wasted work – even for a regular expression as simple as [+-]?[0-9]+.

Also, what about using slice assignment when a thread is enqueued? Let cap have copy-on-write semantics.

Also also, it might be worth evaluating the benefit of using a slice as a stack instead of recursing. Anything to reduce the overhead of syntax.InstAlt instructions.

@gopherbot
Copy link

gopherbot commented Aug 22, 2018

Change https://golang.org/cl/130417 mentions this issue: regexp/syntax: don't do both linear and binary sesarch in MatchRunePos

gopherbot pushed a commit that referenced this issue Aug 22, 2018
MatchRunePos is a significant element of regexp performance, so some
attention to optimization is appropriate. Before this CL, a
non-matching rune would do both a linear search in the first four
entries, and a binary search over all the entries. Change the code to
optimize for the common case of two runes, to only do a linear search
when there are up to four entries, and to only do a binary search when
there are more than four entries.

Updates #26623

name                             old time/op    new time/op    delta
Find-12                             260ns ± 1%     275ns ± 7%   +5.84%  (p=0.000 n=8+10)
FindAllNoMatches-12                 144ns ± 9%     143ns ±12%     ~     (p=0.187 n=10+10)
FindString-12                       256ns ± 4%     254ns ± 1%     ~     (p=0.357 n=9+8)
FindSubmatch-12                     587ns ±12%     593ns ±11%     ~     (p=0.516 n=10+10)
FindStringSubmatch-12               534ns ±12%     525ns ±14%     ~     (p=0.565 n=10+10)
Literal-12                          104ns ±14%     106ns ±11%     ~     (p=0.145 n=10+10)
NotLiteral-12                      1.51µs ± 8%    1.47µs ± 2%     ~     (p=0.508 n=10+9)
MatchClass-12                      2.47µs ± 1%    2.26µs ± 6%   -8.55%  (p=0.000 n=8+10)
MatchClass_InRange-12              2.18µs ± 5%    2.25µs ±11%   +2.85%  (p=0.009 n=9+10)
ReplaceAll-12                      2.35µs ± 6%    2.08µs ±23%  -11.59%  (p=0.010 n=9+10)
AnchoredLiteralShortNonMatch-12    93.2ns ± 9%    93.2ns ±11%     ~     (p=0.716 n=10+10)
AnchoredLiteralLongNonMatch-12      118ns ±10%     117ns ± 9%     ~     (p=0.802 n=10+10)
AnchoredShortMatch-12               142ns ± 1%     141ns ± 1%   -0.53%  (p=0.007 n=8+8)
AnchoredLongMatch-12                303ns ± 9%     304ns ± 6%     ~     (p=0.724 n=10+10)
OnePassShortA-12                    620ns ± 1%     618ns ± 9%     ~     (p=0.162 n=8+10)
NotOnePassShortA-12                 599ns ± 8%     568ns ± 1%   -5.21%  (p=0.000 n=10+8)
OnePassShortB-12                    525ns ± 7%     489ns ± 1%   -6.93%  (p=0.000 n=10+8)
NotOnePassShortB-12                 449ns ± 9%     431ns ±11%   -4.05%  (p=0.033 n=10+10)
OnePassLongPrefix-12                119ns ± 6%     114ns ± 0%   -3.88%  (p=0.006 n=10+9)
OnePassLongNotPrefix-12             420ns ± 9%     410ns ± 7%     ~     (p=0.645 n=10+9)
MatchParallelShared-12              376ns ± 0%     375ns ± 0%   -0.45%  (p=0.003 n=8+10)
MatchParallelCopied-12             39.4ns ± 1%    39.1ns ± 0%   -0.55%  (p=0.004 n=10+9)
QuoteMetaAll-12                     139ns ± 7%     142ns ± 7%     ~     (p=0.445 n=10+10)
QuoteMetaNone-12                   56.7ns ± 0%    61.3ns ± 7%   +8.03%  (p=0.001 n=8+10)
Match/Easy0/32-12                  83.4ns ± 7%    83.1ns ± 8%     ~     (p=0.541 n=10+10)
Match/Easy0/1K-12                   417ns ± 8%     394ns ± 6%     ~     (p=0.059 n=10+9)
Match/Easy0/32K-12                 7.05µs ± 8%    7.30µs ± 9%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                   291µs ±17%     284µs ±10%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12                 9.89ms ± 4%   10.27ms ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12                 1.13µs ± 1%    1.14µs ± 1%   +1.51%  (p=0.000 n=8+8)
Match/Easy0i/1K-12                 35.7µs ±11%    36.8µs ±10%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12                1.70ms ± 7%    1.72ms ± 7%     ~     (p=0.776 n=9+6)

name                             old alloc/op   new alloc/op   delta
Find-12                             0.00B          0.00B          ~     (all equal)
FindAllNoMatches-12                 0.00B          0.00B          ~     (all equal)
FindString-12                       0.00B          0.00B          ~     (all equal)
FindSubmatch-12                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
FindStringSubmatch-12               32.0B ± 0%     32.0B ± 0%     ~     (all equal)

name                             old allocs/op  new allocs/op  delta
Find-12                              0.00           0.00          ~     (all equal)
FindAllNoMatches-12                  0.00           0.00          ~     (all equal)
FindString-12                        0.00           0.00          ~     (all equal)
FindSubmatch-12                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
FindStringSubmatch-12                1.00 ± 0%      1.00 ± 0%     ~     (all equal)

name                             old speed      new speed      delta
QuoteMetaAll-12                   101MB/s ± 8%    99MB/s ± 7%     ~     (p=0.529 n=10+10)
QuoteMetaNone-12                  458MB/s ± 0%   425MB/s ± 8%   -7.22%  (p=0.003 n=8+10)
Match/Easy0/32-12                 385MB/s ± 7%   386MB/s ± 7%     ~     (p=0.579 n=10+10)
Match/Easy0/1K-12                2.46GB/s ± 8%  2.60GB/s ± 6%     ~     (p=0.065 n=10+9)
Match/Easy0/32K-12               4.66GB/s ± 7%  4.50GB/s ±10%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                3.63GB/s ±15%  3.70GB/s ± 9%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12               3.40GB/s ± 4%  3.28GB/s ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12               28.4MB/s ± 1%  28.0MB/s ± 1%   -1.50%  (p=0.000 n=8+8)
Match/Easy0i/1K-12               28.8MB/s ±10%  27.9MB/s ±11%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12              19.0MB/s ±14%  19.1MB/s ± 8%     ~     (p=1.000 n=10+6)

Change-Id: I238a451b36ad84b0f5534ff0af5c077a0d52d73a
Reviewed-on: https://go-review.googlesource.com/130417
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

gopherbot commented Aug 26, 2018

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@josharian
Copy link
Contributor

josharian commented Aug 26, 2018

Reopening to prevent @junyer having to relocate his ideas yet again. :)

@josharian josharian reopened this Aug 26, 2018
@mvdan mvdan removed the WaitingForInfo label Aug 26, 2018
@mvdan mvdan reopened this Aug 26, 2018
@mvdan
Copy link
Member

mvdan commented Aug 26, 2018

I think @gopherbot needs to be taught some manners.

@agnivade
Copy link
Contributor

agnivade commented Nov 15, 2018

That benchmark site has not been updated in a year.

I just ran the input with the tip compiler (go version devel +aa20ae4853 Mon Nov 12 23:07:25 2018 +0530 linux/amd64), along with converting the benchmark to an idiomatic one.

Code -

package main

import (
	"bytes"
	"log"
	"os"
	"regexp"
	"testing"
)

var matches []string
var count int

func measure(data string, pattern string, b *testing.B) {
	r, err := regexp.Compile(pattern)
	if err != nil {
		log.Fatal(err)
	}

	for i := 0; i < b.N; i++ {
		matches = r.FindAllString(data, -1)
		count = len(matches)
	}
}

func BenchmarkAll(b *testing.B) {
	filerc, err := os.Open(os.Getenv("FILE_NAME"))
	if err != nil {
		log.Fatal(err)
	}
	defer filerc.Close()

	buf := new(bytes.Buffer)
	buf.ReadFrom(filerc)
	data := buf.String()
	// Email
	b.Run("Email", func(b *testing.B) {
		measure(data, `[\w\.+-]+@[\w\.-]+\.[\w\.-]+`, b)
	})

	// URI
	b.Run("URI", func(b *testing.B) {
		measure(data, `[\w]+://[^/\s?#]+[^\s?#]+(?:\?[^\s#]*)?(?:#[^\s]*)?`, b)
	})

	// IP
	b.Run("IP", func(b *testing.B) {
		measure(data, `(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])`, b)
	})
}

With that, if I compare with 1.10, there is a substantial improvement now

$benchstat go1.10.txt tip.txt 
name         old time/op  new time/op  delta
All/Email-4   507ms ± 1%   410ms ± 1%  -19.03%  (p=0.008 n=5+5)
All/URI-4     496ms ± 1%   398ms ± 1%  -19.86%  (p=0.008 n=5+5)
All/IP-4      805ms ± 0%   607ms ± 1%  -24.63%  (p=0.008 n=5+5)

And also the total is now 1415ms which brings us above Python3. If we are to go by the original issue title, I'd say it is pretty much resolved.

Only @junyer's comments here have some concrete suggestions to improve.

I don't know whether they are still applicable in the current tip as of now. I will let someone investigate that and re-purpose the issue to that effect.

@agnivade agnivade changed the title regexp: Go's regex is even slower than Python regexp: investigate further performance improvements Nov 15, 2018
@agnivade agnivade added the NeedsInvestigation label Nov 15, 2018
@junyer
Copy link
Contributor

junyer commented Nov 27, 2018

As per https://perf.golang.org/search?q=upload:20181127.2, moving the syntax.InstRune* match checks from step() to add() seems helpful.

Note that this is the regular expression used for the Match/Hard1/* benchmarks:

        {"Hard1", "ABCD|CDEF|EFGH|GHIJ|IJKL|KLMN|MNOP|OPQR|QRST|STUV|UVWX|WXYZ"},

@junyer
Copy link
Contributor

junyer commented Nov 27, 2018

As per https://perf.golang.org/search?q=upload:20181127.3, letting cap have copy-on-write semantics seems additionally helpful.

@junyer
Copy link
Contributor

junyer commented Nov 27, 2018

Both of those are quite trivial changes. I also suggested reducing the overhead of syntax.InstAlt instructions, but that would require some design discussion, so I'm not tinkering with that tonight.

@CAFxX
Copy link
Contributor

CAFxX commented Aug 28, 2019

This may be a bit of a wild idea (that would likely need to be tracked in a separate issue, but as I'm not even sure how feasible it is and as it's related to this pretty open-ended issue I'm dumping it here) but it came to mind while reading this old article from @dgryski: would it make sense, for regexp source expressions known at compile time, having the go compiler compile the regexp and then emit native machine code implementing doExecute for that specific expression (kind-of like ragel)? At runtime regexp.Compile would somehow discover that the expression has been already compiled to native code, and would use the precompiled doExecute.

To avoid having to generate native code directly just for the regexp it would probably be ok to generate go code and let the rest of the go compiler handle that.

@smasher164
Copy link
Member

smasher164 commented Sep 8, 2019

having the go compiler compile the regexp and then emit native machine code

There is precedent for this, namely CTRE for C++, and what used to be rust's compile-time regex macro. The optimization doesn't seem too farfetched to me, since it's similar to how intrinsics and SSA rules can redirect usage of the standard library to different implementations.

As a side note, it would be an interesting project for someone to put together a go generate tool and ctre package that mirrored regexp's API, but generated code at compile time.

@junyer
Copy link
Contributor

junyer commented Sep 8, 2019

CTRE is a veritable nightmare of C++ template metaprogramming. Let us never speak of it again.

The state of the art for code generation is probably Trofimovich's TDFA work for RE2C. See http://re2c.org/2017_trofimovich_tagged_deterministic_finite_automata_with_lookahead.pdf.

@prasad83
Copy link

prasad83 commented Feb 13, 2020

This pcre based regex gave better performance compared to regexp - still PHP preg_match out performed in my test of (apache access log parsing).

ayoyu added a commit to ayoyu/flashtext that referenced this issue Jan 10, 2022
golang regex is so slow: golang/go#26623
---------------------------------
keys_size  | FlashText (s) | Regex (s)
10         | 0.00053121   | 0.007381449
1010         | 0.000902698   | 1.021105121
2010         | 0.001164453   | 2.155324188
3010         | 0.001272009   | 3.189556999
4010         | 0.001415052   | 4.489287341
5010         | 0.00151844   | 5.662644436
6010         | 0.001601235   | 6.820220812
7010         | 0.001711219   | 7.845579981
8010         | 0.001785076   | 9.740038207
...
ayoyu added a commit to ayoyu/flashtext that referenced this issue Jan 10, 2022
golang regex is so slow: golang/go#26623
---------------------------------
keys_size  | FlashText (s) | Regex (s)
10         | 0.00053121   | 0.007381449
1010         | 0.000902698   | 1.021105121
2010         | 0.001164453   | 2.155324188
3010         | 0.001272009   | 3.189556999
4010         | 0.001415052   | 4.489287341
5010         | 0.00151844   | 5.662644436
6010         | 0.001601235   | 6.820220812
7010         | 0.001711219   | 7.845579981
8010         | 0.001785076   | 9.740038207
...
@candlerb
Copy link

candlerb commented Feb 4, 2022

@CAFxX:

would it make sense, for regexp source expressions known at compile time, having the go compiler compile the regexp and then emit native machine code implementing doExecute for that specific expression

That makes me think of a couple of less extreme optimisations:

  1. for an expression of the form regexp.MustCompile(string_literal), at compile-time build the result value, and either store it immutably in the code segment, or clone at runtime. In fact, I guess this could be extended to any function call which can be marked "pure" and has constant arguments.

    Unfortunately this won't help benchmarks which loop over lists of strings to test as regexps. And it won't help real-world code much; they can get almost the same benefit if they do var xxx = regexp.MustCompile(...) globally.

  2. regexp.MustCompile keeps a cache, i.e. map[string]*Regexp. It would be interesting to modify the benchmarks to do this explicitly, and see what difference it makes. The cache should have a size limit to avoid problems with dynamically-generated regexps.

@zikaeroh
Copy link
Contributor

zikaeroh commented Feb 8, 2022

FWIW there's a good number of unreviewed CLs by @bboreham improving regexp performance that have been sitting for a cycle or so:

Of the list, CL 355789 is exceptionally small (take the address of a big struct instead of copying it; 4 line change) compared to its 30-40% perf benefit.

@zikaeroh
Copy link
Contributor

zikaeroh commented Feb 9, 2022

And, the benchstat with those 5 applied:

Time
name                            old time/op    new time/op     delta
pkg:regexp goos:linux goarch:amd64
Find-8                             158ns ±27%      130ns ±29%    -17.37%  (p=0.023 n=10+10)
FindAllNoMatches-8                62.1ns ± 4%     59.2ns ± 1%     -4.58%  (p=0.000 n=8+8)
FindString-8                       148ns ±23%      128ns ±28%    -13.39%  (p=0.023 n=10+10)
FindSubmatch-8                     184ns ± 8%      172ns ± 5%     -6.45%  (p=0.017 n=10+10)
FindStringSubmatch-8               175ns ±11%      194ns ±29%       ~     (p=1.000 n=10+10)
Literal-8                         39.3ns ± 5%     39.6ns ±15%       ~     (p=0.315 n=9+10)
NotLiteral-8                       865ns ±13%      773ns ±20%    -10.72%  (p=0.007 n=10+10)
MatchClass-8                      1.05µs ± 1%     1.01µs ± 7%     -4.22%  (p=0.016 n=9+10)
MatchClass_InRange-8              1.01µs ± 2%     1.00µs ± 6%       ~     (p=0.734 n=9+10)
ReplaceAll-8                       667ns ± 9%      652ns ± 5%       ~     (p=0.156 n=9+10)
AnchoredLiteralShortNonMatch-8    38.1ns ±25%     20.3ns ±11%    -46.69%  (p=0.000 n=10+10)
AnchoredLiteralLongNonMatch-8     46.8ns ± 0%     19.5ns ± 1%    -58.26%  (p=0.001 n=6+8)
AnchoredShortMatch-8              57.9ns ± 1%     43.1ns ± 2%    -25.64%  (p=0.000 n=8+9)
AnchoredLongMatch-8                143ns ±12%       43ns ± 1%    -70.15%  (p=0.000 n=10+8)
OnePassShortA-8                    286ns ± 2%      215ns ± 2%    -24.82%  (p=0.000 n=10+10)
NotOnePassShortA-8                 390ns ±29%      282ns ±38%    -27.70%  (p=0.001 n=10+10)
OnePassShortB-8                    211ns ± 2%      176ns ± 1%    -16.76%  (p=0.000 n=10+10)
NotOnePassShortB-8                 235ns ±20%      175ns ± 1%    -25.32%  (p=0.000 n=10+10)
OnePassLongNotPrefix-8             166ns ± 3%      124ns ± 2%    -25.02%  (p=0.000 n=9+10)
MatchParallelShared-8             53.1ns ± 1%     39.2ns ± 5%    -26.07%  (p=0.000 n=8+8)
MatchParallelCopied-8             42.0ns ±25%     52.4ns ±46%       ~     (p=0.063 n=10+10)
QuoteMetaAll-8                    69.2ns ±25%     59.1ns ± 2%       ~     (p=0.968 n=10+9)
QuoteMetaNone-8                   31.9ns ± 2%     43.0ns ±24%    +34.55%  (p=0.000 n=9+10)
Compile/Onepass-8                 3.61µs ±18%     3.53µs ±25%       ~     (p=0.739 n=10+10)
Compile/Medium-8                  5.70µs ± 2%     9.31µs ± 1%    +63.16%  (p=0.000 n=10+8)
Compile/Hard-8                    52.5µs ±24%     50.5µs ±27%       ~     (p=0.739 n=10+10)
Match/Easy0/16-8                  2.87ns ±38%     2.43ns ± 1%    -15.41%  (p=0.005 n=10+10)
Match/Easy0/32-8                  29.9ns ± 1%     29.3ns ± 1%     -2.06%  (p=0.000 n=8+8)
Match/Easy0/1K-8                   191ns ± 5%      192ns ± 1%       ~     (p=0.175 n=10+9)
Match/Easy0/32K-8                 2.54µs ± 1%     2.52µs ± 1%     -0.67%  (p=0.004 n=10+10)
Match/Easy0/1M-8                   159µs ± 7%      148µs ± 1%     -7.06%  (p=0.000 n=10+10)
Match/Easy0/32M-8                 5.94ms ± 0%     5.40ms ± 1%     -9.01%  (p=0.000 n=9+10)
Match/Easy0i/16-8                 2.49ns ± 1%     2.56ns ± 5%       ~     (p=0.083 n=9+10)
Match/Easy0i/32-8                  515ns ± 1%       61ns ± 7%    -88.08%  (p=0.000 n=9+10)
Match/Easy0i/1K-8                 15.3µs ± 4%      7.4µs ± 1%    -51.46%  (p=0.000 n=10+10)
Match/Easy0i/32K-8                 634µs ± 1%      275µs ± 1%    -56.68%  (p=0.000 n=9+10)
Match/Easy0i/1M-8                 21.7ms ±17%      8.9ms ± 1%    -58.96%  (p=0.000 n=10+10)
Match/Easy0i/32M-8                 686ms ± 2%      285ms ± 0%    -58.47%  (p=0.000 n=9+8)
Match/Easy1/16-8                  2.54ns ± 3%     2.46ns ± 1%     -3.12%  (p=0.016 n=10+10)
Match/Easy1/32-8                  28.5ns ± 5%     26.3ns ± 1%     -7.70%  (p=0.000 n=8+8)
Match/Easy1/1K-8                   375ns ± 3%      385ns ±11%       ~     (p=0.888 n=9+10)
Match/Easy1/32K-8                 19.2µs ± 1%     19.8µs ± 2%     +3.30%  (p=0.000 n=9+9)
Match/Easy1/1M-8                   730µs ±15%      677µs ± 5%     -7.25%  (p=0.002 n=9+10)
Match/Easy1/32M-8                 23.9ms ± 0%     21.9ms ± 2%     -8.36%  (p=0.000 n=8+8)
Match/Medium/16-8                 2.60ns ± 1%     2.46ns ± 1%     -5.60%  (p=0.000 n=9+10)
Match/Medium/32-8                  401ns ± 1%      410ns ± 2%     +2.18%  (p=0.000 n=10+7)
Match/Medium/1K-8                 16.0µs ± 8%     15.0µs ± 7%     -5.74%  (p=0.013 n=10+9)
Match/Medium/32K-8                 755µs ±18%      683µs ± 1%     -9.64%  (p=0.000 n=10+8)
Match/Medium/1M-8                 22.6ms ± 1%     24.7ms ±13%       ~     (p=1.000 n=9+10)
Match/Medium/32M-8                 767ms ± 1%      699ms ± 1%     -8.95%  (p=0.000 n=8+9)
Match/Hard/16-8                   2.60ns ± 1%     2.46ns ± 1%     -5.60%  (p=0.000 n=9+10)
Match/Hard/32-8                    770ns ± 5%      739ns ±29%       ~     (p=0.083 n=8+10)
Match/Hard/1K-8                   25.2µs ±24%     23.9µs ±22%     -5.35%  (p=0.029 n=10+10)
Match/Hard/32K-8                  1.40ms ±67%     0.97ms ± 6%    -31.04%  (p=0.004 n=10+8)
Match/Hard/1M-8                   30.9ms ± 2%     30.7ms ± 2%       ~     (p=0.167 n=8+9)
Match/Hard/32M-8                   1.08s ± 4%      1.02s ±15%     -6.12%  (p=0.040 n=9+9)
Match/Hard1/16-8                  2.29µs ± 5%     1.93µs ± 1%    -15.73%  (p=0.000 n=9+9)
Match/Hard1/32-8                  4.25µs ± 5%     4.51µs ±24%       ~     (p=0.515 n=8+10)
Match/Hard1/1K-8                   124µs ± 1%      122µs ± 1%     -2.26%  (p=0.000 n=8+9)
Match/Hard1/32K-8                 5.85ms ±30%     4.84ms ± 3%       ~     (p=0.146 n=10+8)
Match/Hard1/1M-8                   192ms ±37%      152ms ± 2%       ~     (p=0.083 n=10+8)
Match/Hard1/32M-8                  5.02s ± 4%      4.99s ± 5%       ~     (p=0.605 n=9+9)
Match_onepass_regex/16-8           207ns ± 3%      164ns ± 6%    -20.52%  (p=0.000 n=10+10)
Match_onepass_regex/32-8           367ns ± 2%      270ns ± 1%    -26.53%  (p=0.000 n=9+10)
Match_onepass_regex/1K-8          9.81µs ± 1%     7.39µs ± 1%    -24.67%  (p=0.000 n=9+10)
Match_onepass_regex/32K-8          313µs ± 1%      235µs ± 1%    -24.88%  (p=0.000 n=10+10)
Match_onepass_regex/1M-8          10.8ms ± 2%      7.5ms ± 1%    -30.34%  (p=0.000 n=8+10)
Match_onepass_regex/32M-8          339ms ± 1%      259ms ± 2%    -23.76%  (p=0.000 n=10+10)
pkg:regexp/syntax goos:linux goarch:amd64
EmptyOpContext-8                  93.3ns ± 1%     93.3ns ± 1%       ~     (p=0.986 n=10+10)
Alloc
name                            old alloc/op   new alloc/op    delta
pkg:regexp goos:linux goarch:amd64
Find-8                             0.00B           0.00B            ~     (all equal)
FindAllNoMatches-8                 0.00B           0.00B            ~     (all equal)
FindString-8                       0.00B           0.00B            ~     (all equal)
FindSubmatch-8                     48.0B ± 0%      48.0B ± 0%       ~     (all equal)
FindStringSubmatch-8               32.0B ± 0%      32.0B ± 0%       ~     (all equal)
Literal-8                          0.00B           0.00B            ~     (all equal)
NotLiteral-8                       0.00B           0.00B            ~     (all equal)
MatchClass-8                       0.00B           0.00B            ~     (all equal)
MatchClass_InRange-8               0.00B           0.00B            ~     (all equal)
ReplaceAll-8                       96.0B ± 0%      96.0B ± 0%       ~     (all equal)
AnchoredLiteralShortNonMatch-8     0.00B           0.00B            ~     (all equal)
AnchoredLiteralLongNonMatch-8      0.00B           0.00B            ~     (all equal)
AnchoredShortMatch-8               0.00B           0.00B            ~     (all equal)
AnchoredLongMatch-8                0.00B           0.00B            ~     (all equal)
OnePassShortA-8                    0.00B           0.00B            ~     (all equal)
NotOnePassShortA-8                 0.00B           0.00B            ~     (all equal)
OnePassShortB-8                    0.00B           0.00B            ~     (all equal)
NotOnePassShortB-8                 0.00B           0.00B            ~     (all equal)
OnePassLongNotPrefix-8             0.00B           0.00B            ~     (all equal)
MatchParallelShared-8              0.00B           0.00B            ~     (all equal)
MatchParallelCopied-8              0.00B           0.00B            ~     (all equal)
QuoteMetaAll-8                     64.0B ± 0%      64.0B ± 0%       ~     (all equal)
QuoteMetaNone-8                    0.00B           0.00B            ~     (all equal)
Compile/Onepass-8                 4.02kB ± 0%     4.04kB ± 0%     +0.40%  (p=0.000 n=10+10)
Compile/Medium-8                  9.39kB ± 0%     9.41kB ± 0%     +0.17%  (p=0.000 n=10+10)
Compile/Hard-8                    84.7kB ± 0%     84.7kB ± 0%     +0.02%  (p=0.000 n=10+10)
Match/Easy0/16-8                   0.00B           0.00B            ~     (all equal)
Match/Easy0/32-8                   0.00B           0.00B            ~     (all equal)
Match/Easy0/1K-8                   0.00B           0.00B            ~     (all equal)
Match/Easy0/32K-8                  0.00B           0.00B            ~     (all equal)
Match/Easy0/1M-8                   0.00B           0.00B            ~     (all equal)
Match/Easy0/32M-8                  18.2B ±76%      18.8B ±79%       ~     (p=0.125 n=10+10)
Match/Easy0i/16-8                  0.00B           0.00B            ~     (all equal)
Match/Easy0i/32-8                  0.00B           0.00B            ~     (all equal)
Match/Easy0i/1K-8                  0.00B           0.00B            ~     (all equal)
Match/Easy0i/32K-8                1.20B ±150%     0.40B ±150%       ~     (p=0.626 n=10+10)
Match/Easy0i/1M-8                  76.7B ±79%      35.2B ±80%       ~     (p=0.096 n=10+10)
Match/Easy0i/32M-8                2.44kB ±79%     1.06kB ±76%    -56.37%  (p=0.009 n=10+10)
Match/Easy1/16-8                   0.00B           0.00B            ~     (all equal)
Match/Easy1/32-8                   0.00B           0.00B            ~     (all equal)
Match/Easy1/1K-8                   0.00B           0.00B            ~     (all equal)
Match/Easy1/32K-8                  0.00B           0.00B            ~     (all equal)
Match/Easy1/1M-8                   3.70B ±19%     1.90B ±111%    -48.65%  (p=0.005 n=10+10)
Match/Easy1/32M-8                 67.6B ±132%     118.7B ± 3%       ~     (p=0.476 n=10+9)
Match/Medium/16-8                  0.00B           0.00B            ~     (all equal)
Match/Medium/32-8                  0.00B           0.00B            ~     (all equal)
Match/Medium/1K-8                  0.00B           0.00B            ~     (all equal)
Match/Medium/32K-8                1.90B ±111%     1.20B ±150%       ~     (p=0.465 n=10+10)
Match/Medium/1M-8                  81.7B ±78%      68.0B ±76%       ~     (p=0.165 n=10+10)
Match/Medium/32M-8                2.16kB ±76%    1.62kB ±102%       ~     (p=0.656 n=10+10)
Match/Hard/16-8                    0.00B           0.00B            ~     (all equal)
Match/Hard/32-8                    0.00B           0.00B            ~     (all equal)
Match/Hard/1K-8                    0.00B           0.00B            ~     (all equal)
Match/Hard/32K-8                  0.25B ±300%     3.00B ±100%       ~     (p=0.056 n=8+10)
Match/Hard/1M-8                   87.5B ±130%      98.7B ±77%       ~     (p=0.814 n=10+10)
Match/Hard/32M-8                  4.97kB ±79%    2.64kB ±152%    -46.88%  (p=0.027 n=10+10)
Match/Hard1/16-8                   0.00B           0.00B            ~     (all equal)
Match/Hard1/32-8                   0.00B           0.00B            ~     (all equal)
Match/Hard1/1K-8                   0.00B          2.00B ±100%      +Inf%  (p=0.023 n=8+10)
Match/Hard1/32K-8                 20.0B ±130%      16.4B ±83%       ~     (p=0.535 n=10+10)
Match/Hard1/1M-8                    607B ±76%       772B ±81%       ~     (p=0.650 n=10+10)
Match/Hard1/32M-8                 1.04kB ± 4%     5.44kB ±81%   +424.01%  (p=0.008 n=8+10)
Match_onepass_regex/16-8           0.00B           0.00B            ~     (all equal)
Match_onepass_regex/32-8           0.00B           0.00B            ~     (all equal)
Match_onepass_regex/1K-8           0.00B           0.00B            ~     (all equal)
Match_onepass_regex/32K-8          0.00B           0.00B            ~     (all equal)
Match_onepass_regex/1M-8           10.7B ±16%       7.0B ± 0%    -34.38%  (p=0.000 n=9+8)
Match_onepass_regex/32M-8           363B ± 5%       269B ±16%    -25.91%  (p=0.000 n=10+10)
pkg:regexp/syntax goos:linux goarch:amd64
EmptyOpContext-8                   0.00B           0.00B            ~     (all equal)

name                            old allocs/op  new allocs/op   delta
pkg:regexp goos:linux goarch:amd64
Find-8                              0.00            0.00            ~     (all equal)
FindAllNoMatches-8                  0.00            0.00            ~     (all equal)
FindString-8                        0.00            0.00            ~     (all equal)
FindSubmatch-8                      1.00 ± 0%       1.00 ± 0%       ~     (all equal)
FindStringSubmatch-8                1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Literal-8                           0.00            0.00            ~     (all equal)
NotLiteral-8                        0.00            0.00            ~     (all equal)
MatchClass-8                        0.00            0.00            ~     (all equal)
MatchClass_InRange-8                0.00            0.00            ~     (all equal)
ReplaceAll-8                        5.00 ± 0%       5.00 ± 0%       ~     (all equal)
AnchoredLiteralShortNonMatch-8      0.00            0.00            ~     (all equal)
AnchoredLiteralLongNonMatch-8       0.00            0.00            ~     (all equal)
AnchoredShortMatch-8                0.00            0.00            ~     (all equal)
AnchoredLongMatch-8                 0.00            0.00            ~     (all equal)
OnePassShortA-8                     0.00            0.00            ~     (all equal)
NotOnePassShortA-8                  0.00            0.00            ~     (all equal)
OnePassShortB-8                     0.00            0.00            ~     (all equal)
NotOnePassShortB-8                  0.00            0.00            ~     (all equal)
OnePassLongNotPrefix-8              0.00            0.00            ~     (all equal)
MatchParallelShared-8               0.00            0.00            ~     (all equal)
MatchParallelCopied-8               0.00            0.00            ~     (all equal)
QuoteMetaAll-8                      2.00 ± 0%       2.00 ± 0%       ~     (all equal)
QuoteMetaNone-8                     0.00            0.00            ~     (all equal)
Compile/Onepass-8                   52.0 ± 0%       52.0 ± 0%       ~     (all equal)
Compile/Medium-8                     112 ± 0%        112 ± 0%       ~     (all equal)
Compile/Hard-8                       424 ± 0%        424 ± 0%       ~     (all equal)
Match/Easy0/16-8                    0.00            0.00            ~     (all equal)
Match/Easy0/32-8                    0.00            0.00            ~     (all equal)
Match/Easy0/1K-8                    0.00            0.00            ~     (all equal)
Match/Easy0/32K-8                   0.00            0.00            ~     (all equal)
Match/Easy0/1M-8                    0.00            0.00            ~     (all equal)
Match/Easy0/32M-8                   0.00            0.00            ~     (all equal)
Match/Easy0i/16-8                   0.00            0.00            ~     (all equal)
Match/Easy0i/32-8                   0.00            0.00            ~     (all equal)
Match/Easy0i/1K-8                   0.00            0.00            ~     (all equal)
Match/Easy0i/32K-8                  0.00            0.00            ~     (all equal)
Match/Easy0i/1M-8                   0.00            0.00            ~     (all equal)
Match/Easy0i/32M-8                  5.20 ±81%      1.20 ±100%    -76.92%  (p=0.009 n=10+10)
Match/Easy1/16-8                    0.00            0.00            ~     (all equal)
Match/Easy1/32-8                    0.00            0.00            ~     (all equal)
Match/Easy1/1K-8                    0.00            0.00            ~     (all equal)
Match/Easy1/32K-8                   0.00            0.00            ~     (all equal)
Match/Easy1/1M-8                    0.00            0.00            ~     (all equal)
Match/Easy1/32M-8                   0.00            0.00            ~     (all equal)
Match/Medium/16-8                   0.00            0.00            ~     (all equal)
Match/Medium/32-8                   0.00            0.00            ~     (all equal)
Match/Medium/1K-8                   0.00            0.00            ~     (all equal)
Match/Medium/32K-8                  0.00            0.00            ~     (all equal)
Match/Medium/1M-8                   0.00            0.00            ~     (all equal)
Match/Medium/32M-8                  4.60 ±78%      3.40 ±106%       ~     (p=0.656 n=10+10)
Match/Hard/16-8                     0.00            0.00            ~     (all equal)
Match/Hard/32-8                     0.00            0.00            ~     (all equal)
Match/Hard/1K-8                     0.00            0.00            ~     (all equal)
Match/Hard/32K-8                    0.00            0.00            ~     (all equal)
Match/Hard/1M-8                     0.00            0.00            ~     (all equal)
Match/Hard/32M-8                    13.9 ±86%       7.1 ±168%    -48.92%  (p=0.027 n=10+10)
Match/Hard1/16-8                    0.00            0.00            ~     (all equal)
Match/Hard1/32-8                    0.00            0.00            ~     (all equal)
Match/Hard1/1K-8                    0.00            0.00            ~     (all equal)
Match/Hard1/32K-8                   0.00            0.00            ~     (all equal)
Match/Hard1/1M-8                   2.50 ±100%      3.50 ±100%       ~     (p=0.650 n=10+10)
Match/Hard1/32M-8                   2.25 ±78%      29.30 ±93%  +1202.22%  (p=0.008 n=8+10)
Match_onepass_regex/16-8            0.00            0.00            ~     (all equal)
Match_onepass_regex/32-8            0.00            0.00            ~     (all equal)
Match_onepass_regex/1K-8            0.00            0.00            ~     (all equal)
Match_onepass_regex/32K-8           0.00            0.00            ~     (all equal)
Match_onepass_regex/1M-8            0.00            0.00            ~     (all equal)
Match_onepass_regex/32M-8          0.60 ±100%       0.00        -100.00%  (p=0.011 n=10+10)
pkg:regexp/syntax goos:linux goarch:amd64
EmptyOpContext-8                    0.00            0.00            ~     (all equal)
Speed
name                            old speed      new speed       delta
pkg:regexp goos:linux goarch:amd64
QuoteMetaAll-8                   211MB/s ±27%    237MB/s ± 2%       ~     (p=0.968 n=10+9)
QuoteMetaNone-8                  814MB/s ± 2%    640MB/s ±24%    -21.36%  (p=0.000 n=9+10)
Match/Easy0/16-8                5.80GB/s ±30%   6.60GB/s ± 1%    +13.75%  (p=0.007 n=10+10)
Match/Easy0/32-8                1.07GB/s ± 1%   1.09GB/s ± 1%     +2.09%  (p=0.000 n=8+8)
Match/Easy0/1K-8                5.36GB/s ± 4%   5.33GB/s ± 1%       ~     (p=0.182 n=10+9)
Match/Easy0/32K-8               12.9GB/s ± 1%   13.0GB/s ± 1%     +0.68%  (p=0.004 n=10+10)
Match/Easy0/1M-8                6.59GB/s ± 7%   7.08GB/s ± 1%     +7.46%  (p=0.000 n=10+10)
Match/Easy0/32M-8               5.65GB/s ± 0%   6.21GB/s ± 1%     +9.90%  (p=0.000 n=9+10)
Match/Easy0i/16-8               6.44GB/s ± 1%   6.25GB/s ± 5%       ~     (p=0.079 n=9+10)
Match/Easy0i/32-8               62.2MB/s ± 1%  522.6MB/s ± 7%   +740.58%  (p=0.000 n=9+10)
Match/Easy0i/1K-8               67.1MB/s ± 4%  138.1MB/s ± 1%   +105.87%  (p=0.000 n=10+10)
Match/Easy0i/32K-8              51.7MB/s ± 1%  119.4MB/s ± 1%   +130.83%  (p=0.000 n=9+10)
Match/Easy0i/1M-8               48.7MB/s ±15%  117.9MB/s ± 1%   +142.15%  (p=0.000 n=10+10)
Match/Easy0i/32M-8              48.9MB/s ± 2%  117.9MB/s ± 0%   +140.78%  (p=0.000 n=9+8)
Match/Easy1/16-8                6.31GB/s ± 3%   6.51GB/s ± 1%     +3.14%  (p=0.019 n=10+10)
Match/Easy1/32-8                1.10GB/s ±14%   1.22GB/s ± 1%    +10.11%  (p=0.000 n=9+8)
Match/Easy1/1K-8                2.73GB/s ± 3%   2.67GB/s ±11%       ~     (p=0.905 n=9+10)
Match/Easy1/32K-8               1.71GB/s ± 1%   1.66GB/s ± 2%     -3.19%  (p=0.000 n=9+9)
Match/Easy1/1M-8                1.42GB/s ±14%   1.55GB/s ± 5%     +9.12%  (p=0.001 n=10+10)
Match/Easy1/32M-8               1.41GB/s ± 0%   1.53GB/s ± 2%     +9.13%  (p=0.000 n=8+8)
Match/Medium/16-8               6.15GB/s ± 1%   6.51GB/s ± 1%     +5.94%  (p=0.000 n=9+10)
Match/Medium/32-8               79.8MB/s ± 1%   78.1MB/s ± 2%     -2.12%  (p=0.000 n=10+7)
Match/Medium/1K-8               64.3MB/s ± 8%   68.1MB/s ± 6%     +5.93%  (p=0.013 n=10+9)
Match/Medium/32K-8              43.7MB/s ±16%   48.0MB/s ± 1%     +9.83%  (p=0.000 n=10+8)
Match/Medium/1M-8               46.4MB/s ± 1%   43.1MB/s ±13%       ~     (p=1.000 n=9+10)
Match/Medium/32M-8              43.7MB/s ± 1%   48.0MB/s ± 1%     +9.83%  (p=0.000 n=8+9)
Match/Hard/16-8                 6.15GB/s ± 1%   6.51GB/s ± 1%     +5.93%  (p=0.000 n=9+10)
Match/Hard/32-8                 41.6MB/s ± 5%   44.0MB/s ±24%       ~     (p=0.083 n=8+10)
Match/Hard/1K-8                 41.5MB/s ±21%   43.7MB/s ±20%     +5.24%  (p=0.028 n=10+10)
Match/Hard/32K-8                25.5MB/s ±45%   33.2MB/s ±17%    +30.12%  (p=0.008 n=10+9)
Match/Hard/1M-8                 33.9MB/s ± 2%   34.2MB/s ± 2%       ~     (p=0.145 n=8+9)
Match/Hard/32M-8                31.0MB/s ± 4%   32.3MB/s ±24%       ~     (p=0.133 n=9+10)
Match/Hard1/16-8                6.98MB/s ± 5%   8.27MB/s ± 1%    +18.62%  (p=0.000 n=9+9)
Match/Hard1/32-8                7.53MB/s ± 5%   7.29MB/s ±21%       ~     (p=0.500 n=8+10)
Match/Hard1/1K-8                8.23MB/s ± 1%   8.42MB/s ± 1%     +2.28%  (p=0.000 n=8+9)
Match/Hard1/32K-8               5.83MB/s ±26%   6.78MB/s ± 3%       ~     (p=0.152 n=10+8)
Match/Hard1/1M-8                5.69MB/s ±30%   6.91MB/s ± 2%       ~     (p=0.064 n=10+8)
Match/Hard1/32M-8               6.69MB/s ± 4%   6.74MB/s ± 5%       ~     (p=0.649 n=9+9)
Match_onepass_regex/16-8        77.4MB/s ± 3%   97.4MB/s ± 6%    +25.93%  (p=0.000 n=10+10)
Match_onepass_regex/32-8        87.2MB/s ± 2%  118.7MB/s ± 1%    +36.10%  (p=0.000 n=9+10)
Match_onepass_regex/1K-8         104MB/s ± 1%    139MB/s ± 1%    +32.75%  (p=0.000 n=9+10)
Match_onepass_regex/32K-8        105MB/s ± 1%    139MB/s ± 1%    +33.12%  (p=0.000 n=10+10)
Match_onepass_regex/1M-8        96.9MB/s ± 2%  139.1MB/s ± 1%    +43.54%  (p=0.000 n=8+10)
Match_onepass_regex/32M-8       98.9MB/s ± 1%  129.8MB/s ± 2%    +31.19%  (p=0.000 n=10+10)

@zikaeroh
Copy link
Contributor

zikaeroh commented Jun 5, 2022

CL 355789 (the "exceptionally small" yet powerful CL mentioned previously) has been merged for Go 1.19! 🎉

(The CL didn't mention this issue, so the commit didn't trigger a thread update, hence me mentioning since so many are following this thread. The other 4 CLs are still pending review.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Performance
Projects
None yet
Development

No branches or pull requests