Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: expand TestIntendedInlining to more packages and funcs #21851

Closed
mvdan opened this issue Sep 12, 2017 · 22 comments
Closed

cmd/compile: expand TestIntendedInlining to more packages and funcs #21851

mvdan opened this issue Sep 12, 2017 · 22 comments

Comments

@mvdan
Copy link
Member

mvdan commented Sep 12, 2017

#17566 is about improving the inlining cost model. Making changes to the compiler, especially touching the inlining heuristics, always has the risk of unintentionally making some funcs non-inlineable, potentially resulting in noticeable performance regressions.

That's why @josharian added TestIntendedInlining: https://go-review.googlesource.com/c/go/+/57410

It basically runs go build -a -gcflags=-m runtime and checks that certain funcs are printed in a line along with can inline.

However, the test is limited to the runtime. There are other small functions in the standard library that we want to be inlineable, and which we even tweak to appease the current heuristic. A recent example: https://go-review.googlesource.com/c/go/+/63332

I propose that we add a way to cover more funcs in other packages within std.

Thoughts:

  • We probably want to focus on low-level and basic packages first, like unicode/utf8 and bytes
  • Should we modify TestIntendedInlining, or add a new test? (given how it uses build -a, I presume a single test would be the fastest)
  • If it covers more than just runtime, I presume we would want the test out of the runtime package (where? cmd/compile?)

Suggestions of funcs/packages to add welcome. Happy to work on it if pointers on the above are given.

CC @TocarIP @josharian @martisch

@randall77
Copy link
Contributor

You can modify that test and move it to cmd/compile if you want.
My only worry is that "build -a std" would be slow. Try to restrict to just the packages that want a test.

@mvdan
Copy link
Member Author

mvdan commented Sep 12, 2017

Yes - this is partly why I mentioned focusing on certain packages first, e.g. the ones closest to runtime and with small dependency trees.

@TocarIP
Copy link
Contributor

TocarIP commented Sep 12, 2017

We can start with individual packages and switch to "build -a std" once enough packages are covered

@mvdan mvdan self-assigned this Sep 12, 2017
@mvdan mvdan changed the title Expand TestIntendedInlining to other packages in std cmd/compile: expand TestIntendedInlining to other packages in std Sep 12, 2017
@mvdan
Copy link
Member Author

mvdan commented Sep 13, 2017

@TocarIP I would assume that even after adding all the funcs we want, we still wouldn't need all of std. In any case, we can always look at that once the package list has grown large and we can measure the difference in test run time.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/63550 mentions this issue: cmd/compile: add TestIntendedInlining from runtime

gopherbot pushed a commit that referenced this issue Sep 13, 2017
Move it from the runtime package, as we will soon add more packages and
functions for it to check.

The test used the testEnv func, which cleaned certain environment
variables from a command, so it was moved to internal/testenv under a
more descriptive (and less ambiguous) name. Add a simple godoc to it
too.

For #21851.

Change-Id: I6f39c1f23b45377718355fafe66ffd87047d8ab6
Reviewed-on: https://go-review.googlesource.com/63550
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/63611 mentions this issue: cmd/compile: expand inlining test to multiple pkgs

gopherbot pushed a commit that referenced this issue Sep 14, 2017
Rework the test to work with any number of std packages. This was done
to include a few funcs from unicode/utf8. Adding more will be much
simpler too.

While at it, add more runtime funcs by searching for "inlined" or
"inlining" in the git log of its directory. These are: addb, subtractb,
fastrand and noescape.

Updates #21851.

Change-Id: I4fb2bd8aa6a5054218f9b36cb19d897ac533710e
Reviewed-on: https://go-review.googlesource.com/63611
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/63810 mentions this issue: cmd/compile: collect reasons in inlining test

gopherbot pushed a commit that referenced this issue Sep 22, 2017
If we use -gcflags='-m -m', the compiler should give us a reason why a
func couldn't be inlined. Add the extra -m necessary for that extra info
and use it to give better test failures. For example, for the func in
the TODO:

	--- FAIL: TestIntendedInlining (1.53s)
		inl_test.go:104: runtime.nextFreeFast was not inlined: function too complex

We might increase the number of -m flags to get more information at some
later point, such as getting details on how close the func was to the
inlining budget.

Also started using regexes, as the output parsing is getting a bit too
complex for manual string handling.

While at it, also refactored the test to not buffer the entire output
into memory. This is fine in practice, but it won't scale well as we add
more packages or we depend more on the compiler's debugging output.

For example, "go build -a -gcflags='-m -m' std" prints nearly 40MB of
plaintext - and we only need to see the output line by line anyway.

Updates #21851.

Change-Id: I00986ff360eb56e4e9737b65a6be749ef8540643
Reviewed-on: https://go-review.googlesource.com/63810
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
@mvdan mvdan changed the title cmd/compile: expand TestIntendedInlining to other packages in std cmd/compile: expand TestIntendedInlining to more packages and funcs Sep 22, 2017
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/65351 mentions this issue: cmd/compile: add more runtime funcs to inline test

gopherbot pushed a commit that referenced this issue Sep 22, 2017
This is based from a list that Keith Randall provided in mid-2016. These
are all funcs that, at the time, were important and small enough that
they should be clearly inlined.

The runtime has changed a bit since then. Ctz16 and Ctz8 were removed,
so don't add them. stringtoslicebytetmp was moved to the backend, so
it's no longer a Go function. And itabhash was moved to itabHashFunc.

The only other outlier is adjustctxt, which is not inlineable at the
moment. I've added a TODO and will address it myself in a separate
commit.

While at it, error if any funcs in the input table are duplicated.
They're never useful and typos could lead to unintentionally thinking a
function is inlineable when it actually isn't.

And, since the lists are getting long, start sorting alphabetically.

Finally, rotl_31 is only defined on 64-bit architectures, and the added
runtime/internal/sys funcs are assembly on 386 and thus non-inlineable
in that case.

Updates #21851.

Change-Id: Ib99ab53d777860270e8fd4aefc41adb448f13662
Reviewed-on: https://go-review.googlesource.com/65351
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/65654 mentions this issue: cmd/compile: add runtime GC funcs to inlining test

@stemar94
Copy link

stemar94 commented Sep 23, 2017

In https://golang.org/cl/42813 we made a change to have a method of bytes.Buffer to be inlined. We also added a test for that. Would this be a candidate here, and therefore remove the test in bytes package?

@mvdan
Copy link
Member Author

mvdan commented Sep 23, 2017

Yes, that would be good. There is no need to duplicate the logic to determine whether or not a func is inlineable (and all the extras, such as reporting why it isn't so).

Then again, technically the tests are checking for different things. TestIntendedInlining checks if a func is inlineable, which is a decision made at compile-time. Your bytes test checks the binary to see if the symbol exists, so it checks if the func was inlined. Are all funcs that are inlineable always inlined? I honestly don't know the answer to that question.

If the answer is yes, then it's safe to merge with this test.

@mvdan
Copy link
Member Author

mvdan commented Sep 23, 2017

I also grepped the Go repository, and did not find any other occurences of go tool nm to check if funcs are inlined.

@stemar94
Copy link

I also grepped the Go repository, and did not find any other occurences of go tool nm to check if funcs are inlined.

I know of at least
https://golang.org/src/net/http/http_test.go#L80

@mvdan
Copy link
Member Author

mvdan commented Sep 23, 2017

Well, I did see that one, but that checks for linker behavior - and it checks whether or not some funcs were linked into a binary, I assume because they should never be part of the binary in the first place (not even inlined).

@randall77
Copy link
Contributor

Inlineable functions might still appear in the binary if a closure is made out of them.

@mvdan
Copy link
Member Author

mvdan commented Sep 24, 2017

Thanks, @randall77. I'll assume it is fine to move the check then. CL incoming.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/65658 mentions this issue: cmd/compile: merge bytes inline test with the rest

gopherbot pushed a commit that referenced this issue Sep 25, 2017
In golang.org/cl/42813, a test was added in the bytes package to check
if a Buffer method was being inlined, using 'go tool nm'.

Now that we have a compiler test that verifies that certain funcs are
inlineable, merge it there. Knowing whether the funcs are inlineable is
also more reliable than whether or not their symbol appears in the
binary, too. For example, under some circumstances, inlineable funcs
can't be inlined, such as if closures are used.

While at it, add a few more bytes.Buffer methods that are currently
inlined and should clearly stay that way.

Updates #21851.

Change-Id: I62066e32ef5542d37908bd64f90bda51276da4de
Reviewed-on: https://go-review.googlesource.com/65658
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/65491 mentions this issue: regexp: make (*bitState).push inlinable

gopherbot pushed a commit that referenced this issue Sep 25, 2017
By refactoring job.arg from int with 0/1 as the only valid values into bool
and simplifying (*bitState).push, we reduce the number of nodes below the inlining threshold.
This improves backtracking regexp performance by 5-10% and go1 geomean  by 1.7%
Full performance data below:

name                                      old time/op    new time/op     delta
Find-6                                       510ns ± 0%      480ns ± 1%   -5.90%  (p=0.000 n=10+10)
FindString-6                                 504ns ± 1%      479ns ± 1%   -5.10%  (p=0.000 n=10+10)
FindSubmatch-6                               689ns ± 1%      659ns ± 1%   -4.27%  (p=0.000 n=9+10)
FindStringSubmatch-6                         659ns ± 0%      628ns ± 1%   -4.69%  (p=0.000 n=8+10)
Literal-6                                    174ns ± 1%      171ns ± 1%   -1.50%  (p=0.000 n=10+10)
NotLiteral-6                                2.89µs ± 1%     2.72µs ± 0%   -5.84%  (p=0.000 n=10+9)
MatchClass-6                                4.65µs ± 1%     4.28µs ± 1%   -7.96%  (p=0.000 n=10+10)
MatchClass_InRange-6                        4.15µs ± 1%     3.80µs ± 0%   -8.61%  (p=0.000 n=10+8)
ReplaceAll-6                                2.72µs ± 1%     2.60µs ± 1%   -4.68%  (p=0.000 n=10+10)
AnchoredLiteralShortNonMatch-6               158ns ± 1%      153ns ± 1%   -3.03%  (p=0.000 n=10+10)
AnchoredLiteralLongNonMatch-6                176ns ± 1%      176ns ± 0%     ~     (p=1.000 n=10+9)
AnchoredShortMatch-6                         260ns ± 0%      255ns ± 1%   -1.84%  (p=0.000 n=9+10)
AnchoredLongMatch-6                          456ns ± 0%      455ns ± 0%   -0.19%  (p=0.008 n=8+10)
OnePassShortA-6                             1.13µs ± 1%     1.12µs ± 0%   -0.57%  (p=0.046 n=10+8)
NotOnePassShortA-6                          1.14µs ± 1%     1.14µs ± 1%     ~     (p=0.162 n=10+10)
OnePassShortB-6                              908ns ± 0%      893ns ± 0%   -1.60%  (p=0.000 n=8+9)
NotOnePassShortB-6                           857ns ± 0%      803ns ± 1%   -6.34%  (p=0.000 n=8+10)
OnePassLongPrefix-6                          190ns ± 0%      190ns ± 1%     ~     (p=0.059 n=8+10)
OnePassLongNotPrefix-6                       722ns ± 1%      722ns ± 1%     ~     (p=0.451 n=10+10)
MatchParallelShared-6                        810ns ± 2%      807ns ± 2%     ~     (p=0.643 n=10+10)
MatchParallelCopied-6                       72.1ns ± 1%     69.4ns ± 1%   -3.81%  (p=0.000 n=10+10)
QuoteMetaAll-6                               213ns ± 2%      216ns ± 3%     ~     (p=0.284 n=10+10)
QuoteMetaNone-6                             89.7ns ± 1%     89.8ns ± 1%     ~     (p=0.616 n=10+10)
Match/Easy0/32-6                             127ns ± 1%      127ns ± 1%     ~     (p=0.977 n=10+10)
Match/Easy0/1K-6                             566ns ± 0%      566ns ± 0%     ~     (p=1.000 n=8+8)
Match/Easy0/32K-6                           9.30µs ± 1%     9.28µs ± 1%     ~     (p=0.529 n=10+10)
Match/Easy0/1M-6                             460µs ± 1%      460µs ± 1%     ~     (p=0.853 n=10+10)
Match/Easy0/32M-6                           15.0ms ± 0%     15.1ms ± 0%   +0.77%  (p=0.000 n=9+8)
Match/Easy0i/32-6                           2.10µs ± 1%     1.98µs ± 0%   -6.02%  (p=0.000 n=10+8)
Match/Easy0i/1K-6                           61.5µs ± 0%     57.2µs ± 0%   -6.97%  (p=0.000 n=10+9)
Match/Easy0i/32K-6                          2.75ms ± 0%     2.72ms ± 0%   -1.10%  (p=0.000 n=9+9)
Match/Easy0i/1M-6                           88.0ms ± 0%     86.9ms ± 1%   -1.29%  (p=0.000 n=8+10)
Match/Easy0i/32M-6                           2.82s ± 0%      2.77s ± 1%   -1.81%  (p=0.000 n=8+10)
Match/Easy1/32-6                             123ns ± 1%      124ns ± 1%   +0.90%  (p=0.001 n=10+10)
Match/Easy1/1K-6                            1.70µs ± 1%     1.65µs ± 0%   -3.18%  (p=0.000 n=9+10)
Match/Easy1/32K-6                           69.1µs ± 0%     68.4µs ± 1%   -0.95%  (p=0.000 n=8+10)
Match/Easy1/1M-6                            2.46ms ± 1%     2.42ms ± 1%   -1.66%  (p=0.000 n=10+10)
Match/Easy1/32M-6                           78.4ms ± 1%     77.5ms ± 0%   -1.08%  (p=0.000 n=10+9)
Match/Medium/32-6                           2.07µs ± 1%     1.91µs ± 1%   -7.69%  (p=0.000 n=10+10)
Match/Medium/1K-6                           62.8µs ± 0%     58.0µs ± 1%   -7.70%  (p=0.000 n=8+10)
Match/Medium/32K-6                          2.63ms ± 1%     2.58ms ± 1%   -2.14%  (p=0.000 n=10+10)
Match/Medium/1M-6                           84.6ms ± 0%     82.5ms ± 0%   -2.37%  (p=0.000 n=8+9)
Match/Medium/32M-6                           2.71s ± 0%      2.64s ± 0%   -2.46%  (p=0.000 n=10+9)
Match/Hard/32-6                             3.26µs ± 1%     2.98µs ± 1%   -8.49%  (p=0.000 n=10+10)
Match/Hard/1K-6                              100µs ± 0%       90µs ± 1%   -9.55%  (p=0.000 n=9+10)
Match/Hard/32K-6                            3.82ms ± 0%     3.82ms ± 1%     ~     (p=0.515 n=8+10)
Match/Hard/1M-6                              122ms ± 1%      123ms ± 0%   +0.66%  (p=0.000 n=10+8)
Match/Hard/32M-6                             3.89s ± 1%      3.91s ± 1%     ~     (p=0.105 n=10+10)
Match/Hard1/32-6                            18.1µs ± 1%     16.1µs ± 1%  -11.31%  (p=0.000 n=10+10)
Match/Hard1/1K-6                             565µs ± 0%      493µs ± 1%  -12.65%  (p=0.000 n=8+10)
Match/Hard1/32K-6                           18.8ms ± 0%     18.8ms ± 1%     ~     (p=0.905 n=9+10)
Match/Hard1/1M-6                             602ms ± 1%      602ms ± 1%     ~     (p=0.278 n=9+10)
Match/Hard1/32M-6                            19.1s ± 1%      19.2s ± 1%   +0.31%  (p=0.035 n=9+10)
Match_onepass_regex/32-6                    6.32µs ± 1%     6.34µs ± 1%     ~     (p=0.060 n=10+10)
Match_onepass_regex/1K-6                     204µs ± 1%      204µs ± 1%     ~     (p=0.842 n=9+10)
Match_onepass_regex/32K-6                   6.53ms ± 0%     6.55ms ± 1%   +0.36%  (p=0.005 n=10+10)
Match_onepass_regex/1M-6                     209ms ± 0%      208ms ± 1%   -0.65%  (p=0.034 n=8+10)
Match_onepass_regex/32M-6                    6.72s ± 0%      6.68s ± 1%   -0.74%  (p=0.000 n=9+10)
CompileOnepass/^(?:(?:(?:.(?:$))?))...-6    7.02µs ± 1%     7.02µs ± 1%     ~     (p=0.671 n=10+10)
CompileOnepass/^abcd$-6                     5.65µs ± 1%     5.65µs ± 1%     ~     (p=0.411 n=10+9)
CompileOnepass/^(?:(?:a{0,})*?)$-6          7.06µs ± 1%     7.06µs ± 1%     ~     (p=0.912 n=10+10)
CompileOnepass/^(?:(?:a+)*)$-6              6.40µs ± 1%     6.41µs ± 1%     ~     (p=0.699 n=10+10)
CompileOnepass/^(?:(?:a|(?:aa)))$-6         8.18µs ± 2%     8.16µs ± 1%     ~     (p=0.529 n=10+10)
CompileOnepass/^(?:[^\s\S])$-6              5.08µs ± 1%     5.17µs ± 1%   +1.77%  (p=0.000 n=9+10)
CompileOnepass/^(?:(?:(?:a*)+))$-6          6.86µs ± 1%     6.85µs ± 0%     ~     (p=0.190 n=10+9)
CompileOnepass/^[a-c]+$-6                   5.14µs ± 1%     5.11µs ± 0%   -0.53%  (p=0.041 n=10+10)
CompileOnepass/^[a-c]*$-6                   5.62µs ± 1%     5.63µs ± 1%     ~     (p=0.382 n=10+10)
CompileOnepass/^(?:a*)$-6                   5.76µs ± 1%     5.73µs ± 1%   -0.41%  (p=0.008 n=9+10)
CompileOnepass/^(?:(?:aa)|a)$-6             7.89µs ± 1%     7.84µs ± 1%   -0.66%  (p=0.020 n=10+10)
CompileOnepass/^...$-6                      5.38µs ± 1%     5.38µs ± 1%     ~     (p=0.857 n=9+10)
CompileOnepass/^(?:a|(?:aa))$-6             7.80µs ± 2%     7.82µs ± 1%     ~     (p=0.342 n=10+10)
CompileOnepass/^a((b))c$-6                  7.75µs ± 1%     7.78µs ± 1%     ~     (p=0.172 n=10+10)
CompileOnepass/^a.[l-nA-Cg-j]?e$-6          8.39µs ± 1%     8.42µs ± 1%     ~     (p=0.138 n=10+10)
CompileOnepass/^a((b))$-6                   6.92µs ± 1%     6.95µs ± 1%     ~     (p=0.159 n=10+10)
CompileOnepass/^a(?:(b)|(c))c$-6            10.0µs ± 1%     10.0µs ± 1%     ~     (p=0.896 n=10+10)
CompileOnepass/^a(?:b|c)$-6                 5.62µs ± 1%     5.66µs ± 1%   +0.71%  (p=0.023 n=10+10)
CompileOnepass/^a(?:b?|c)$-6                8.49µs ± 1%     8.43µs ± 1%   -0.69%  (p=0.010 n=10+10)
CompileOnepass/^a(?:b?|c+)$-6               9.26µs ± 1%     9.28µs ± 1%     ~     (p=0.448 n=10+10)
CompileOnepass/^a(?:bc)+$-6                 6.52µs ± 1%     6.46µs ± 2%   -1.02%  (p=0.003 n=10+10)
CompileOnepass/^a(?:[bcd])+$-6              6.29µs ± 1%     6.32µs ± 1%     ~     (p=0.256 n=10+10)
CompileOnepass/^a((?:[bcd])+)$-6            7.77µs ± 1%     7.79µs ± 1%     ~     (p=0.105 n=10+10)
CompileOnepass/^a(:?b|c)*d$-6               14.0µs ± 1%     13.9µs ± 1%   -0.69%  (p=0.003 n=10+10)
CompileOnepass/^.bc(d|e)*$-6                8.96µs ± 1%     9.06µs ± 1%   +1.20%  (p=0.000 n=10+9)
CompileOnepass/^loooooooooooooooooo...-6     219µs ± 1%      220µs ± 1%   +0.63%  (p=0.006 n=9+10)
[Geo mean]                                  31.6µs          31.1µs        -1.82%

name                                      old speed      new speed       delta
QuoteMetaAll-6                            65.5MB/s ± 2%   64.8MB/s ± 3%     ~     (p=0.315 n=10+10)
QuoteMetaNone-6                            290MB/s ± 1%    290MB/s ± 1%     ~     (p=0.755 n=10+10)
Match/Easy0/32-6                           250MB/s ± 0%    251MB/s ± 1%     ~     (p=0.277 n=8+9)
Match/Easy0/1K-6                          1.81GB/s ± 0%   1.81GB/s ± 0%     ~     (p=0.408 n=8+10)
Match/Easy0/32K-6                         3.52GB/s ± 1%   3.53GB/s ± 1%     ~     (p=0.529 n=10+10)
Match/Easy0/1M-6                          2.28GB/s ± 1%   2.28GB/s ± 1%     ~     (p=0.853 n=10+10)
Match/Easy0/32M-6                         2.24GB/s ± 0%   2.23GB/s ± 0%   -0.76%  (p=0.000 n=9+8)
Match/Easy0i/32-6                         15.2MB/s ± 1%   16.2MB/s ± 0%   +6.43%  (p=0.000 n=10+9)
Match/Easy0i/1K-6                         16.6MB/s ± 0%   17.9MB/s ± 0%   +7.48%  (p=0.000 n=10+9)
Match/Easy0i/32K-6                        11.9MB/s ± 0%   12.0MB/s ± 0%   +1.11%  (p=0.000 n=9+9)
Match/Easy0i/1M-6                         11.9MB/s ± 0%   12.1MB/s ± 1%   +1.31%  (p=0.000 n=8+10)
Match/Easy0i/32M-6                        11.9MB/s ± 0%   12.1MB/s ± 1%   +1.84%  (p=0.000 n=8+10)
Match/Easy1/32-6                           260MB/s ± 1%    258MB/s ± 1%   -0.91%  (p=0.001 n=10+10)
Match/Easy1/1K-6                           601MB/s ± 1%    621MB/s ± 0%   +3.28%  (p=0.000 n=9+10)
Match/Easy1/32K-6                          474MB/s ± 0%    479MB/s ± 1%   +0.96%  (p=0.000 n=8+10)
Match/Easy1/1M-6                           426MB/s ± 1%    433MB/s ± 1%   +1.68%  (p=0.000 n=10+10)
Match/Easy1/32M-6                          428MB/s ± 1%    433MB/s ± 0%   +1.09%  (p=0.000 n=10+9)
Match/Medium/32-6                         15.4MB/s ± 1%   16.7MB/s ± 1%   +8.23%  (p=0.000 n=10+9)
Match/Medium/1K-6                         16.3MB/s ± 1%   17.7MB/s ± 1%   +8.43%  (p=0.000 n=9+10)
Match/Medium/32K-6                        12.5MB/s ± 1%   12.7MB/s ± 1%   +2.15%  (p=0.000 n=10+10)
Match/Medium/1M-6                         12.4MB/s ± 0%   12.7MB/s ± 0%   +2.44%  (p=0.000 n=8+9)
Match/Medium/32M-6                        12.4MB/s ± 0%   12.7MB/s ± 0%   +2.52%  (p=0.000 n=10+9)
Match/Hard/32-6                           9.82MB/s ± 1%  10.73MB/s ± 1%   +9.29%  (p=0.000 n=10+10)
Match/Hard/1K-6                           10.2MB/s ± 0%   11.3MB/s ± 1%  +10.56%  (p=0.000 n=9+10)
Match/Hard/32K-6                          8.58MB/s ± 0%   8.58MB/s ± 1%     ~     (p=0.554 n=8+10)
Match/Hard/1M-6                           8.59MB/s ± 1%   8.53MB/s ± 0%   -0.70%  (p=0.000 n=10+8)
Match/Hard/32M-6                          8.62MB/s ± 1%   8.59MB/s ± 1%     ~     (p=0.098 n=10+10)
Match/Hard1/32-6                          1.77MB/s ± 1%   1.99MB/s ± 1%  +12.40%  (p=0.000 n=10+8)
Match/Hard1/1K-6                          1.81MB/s ± 1%   2.08MB/s ± 1%  +14.55%  (p=0.000 n=10+10)
Match/Hard1/32K-6                         1.74MB/s ± 0%   1.74MB/s ± 0%     ~     (p=0.108 n=9+10)
Match/Hard1/1M-6                          1.74MB/s ± 0%   1.74MB/s ± 1%     ~     (p=1.000 n=9+10)
Match/Hard1/32M-6                         1.75MB/s ± 0%   1.75MB/s ± 1%     ~     (p=0.157 n=9+10)
Match_onepass_regex/32-6                  5.05MB/s ± 0%   5.05MB/s ± 1%     ~     (p=0.262 n=8+10)
Match_onepass_regex/1K-6                  5.02MB/s ± 1%   5.02MB/s ± 1%     ~     (p=0.677 n=9+10)
Match_onepass_regex/32K-6                 5.02MB/s ± 0%   4.99MB/s ± 0%   -0.47%  (p=0.000 n=10+9)
Match_onepass_regex/1M-6                  5.01MB/s ± 0%   5.04MB/s ± 1%   +0.68%  (p=0.017 n=8+10)
Match_onepass_regex/32M-6                 4.99MB/s ± 0%   5.03MB/s ± 1%   +0.74%  (p=0.000 n=10+10)
[Geo mean]                                29.1MB/s        29.8MB/s        +2.44%

go1 data for reference

name                     old time/op    new time/op    delta
BinaryTree17-6              4.39s ± 1%     4.37s ± 0%   -0.58%  (p=0.006 n=9+9)
Fannkuch11-6                5.13s ± 0%     5.18s ± 0%   +0.87%  (p=0.000 n=8+8)
FmtFprintfEmpty-6          74.2ns ± 0%    71.7ns ± 3%   -3.41%  (p=0.000 n=10+10)
FmtFprintfString-6          120ns ± 1%     122ns ± 2%     ~     (p=0.333 n=10+10)
FmtFprintfInt-6             127ns ± 1%     127ns ± 1%     ~     (p=0.809 n=10+10)
FmtFprintfIntInt-6          186ns ± 0%     188ns ± 1%   +1.02%  (p=0.002 n=8+10)
FmtFprintfPrefixedInt-6     223ns ± 1%     222ns ± 2%     ~     (p=0.421 n=10+10)
FmtFprintfFloat-6           374ns ± 0%     376ns ± 1%   +0.43%  (p=0.030 n=8+10)
FmtManyArgs-6               795ns ± 0%     788ns ± 1%   -0.79%  (p=0.000 n=8+9)
GobDecode-6                10.9ms ± 1%    10.9ms ± 0%     ~     (p=0.079 n=10+9)
GobEncode-6                8.60ms ± 1%    8.56ms ± 0%   -0.52%  (p=0.004 n=10+10)
Gzip-6                      378ms ± 1%     386ms ± 1%   +2.28%  (p=0.000 n=10+10)
Gunzip-6                   63.7ms ± 0%    62.3ms ± 0%   -2.22%  (p=0.000 n=9+8)
HTTPClientServer-6          120µs ± 3%     114µs ± 3%   -4.99%  (p=0.000 n=10+10)
JSONEncode-6               20.3ms ± 1%    19.9ms ± 0%   -1.90%  (p=0.000 n=9+10)
JSONDecode-6               84.3ms ± 0%    83.7ms ± 0%   -0.76%  (p=0.000 n=8+8)
Mandelbrot200-6            6.91ms ± 0%    6.89ms ± 0%   -0.31%  (p=0.000 n=9+8)
GoParse-6                  5.49ms ± 0%    5.47ms ± 1%     ~     (p=0.101 n=8+10)
RegexpMatchEasy0_32-6       130ns ± 0%     128ns ± 0%   -1.54%  (p=0.002 n=8+10)
RegexpMatchEasy0_1K-6       322ns ± 1%     322ns ± 0%     ~     (p=0.525 n=10+9)
RegexpMatchEasy1_32-6       124ns ± 0%     124ns ± 0%   -0.32%  (p=0.046 n=8+10)
RegexpMatchEasy1_1K-6       570ns ± 0%     548ns ± 1%   -3.76%  (p=0.000 n=10+10)
RegexpMatchMedium_32-6      196ns ± 0%     183ns ± 1%   -6.61%  (p=0.000 n=8+10)
RegexpMatchMedium_1K-6     64.3µs ± 0%    59.0µs ± 1%   -8.31%  (p=0.000 n=8+10)
RegexpMatchHard_32-6       3.08µs ± 0%    2.80µs ± 0%   -8.96%  (p=0.000 n=8+9)
RegexpMatchHard_1K-6       93.0µs ± 0%    84.5µs ± 1%   -9.17%  (p=0.000 n=8+9)
Revcomp-6                   647ms ± 2%     646ms ± 1%     ~     (p=0.720 n=10+9)
Template-6                 92.3ms ± 0%    91.7ms ± 0%   -0.65%  (p=0.000 n=8+8)
TimeParse-6                 490ns ± 0%     488ns ± 0%   -0.43%  (p=0.000 n=10+10)
TimeFormat-6                513ns ± 0%     513ns ± 1%     ~     (p=0.144 n=9+10)
[Geo mean]                 79.1µs         77.7µs        -1.73%

name                     old speed      new speed      delta
GobDecode-6              70.1MB/s ± 1%  70.3MB/s ± 0%     ~     (p=0.078 n=10+9)
GobEncode-6              89.2MB/s ± 1%  89.7MB/s ± 0%   +0.52%  (p=0.004 n=10+10)
Gzip-6                   51.4MB/s ± 1%  50.2MB/s ± 1%   -2.23%  (p=0.000 n=10+10)
Gunzip-6                  304MB/s ± 0%   311MB/s ± 0%   +2.27%  (p=0.000 n=9+8)
JSONEncode-6             95.8MB/s ± 1%  97.7MB/s ± 0%   +1.93%  (p=0.000 n=9+10)
JSONDecode-6             23.0MB/s ± 0%  23.2MB/s ± 0%   +0.76%  (p=0.000 n=8+8)
GoParse-6                10.6MB/s ± 0%  10.6MB/s ± 1%     ~     (p=0.111 n=8+10)
RegexpMatchEasy0_32-6     244MB/s ± 0%   249MB/s ± 0%   +2.06%  (p=0.000 n=9+10)
RegexpMatchEasy0_1K-6    3.18GB/s ± 1%  3.17GB/s ± 0%     ~     (p=0.211 n=10+9)
RegexpMatchEasy1_32-6     257MB/s ± 0%   258MB/s ± 0%   +0.37%  (p=0.000 n=8+8)
RegexpMatchEasy1_1K-6    1.80GB/s ± 0%  1.87GB/s ± 1%   +3.91%  (p=0.000 n=10+10)
RegexpMatchMedium_32-6   5.08MB/s ± 0%  5.43MB/s ± 1%   +7.03%  (p=0.000 n=8+10)
RegexpMatchMedium_1K-6   15.9MB/s ± 0%  17.4MB/s ± 1%   +9.08%  (p=0.000 n=8+10)
RegexpMatchHard_32-6     10.4MB/s ± 0%  11.4MB/s ± 0%   +9.82%  (p=0.000 n=8+9)
RegexpMatchHard_1K-6     11.0MB/s ± 0%  12.1MB/s ± 1%  +10.10%  (p=0.000 n=8+9)
Revcomp-6                 393MB/s ± 2%   394MB/s ± 1%     ~     (p=0.720 n=10+9)
Template-6               21.0MB/s ± 0%  21.2MB/s ± 0%   +0.66%  (p=0.000 n=8+8)
[Geo mean]               74.2MB/s       76.2MB/s        +2.70%

Updates #21851

Change-Id: Ie88455db925f422a828f8528293790726a9c036b
Reviewed-on: https://go-review.googlesource.com/65491
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 25, 2017
This is based on a list that Austin Clements provided in mid-2016. It is
mostly untouched, except for the fact that the wbufptr funcs were
removed from the runtime thus removed from the lits here too.

Add a section for these GC funcs, since there are quite a lot of them
and the runtime has tons of funcs that we want to inline. As before,
sort this section too.

Also place some of these funcs out of the GC section, as they are not
directly related to the GC.

Updates #21851.

Change-Id: I35eb777a4c50b5f655618920dc2bc568c7c30ff5
Reviewed-on: https://go-review.googlesource.com/65654
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/66990 mentions this issue: cmd/compile: add reflect to TestIntendedInlining

gopherbot pushed a commit that referenced this issue Sep 28, 2017
Add the package to the table and start it off with a few small, basic
functions. Inspired by CL 66331, which added flag.ro.

Updates #21851.

Change-Id: I3995cde1ff7bb09a718110473bed8b193c2232a5
Reviewed-on: https://go-review.googlesource.com/66990
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/68650 mentions this issue: cmd/compile: add encoding/binary to TestIntendedInlining

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/72190 mentions this issue: cmd/compile: test inlineability of math/bits funcs

@mvdan
Copy link
Member Author

mvdan commented Jan 4, 2018

I have run out of ideas to add to the table, so I'm going to close this issue for now. Of course, feel free to still add more to it - but an umbrella issue to expand the table is no longer useful.

I believe it was @aclements who suggested that we could use benchmarks to identify what runtime functions should always be inlineable, as opposed to human guesses. That could be a way forward if someone wants to look into it.

@mvdan mvdan closed this as completed Jan 4, 2018
@golang golang locked and limited conversation to collaborators Jan 4, 2019
@rsc rsc unassigned mvdan Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants