Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: document best practices for avoiding compiler optimizations in benchmarks #27400

Open
nicksnyder opened this issue Aug 31, 2018 · 9 comments

Comments

@nicksnyder
Copy link

@nicksnyder nicksnyder commented Aug 31, 2018

What did you do?

  1. I wanted to make sure I was creating accurate benchmarks.
  2. I found and read Dave Cheney's 2013 blog post on how to write benchmarks in Go. In the "A note on compiler optimisations" section he mentions that it is best practice to assign results to local and package level variables to avoid optimizations.
  3. I went to https://golang.org/pkg/testing/#hdr-Benchmarks

What did you expect to see?

I expected to see documentation on how to correctly write benchmarks that avoid compiler optimizations and examples that reflect best practices.

If the techniques described in Dave's blog post are no longer necessary, I expected to see explicit documentation to that effect.

What did you see instead?

Neither of those things.

@meirf meirf added help wanted and removed help wanted labels Aug 31, 2018
@meirf

This comment has been minimized.

Copy link
Contributor

@meirf meirf commented Aug 31, 2018

@davecheney,

Is that kind of optimization avoidance still recommended?

If so, do you think that info should be put in https://golang.org/pkg/testing/#hdr-Benchmarks? I don't see an official benchmark wiki so seems useful to give a short explanation in testing doc.

@FiloSottile FiloSottile added this to the Unplanned milestone Aug 31, 2018
@as

This comment has been minimized.

Copy link
Contributor

@as as commented Sep 1, 2018

Couldn't reproduce benchmark code elision in currently-supported Go versions.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Sep 1, 2018

@as I believe it can happen now with inlined calls. There is discussion of purity analysis, which might impact non-inlined pure calls later.

@gbbr

This comment has been minimized.

Copy link
Member

@gbbr gbbr commented Oct 25, 2019

How come no action has been taken here? This indeed seems like it would be documentation worthy. Does it warrant a doc update?

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Oct 25, 2019

@gbbr I think adding some documentation around this would be fine. No one has gotten to it yet. Want to send a CL?

@nicksnyder

This comment has been minimized.

Copy link
Author

@nicksnyder nicksnyder commented Oct 25, 2019

Based on the conversation in this thread it is still unclear to me what the documentation should say. Does Dave’s 2013 blog post reflect today’s best practices?

@gbbr

This comment has been minimized.

Copy link
Member

@gbbr gbbr commented Oct 25, 2019

I am not able to come up with an example that illustrates the problem. I'm not sure if this really is a problem today. The example here is wrong, as is #14813, because it uses the loop's index as an argument to the function call. The other example in Dave's post here also does not prove any noticeable differences between the two solutions.

@cespare

This comment has been minimized.

Copy link
Contributor

@cespare cespare commented Oct 25, 2019

Here's an example that demonstrates the problem. You have to be careful to ensure that the function is inlined but also that the result cannot computed at compile time.

package main

import (
	"runtime"
	"testing"
	"time"
)

func BenchmarkX(b *testing.B) {
	for i := 0; i < b.N; i++ {
		f()
	}
}

var sink int

func BenchmarkXSink(b *testing.B) {
	for i := 0; i < b.N; i++ {
		sink = f()
	}
}

func BenchmarkXKeepAlive(b *testing.B) {
	for i := 0; i < b.N; i++ {
		runtime.KeepAlive(f())
	}
}

var input float64

func init() {
	if time.Now().Year() > 1900 {
		input = 123412341234123
	}
}

func f() int {
	x := input
	x /= 7.3
	x /= 7.3
	x /= 7.3
	x /= 7.3
	return int(x)
}

Here's what I get using gc (tip) and gccgo (8.3.0):

$ go test -bench . bench_test.go
goos: linux
goarch: amd64
BenchmarkX-4                    1000000000               0.281 ns/op
BenchmarkXSink-4                43315279                24.5 ns/op
BenchmarkXSinkExported-4        49105191                24.5 ns/op
BenchmarkXKeepAlive-4           48882840                24.3 ns/op
PASS
ok      command-line-arguments  5.823s
$ go test -compiler gccgo -bench . bench_test.go
goos: linux
goarch: amd64
BenchmarkX-4                    50000000                24.4 ns/op
BenchmarkXSink-4                50000000                24.4 ns/op
BenchmarkXSinkExported-4        50000000                24.5 ns/op
BenchmarkXKeepAlive-4           20000000                68.6 ns/op
PASS
ok      command-line-arguments  5.286s
@cespare

This comment has been minimized.

Copy link
Contributor

@cespare cespare commented Oct 25, 2019

I started one of the other discussions about this a few years ago on golang-dev. I think the situation is still quite unfortunate. To the best of my knowledge, I think that situation is:

  1. Examples where the compiler breaks benchmarks (as my code above) are fairly rare in the real world today.
  2. There are a variety of approaches for mitigating this hypothetical problem which have entered the lore, some of which could be themselves thwarted by a sufficiently clever compiler.
  3. If certain optimizations appeared in the gc compiler, they would probably break loads of benchmarks that already exist. (If Go could inline functions with for loops (#14768), which is an important optimization that hasn't been implemented yet, then I expect this issue would suddenly apply to a whole lot more functions out there.)

Given (1) and (2), I think a lot of the current "sink" approaches are not good since they make the code uglier and they are hard to justify (they protect the benchmark against some, but not all, hypothetical future optimizations).

More recently some people have suggested that using runtime.KeepAlive to mark values that you want to always be computed in the benchmark is the best approach (such as @randall77's comment here). That seems better, but is still not completely ideal in two ways:

  • KeepAlive might not convey the meaning as well as something that is named/documented specifically for this purpose. (@randall77 suggested adding a testing.B method which can just delegate to runtime.KeepAlive.)
  • KeepAlive adds a little overhead that seems unnecessary (#33442).

Some comments in these discussions have seemed to suggest that writing microbenchmarks is an advanced task for experienced programmers which always requires some knowledge about the compiler and system underneath and therefore there's nothing to be done here. I don't really buy that, though: I think that even beginners can reasonably want to write benchmarks which measure how long their function f takes to run and compare different approaches to writing f, and we should make it as easy as possible to avoid any future pitfalls where the obvious benchmark code becomes broken.

I advocate that we decide on a single best approach here (which seems to be using runtime.KeepAlive or else a new helper in the testing package), document it thoroughly, and adopt it as widely as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.