Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: testing: custom benchmark labels #28398

Closed
mmcloughlin opened this issue Oct 25, 2018 · 21 comments

Comments

Projects
None yet
8 participants
@mmcloughlin
Copy link
Contributor

commented Oct 25, 2018

Currently go benchmark output reports some labels:

$ go version
go version go1.11.1 linux/amd64
$ go test -run NONE -bench . crypto/aes
goos: linux
goarch: amd64
pkg: crypto/aes
BenchmarkEncrypt-8   	100000000	        10.5 ns/op	1516.95 MB/s
BenchmarkDecrypt-8   	100000000	        10.2 ns/op	1566.61 MB/s
BenchmarkExpand-8    	20000000	        57.3 ns/op
PASS
ok  	crypto/aes	3.310s

These are added by

go/src/testing/benchmark.go

Lines 277 to 283 in dd78955

labelsOnce.Do(func() {
fmt.Fprintf(b.w, "goos: %s\n", runtime.GOOS)
fmt.Fprintf(b.w, "goarch: %s\n", runtime.GOARCH)
if b.importPath != "" {
fmt.Fprintf(b.w, "pkg: %s\n", b.importPath)
}
})

It would be great if the user could add custom labels relevant to the performance of their code. For example:

  • CPU type/Microarchitecture/CPUID bits
  • Software version (perhaps git tag)
  • Presumably many others

I have noticed that the perf.golang.org service appears to support this, but I haven't looked into exactly how. Apologies if I've overlooked some existing feature.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Oct 26, 2018

Dup of #26037? I’d love to see that go in for 1.12 but I don’t think either @aclements or I have the bandwidth to push it through.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Oct 26, 2018

Oh, reading closer, not a dup. Sorry. At least I’ve now cc’d Austin. :)

@mmcloughlin mmcloughlin changed the title testing: custom benchmark labels proposal: testing: custom benchmark labels Oct 26, 2018

@gopherbot gopherbot added this to the Proposal milestone Oct 26, 2018

@gopherbot gopherbot added the Proposal label Oct 26, 2018

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Oct 26, 2018

Seemed this should have the proposal: prefix. Regarding bandwidth, I would be happy to put together a CL if the proposal is accepted.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2018

What exactly are you proposing? Adding an option to go test which prints labels? It seems that you could print those yourself just as easily.

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Oct 31, 2018

An option to go test seems less useful to me. This feature is more important when it comes to labels that require code to derive (such as hardware parameters).

Proposal

My concrete proposal is an additional function in the testing package.
Perhaps something like:

// SetBenchmarkLabel records a value relevant to benchmark performance. This
// will be included in benchmark output ahead of individual benchmark results.
// Labels "goos", "goarch" and "pkg" are set by default.
SetBenchmarkLabel(key, value string)

Not sure about naming, but hopefully this gives the idea.

"Case Study"

I recently implmented Meow hash for Golang. This hash is designed to take advantage of hardware AES instructions, therefore the implementation dispatches to one of three backends based on CPUID flags at runtime (pure Go, AES-NI and VAES-256/512).

To provide useful context in the test/benchmark output, I have two test functions that exist
purely to dump internal parameters:

https://github.com/mmcloughlin/meow/blob/7358a9c1d22772fbddc8c47fd74cf501c0136ec1/dispatch_amd64_test.go#L10-L17
https://github.com/mmcloughlin/meow/blob/7358a9c1d22772fbddc8c47fd74cf501c0136ec1/meow_test.go#L23-L25

I would prefer to include these as "official" labels:

testing.SetBenchmarkLabel("backend", implementation)
testing.SetBenchmarkLabel("hasaesni", cpu.HasAES)
testing.SetBenchmarkLabel("hasavx", cpu.HasAVX)
...

Comparison: Google Benchmark

Note that Google benchmark also provides context with benchmark runs. For
example:

$ ./benchmark
2018-10-31 12:31:20
Running ./benchmark
Run on (4 X 3500 MHz CPU s)
CPU Caches:
  L1 Data 32K (x2)
  L1 Instruction 32K (x2)
  L2 Unified 262K (x2)
  L3 Unified 4194K (x1)
--------------------------------------------------------------------
Benchmark                             Time           CPU Iterations
--------------------------------------------------------------------
meow_hash_benchmark/1                14 ns         14 ns   45173239   66.5225MB/s
meow_hash_benchmark/2                15 ns         15 ns   46804897   125.632MB/s
meow_hash_benchmark/4                16 ns         16 ns   44735009   240.535MB/s
...

It would be great it was also possible to add CPU/cache information in Go
benchmark output.

Benchmark Format

Note that this output has been formalized in the following document:

https://github.com/golang/proposal/blob/master/design/14313-benchmark-format.md#configuration-lines

In particular these "labels" are called "Configuration Lines".

Moreover, this document makes go bench output the official format for
representing benchmark results. While it is always possible to output this
kind of metadata in a separate program, that makes it harder to integrate with
automated tooling for running and storing benchmarks.

@gopherbot

This comment has been minimized.

Copy link

commented Nov 2, 2018

Change https://golang.org/cl/146897 mentions this issue: testing: add SetBenchmarkLabel

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Nov 2, 2018

Added a CL to concretely demonstrate the proposal (and get something on the record before the freeze).

@seebs

This comment has been minimized.

Copy link
Contributor

commented Nov 14, 2018

I'd use it if it existed, which I like to think is an argument in favor.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 28, 2018

Does SetBenchmarkLabel keep track of what's already been printed and not repeat prints? In theory you can change the labels between runs.

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2018

The existing implementation (and prototype CL) print the labels inside a sync.Once. Therefore any labels added or modified after the first print would be ignored. My expectation was that this would be used in init or other setup functions. This behavior seems reasonable to me but should be documented for clarity, perhaps:

// SetBenchmarkLabel records a value relevant to benchmark performance. This
// will be included in benchmark output ahead of individual benchmark results.
// Any SetBenchmarkLabel calls after the first benchmark is executed will be
// ignored.  Labels "goos", "goarch" and "pkg" are set by default.
SetBenchmarkLabel(key, value string)

Did you have some other behavior in mind?

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 12, 2018

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

Otherwise this seems OK to me. @aclements?

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2019

Ping @aclements

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 24, 2019

Yes it would be great to get @aclements input.

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

This seems easy enough, but I'd like to understand the use case you have in mind. Are you thinking about SetBenchmarkLabel calls from inside a benchmark function?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

Is this addressed by the recent fix for #26037?

CC @aclements

@josharian

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

Nope, this is not a dup of #26037. That was my initial reaction as well though. :) See the beginning of the thread.

This is about the labels that are printed prior to the benchmark lines, not the information that is included in the benchmark lines.

@andybons

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

Adding to the ping party for @aclements

@aclements

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

I'd like to understand what @rsc is proposing here a bit better, too.

It does seem a bit odd that SetBenchmarkLabel is only meaningful before the first benchmark run, but I also don't see how it's useful to call between benchmarks (or even when you would be able to do that). If a benchmark has multiple variations, I would expect it to use sub-benchmarks and the line-oriented "BenchmarkName/k=v/..." syntax.

SetBenchmarkLabel should specify that it constrains the syntax of the key and the value to legal configuration lines according to the benchmark format spec. (We did this in #26037 for the metric names.)

@rsc

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

I was thinking that this would be something done inside a Benchmark function, but Austin points out that at that point the "BenchmarkName " prefix has been printed, making it inconvenient to print new key:value pairs. Also, the examples in the initial request are all global to the process, so maybe it makes sense to scope down to something global to the process that gets printed before any benchmarks start.

Then the only question is how do we know benchmarks are about to start. TestMain could print them unconditionally even today. If we give TestMain more access to the list of intended tests and benchmarks (see #28592) then it could avoid the print if no benchmarks will run. But maybe the lines in question would not be so bad to print always, so maybe we don't need to do anything special at all. For that matter, func init would work too.

As far as the case study, that could be done today with:

func init() {
    fmt.Printf("backend: %s\n", implementation)
    fmt.Printf("hasaesni: %v\n", cpu.HasAES)
    fmt.Printf("hasavx: %v\n", cpu.HasAVX)
}
@rsc

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

Based on my comment from last week, I'm leaning toward declining this proposal. The answer to the original question is just "print statements".

@mmcloughlin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

In my opinion there's value in making this a "first class citizen". Printing works but feels hacky.

Here's a hypothetical (but I believe not far fetched) reason why this could matter down the line. Google Benchmark offers --benchmark_out_format=<json|console|csv>. What if go test wanted to offer a similar feature. (The behavior of the -json option to go test is truly perplexing to me, but that's a topic for another time.) So if we wanted to have a go test -bench . -benchfmt json option the output might look like this:

{
	"labels": {
		"goos": "linux",
		"goarch": "amd64",
		"pkg": "crypto/aes"
	},
	"benchmarks": [
		{
			"full_name": "BenchmarkFib-4",
			"n": 5000000,
			"name": "Fib",
			"ns_per_op": 313,
			"procs": 4
		},
		...
		{
			"full_name": "BenchmarkFrobInPlace-4",
			"mb_per_sec": 2155.61,
			"n": 3000000,
			"name": "FrobInPlace",
			"ns_per_op": 475,
			"procs": 4
		}
	]
}

Now my print statements give me:

backend: aes-ni
hasaesni: true
hasavx: true
{
	"labels": {
		"goos": "linux",
		"goarch": "amd64",
		"pkg": "crypto/aes"
	},
	...
}

I'm not arguing for this specific feature right now. My point is just that if people are printing labels, you've lost the semantics, and are unable to manipulate the labels in any structured way.

Finally, I think adding this as a function encourages people to use it, which I think is a positive too.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jun 25, 2019

The JSON output is converted by a different program by parsing the actual benchmark output. Printing will not affect that. Let's let people do print statements for now and revisit once we can see more usage.

@rsc rsc closed this Jun 25, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.