-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: decide, document standard benchmark data format #14313
Comments
For golang/go#14313. Change-Id: Ib9483714bbd004ff2be6cfa0d6e730d2d7f5da42 Reviewed-on: https://go-review.googlesource.com/19490 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
Why have the key start with a lowercase letter rather than an uppercase one? Wouldn't having it start with an uppercase letter make it easier to write go code that unmarshals the configuration (for example, not needing struct tags like in json)? |
A line from the example output: BenchmarkDecode/text:digits/level:speed/size:1e4/gomaxprocs:8 100 154125 ns/op 64.88 MB/s 40418 B/op 7 allocs/op is it possible to change "100" to "100 iters" or "100 runs" so that every number also has a unit? Or will that be backwards incompatible? |
@nigeltao, unfortunately that would be backwards incompatible. |
What's the reasoning for this restriction?
It doesn't seem like it makes the job of a parser author harder to just drop this restriction (and generation code can pick an appropriate unit per line, if desired). |
@cespare, in the general case dropping that restriction would make the job of the parser much harder, if not impossible. Since we're proposing that benchmarks can emit custom metrics, that would require the parser to know how to conform arbitrary scalings of arbitrary units. It would also require that we specify much more precisely the syntax of units. Currently we've specified the unit syntax enough to make "best effort" rescaling possible in benchmark processors, but even if a processor doesn't understand a unit, it can at least expect it to be spelled consistently so it can match up results. |
@aclements got it, thanks. |
This proposal adds one new kind of line: configuration. What do you think about also standardizing a column header line format? Right now the format doesn't provide any mechanism for carrying the context around with it, so a line like this
only makes sense given the current benchmark conventions. Now, given that lines other than configuration lines and benchmark result lines are ignored, it's possible for someone to make up their own format for column headers. But might it be better to settle on a single convention from the start? (A slightly different convention would be to decide on a well-known configuration key ( |
By the way, I commented on a few typos in the CL (https://go-review.googlesource.com/#/c/19490/) after it was already submitted. |
I wonder whether the specification should call out that the file is UTF-8 encoded and that lines are separated by line feeds. It may also be helpful to specify whether or not a tool can assume that value units are always ratios, so must include a slash character. There is also the question whether a value unit with multiple slashes is undefined, forbidden or something else. |
I suggest to require all lines beginning with a # character to be ignored. Currently
is a valid configuration line. It cannot be ignored because of the word other in
|
LGTM overall. Seems very useful. Some nits and questions:
0 or more space characters, or 1 or more space characters?
Presumably if the answer to the above is "1 or more", then the spaces may also be omitted in this case.
It'd be better to require a following capital letter. If the name is just
This is a mismatch with running benchmarks over multiple packages right now. Using tip: $ go test -bench=. encoding/base32 encoding/base64
PASS
BenchmarkEncodeToString-8 30000 42033 ns/op 194.89 MB/s
BenchmarkDecodeString-8 10000 182128 ns/op 71.99 MB/s
ok encoding/base32 3.593s
PASS
BenchmarkEncodeToString-8 100000 22681 ns/op 361.17 MB/s
BenchmarkDecodeString-8 20000 99789 ns/op 109.47 MB/s
ok encoding/base64 5.487s According to the proposal, this would be interpreted incorrectly as two benchmarks run twice, rather than four benchmarks run once. If this proposal were adopted, however, we could add to the output a config line to disambiguate, perhaps like: $ go test -bench=. encoding/base32 encoding/base64
PASS
package: encoding/base32
BenchmarkEncodeToString-8 30000 42033 ns/op 194.89 MB/s
BenchmarkDecodeString-8 10000 182128 ns/op 71.99 MB/s
ok encoding/base32 3.593s
PASS
package: encoding/base64
BenchmarkEncodeToString-8 100000 22681 ns/op 361.17 MB/s
BenchmarkDecodeString-8 20000 99789 ns/op 109.47 MB/s
ok encoding/base64 5.487s
Link to issue 10669.
Will https://godoc.org/golang.org/x/tools/benchmark/parse be updated? Deprecated? Replaced? |
Thanks for the comments.
Sure.
It's a fascinating question, but one that maybe we should leave open, at least at first. I can't think of any units that wouldn't have exactly one slash, but maybe (probably) I am just not thinking hard enough.
Nice catch. Thank you. |
1 or more ("x:y: 12" sets "x:y" = "12").
Yes.
Interesting corner case, thanks. Actually the go command allows plain Benchmark. The tool will have to rename it to something. The go command does require that if something follows "Benchmark", that something must not be a lower case letter: "Benchmarked" is not a benchmark.
Nice.
I expect deprecated in favor of a new parser in x/perf (see #14304). The code in package parse is very tied to benchcmp's view of the world (unsurprising since it was basically an mv from cmd/benchcmp). Worse, benchcmp's view of the world - which I accept full blame for - is fundamentally flawed, since it doesn't understand the idea of running a benchmark multiple times and therefore cannot do any statistical validity checks. A general parsing library should really provide something closer to the raw input, leaving analysis to the code invoking the parser. For benchstat, I started by using package parse and trying to recook the results into the form I wanted, but I found that it was easier to parse the raw data from scratch. For that matter, benchstat will move to x/perf, but probably, due to the statistical failings, benchcmp will be left behind, deprecated, and eventually removed. Not for a while though, and somewhat off-topic for the current proposal. |
Thanks for the comments! I believe I've addressed all of them in https://go-review.googlesource.com/21030.
Good catch. I made the rule a bit stricter than you proposed by requiring that the first character satisfy unicode.IsLower(). If there's a compelling case to weaken this, we can do so in the future.
There are useful units with zero slashes, such as binary size in a compiler benchmark and median GC utilization (though you could argue these are "per iteration"). There are also useful units with multiple slashes, such as bytes/CPU/sec. You could represent this as bytes/(CPU*sec), but this goes one step deeper into the rabbit hole of specifying unit syntax, which we're trying to avoid at least for now. I've been using some of these more complex units in the GC benchmark suite I've been putting together (see metrics.go). |
I'm not sure I understand what you're proposing. What would be in the column header line format that isn't already in the benchmark line? Could you give a concrete example? |
@aclements the meaning of the columns isn't given in the benchmark line. The unit alone does not tell you what is being measured. A header line would give this information:
(Apologies for the lousy names.) It's less relevant today because Go benchmarks have a known set (and ordering) of columns, but your format says that this is also a valid benchmark line:
and without more context one cannot know what the 12.34 MB/s represents. (Compression rate? alloc rate? network throughput?) Anyway, as I mentioned, there's plenty of leeway in the specification for anyone to create their own header convention; I was just wondering if it makes sense to prescribe one from the outset. |
@cespare, interesting. I think I understand. This is the difference between dimensions and units: one tells you what's being measured (e.g., "time"), while the other relates a value to a specific physical quantity (e.g., "minutes", and "seconds"). I've actually been fighting with this in a general unit parser/renormalizer I wrote for benchstat, since benchstat wants to show a dimension for its own column headers, and values with nicely (and differently) scaled units in the rows. Specifying it explicitly feels almost unnecessary because you can usually derive the dimension from the unit, but you're right that some of the units used by So, another solution is to say that we know what ns/op, MB/s, B/op, and allocs/op mean, but any custom metrics have to have "better" units. |
I just hit this myself. rsc/benchstat#5 |
I just encountered one annoying problem with this format. If there are no tests (such as in the go1 benchmarks), the testing package prints "testing: warning: no tests to run", which this format interprets as a configuration line with key "testing" and value "warning: no tests to run". |
@mpvl can we change that print to use the SKIP form you had suggested? |
Or we can specifically reserve and ignore the "testing" configuration key,
which is reasonable given that "testing" is the package name.
|
The 'testing ...' line is written to stderr. It should not be present in On Thu, Apr 21, 2016 at 3:59 PM, Minux Ma notifications@github.com wrote:
|
True, but I usually capture both when I'm logging benchmarks because I want to see the errors in the log. |
I'm sorry. This was my misunderstanding, I thought that you were piping the On Fri, 22 Apr 2016, 00:19 Austin Clements, notifications@github.com
|
Closing this as a dup of #2981 (JSON as the standard format?), because both would likely be resolved regardless of what is decided. Feel free to reopen this if you feel otherwise. |
@bradfitz, to be clear, the currently proposed JSON format doesn't capture most of what's interesting about this proposal (custom metrics and benchmark metadata), and according to #2981 (comment), it seems these things are out of scope for the current JSON proposal. Not that there's any harm in closing this issue. We're already using this format. But it's not a dup of #2981 since it covers different ground. |
I agree—not a dup. There's a lot happening here beyond mere encoding. And it's not clear to me how the JSON proposal handles the header lines discussed here, which are not tied to specific benchmark runs. Re-opening. |
Then I'm marking this as "Proposal-Accepted" so it doesn't come up in future proposal review searches. This is affected accepted because you're all using it anyway. I still think there's value in a standard format like JSON, but that discussion can continue in #2981. We can keep this bug open to document the design proposal here. |
CL https://golang.org/cl/33570 mentions this issue. |
The benchmark format proposal was written in parallel with the sub-benchmarks support and the two didn't quite agree on the format of benchmark names. This commit makes two changes to the benchmark format document so that it agrees with the practice and implementation of sub-benchmarks in the testing package: * It replaces key:value in the benchmark name with key=value, which is the format suggested by the sub-benchmarks documentation. * It removes the special rule about reporting gomaxprocs as a special key=value pair. This was never implemented in the testing package, so it still emits a -N suffix for the gomaxprocs value. Technically this syntax is ambiguous, but it's what we're doing. Updates golang/go#14313. Change-Id: I67cbd735de7ab145bc7c3a93b5dadbe05120929c Reviewed-on: https://go-review.googlesource.com/33570 Reviewed-by: Marcel van Lohuizen <mpvl@golang.org>
Is there anything left to be done for this proposal? (Does the documentation need to be moved to some more prominent or searchable location?) |
Hi, I didn't find one, so I wrote an API for encoding/decoding/streaming the benchmark output as defined by the spec readme at https://github.com/cep21/benchparse. I'm happy to contribute it if desired. |
Ping @rsc @aclements . |
Looks like you want to reinvent a wheel. I know about at least these formats: JMeter, Skia, Chrome, BenchmarkAnything, Google Benchmark, dstat. P.S. I have overlooked that it has been already merged. |
So… What’s left in this issue? |
We propose to make the current output of
go test -bench
the defined format for recording all Go benchmark data. Having a defined format allows benchmark measurement programs and benchmark analysis programs to interoperate while evolving independently.See golang.org/design/14313-benchmark-format for details.
The text was updated successfully, but these errors were encountered: