x/perf/benchstat: add a mode to compare across git refs #34311

vancluever · 2019-09-15T20:22:00Z

Currently, benchstat requires input files used for comparison be generated and supplied "manually", ie: out-of-band from benchstat itself.

A workflow for a git comparison would follow something like:

Add your benchmarks and commit them, so you have a baseline.
Run go test -count=SOME_SIGNIFICANT_COUNT -bench=. ... > old.txt.
Make the changes that will improve or otherwise affect performance. Possibly commit these.
Run go test -count=SOME_SIGNIFICANT_COUNT -bench=. ... > new.txt.

A while ago I decided to make a change to benchstat to help eliminate some of this busywork, in addition to help ease the benchmarking of two arbitrary points - say you wanted to compare a benchmark between two arbitrary commits to check for regressions, etc.

To see what the changes would look like, the commit is here: https://github.com/vancluever/perf/commit/4539336fbe2794f1f28d49e87566915746ab69fa

Basically, the workflow for "git mode" would be:

# Compare current HEAD with old branch or ref
benchstat -git -benchargs="-count=5 -run='^$' -bench=BenchmarkStuff ." old-branch

# Compare two branches
benchstat -git -benchargs="-count=5 -run='^$' -bench=BenchmarkStuff ." old-branch new-branch

# Compare two commits (ensure they are in correct chronological order on your branch)
benchstat -git -benchargs="-count=5 -run='^$' -bench=BenchmarkStuff ." abcd012 012abcd

Basically, all of the above commit is lacking to be CL-worthy is:

Some tests for the new behavior.
Some care taken around discovery of the current ref when it's set to a detached ref (such as a tag). In this case, git rev-parse --abbrev-ref HEAD gives HEAD, not the ref (ie: tag or commit). This actually prevents benchstat from testing the correct refs when only supplied an old ref, or switching back to the previous HEAD when supplied two.

If this proposal is accepted, I'll work on fixing detached HEAD discovery, add the tests, and put in the CL. 🙂

To show it completely in action, below is the output of me using it to run the benchmarks from https://go-review.googlesource.com/c/go/+/163862, between when they were added, and when https://go-review.googlesource.com/c/go/+/163599 was merged, correcting the HTTP/1.1 Transport's ReaderFrom implementation issues. These are the darwin/amd64 tests, but as they are run on my local laptop which is not necessarily the most controlled environment (in addition to using go 1.12.9 built with 1.13), there might be some skew from what is in 163599, which was submitted some 6 months ago. Since we're comparing across commits as well, it's also including other commits that would not be included by comparing a single commit to master.

$ benchstat -benchargs="-count=10 -run='^$' -bench=BenchmarkFileAndServer ./src/net/http/" -git 312bfc5 6ebfbba
running in 312bfc5: go test -count=10 -run='^$' -bench=BenchmarkFileAndServer ./src/net/http/
running in 6ebfbba: go test -count=10 -run='^$' -bench=BenchmarkFileAndServer ./src/net/http/
name                        old time/op    new time/op      delta
FileAndServer_1KB/NoTLS-8      120µs ±74%       109µs ±13%       ~     (p=0.912 n=10+10)
FileAndServer_1KB/TLS-8       89.6µs ± 7%     114.5µs ± 4%    +27.73%  (p=0.000 n=10+10)
FileAndServer_16MB/NoTLS-8     246ms ± 8%        10ms ± 9%    -96.12%  (p=0.000 n=8+10)
FileAndServer_16MB/TLS-8      45.8ms ±23%      16.5ms ± 5%    -63.86%  (p=0.000 n=10+9)
FileAndServer_64MB/NoTLS-8     1.06s ±27%       0.04s ± 3%    -96.67%  (p=0.000 n=10+8)
FileAndServer_64MB/TLS-8       206ms ±30%        64ms ± 3%    -69.08%  (p=0.000 n=10+10)

name                        old speed      new speed        delta
FileAndServer_1KB/NoTLS-8   9.64MB/s ±49%    9.44MB/s ±12%       ~     (p=0.912 n=10+10)
FileAndServer_1KB/TLS-8     11.4MB/s ± 7%     9.0MB/s ± 4%    -21.77%  (p=0.000 n=10+10)
FileAndServer_16MB/NoTLS-8  68.3MB/s ± 8%  1762.6MB/s ± 8%  +2480.11%  (p=0.000 n=8+10)
FileAndServer_16MB/TLS-8     372MB/s ±20%    1009MB/s ± 3%   +171.54%  (p=0.000 n=10+8)
FileAndServer_64MB/NoTLS-8  65.2MB/s ±34%  1903.2MB/s ± 3%  +2820.52%  (p=0.000 n=10+8)
FileAndServer_64MB/TLS-8     335MB/s ±27%    1056MB/s ± 3%   +215.13%  (p=0.000 n=10+10)

The text was updated successfully, but these errors were encountered:

vancluever · 2019-09-15T20:40:38Z

Just to compare as well, here is the full output of the equivalent "manual" run:

git checkout 312bfc5 && \
  go test -count=10 -run='^$' -bench=BenchmarkFileAndServer ./src/net/http/ > old.txt && \
  git checkout 6ebfbba && \
  go test -count=10 -run='^$' -bench=BenchmarkFileAndServer ./src/net/http/ > new.txt && \
  benchstat old.txt new.txt
Checking out files: 100% (3109/3109), done.
Note: checking out '312bfc5'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 312bfc5d55 net/http: add request file upload benchmarks
Previous HEAD position was 312bfc5d55 net/http: add request file upload benchmarks
HEAD is now at 6ebfbbaadf net/http: let Transport request body writes use sendfile
name                        old time/op    new time/op      delta
FileAndServer_1KB/NoTLS-8      125µs ±61%        80µs ± 2%    -35.76%  (p=0.034 n=10+8)
FileAndServer_1KB/TLS-8       88.8µs ± 5%      87.7µs ± 3%       ~     (p=0.280 n=10+10)
FileAndServer_16MB/NoTLS-8     231ms ±31%         8ms ± 2%    -96.40%  (p=0.000 n=10+9)
FileAndServer_16MB/TLS-8      41.4ms ±11%      15.3ms ± 1%    -63.15%  (p=0.000 n=10+9)
FileAndServer_64MB/NoTLS-8     819ms ±39%        34ms ± 9%    -95.86%  (p=0.000 n=10+10)
FileAndServer_64MB/TLS-8       170ms ±16%        61ms ± 1%    -64.28%  (p=0.000 n=10+9)

name                        old speed      new speed        delta
FileAndServer_1KB/NoTLS-8   9.35MB/s ±46%   12.73MB/s ± 2%    +36.12%  (p=0.034 n=10+8)
FileAndServer_1KB/TLS-8     11.5MB/s ± 5%    11.7MB/s ± 3%       ~     (p=0.271 n=10+10)
FileAndServer_16MB/NoTLS-8  71.2MB/s ±23%  2016.6MB/s ± 2%  +2733.20%  (p=0.000 n=9+9)
FileAndServer_16MB/TLS-8     407MB/s ±10%    1099MB/s ± 1%   +170.23%  (p=0.000 n=10+9)
FileAndServer_64MB/NoTLS-8  88.7MB/s ±45%  1981.3MB/s ±10%  +2132.70%  (p=0.000 n=10+10)
FileAndServer_64MB/TLS-8     397MB/s ±14%    1103MB/s ± 1%   +178.15%  (p=0.000 n=10+9)

mvdan · 2019-09-16T08:26:52Z

I sympathise with the idea and effort here; automating this would be nice.

I don't think benchstat is the right tool, though. All it does is compare benchmark numbers; it is not at all specific to Go, or to running Go benchmarks via go test. Similarly, it doesn't know about git, and shouldn't require it for its features.

I think your idea is good, but it should be a tool on top of benchstat. For example, imagine a program that depends on git, go test, and benchstat. If that works well and people find it useful, we can then consider including it under x/perf, if you come up with a good name for it :)

zdjones · 2019-09-16T08:37:49Z

I've been wanting to do something similar, but more general. I need to be able to run arbitrary go commands against different commits, not just benchmarks. I've just been writing bash scripts to do it, all the while knowing there must be a more elegant solution.

zdjones · 2019-09-16T08:58:40Z

Take a look at https://github.com/josharian/compilecmp, which runs compiler benchmarks across different commits.

josharian · 2019-09-16T14:42:54Z

https://godoc.org/github.com/aclements/go-misc/benchmany

vancluever · 2019-09-16T16:23:57Z

we can then consider including it under x/perf, if you come up with a good name for it :)

😆 sound good @mvdan, I'll work on this when I have a chance, spin it off, and link it when it's ready.

mvdan · 2019-09-16T21:45:00Z

@vancluever have you checked the tools above? It's probable that you can either directly use (or quickly adapt) either of them for your needs.

vancluever · 2019-09-16T22:09:17Z

@mvdan I did take a quick look over, and both definitely seem to do things that would cover this case.

Honestly, this was more about seeing if the work that I did here (I made this 6 months ago to help with gathering benchmarks for the above CLs) would be useful to contribute - but I do agree it it's overloading x/perf's concerns. In light of that and the prior art, I'm just going to close this and if I really feel like there's something that would make sense to include, I'll open a new proposal.

Thanks! And thanks @zdjones and @josharian for the suggestions.

gopherbot added this to the Unreleased milestone Sep 15, 2019

vancluever closed this as completed Sep 17, 2019

golang locked and limited conversation to collaborators Sep 16, 2020

gopherbot added the FrozenDueToAge label Sep 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/perf/benchstat: add a mode to compare across git refs #34311

x/perf/benchstat: add a mode to compare across git refs #34311

vancluever commented Sep 15, 2019 •

edited

Loading

vancluever commented Sep 15, 2019

mvdan commented Sep 16, 2019

zdjones commented Sep 16, 2019

zdjones commented Sep 16, 2019

josharian commented Sep 16, 2019

vancluever commented Sep 16, 2019

mvdan commented Sep 16, 2019

vancluever commented Sep 16, 2019 •

edited

Loading

x/perf/benchstat: add a mode to compare across git refs #34311

x/perf/benchstat: add a mode to compare across git refs #34311

Comments

vancluever commented Sep 15, 2019 • edited Loading

vancluever commented Sep 15, 2019

mvdan commented Sep 16, 2019

zdjones commented Sep 16, 2019

zdjones commented Sep 16, 2019

josharian commented Sep 16, 2019

vancluever commented Sep 16, 2019

mvdan commented Sep 16, 2019

vancluever commented Sep 16, 2019 • edited Loading

vancluever commented Sep 15, 2019 •

edited

Loading

vancluever commented Sep 16, 2019 •

edited

Loading