Skip to content

Commit

Permalink
cmd/benchstat: import from rsc.io/benchstat
Browse files Browse the repository at this point in the history
I copied the code from various dependencies in go-moremath
into a single 'internal/stats' package. That package is at the top
level of the repo because I expect to pull much of benchcmp
into an importable package.

For golang/go#14304.

Change-Id: Ie114839b2901f5060c202feb3ffc768bf43ce5da
Reviewed-on: https://go-review.googlesource.com/35503
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
  • Loading branch information
rsc committed Jan 25, 2017
1 parent 111d966 commit 1da04cc
Show file tree
Hide file tree
Showing 22 changed files with 3,364 additions and 0 deletions.
85 changes: 85 additions & 0 deletions cmd/benchstat/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Benchstat

Benchstat computes and compares statistics about benchmarks.

Usage:

benchstat [-delta-test name] [-geomean] [-html] old.txt [new.txt] [more.txt ...]

Each input file should contain the concatenated output of a number of runs
of ``go test -bench.'' For each different benchmark listed in an input file,
benchstat computes the mean, minimum, and maximum run time, after removing
outliers using the interquartile range rule.

If invoked on a single input file, benchstat prints the per-benchmark
statistics for that file.

If invoked on a pair of input files, benchstat adds to the output a column
showing the statistics from the second file and a column showing the percent
change in mean from the first to the second file. Next to the percent
change, benchstat shows the p-value and sample sizes from a test of the two
distributions of benchmark times. Small p-values indicate that the two
distributions are significantly different. If the test indicates that there
was no significant change between the two benchmarks (defined as p > 0.05),
benchstat displays a single ~ instead of the percent change.

The -delta-test option controls which significance test is applied: utest
(Mann-Whitney U-test), ttest (two-sample Welch t-test), or none. The default
is the U-test, sometimes also referred to as the Wilcoxon rank sum test.

If invoked on more than two input files, benchstat prints the per-benchmark
statistics for all the files, showing one column of statistics for each
file, with no column for percent change or statistical significance.

The -html option causes benchstat to print the results as an HTML table.

## Example

Suppose we collect benchmark results from running ``go test -bench=Encode''
five times before and after a particular change.

The file old.txt contains:

BenchmarkGobEncode 100 13552735 ns/op 56.63 MB/s
BenchmarkJSONEncode 50 32395067 ns/op 59.90 MB/s
BenchmarkGobEncode 100 13553943 ns/op 56.63 MB/s
BenchmarkJSONEncode 50 32334214 ns/op 60.01 MB/s
BenchmarkGobEncode 100 13606356 ns/op 56.41 MB/s
BenchmarkJSONEncode 50 31992891 ns/op 60.65 MB/s
BenchmarkGobEncode 100 13683198 ns/op 56.09 MB/s
BenchmarkJSONEncode 50 31735022 ns/op 61.15 MB/s

The file new.txt contains:

BenchmarkGobEncode 100 11773189 ns/op 65.19 MB/s
BenchmarkJSONEncode 50 32036529 ns/op 60.57 MB/s
BenchmarkGobEncode 100 11942588 ns/op 64.27 MB/s
BenchmarkJSONEncode 50 32156552 ns/op 60.34 MB/s
BenchmarkGobEncode 100 11786159 ns/op 65.12 MB/s
BenchmarkJSONEncode 50 31288355 ns/op 62.02 MB/s
BenchmarkGobEncode 100 11628583 ns/op 66.00 MB/s
BenchmarkJSONEncode 50 31559706 ns/op 61.49 MB/s
BenchmarkGobEncode 100 11815924 ns/op 64.96 MB/s
BenchmarkJSONEncode 50 31765634 ns/op 61.09 MB/s

The order of the lines in the file does not matter, except that the output
lists benchmarks in order of appearance.

If run with just one input file, benchstat summarizes that file:

$ benchstat old.txt
name time/op
GobEncode 13.6ms ± 1%
JSONEncode 32.1ms ± 1%
$

If run with two input files, benchstat summarizes and compares:

$ benchstat old.txt new.txt
name old time/op new time/op delta
GobEncode 13.6ms ± 1% 11.8ms ± 1% -13.31% (p=0.016 n=4+5)
JSONEncode 32.1ms ± 1% 31.8ms ± 1% ~ (p=0.286 n=4+5)
$

Note that the JSONEncode result is reported as statistically insignificant
instead of a -0.93% delta.
Loading

5 comments on commit 1da04cc

@zchee
Copy link

@zchee zchee commented on 1da04cc Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsc Hi, thanks start of works perf package.
BTW, it's not huge bug so I'll use this comment area.

I understand you said "That package is at the top level of the repo because I expect to pull much of benchcmp into an importable package".
But now, it seems to benchstat main.go file is can't import golang.org/x/perf/internal/stats because that's top level internal package.

Is it intended? will it be fixed later?

$ go get -u -v -x github.com/golang/perf/cmd/bench{save,stat}
.
.
.
# It's $GOPATH/github.com/golang/perf/cmd/benchstat/main.go
main.go:105:2: use of internal package not allowed

@rsc
Copy link
Contributor Author

@rsc rsc commented on 1da04cc Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Apparently I haven't added the right import restrictions. The import should be done as

go get -u -v -x golang.org/x/perf/cmd/benchstat

@zchee
Copy link

@zchee zchee commented on 1da04cc Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsc Ah, I made a mistake because "[mirror]" text was not in the perf repository description.
It works! sorry for annoying.

@rsc
Copy link
Contributor Author

@rsc rsc commented on 1da04cc Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added "[mirror]", thanks.

@zchee
Copy link

@zchee zchee commented on 1da04cc Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks too :)

Please sign in to comment.