Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: testing: Sub-benchmark that returns testing.BenchResults struct; maybe testing.B.Execute? #59300

ayang64 opened this issue Mar 28, 2023 · 1 comment


Copy link

ayang64 commented Mar 28, 2023

In general, benchmarks in isolation aren't very useful. To compare benchmarks we need an external tool (like benchstat). It'd be nice if we could do comparative benchmarking from within the go test tool itself.

When refactoring code, it is often useful to maintain an old version of a function or method to test for regressions or change in functionality.

Likewise, when refactoring for performance we should test for functional regressions AND performance regressions. That is: We should ensure that a change represents a performance improvement and make sure that it remains the case over time.

For example, lets say we have a function fib() that we think we've sped up. I believe a reasonable thing to do would be to create a test compare our new version of fib() with our old version. That might entail something like:

// in fib_test.go

func oldFib(n int) int {
  // insert fib() implementation here...

// imagine our "new" fib() function has replaced the old implementation of fib() -- presumably in fib.go

// TestFibRegression ensures that our new optimized fib is bug-for-bug compatible with the previous version.  
func TestFibRegression(t *testing.T) {
  for i := 0; i < 50; i++ {
    if got, expected := fib(i), oldFib(i); got != expected {
      t.Fatalf("regression: fib(%d) returned %d; oldFib(%[1]d) returned %d", i, got, expected)

So maybe that's a reasonable way to ensure that our new fib() is a valid replacement for the old one. But how do we ensure that our new fib() is always faster than the old?

How about this:

// BenchmarkFib benchmarks and compares our optimized and unoptimized fib implementation.
// If, somehow, the optimized version becomes slower, this benchmark should fail.
func BenchmarkFib(b *testing.B) {
  funcs := map[string]func(int) int {
    "oldFib":  oldFib,
    "fib": fib,

  r := map[string]testing.BenchmarkResults{}

  for name, f := range funcs {
    r[name] = b.Execute(func(b *testing.B) {
      for i := 0; i < b.N; i++ {
        f(100) // whatever -- i know, inlining etc.

  if newns, oldns := r["fib"].NsPerOp(), r["oldFib"].NsPerOp; if oldns > 0 && newns > oldns {
    b.Fatalf("regression: new fib is %d ns/op; should be less than old fib at %d ns/op", newns, oldns)

The functions can be compared on the same machine and as near the same time as possible. oldFib can be excluded from the benchmark if necessary by excluding sub-tests that starts with old and we can have an easy way to be alerted if the optimization regresses some how (maybe a new compiler optimization or standard library change, etc.).

The goal is to allow comparative benchmarks without external tools.

I propose this instead of using testing.Benchmark() unless testing.Benchmark() can be called from... an actual benchmark in a meaningful way.

Otherwise, I don't think we should have to run benchmarks from a Test function for the same reason we generally don't want to run benchmarks when we test.

Maybe additional methods could be used to compute and output differences in a way similar to benchstat.

@gopherbot gopherbot added this to the Proposal milestone Mar 28, 2023
Copy link

CC @aclements @golang/runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Status: Incoming

No branches or pull requests

3 participants