-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/perf/benchstat: Return status code 1 when benchmarks change significantly #20728
Comments
If you run the benchmarks enough times, you could get a change as small as You could change the flag to be a treshold, but I'm not sure if that would be a good solution. Do you have an idea of how we could deal with these? I feel like the flag would be fairly useless with the high likelihood of false positives. |
Oh, forget my point on the treshold - I didn't know about |
+1 on this. Currently have a non voting job on CI for exactly this, and parsing the result as described in the initial description which is not ideal. |
I'm not sure this makes statistical sense. With the default alpha threshold, you expect a benchmark with no changes to show a "significant" change 5% of the time by random chance. If you're running multiple benchmarks, the chance that at least one of them will appear significant amplifies (unless you apply a correction for multiple hypothesis testing, which benchstat currently won't do automatically for you). So is this actually useful for CI? Note that there is also a CSV output, so it wouldn't be hard to write a tool to parse that output. I'm also pulling all of the benchmark stats out into their own package that could be reused by another tool directly. On the topic of the threshold, note that statistical significance does not mean that a change is "big", just that it's unlikely to be from random chance. It could be a very small change, but there was enough data and low enough noise to determine that there probably was a change. |
Change https://golang.org/cl/283616 mentions this issue: |
Updates golang/go#20728. Change-Id: I4c33e64d5959cadfbb97ca6a2274e0c060e87d29
Updates golang/go#20728. Change-Id: I4c33e64d5959cadfbb97ca6a2274e0c060e87d29 Reviewed-on: https://go-review.googlesource.com/c/perf/+/283616 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Running benchstat on a CI server to detect anomalies relies on the user to parse the output from the command in order to pick up any deltas. To make this process simpler I propose benchstat would return a status code 1 when any of the benchmarks have significant change.
In the event that backwards compatibility is required, a new flag could be added to activate this behaviour.
Example:
The text was updated successfully, but these errors were encountered: