New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: add ratcheting variants #7465

Open
josharian opened this Issue Mar 4, 2014 · 4 comments

Comments

Projects
None yet
5 participants
@josharian
Contributor

josharian commented Mar 4, 2014

For some testing and benchmark purposes, a ratchet is better suited than an average.

https://golang.org/cl/67870053/ bumps up the number of AllocsPerRun runs of an
http test to avoid flakiness. This test would be more reliable using a lower number of
runs if it could measure the best run rather than the average. In addition, it could set
an explicit (rather than comparative) goal for the number of allocs, which would allow
it to catch other regressions. With care, MinAllocsPerRun could even use heuristics to
avoid requiring the user to pass an explicit number of runs.

For benchmarking tightly CPU-bound code with minimal scheduler/OS interactions, a
ratcheting benchmark will often yield more stable, useful results than an averaging
benchmark.
@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented May 9, 2014

Comment 1:

Labels changed: added repo-main, release-none.

@minux

This comment has been minimized.

Member

minux commented May 9, 2014

Comment 2:

i'd expect that using the best result of abfew runs might introduce yet another kind of
flaky, i.e. false positive one. comparing to false negative flaky results we are
getting,  i'd rather get the later.
@rsc

This comment has been minimized.

Contributor

rsc commented Mar 7, 2017

For allocs, I agree that it would be nice to fix AllocsPerRun in some ideal world, although we're a bit stuck with it now. I'm also not sure we can build an API with no runs parameter: it seems like at the least you need a max count. If f is expensive then you might not want to run it very many times, and if f is unstable then you need to cut it off at some point. It might be nice to sketch out a func CountAllocs(f func()) int, but I'd be worried about these kinds of complications. In contrast, AllocsPerRun is very easy to specify and understand. There's no magic that can break.

For CPU, I think the number of times when you actually want just a ratchet is pretty low. Modern systems are weird enough that even the lowest possible observed time can be misleading. Maybe 99% of the time the top takes 5ns but occasionally the stars align just right and it takes 3ns. I've seen craziness like this. Then the min of all the runs is noisier than the average. I do think we should expose the underlying distribution, as in #19128, which is much better than any one number.

Given #19128, can we trim this issue down to being just about allocation counting?

@josharian

This comment has been minimized.

Contributor

josharian commented Mar 7, 2017

For CPU, I think the number of times when you actually want just a ratchet is pretty low.

Fair enough. And my benchmarking interests are probably atypical.

Given #19128, can we trim this issue down to being just about allocation counting?

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment