Suppose we have a benchmark with high variance. We use our estimates to try to get near the benchtime. We are more likely to exceed the benchtime if we get a particularly slow run. A particularly fast run is more likely to trigger another benchmark run.
The current approach thus introduces bias.
One simple way to fix this would be to decide when our estimate is going to be "close enough", that is, when we are one iteration way from being done, and then stick with that final iteration even if it falls short of the benchtime.
This issue is to follow up on this concern, independently of #24735, which is really about something else.
The text was updated successfully, but these errors were encountered:
Still needs investigation. I think the theoretical concern is real; the real question is whether it matters in practice.
While I'm thinking of it, another simple fix for this would be to accept benchmark execution times that are with a range of the goal (say +/- 10%) rather than requiring that they be strictly >= the goal. That'd also reduce pressure to overestimate b.N (see also CL 112155 and #27217) and make benchmarks run faster overall.