-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: testing: allow B to use real execution duration to decide N #40227
Comments
I usually use https://golang.org/pkg/testing/#B.ResetTimer after setup code and before starting the benchmark loop. Does that work better for your use case? |
It introduces the same issue, the logic behind both Is the same.
When predicting b.N, only built in timer duration was taken into
consideration rather than actual execution time. Thus it would make the
benchmark batch sizes grow crazily.
|
I can't think of a scenario where this kind of per-size setup would lead to valid benchmark results. You're kind of lucky to be getting a timeout - the alternative would be completely bogus results. If the setup for a loop of length b.N takes longer than the thing you're timing, benchmarks are going to have a hard time getting a precise read. The implication is that the actual operation you are doing is changing based on b.N, which invalidates the entire benchmark timing computation: a buffer of 1 GB is going to have very different cache performance than a buffer of 1 MB, but the benchmark routine depends fundamentally on b.N=1<<30 doing exactly the same operation as b.N=1<<20, just 1<<10 more times. |
@rsc It's NOT per-size setup. It's what happened (which should not). I was merely trying to measure read and write performance separately. The current benchmark prediction algorithm was not able to handle it correctly (when expansive operations are ex luded). I revised the code comment in description, it should be more intuitive now. |
There are two problems here. The first problem is that if read is cheap, then testing needs to do more of them in the loop to get an accurate per-operation iteration count. What you are suggesting would choose a very small N in the Read benchmark because of the (untimed) expensive writes, but then the calculation (timed span) / b.N would be much less accurate than usual. To make this benchmark produce accurate results, you need to find a way to move the expensive setup out of the benchmark entirely. The second problem is that, despite the claims to the contrary, this is absolutely per-size setup. If you do N ExpensiveWrite followed by N Reads, then between those two there is some buffer somewhere containing an amount of memory that scales with N. That is exactly what you can't do in benchmarks, because the work involved in storing the memory will have different characteristics for different N. To make this benchmark produce accurate results, the actual work has to be the same, just repeated N times. A more accurate benchmark would look like:
|
testing.B
to use real execution duration to predicate b.N
This proposal has been added to the active column of the proposals project |
Based on the discussion above, this proposal seems like a likely decline. |
No change in consensus, so declined. |
When trying something like below. It can lead to benchmark TIMEOUT. Because benchmark only takes account of recorded duration rather than actual execution duration when predicting next
b.N
.go/src/testing/benchmark.go
Line 125 in c5d7f2f
go/src/testing/benchmark.go
Line 135 in c5d7f2f
go/src/testing/benchmark.go
Lines 296 to 324 in c5d7f2f
Maybe there could be
b.ResetDuration()
which only resets output duration without polluting prediction algorithm.The text was updated successfully, but these errors were encountered: