-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: testing: add B.Lap for phases of benchmarks #33194
Comments
/cc @aclements @mpvl |
@seebs benchmark package supports https://golang.org/pkg/testing/#B.ReportMetric, which can be used to report custom metrics. As an example https://godoc.org/github.com/loov/hrtime/hrtesting adds percentiles to the output, such that it gives:
A similar package that would report lap times wouldn't be much more complicated:
The reason it's done that way, is because it can have lower measurement overhead. |
Depending on your situation you can also measure the overhead separately:
It can be made more concise with some package. |
My concern with that second example is that while I can measure the overhead separately, I'm still running the overhead again to measure it separately, rather than just getting the measurements separately. Also, it's not obvious to me how I could implement that "laps" package -- I could record timestamps, I guess. But what about recording allocations, for instance? It just seems like being able to record sub-benchmarks, and sum them, would be a lot simpler. (Especially if I start adding custom metrics, because then I have to manage my custom metrics and my lap timer.) |
It's unclear that there's a new core concept here that belongs in the testing package proper. It does seem like the timing of subpieces can be done outside the package, as custom metrics, and maybe that would be a good place to start. (It's also true that things like counting allocations for different phases can't be done as easily as custom metrics, but it's hard to make the phases completely orthogonal from all the things you might want to do.) |
(Also, @ianlancetaylor points out b.Lap is the wrong name. A lap is something that keeps getting repeated. Maybe each of the b.N top-level iterations is a lap.) I wrote last week that it seemed like sub-part timing should start outside this package. No responses. Does anyone object to that? |
Yeah, "Lap" is probably the wrong name. "Segment" maybe. I don't see a way for an external package to hook into the benchmark package's ability to measure allocations, for instance; would we be assuming that external packages would be duplicating that logic, then? I think the problem here is that I suspect that a really nice implementation of this may need to be inside testing to get access to the knobs and levers it wants to do a good job of accumulating sample data; at a bare minimum, it'll have to duplicate a lot of code. But it may also be a good idea to have that experience to draw on before trying to design an interface for a thing like this that will work in practice. But on the Nth hand, it might be that an interface which is possible without that internal access would be very different from one that works with that internal access. So I think the answer is "I may well some day have a very good objection to this being outside testing, but I don't think I'll know what it is unless I first try to write the external package and see how it works." |
@seebs There's https://golang.org/pkg/runtime/#ReadMemStats for reading memory stats. Based on experience of writing hrtesting, I would think that it will need to duplicate very little code. Here's a quick example for implementing phases: https://play.golang.org/p/wVZIA4TODxL. The API can be improved, but should suffice as a proof-of-concept. |
I tried doing ReadMemStats in a loop at one point and it was about two orders of magnitude of performance hit on a loop that I thought would be "expensive" to run. The issue being that it's doing a ton of additional work that isn't actually necessary for the benchmem results, I think. |
Benchmarking code uses the same ReadMemStats func: https://github.com/golang/go/blob/master/src/testing/benchmark.go#L122. One option would be to exclude the time spent inside the ReadMemStats from the results. e.g. add that time to some "overhead phase". |
... huh, now I'm curious about how expensive ReadMemStats is itself, and whether something else went wrong in my thing that had performance problems when I was using it. Answer: It stops the world and runs with systemstack, which is very cheap in a benchmark, and insanely expensive in a production app actually running 30+ active goroutines. |
Related: #20875 |
Also possibly related to your last comment: #19812 |
I asked last week if anyone objected to this support starting outside the testing package. I don't see any objections to that. Good discussion, but no objections. Based on the discussion, this seems like a likely decline. |
No change in consensus; declined. |
What version of Go are you using (
go version
)?N/A (but 1.12.x)
What did you do?
I'm benchmarking some stuff. I have a longish and complicated process with multiple parts which are hard to separate or run independently. (Approximately in the vicinity of setup/teardown code, where the setup has to be matched by paired teardowns). Was looking at benchmarking options.
Thought of a thing that I'm not sure is all that useful, but the more I think about it, the more I think it would have desireable consequences.
What did you expect to see?
expect is a strong term.
What I'd like is a way to handle sub-benchmarks where it's not possible to just run each one 1..N times separately, but where the entire procedure has to be run 1..N times, but I'd like to track different parts of it.
What did you see instead?
This is sort of possible to emulate-ish using StartTimer/StopTimer, but there's a performance problem here; if one of the sections takes significantly less time than the others, running the entire procedure enough times that that section takes over a second to run takes a very long time, and is highly redundant. This is the sort of thing for which stopwatches have lap timers -- sub timers that can be captured while the main timer continues, and then reviewed separately.
So the idea I have is something like:
And now we get to the kicker: The reason I think this is actually a sort of good idea.
Right now, the above would probably be written as:
Often, when I've had problems figuring out why benchmarks are taking so long to run even though the performance reported is good, the reason is issues in the performance of the setup/cleanup code.
If this feature existed, it would be close to free to report on the performance of those sections too -- and in several cases, if I'd been tracking that performance, I'd have noticed a serious performance regression in something that I wasn't consciously thinking about, but which was impacted by a change I was making.
Basically, I want to encourage people to collect the data, since for some of the benchmarks I do, the setup or cleanup code can easily be 80% of the runtime of a benchmark, and if I'm gonna be spending the CPU cycles to run it, it'd be nice to be able to track it without preventing me from getting good data on the performance of the inner bit.
(I haven't thought very carefully about the API, but I think it probably does want to get an inner b, because it might want to record custom stats, etc.)
The text was updated successfully, but these errors were encountered: