-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: go.uber.org/zap/zapcore BenchmarkZapJSON performance regression in 1.19 #54076
Comments
Ah, sorry, I actually looked into this one in the memory regressions issue. IIRC, yes, the pacer is just doing its job a little better (in the tail of your trace, note the difference in assist CPU %). I'm inclined to close this. |
Ah, you did note that. :) I should learn to read. The contention is not great, but with fewer assists, contention is going to increase because the allocator is hit harder. |
The additional GC cycles come from a difference in overshoot vs. undershoot. Go 1.18 overshoots on average, Go 1.19 undershoots on average, but has more variance in general. That might be worth investigating, but I'm not sure it makes this a critical regression (it's a microbenchmark after all). In the image below, the first one is Go 1.18, the second is go1.19. Here's the raw text (I added the image because the distribution is difficult to see with the Braille characters on GitHub):
|
As for the reason for the overshoot vs. undershoot, it looks like Go 1.18 thinks the cons/mark ratio is lower than Go 1.19 thinks it is. Note that the variance is also higher, which directly leads to the overshoot/undershoot variance. Again, an image: Raw text:
|
I think the increase in cons/mark (and variance) is a result of:
This would explain everything assuming the allocation rate is roughly the same. There are more GC cycles because the GC expects the costs to be offset by shorter cycles. But the cycles aren't really shorter, so it appears to the GC like the GC is just slower. As a side-note: I think this benchmark also has some kind of memory leak (or is actually storing log data somewhere). I noticed that every run-of-10 I did, the timings went up slightly with each run. When I look at the |
Thanks for the detailed analysis! |
BenchmarkZapJSON is currently our largest regression for 1.19 on the performance dashboard.
Given the big jumps on the benchmark, this benchmark seems to be rather sensitive to changes.
This reproduces easily locally (I get ~500ns/op vs ~650ns/op on my local machine).
The immediate cause of the regression appears to be additional GC cycles. For the same amount of work (
-test.count=3 -test.benchtime=2000000x
), Go 1.19 runs nearly 25% more GC cycles.End of GC traces:
1.18:
1.19:
I'm not the best at interpreting these, but it seems that 1.19 does manage to use a lower GC CPU percentage for each cycle, and fewer assists, which is good, but the extra cycles hurt.
Another interesting point is that this benchmark has heavy contention on the heap lock (in both versions; tiny bit higher in 1.19):
I wonder if that contention may be messing with pacing a bit, since adding more concurrency is not helping as much as it should.
The overall 1.18 vs 1.19 results for zapcore on my machine look like this:
I suspect that we have overly sensitive benchmarks here and there isn't much for us to do (except perhaps investigate the heap lock contention), but I'd like to hear @mknyszek's thoughts.
The text was updated successfully, but these errors were encountered: