runtime: decreasing heap makes performance worse #27545
Closed
Labels
Milestone
Comments
Hi @LK4D4. Could you post the output of GODEBUG=gctrace=1 for both benchmark configurations? I'm particularly wondering about the heap size and how it compares. If the heap is relatively small, there's a known amortization failure (#19839 and #23044) that can make GC more expensive on smaller heaps. |
Those look like just benchmark results and not gctraces. But if you confirmed that the heaps are small, then, yeah, I'd say this is the GC amortization failure. |
@aclements sorry :/ Have so many different outputs at this point. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hello, I'm trying to mitigate #18155 - trying to remove some persistent buffers, and encountered something that I don't understand.
What version of Go are you using (
go version
)?Tried Go 1.10.4 and Go 1.11
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
So, I have two branches, diff between them looks like
That is - I'm removing unused buffer field which is instantiated with 16kb []byte slice. This makes performance worse: benchmark shows 10-20% increase in running time.
Cpu profile shows that version without buffer spends much more time in function gcAssistAlloc. And blocking profile shows that much more time spent in runtime.gcParkAssist.
I wrote a script to reproduce the problem (it's not that isolated code, sadly :():
Copy it somewhere, chmod +x and run. It will produce benchcmp output and CPU profiles and traces in
/tmp/gc_vitess_result
.govendor sync
might take some time, but eventually it'll do it. You might need to run it couple of times(to warm things up) before it starts to produce consistent result.Also, you might notice line
+ //_ [connBufferSize]byte
in my diff - if I uncomment it, then there is no performance hit. I tried to allocate the same amount of memory in benchmark function itself with no luck.When I disable GC - performance becomes better-same.
Also, there is another two fields in this struct -
*bufio.Reader
and*bufio.Writer
, removal of them(pooling them fromsync.Pool
actually) has same consequences, removal of all three - makes benchmark 50-70% worse.Let me know if I could provide more info. I'll try to write more isolated benchmark in the meantime.
Benchmark code
What did you expect to see?
Performance is better of the same after removing allocation.
What did you see instead?
Performance is worse after removing allocation.
@RLH @aclements I would appreciate any input on this problem.
Thanks!
/cc @Tahara @sougou
The text was updated successfully, but these errors were encountered: