-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: large type switches that always match the first value are slower in Go 1.22 #68125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've bisected this degradation to two commits. The first is certainly related to type switches. The second one seems wild, although I have a theory how it can influence things. The first commit is
The benchmark results at commit 28f4ea1:
The benchmark results at commit 10da3b6 which immediatly precedes 28f4ea1:
|
The second commit surprised me, but it also degrades the benchmark noticeably:
I get the following results when benchmarking fa4f951:
And these are the results of 9bfaaa1:
|
Go 1.23 (commit 44f1870) performs in this benchmark the same way as go 1.22.0. |
% go1.22-9bfaaa15fd version
go version devel go1.22-9bfaaa15fd Fri Sep 29 14:58:44 2023 +0000 darwin/arm64
%
% go1.22-fa4f951026 version
go version devel go1.22-fa4f951026 Fri Sep 29 18:51:35 2023 +0000 darwin/arm64
% % go1.22-9bfaaa15fd test -v -run=BenchmarkDeepValueSameGoRoutine -bench=BenchmarkDeepValueSameGoRoutine -count 10 > 9bfaaa15fd.out
% go1.22-fa4f951026 test -v -run=BenchmarkDeepValueSameGoRoutine -bench=BenchmarkDeepValueSameGoRoutine -count 10 > fa4f951026.out % ~/go/bin/benchstat 9bfaaa15fd.out fa4f951026.out
goos: darwin
goarch: arm64
pkg: context
│ 9bfaaa15fd.out │ fa4f951026.out │
│ sec/op │ sec/op vs base │
DeepValueSameGoRoutine/depth=10-8 27.84n ± 1% 27.61n ± 1% ~ (p=0.171 n=10)
DeepValueSameGoRoutine/depth=20-8 66.41n ± 2% 67.03n ± 1% ~ (p=0.494 n=10)
DeepValueSameGoRoutine/depth=30-8 94.55n ± 1% 93.91n ± 2% ~ (p=0.305 n=10)
DeepValueSameGoRoutine/depth=50-8 152.2n ± 0% 150.0n ± 2% -1.48% (p=0.033 n=10)
DeepValueSameGoRoutine/depth=100-8 293.7n ± 0% 289.0n ± 2% -1.58% (p=0.000 n=10)
geomean I tried versions 9bfaaa1 and fa4f951 and found that there was no performance degradation, but it got better. I don't know if there is any problem with my operation. |
@callthingsoff I do not think there is a problem with your operation. You've tested with a different CPU. It suggests that this benchmark got unlucky with icache aliasing after some code was moved around. I would not bother too much with fa4f951. |
I'm pretty sure this is because we're now compiling the type switch in In any case, the reason it is slower is the benchmark is basically optimal for linear search, as the case selected is always the first one. Using a jump table requires more overhead but should do better when other cases are the common one. Just to verify I changed the code of So, not entirely clear what should be done here. Perhaps if value contexts are really the common case, we can promote that case out of the switch into its own if statement. |
Well, value contexts are really common. I found this benchmark while figuring it out why opentelemetry added so much overhead in a project of mine. It is a very typical pattern to have a chain of contexts like "context with parent span info", "context with my current span", "context with one more stats collector", "context with logger fields", etc. Opentelemetry does that a lot, as well as sentry, azure sdk, |
Different coding patterns are going to have different frequencies for context types. It's not clear to me that That said, @randall77 how did you pick the value 5 for |
It was basically chosen using |
and perhaps we put a PGO-informed TODO on the switch code? If the profile is true, that might be really helpful. |
Go version
go version go1.22.0 linux/amd64
Output of
go env
in your module/workspace:What did you do?
The benchmark
BenchmarkDeepValueSameGoRoutine
in packagecontext
became 2x slower in go 1.22.Run the following command:
What did you see happen?
With go 1.21.0, I get the following:
With go 1.22.0 I get this:
What did you expect to see?
No performance degradation.
The text was updated successfully, but these errors were encountered: