You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been experimenting with alternate encodings for the PCDATA tables, with the goal of significantly improving decode speed, at possibly a slight increase in binary size.
@mknyszek, @dr2chase and I designed a format that's >4x faster to decode than the Go 1.21 tables, while increasing binary size by only 1–2.5%. This issue is to track (eventually) moving to this format in the compiler and runtime.
All of the benchmarks generate a random sample of 1024 (PC, table) pairs to lookup and report the average lookup time.
"Decode/go/alt" is the new implementation. "varint-cache-nohit" is the current varint format, with no cache hits. "varint-cache-hit" is similar, but repeats each lookup 8 times to simulate a high cache hit rate. "varint-cache-none" is the varint format, but with no cache.
Here are the same results for github.com/kubernetes/kubernetes/cmd/kubelet:
I went a little deeper on benchmarking. I wanted to see how the PC being looked up affected speed, so in addition to selecting a "random" PC like before, I added benchmarks for looking up PC offset 0 and PC offset 4096. Indeed, if you read down the benchstat table, you can see that my proposed encoding is relatively insensitive to the PC value, while the varint encoding slows down pretty significantly with larger PC values. Even at PC offset 0, where you'd expect the varint encoding to do pretty well, it's still slower than my proposed encoding (looking at the pc=0 rows, comparing varint to alt).
perflock go test -test.run ^$ -test.bench Decode -bench-binary ~/sdk/go1.20.6/bin/go -bench-binary /tmp/kubelet -test.count 20 -test.timeout 20m