-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bytes: relatively poor performance in bytes.Compare #62648
Comments
Your compare function is not strictly equivalent to
I get:
|
This largely has to do with how different the UUIDs are. Of course, it would be nice to have the best of both worlds. Maybe there is some early-exit things we can do to improve the obviously-different case. |
I was also hoping to find a way to cast (without a performance hit) the I agree that the ratio of hits to misses is influential. Thanks for taking a look. |
There is no penalty with See https://godbolt.org/z/hxbeYE5E9 - both parts are loaded with a single MOVQ and no bounds checks are needed. |
This is because of the hand written function is inlined. goos: darwin
goarch: arm64
pkg: go-playground/pkg
BenchmarkBytesCompare
BenchmarkBytesCompare-8 10990532 106.1 ns/op
BenchmarkArrayCompare
BenchmarkArrayCompare-8 13332518 89.93 ns/op
BenchmarkArrayCompareNoInline
BenchmarkArrayCompareNoInline-8 8318890 138.7 ns/op
BenchmarkBytesCompareAllSame
BenchmarkBytesCompareAllSame-8 9992674 119.0 ns/op
BenchmarkArrayCompareAllSame
BenchmarkArrayCompareAllSame-8 3212157 371.1 ns/op
BenchmarkArrayCompareAllSameNoInline
BenchmarkArrayCompareAllSameNoInline-8 2865070 487.5 ns/op We use "compare_native" to compare the bytes using architeture specific optimizations, but ASM commands cannot be inlined. If the UUIDs are different, the overhead of the call is more significant than when they are the same at which the loop is more prominent. I don't think there is a good way to inline ASM instructions as of today. |
What version of Go are you using (
go version
)?1.21
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/phil/Library/Caches/go-build'
GOENV='/Users/phil/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/phil/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/phil/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.21.0'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/2w/sj4gll256rqfs6r4sgwcn3lh0000gn/T/go-build2454689691=/tmp/go-build -gno-record-gcc-switches -fno-common'
What did you do?
For fixed length arrays (such as
github.com/google/uuid.UUID
)bytes.Compare
perform poorly relative to a handwritten:Bench results where there's a
What did you expect to see?
I expected the internal bytes.Compare to perform optimally.
What did you see instead?
Probably due to the conversion a slice and the inability to unroll the loop,
bytes.Compare
performed about 3x worse.It would be ideal to have a function that would compare fixed-length arrays optimally.
The text was updated successfully, but these errors were encountered: