What version of Go are you using (go version)?
$ go version
go version devel +7b872b6d95 Tue Jun 9 23:24:08 2020 +0000 darwin/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env)?
go env darwin/amd64
$ go env
GOARCH="amd64"
GOOS="darwin"
What did you do?
I've implemented a faster version of siftDown for when it's expected that the root will end up in one of the lower levels. This is the likely outcome right after a "pop", since we move a former leaf to the root.
Benchmarks using an adversarial input (the one in sort_test.go) show that the new code is ~11.8% faster when using a struct with a complex Less, and ~5.5% faster when sorting ints.
My changes makes the code slower in the case when a lot of elements are equivalent (!Less(a, b) && !Less(b, a)), but this case is uncommon when the Heapsort part of sort.go is activated.
Notes
The same changes can be applied to container/heap and to runtime, assuming benchmarks support them.
benchstat results
name old time/op new time/op delta
Adversary/Using_complex_Less 49.5ms ± 0% 43.7ms ± 0% -11.77% (p=0.000 n=10+10)
Adversary/Using_Ints 31.3ms ± 0% 29.6ms ± 0% -5.48% (p=0.000 n=10+10)
What version of Go are you using (
go version)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env)?go envdarwin/amd64What did you do?
I've implemented a faster version of siftDown for when it's expected that the root will end up in one of the lower levels. This is the likely outcome right after a "pop", since we move a former leaf to the root.
Benchmarks using an adversarial input (the one in sort_test.go) show that the new code is ~11.8% faster when using a struct with a complex Less, and ~5.5% faster when sorting ints.
My changes makes the code slower in the case when a lot of elements are equivalent (!Less(a, b) && !Less(b, a)), but this case is uncommon when the Heapsort part of sort.go is activated.
Notes
The same changes can be applied to container/heap and to runtime, assuming benchmarks support them.
benchstat results