Please see example at https://github.com/bboreham/go-loser/tree/lesser.
I would like the call to Less() to be devirtualized and inlined, because the implementation is trivial.
A version of the code using < instead of a Less method is here; it runs about 40% faster.
Commands I ran in my attempt:
go1.23rc2 test -run xxx -bench '^BenchmarkMerge$' -cpuprofile pgo.pprof
go1.23rc2 test -pgo pgo.pprof -run xxx -bench '^BenchmarkMerge$'
(go version go1.23rc2 linux/amd64)
I tried to gain some insight via -gcflags=-d=pgodebug=2, I but don't really follow what it is telling me.
...
hot-node enabled increased budget=2000 for func=github.com/bboreham/go-loser_test.Uint64.Less
...
./tree.go:142:21: PGO devirtualize considering call (func(go.shape.uint64, go.shape.uint64) bool)(&loser..dict[2])(loser.node.value, loser.winningValue)
./tree.go:142:21: edge github.com/bboreham/go-loser.(*Tree[go.shape.uint64,go.shape.*uint8]).replayGames:5 -> github.com/bboreham/go-loser_test.Uint64.Less (weight 10): method(Uint64) func(Uint64) bool doesn't match func(go.shape.uint64, go.shape.uint64) bool
./tree.go:142:21: call github.com/bboreham/go-loser.(*Tree[go.shape.uint64,go.shape.*uint8]).replayGames:5: no hot callee
[This is something I was chatting with @prattmic about at GopherCon 2023; I just got round to writing it up]
Please see example at https://github.com/bboreham/go-loser/tree/lesser.
I would like the call to
Less()to be devirtualized and inlined, because the implementation is trivial.A version of the code using
<instead of aLessmethod is here; it runs about 40% faster.Commands I ran in my attempt:
(
go version go1.23rc2 linux/amd64)I tried to gain some insight via
-gcflags=-d=pgodebug=2, I but don't really follow what it is telling me.[This is something I was chatting with @prattmic about at GopherCon 2023; I just got round to writing it up]