-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
When a method returns a struct by value and the caller only accesses a single field, the compiler copies the entire struct onto the stack even though only one field is needed. If the struct is backed by a pointer dereference, the compiler should be able to load just the accessed field directly.
This matters in practice for unique.Handle[T].Value() where T is a moderately sized struct and Value().someField is called in a hot loop.
Reproducer:
package main
type Handle[T any] struct {
value *T
}
func (h Handle[T]) Value() T {
return *h.value
}
type big struct {
typ int8
index int64
str string
pkgID string
}
type S struct {
h Handle[big]
}
func main() { f(nil) }
var f = find
func find(ss []S) S {
for _, s := range ss {
if s.typ() == 2 {
return s
}
}
return S{}
}
func (s S) typ() int8 {
if s == (S{}) {
return 0
}
return s.h.Value().typ
}With go tool objdump, the loop body for find copies the entire 40-byte big struct onto the stack (6 MOVUPS) just to read the first byte:
main.go:40 LEAQ 0(SP), SI
main.go:40 MOVUPS 0(DX), X14
main.go:40 MOVUPS X14, 0(SI)
main.go:40 MOVUPS 0x10(DX), X14
main.go:40 MOVUPS X14, 0x10(SI)
main.go:40 MOVUPS 0x20(DX), X14
main.go:40 MOVUPS X14, 0x20(SI)
main.go:40 MOVZX 0(SP), SI
Changing s.h.Value().typ to (*s.h.value).typ (an explicit pointer dereference) produces dramatically better code — a single MOVZX 0(DX), SI with no stack frame at all.
Both expressions have identical semantics when the pointer is non-nil (which is guaranteed by the nil check above), so the compiler should be able to apply this optimization.
Tested with go version go1.26.0 linux/amd64
Metadata
Metadata
Assignees
Labels
Type
Projects
Status