Skip to content

cmd/compile: optimize large structs #24416

@randall77

Description

@randall77

The compiler currently compiles large structs conservatively. If a struct type has more than 4 fields (or a few other conditions), we treat that type as unSSAable. All operations on variables of that type go to the stack, as if their address was taken.

This is suboptimal in various ways. For example:

type T struct {
	a, b, c, d int
}

func f(x *T) {
	t := T{}
	*x = t
}

type U struct {
	a, b, c, d, e int
}

func g(x *U) {
	u := U{}
	*x = u
}

f is compiled optimally, to:

	XORPS	X0, X0
	MOVQ	"".x+8(SP), AX
	MOVUPS	X0, (AX)
	MOVUPS	X0, 16(AX)
	RET

g is quite a bit worse:

	MOVQ	BP, 40(SP)
	LEAQ	40(SP), BP
	MOVQ	$0, "".u(SP)
	XORPS	X0, X0
	MOVUPS	X0, "".u+8(SP)
	MOVUPS	X0, "".u+24(SP)
	MOVQ	"".u(SP), AX
	MOVQ	"".x+56(SP), CX
	MOVQ	AX, (CX)
	LEAQ	8(CX), DI
	LEAQ	"".u+8(SP), SI
	DUFFCOPY	$868
	MOVQ	40(SP), BP
	ADDQ	$48, SP
	RET

We zero a temporary variable on the stack, then copy it to the destination.

We should process large structs through SSA as well. This will require a fair amount of work in the SSA backend to introduce struct builders, selectors of arbitrary width, stack allocation of large types, maybe heap allocation if they are really huge, etc.

Arrays of size > 1 are in a similar state, but they are somewhat harder because non-constant indexes add an additional complication.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsFixThe path to resolution is known, but the work has not been done.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions