New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inefficient CALL setup when more than 32bytes of args #23377

Open
ALTree opened this Issue Jan 8, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@ALTree
Member

ALTree commented Jan 8, 2018

$ gotip version
go version devel +a62071a209 Sat Jan 6 04:52:00 2018 +0000 linux/amd64
type T struct {
	s1, s2 string
}

//go:noinline
func foo(t T) { _ = t }

func bar() {
	var t T
	foo(t)
}

generates

0x0020 00032 (test.go:14)	MOVUPS	X0, (SP)
0x0024 00036 (test.go:14)	MOVUPS	X0, 16(SP)
0x0029 00041 (test.go:14)	CALL	"".foo(SB)

but when

type T struct {
	s1, s2, s3 string    // one more string
}
0x001d 00029 (test.go:13)	XORPS	X0, X0
0x0020 00032 (test.go:13)	MOVUPS	X0, "".t+48(SP)
0x0025 00037 (test.go:13)	MOVUPS	X0, "".t+64(SP)
0x002a 00042 (test.go:13)	MOVUPS	X0, "".t+80(SP)
0x002f 00047 (test.go:13)	MOVQ	SP, DI
0x0032 00050 (test.go:14)	LEAQ	"".t+48(SP), SI
0x0037 00055 (test.go:14)	DUFFCOPY	$854
0x004a 00074 (test.go:14)	CALL	"".foo(SB)

The stack is bigger; first we MOVUPS a bunch of zeros to 48/64/80(SP), then we call DUFFCOPY to move them again to (SP). This seems wasteful. Even if we cross the multiple-MOVs/DUFF threshold, it seems it would be possible to just DUFFZERO at (SP), essentially the thing the first snippet does.

This also happen when there's no zeroing going on. For example, for struct { a, b, c, d int64}, when initialized as t = {1, 2, 3, 4}, the values are moved directly to (SP), but for struct { a, b, c, d, e int64}, which is bigger than 32bytes, they aren't. There are 5 moves high into the stack and then a DUFFCOPY call moves them to (SP).

@randall77

This comment has been minimized.

Contributor

randall77 commented Jan 8, 2018

This is a special case of the more general problem that large structs (> 4 words) aren't handled efficiently.
It is on my radar, and I've even attempted a CL. But I don't like my solution yet.

@ALTree

This comment has been minimized.

Member

ALTree commented Jan 8, 2018

@randall77 thanks. I'll let you decide if it's worth keeping this open to track the issue or it's not really necessary, since the general problem is known, and we can close this.

@randall77

This comment has been minimized.

Contributor

randall77 commented Jan 8, 2018

Let's leave it open for now. It will be good as an example to double-check when the general issue gets fixed.

@randall77 randall77 added this to the Unplanned milestone Jan 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment