Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: superfluous MOV instructions #32969

Open
dsnet opened this issue Jul 8, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@dsnet
Copy link
Member

commented Jul 8, 2019

Using go1.12 on linux/amd64

Consider the following benchmark:

var srcString = "1234"
var srcBytes = []byte(srcString)
var dst uint64

// BenchmarkIdeal is the ideal performance parsing a string.
func BenchmarkIdeal(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		dst, _ = strconv.ParseUint(srcString, 0, 64)
	}
}

// BenchmarkCopy is the performance after copying the input bytes as a string.
// Performance loss is expected.
func BenchmarkCopy(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		dst, _ = strconv.ParseUint(string(srcBytes), 0, 64)
	}
}

// BenchmarkCast is the performance when using unsafe to cast bytes as a string.
func BenchmarkCast(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		dst, _ = strconv.ParseUint(UnsafeString(srcBytes), 0, 64)
	}
}

type stringHeader struct {
	Data unsafe.Pointer
	Len  int
}
type sliceHeader struct {
	Data unsafe.Pointer
	Len  int
	Cap  int
}

func UnsafeString(b []byte) (s string) {
	bp := (*sliceHeader)(unsafe.Pointer(&b))
	sp := (*stringHeader)(unsafe.Pointer(&s))
	sp.Data = bp.Data
	sp.Len = bp.Len
	return s
}

This currently reports the following on my system:

BenchmarkIdeal-8   	100000000	        14.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkCopy-8    	50000000	        35.1 ns/op	       4 B/op	       1 allocs/op
BenchmarkCast-8    	100000000	        15.7 ns/op	       0 B/op	       0 allocs/op

I expect BenchmarkIdeal and BenchmarkCast to be the same, but there is a ~1ns loss of performance.

The assembly of the inner loop between BenchmarkIdeal and BenchmarkCast is:

-	0x0033 00051	MOVQ	"".srcString(SB), AX
-	0x003a 00058	MOVQ	"".srcString+8(SB), CX
+	0x003a 00058	MOVQ	"".srcBytes+16(SB), AX
+	0x0041 00065	MOVQ	"".srcBytes(SB), CX
+	0x0048 00072	MOVQ	"".srcBytes+8(SB), DX
+	0x004f 00079	MOVQ	CX, "".b+80(SP)
+	0x0054 00084	MOVQ	DX, "".b+88(SP)
+	0x0059 00089	MOVQ	AX, "".b+96(SP)
+	0x005e 00094	XORPS	X0, X0
+	0x0061 00097	MOVUPS	X0, "".s+64(SP)
+	0x0066 00102	XCHGL	AX, AX
+	0x0067 00103	MOVQ	"".b+80(SP), AX
+	0x006c 00108	MOVQ	AX, "".s+64(SP)
+	0x0071 00113	MOVQ	"".b+88(SP), AX
+	0x0076 00118	MOVQ	AX, "".s+72(SP)
+	0x007b 00123	MOVQ	"".s+64(SP), CX
	0x0080 00128	MOVQ	CX, (SP)
	0x0084 00132	MOVQ	AX, 8(SP)
	0x0089 00137	MOVQ	$0, 16(SP)
	0x0092 00146	MOVQ	$64, 24(SP)
	0x009b 00155	CALL	strconv.ParseUint(SB)
	0x00a0 00160	MOVQ	32(SP), AX
	0x00a5 00165	MOVQ	AX, "".dst(SB)

It seems that although the call to UnsafeString is inlined, the compiler still emits a number of MOV instructions that seem superfluous. Ideally, the compiler would emit assembly that is functionally identical to the BenchmarkIdeal case.

@dsnet dsnet added the Performance label Jul 8, 2019

@bcmills bcmills added this to the Unplanned milestone Jul 8, 2019

@bcmills

This comment has been minimized.

Copy link
Member

commented Jul 8, 2019

@randall77, @dr2chase: is this the same underlying issue as #32624?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.