package p
func f(x int) (uint32, uint32) {
var a, b uint32
for {
a++
if x == 0 {
break
}
x--
b += 2
}
return a, b
}
This compiles as:
"".f STEXT nosplit size=33 args=0x10 locals=0x0
0x0000 00000 (x.go:3) TEXT "".f(SB), NOSPLIT, $0-16
0x0000 00000 (x.go:3) FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
0x0000 00000 (x.go:3) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:3) MOVQ "".x+8(SP), AX
0x0005 00005 (x.go:3) XORL CX, CX
0x0007 00007 (x.go:3) MOVL CX, DX
0x0009 00009 (x.go:5) JMP 17
0x000b 00011 (x.go:10) DECQ AX
0x000e 00014 (x.go:11) ADDL $2, DX
0x0011 00017 (x.go:6) INCL CX
0x0013 00019 (x.go:7) TESTQ AX, AX
0x0016 00022 (x.go:7) JNE 11
0x0018 00024 (x.go:13) MOVL CX, "".~r1+16(SP)
0x001c 00028 (x.go:13) MOVL DX, "".~r2+20(SP)
0x0020 00032 (x.go:13) RET
This issue is about instruction 0x0007, MOVL CX, DX. I think we should prefer XORL DX, DX. The reasoning is that it is shorter (2 bytes instead of 4) and avoids false dependencies between registers. This is only preferable when rematerialization is cheaper than a register copy, which are special but common cases, like zeroing.
I believe that the modification to regalloc should occur in processDest, but that's as far as I got.
cc @cherrymui @randall77
This compiles as:
This issue is about instruction 0x0007,
MOVL CX, DX. I think we should preferXORL DX, DX. The reasoning is that it is shorter (2 bytes instead of 4) and avoids false dependencies between registers. This is only preferable when rematerialization is cheaper than a register copy, which are special but common cases, like zeroing.I believe that the modification to regalloc should occur in processDest, but that's as far as I got.
cc @cherrymui @randall77