Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: bad code generation for ARM (bounds test fails) #41780

Closed
dr2chase opened this issue Oct 4, 2020 · 6 comments
Closed

cmd/compile: bad code generation for ARM (bounds test fails) #41780

dr2chase opened this issue Oct 4, 2020 · 6 comments

Comments

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Oct 4, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15.2 darwin/amd64

and also

$ go version
go version go1.15 linux/arm64

and also

$ go version
go version devel +869c02ce1f Sat Oct 3 17:02:58 2020 +0000 linux/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/drchase/Library/Caches/go-build"
GOENV="/Users/drchase/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/drchase/work/gocode/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/drchase/work/gocode"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/drchase/work/go-1.15"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/drchase/work/go-1.15/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/gr/vvb66dqx6jl6lh8wckfd5p9w0095tn/T/go-build888553357=/tmp/go-build -gno-record-gcc-switches -fno-common"

and also

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/drchase/.cache/go-build"
GOENV="/home/drchase/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/drchase/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/drchase/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/drchase/work/go-1.15"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/drchase/work/go-1.15/pkg/tool/linux_arm64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build271442527=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Compile this program https://play.golang.org/p/TfpedZeakHY for GOARCH=arm GOOS=linux

GOOS=linux GOARCH=arm go run arm.go

Code also appears below:

package main

type decimal struct {
	d  [8]byte // digits, big-endian representation
	dp int     // decimal point
}

var powtab = []int{1, 3, 6, 9, 13, 16, 19, 23, 26}

//go:noinline
func foo(d *decimal) int {
	exp := int(d.d[1])
	if d.dp < 0 || d.dp == 0 && d.d[0] < '5' {
		var n int
		if -d.dp >= len(powtab) {
			n = 27
		} else {
			n = powtab[-d.dp]
		}
		exp += n
	}
	return exp
}

func main() {
	var d decimal
	d.d[0] = '1'
	println("foo(&d)=", foo(&d))
}

What did you expect to see?

The playground output: foo(&d)= 1

What did you see instead?

On an ARM:

GOARCH=arm go run arm.go
panic: runtime error: index out of range [0] with length 9

goroutine 1 [running]:
main.foo(0x202c7ac, 0x0)
	/home/drchase/work/tmp/arm.go:18 +0x78
main.main()
	/home/drchase/work/tmp/arm.go:28 +0x30
exit status 2

The bug is very delicate, it depends on something going wrong in arm lowering/optimization to machine code, and it is in that dyslexia-inducing bit where comparisons are reversed, inverted, etc. The signature of the bug (seen in ssa.html) is that the comparison -d.dp >= len(powtab) gets compiled into a CMN instruction both at the comparison and at the bounds check; if this happens, something goes wrong. The pattern firing depends on the SSA variable numbering of the inputs to the originally generated CMP.

From here, a complete dump of the debugging work, just collecting, not analyzing, that can happen later.

Screen Shot 2020-10-03 at 11 37 20 PM

Ultimately, the code looks like this:
Screen Shot 2020-10-03 at 11 41 56 PM

According to gdb, this is the code that is generated:

=> 0x6c170 <main.foo>:	ldr	r1, [r10, #8]
   0x6c174 <main.foo+4>:	cmp	sp, r1
   0x6c178 <main.foo+8>:	bls	0x6c1ec <main.foo+124>
   0x6c17c <main.foo+12>:	str	lr, [sp, #-12]!
   0x6c180 <main.foo+16>:	ldr	r2, [sp, #16]
   0x6c184 <main.foo+20>:	ldrb	r3, [r2, #1]
   0x6c188 <main.foo+24>:	ldr	r4, [r2, #8]
   0x6c18c <main.foo+28>:	cmp	r4, #0
   0x6c190 <main.foo+32>:	bge	0x6c1cc <main.foo+92>
   0x6c194 <main.foo+36>:	ldr	r11, [pc, #96]	; 0x6c1fc <main.foo+140>
   0x6c198 <main.foo+40>:	ldr	r1, [r11]
   0x6c19c <main.foo+44>:	ldr	r11, [pc, #92]	; 0x6c200 <main.foo+144>
   0x6c1a0 <main.foo+48>:	ldr	r2, [r11]
   0x6c1a4 <main.foo+52>:	cmn	r1, r4
   0x6c1a8 <main.foo+56>:	bgt	0x6c1bc <main.foo+76>
   0x6c1ac <main.foo+60>:	mov	r0, #27
   0x6c1b0 <main.foo+64>:	add	r0, r3, r0
   0x6c1b4 <main.foo+68>:	str	r0, [sp, #20]
   0x6c1b8 <main.foo+72>:	ldr	pc, [sp], #12
   0x6c1bc <main.foo+76>:	rsb	r0, r4, #0
   0x6c1c0 <main.foo+80>:	bls	0x6c1e4 <main.foo+116>
   0x6c1c4 <main.foo+84>:	ldr	r0, [r2, r0, lsl #2]
   0x6c1c8 <main.foo+88>:	b	0x6c1b0 <main.foo+64>
   0x6c1cc <main.foo+92>:	bne	0x6c1dc <main.foo+108>
   0x6c1d0 <main.foo+96>:	ldrb	r2, [r2]
   0x6c1d4 <main.foo+100>:	cmp	r2, #53	; 0x35
   0x6c1d8 <main.foo+104>:	bcc	0x6c194 <main.foo+36>
   0x6c1dc <main.foo+108>:	mov	r0, r3
   0x6c1e0 <main.foo+112>:	b	0x6c1b4 <main.foo+68>
   0x6c1e4 <main.foo+116>:	bl	0x69d50 <runtime.panicIndex>
   0x6c1e8 <main.foo+120>:	andeq	r0, r0, r0
   0x6c1ec <main.foo+124>:	mov	r3, lr
   0x6c1f0 <main.foo+128>:	bl	0x680fc <runtime.morestack_noctxt>
   0x6c1f4 <main.foo+132>:	b	0x6c170 <main.foo>
   0x6c1f8 <main.foo+136>:	b	0x6c1f8 <main.foo+136>
   0x6c1fc <main.foo+140>:	andeq	r1, sp, r4, asr r0
   0x6c200 <main.foo+144>:	andeq	r1, sp, r0, asr r0
   0x6c204 <main.main>:	ldr	r1, [r10, #8]

and this is the trace where it goes wrong:

2: /x $cpsr = 0x80000010
4: /x $r0 = 0x0
5: /x $r1 = 0x9
6: /x $r2 = 0xd0100
7: /x $r3 = 0x0
8: /x $r4 = 0x0
9: x/3i $pc
=> 0x6c1a4 <main.foo+52>:	cmn	r1, r4
   0x6c1a8 <main.foo+56>:	bgt	0x6c1bc <main.foo+76>
   0x6c1ac <main.foo+60>:	mov	r0, #27
(gdb) 
0x0006c1a8	15	in /Users/drchase/work/tmp/arm.go
2: /x $cpsr = 0x10
4: /x $r0 = 0x0
5: /x $r1 = 0x9
6: /x $r2 = 0xd0100
7: /x $r3 = 0x0
8: /x $r4 = 0x0
9: x/3i $pc
=> 0x6c1a8 <main.foo+56>:	bgt	0x6c1bc <main.foo+76>
   0x6c1ac <main.foo+60>:	mov	r0, #27
   0x6c1b0 <main.foo+64>:	add	r0, r3, r0
(gdb) 
0x0006c1bc	22	in /Users/drchase/work/tmp/arm.go
2: /x $cpsr = 0x10
4: /x $r0 = 0x0
5: /x $r1 = 0x9
6: /x $r2 = 0xd0100
7: /x $r3 = 0x0
8: /x $r4 = 0x0
9: x/3i $pc
=> 0x6c1bc <main.foo+76>:	rsb	r0, r4, #0
   0x6c1c0 <main.foo+80>:	bls	0x6c1e4 <main.foo+116>
   0x6c1c4 <main.foo+84>:	ldr	r0, [r2, r0, lsl #2]
(gdb) 
18	in /Users/drchase/work/tmp/arm.go
2: /x $cpsr = 0x10
4: /x $r0 = 0x0
5: /x $r1 = 0x9
6: /x $r2 = 0xd0100
7: /x $r3 = 0x0
8: /x $r4 = 0x0
9: x/3i $pc
=> 0x6c1c0 <main.foo+80>:	bls	0x6c1e4 <main.foo+116>
   0x6c1c4 <main.foo+84>:	ldr	r0, [r2, r0, lsl #2]
   0x6c1c8 <main.foo+88>:	b	0x6c1b0 <main.foo+64>
(gdb) 
0x0006c1e4	18	in /Users/drchase/work/tmp/arm.go
2: /x $cpsr = 0x10
4: /x $r0 = 0x0
5: /x $r1 = 0x9
6: /x $r2 = 0xd0100
7: /x $r3 = 0x0
8: /x $r4 = 0x0
9: x/3i $pc
=> 0x6c1e4 <main.foo+116>:	bl	0x69d50 <runtime.panicIndex>
   0x6c1e8 <main.foo+120>:	andeq	r0, r0, r0
   0x6c1ec <main.foo+124>:	mov	r3, lr
@dr2chase
Copy link
Contributor Author

@dr2chase dr2chase commented Oct 5, 2020

I wonder if the (CMP x -y) => (CMN x y) is valid when the condition codes are used for unsigned comparison.
Nothing else in the rewrites looks the least bit suspicious.

Loading

@dr2chase
Copy link
Contributor Author

@dr2chase dr2chase commented Oct 5, 2020

Confirmed, the CMP to CMN transformation is not carry-bit-preserving. The failing rewrite:

b11: (UGT (CMP x (RSBconst [0] y))) else --> panic
          (CMP x (RSBconst [0] y)) => (CMN x y)
b11: (UGT (CMN x y)) else --> panic

evaluated:

BEFORE:
b11: (UGT (CMP "9" (RSBconst [0] "0"))) else --> panic
b11: (UGT (CMP "9" "0")) else --> panic
UGT = if C&~Z then, else
C = "9" - "0" does not borrow (i.e. "0" U< "9") = true
Z = false (9 != 0)
UGT = true and not false = true ==> "then"

AFTER:
b11: (UGT (CMN "9" "0")) else --> panic
UGT = if C&~Z then, else
C = "9" + "0" unsigned-overflows (i.e. "0" + "9" exceeds max unsigned) = false
Z = false (9 != 0)
UGT = false and not false = false ==> "else" == panic

In context, the full chain of rewrites for both comparisons (that are the same comparison):

b7:  (If (Leq32 x (Neg32 y))) else --> b11;  b11: (If (IsInBounds  (Neg32 y) x)) else --> panic

  (Leq32 x y) => (LessEqual (CMP x y))
  (Neg(32|16|8) x) => (RSBconst [0] x)
  (IsInBounds idx len) => (LessThanU (CMP idx len))

b7: (If (LessEqual (CMP x (RSBconst [0] y)))) else  --> b11;  b11: (If (LessThanU (CMP (RSBconst [0] y) x))) else --> panic

 (If (LessThanU cc) yes no) => (ULT cc yes no)
 (If (LessEqual cc) yes no) => (LE cc yes no)

b7: (LE (CMP x (RSBconst [0] y))) else  --> b11;  b11: (ULT (CMP (RSBconst [0] y) x)) else --> panic

  (CMP x y) && x.ID > y.ID => (InvertFlags (CMP y x))

b7: (LE (CMP x (RSBconst [0] y))) else  --> b11;  b11: (ULT (InvertFlags (CMP x (RSBconst [0] y)))) else --> panic

   (ULT (InvertFlags cmp) yes no) => (UGT cmp yes no)

b7: (LE (CMP x (RSBconst [0] y))) else  --> b11;  b11: (UGT (CMP x (RSBconst [0] y))) else --> panic

  (CMP x (RSBconst [0] y)) => (CMN x y)

b7: (LE (CMN x y)) else  --> b11;  b11: (UGT (CMN x y)) else --> panic

   Because of block ordering rules, these assemble to

b7: (BGT (CMN x y)) then  --> b11;  b11: (BLS (CMN x y)) then --> panic

  Translating from GO asm to ARM asm (no surprises)

b7: (bgt (cmn x y)) then  --> b11;  b11: (bls (cmn x y)) then --> panic

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 5, 2020

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 5, 2020

Actually, that's the arm64 CL. The arm one was https://go-review.googlesource.com/c/go/+/236637/

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 5, 2020

Change https://golang.org/cl/259450 mentions this issue: cmd/compile: avoid applying ARM CMP->CMN rewrite in unsigned context

Loading

@dr2chase
Copy link
Contributor Author

@dr2chase dr2chase commented Oct 5, 2020

@randall77 It's similar. I suppose I should figure out if the CMP <-> CMN rewrites are valid for the no-ov comparisons.

Loading

@ALTree ALTree changed the title cmd/compile: Bad code generation for ARM (bounds test fails). cmd/compile: bad code generation for ARM (bounds test fails) Oct 5, 2020
@gopherbot gopherbot closed this in 694025e Oct 6, 2020
@golang golang locked and limited conversation to collaborators Oct 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants