Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: invalid pointer found on stack when compiled with -race #63657

fischerman opened this issue Oct 21, 2023 · 11 comments

cmd/compile: invalid pointer found on stack when compiled with -race #63657

fischerman opened this issue Oct 21, 2023 · 11 comments
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. RaceDetector


Copy link

fischerman commented Oct 21, 2023

What version of Go are you using (go version)?

$ go version
go version go1.21.1 linux/amd64

Does this issue reproduce with the latest release?

Yes, as of this writing "go1.21.3".

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1662630679=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Run go test -race . in this go module.

I couldn't reproduce it without dependencies, but the code is quiet small. I'm only "using" Ginkgo and Gomega. Of another dependency I'm just using a type which only ever set to nil. I've added some comments.

The error only occurs in Go 1.21. Go 1.20 works fine. Also the -race flag is required.

When I remove dead code the panic doesn't occur.

What did you expect to see?

Ginkgo test results.

Running Suite: Stackit Suite - /work
Random Seed: 1697889786

Will run 1 of 1 specs
• [FAILED] [0.001 seconds]
f [It] no 'invalid pointer found on stack' please

  [FAILED] Unexpected error:
      <*errors.errorString | 0xc0000640a0>: 
      not even related to the call to f
          s: "not even related to the call to f",
  In [It] at: /work/suite_test.go:19 @ 10/21/23 12:03:06.601

Summarizing 1 Failure:
  [FAIL] f [It] no 'invalid pointer found on stack' please

Ran 1 of 1 Specs in 0.002 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 0 Skipped
--- FAIL: TestStackit (0.00s)
FAIL     0.015s

What did you see instead?

fatal error: invalid pointer found on stack
Running Suite: Stackit Suite - /work
Random Seed: 1697889305

Will run 1 of 1 specs
runtime: bad pointer in frame at 0xc0000bdee0: 0x10
fatal error: invalid pointer found on stack

runtime stack:
runtime.throw({0xa16d05?, 0xd35080?})
        /usr/local/go/src/runtime/panic.go:1077 +0x5c fp=0x7faaae6e78b8 sp=0x7faaae6e7888 pc=0x47019c
runtime.adjustpointers(0x7faaae6e7b30?, 0x7faaae6e7978, 0x498605?, {0x7faaae6e7b30?, 0x0?})
        /usr/local/go/src/runtime/stack.go:627 +0x1ad fp=0x7faaae6e7918 sp=0x7faaae6e78b8 pc=0x48b24d
runtime.adjustframe(0x7faaae6e7b30, 0x7faaae6e7a10)
        /usr/local/go/src/runtime/stack.go:684 +0xdb fp=0x7faaae6e79a8 sp=0x7faaae6e7918 pc=0x48b37b
runtime.copystack(0xc0001884e0, 0x800000002?)
        /usr/local/go/src/runtime/stack.go:935 +0x2c5 fp=0x7faaae6e7ca0 sp=0x7faaae6e79a8 pc=0x48bb25
        /usr/local/go/src/runtime/stack.go:1116 +0x47f fp=0x7faaae6e7e50 sp=0x7faaae6e7ca0 pc=0x48c0df
traceback: unexpected SPWRITE function runtime.morestack
        /usr/local/go/src/runtime/asm_amd64.s:593 +0x8f fp=0x7faaae6e7e58 sp=0x7faaae6e7e50 pc=0x4a5fef

goroutine 37 [copystack]:
fmt.(*pp).handleMethods(0xc0001b61a0, 0x73)
        /usr/local/go/src/fmt/print.go:621 +0x6f0 fp=0xc0000bd810 sp=0xc0000bd808 pc=0x541f30
fmt.(*pp).printArg(0xc0001b61a0, {0x9fe980?, 0x9936a0}, 0x73)
        /usr/local/go/src/fmt/print.go:756 +0xccf fp=0xc0000bd8f0 sp=0xc0000bd810 pc=0x542e8f
fmt.(*pp).doPrintf(0xc0001b61a0, {0xa0bda2, 0x9}, {0xc0000bdb68?, 0x2, 0x2})
        /usr/local/go/src/fmt/print.go:1077 +0x590 fp=0xc0000bda38 sp=0xc0000bd8f0 pc=0x547910
fmt.Sprintf({0xa0bda2, 0x9}, {0xc000185b68, 0x2, 0x2})
        /usr/local/go/src/fmt/print.go:239 +0x5d fp=0xc0000bda90 sp=0xc0000bda38 pc=0x53da7d{0x9936a0?, 0xc00019c090?, 0xc00019c090?})
        /go/pkg/mod/ +0x545 fp=0xc0000bdbc8 sp=0xc0000bda90 pc=0x939c05{0x9936a0, 0xc00019c090}, 0xc00019c090?)
        /go/pkg/mod/ +0x252 fp=0xc0000bdd10 sp=0xc0000bdbc8 pc=0x9392b2*HaveOccurredMatcher).NegatedFailureMessage(0x7faaf6b0c228?, {0x9936a0, 0xc00019c090})
        /go/pkg/mod/ +0x3a fp=0xc0000bdd78 sp=0xc0000bdd10 pc=0x9537fa*Assertion).match(0xc0001d2040, {0xaedf78, 0xdc1b80}, 0x0, {0x0, 0x0, 0x0})
        /go/pkg/mod/ +0x1d6 fp=0xc0000bde48 sp=0xc0000bdd78 pc=0x93d3f6*Assertion).NotTo(0xc0001d2040, {0xaedf78, 0xdc1b80}, {0x0, 0x0, 0x0})
        /go/pkg/mod/ +0x11e fp=0xc0000bdea8 sp=0xc0000bde48 pc=0x93d01e
        /work/suite_test.go:19 +0xc6 fp=0xc0000bdf00 sp=0xc0000bdea8 pc=0x954446{0x0, 0x0})
        /go/pkg/mod/ +0x2f fp=0xc0000bdf20 sp=0xc0000bdf00 pc=0x9168cf*Suite).runNode.func3()
        /go/pkg/mod/ +0x106 fp=0xc0000bdfe0 sp=0xc0000bdf20 pc=0x931046
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000bdfe8 sp=0xc0000bdfe0 pc=0x4a7e81
created by*Suite).runNode in goroutine 19
        /go/pkg/mod/ +0x1345

goroutine 1 [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x18?, 0xc6?, 0x18?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0002316a8 sp=0xc000231688 pc=0x47308e
runtime.chanrecv(0xc00025e380, 0xc00023178f, 0x1)
        /usr/local/go/src/runtime/chan.go:583 +0x385 fp=0xc000231720 sp=0xc0002316a8 pc=0x43e4a5
runtime.chanrecv1(0xa03500?, 0x981b80?)
        /usr/local/go/src/runtime/chan.go:442 +0x12 fp=0xc000231748 sp=0xc000231720 pc=0x43e112
testing.(*T).Run(0xc00029a000, {0xa0c57b, 0xb}, 0xa47cc0)
        /usr/local/go/src/testing/testing.go:1649 +0x856 fp=0xc000231868 sp=0xc000231748 pc=0x586f16
        /usr/local/go/src/testing/testing.go:2054 +0x85 fp=0xc0002318c0 sp=0xc000231868 pc=0x58aa45
testing.tRunner(0xc00029a000, 0xc000231b08)
        /usr/local/go/src/testing/testing.go:1595 +0x239 fp=0xc0002319d8 sp=0xc0002318c0 pc=0x585699
testing.runTests(0xc0000a99a0?, {0xd6b840, 0x1, 0x1}, {0x1c?, 0x4a9539?, 0xd92340?})
        /usr/local/go/src/testing/testing.go:2052 +0x897 fp=0xc000231b38 sp=0xc0002319d8 pc=0x58a8b7
        /usr/local/go/src/testing/testing.go:1925 +0xb58 fp=0xc000231eb8 sp=0xc000231b38 pc=0x5880d8
        _testmain.go:47 +0x2be fp=0xc000231f40 sp=0xc000231eb8 pc=0x95489e
        /usr/local/go/src/runtime/proc.go:267 +0x2bb fp=0xc000231fe0 sp=0xc000231f40 pc=0x472c1b
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000231fe8 sp=0xc000231fe0 pc=0x4a7e81

goroutine 2 [force gc (idle)]:
runtime.gopark(0xd2d210?, 0xd930e0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00004e7a8 sp=0xc00004e788 pc=0x47308e
        /usr/local/go/src/runtime/proc.go:322 +0xb3 fp=0xc00004e7e0 sp=0xc00004e7a8 pc=0x472ef3
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00004e7e8 sp=0xc00004e7e0 pc=0x4a7e81
created by runtime.init.6 in goroutine 1
        /usr/local/go/src/runtime/proc.go:310 +0x1a

goroutine 3 [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00005ef78 sp=0xc00005ef58 pc=0x47308e
        /usr/local/go/src/runtime/mgcsweep.go:280 +0x94 fp=0xc00005efc8 sp=0xc00005ef78 pc=0x45d234
        /usr/local/go/src/runtime/mgc.go:200 +0x25 fp=0xc00005efe0 sp=0xc00005efc8 pc=0x4523e5
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00005efe8 sp=0xc00005efe0 pc=0x4a7e81
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:200 +0x66

goroutine 4 [GC scavenge wait]:
runtime.gopark(0xc00002a070?, 0xae7950?, 0x1?, 0x0?, 0xc0000071e0?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc000064f70 sp=0xc000064f50 pc=0x47308e
        /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000064fa0 sp=0xc000064f70 pc=0x45aae9
        /usr/local/go/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc000064fc8 sp=0xc000064fa0 pc=0x45b05c
        /usr/local/go/src/runtime/mgc.go:201 +0x25 fp=0xc000064fe0 sp=0xc000064fc8 pc=0x452385
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000064fe8 sp=0xc000064fe0 pc=0x4a7e81
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:201 +0xa5

goroutine 18 [finalizer wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc000184e28 sp=0xc000184e08 pc=0x47308e
        /usr/local/go/src/runtime/mfinal.go:193 +0x13b fp=0xc000184fe0 sp=0xc000184e28 pc=0x45145b
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000184fe8 sp=0xc000184fe0 pc=0x4a7e81
created by runtime.createfing in goroutine 1
        /usr/local/go/src/runtime/mfinal.go:163 +0x3d

goroutine 19 [select]:
runtime.gopark(0xc0001db7d0?, 0x5?, 0xe5?, 0x9e?, 0xc0001db2c6?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0001dad88 sp=0xc0001dad68 pc=0x47308e
runtime.selectgo(0xc0001db7d0, 0xc0001db2bc, 0xd92340?, 0x0, 0x946261?, 0x1)
        /usr/local/go/src/runtime/select.go:327 +0x84b fp=0xc0001daed8 sp=0xc0001dad88 pc=0x484a0b*Suite).runNode(_, {0x2, 0x4, {0xa1d7b7, 0x2a}, 0xc0000a1d70, {{0xb71b31, 0x13}, 0x11, {0x0, ...}, ...}, ...}, ...)
        /go/pkg/mod/ +0x182f fp=0xc0001dee08 sp=0xc0001daed8 pc=0x92f0af*group).attemptSpec(0xc0001e2bf8, 0x1, {{0xc000198240?, 0xc0001d2000?, 0x1?}, 0x0?})
        /go/pkg/mod/ +0x1125 fp=0xc0001e0e78 sp=0xc0001dee08 pc=0x90ba45*group).run(0xc0001e2bf8, {0xc00019a060, 0x1, 0x1})
        /go/pkg/mod/ +0x1228 fp=0xc0001e2860 sp=0xc0001e0e78 pc=0x90f428*Suite).runSpecs(0xc000262a80, {0xa0d5b1, 0xd}, {0xdc1b80, 0x0, 0x0}, {0xc00001403b, 0x5}, 0x0, {0xc00019a040, ...})
        /go/pkg/mod/ +0x1167 fp=0xc0001e3638 sp=0xc0001e2860 pc=0x927587*Suite).Run(_, {_, _}, {_, _, _}, {_, _}, _, {0xaf0b10, ...}, ...)
        /go/pkg/mod/ +0x5f8 fp=0xc0001e3800 sp=0xc0001e3638 pc=0x921c98{0xaeb540, 0xc00029a1a0}, {0xa0d5b1, 0xd}, {0x0, 0x0, 0x0})
        /go/pkg/mod/ +0xe6b fp=0xc0001e3e48 sp=0xc0001e3800 pc=0x936b8b
        /work/suite_test.go:13 +0x4e fp=0xc0001e3e98 sp=0xc0001e3e48 pc=0x9542ae
testing.tRunner(0xc00029a1a0, 0xa47cc0)
        /usr/local/go/src/testing/testing.go:1595 +0x239 fp=0xc0001e3fb0 sp=0xc0001e3e98 pc=0x585699
        /usr/local/go/src/testing/testing.go:1648 +0x45 fp=0xc0001e3fe0 sp=0xc0001e3fb0 pc=0x587185
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0001e3fe8 sp=0xc0001e3fe0 pc=0x4a7e81
created by testing.(*T).Run in goroutine 1
        /usr/local/go/src/testing/testing.go:1648 +0x82b

goroutine 20 [select, locked to thread]:
runtime.gopark(0xc000063fa8?, 0x2?, 0x0?, 0x0?, 0xc000063fa4?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc000063e08 sp=0xc000063de8 pc=0x47308e
runtime.selectgo(0xc000063fa8, 0xc000063fa0, 0x0?, 0x0, 0x2?, 0x1)
        /usr/local/go/src/runtime/select.go:327 +0x84b fp=0xc000063f58 sp=0xc000063e08 pc=0x484a0b
        /usr/local/go/src/runtime/signal_unix.go:1014 +0x19f fp=0xc000063fe0 sp=0xc000063f58 pc=0x49ee1f
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000063fe8 sp=0xc000063fe0 pc=0x4a7e81
created by runtime.ensureSigM in goroutine 19
        /usr/local/go/src/runtime/signal_unix.go:997 +0xc8

goroutine 34 [syscall]:
runtime.notetsleepg(0x4aad51?, 0x4a7e81?)
        /usr/local/go/src/runtime/lock_futex.go:236 +0x29 fp=0xc00004efa0 sp=0xc00004ef68 pc=0x443ee9
        /usr/local/go/src/runtime/sigqueue.go:152 +0x29 fp=0xc00004efc0 sp=0xc00004efa0 pc=0x4a4409
        /usr/local/go/src/os/signal/signal_unix.go:23 +0x1d fp=0xc00004efe0 sp=0xc00004efc0 pc=0x5cb19d
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00004efe8 sp=0xc00004efe0 pc=0x4a7e81
created by os/signal.Notify.func1.1 in goroutine 19
        /usr/local/go/src/os/signal/signal.go:151 +0x47

goroutine 35 [select]:
runtime.gopark(0xc00005ff78?, 0x3?, 0x0?, 0x0?, 0xc00005ff22?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc00005fd88 sp=0xc00005fd68 pc=0x47308e
runtime.selectgo(0xc00005ff78, 0xc00005ff1c, 0xc00005ff28?, 0x0, 0x3?, 0x1)
        /usr/local/go/src/runtime/select.go:327 +0x84b fp=0xc00005fed8 sp=0xc00005fd88 pc=0x484a0b*InterruptHandler).registerForInterrupts.func2(0x0)
        /go/pkg/mod/ +0x125 fp=0xc00005ffb8 sp=0xc00005fed8 pc=0x8fcd85*InterruptHandler).registerForInterrupts.func3()
        /go/pkg/mod/ +0x42 fp=0xc00005ffe0 sp=0xc00005ffb8 pc=0x8fcc22
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00005ffe8 sp=0xc00005ffe0 pc=0x4a7e81
created by*InterruptHandler).registerForInterrupts in goroutine 19
        /go/pkg/mod/ +0x2bd

goroutine 36 [select]:
runtime.gopark(0xc000065fb0?, 0x2?, 0xff?, 0xff?, 0xc000065f7c?)
        /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc000065df0 sp=0xc000065dd0 pc=0x47308e
runtime.selectgo(0xc000065fb0, 0xc000065f78, 0x0?, 0x0, 0x0?, 0x1)
        /usr/local/go/src/runtime/select.go:327 +0x84b fp=0xc000065f40 sp=0xc000065df0 pc=0x484a0b
        /go/pkg/mod/ +0xc7 fp=0xc000065fe0 sp=0xc000065f40 pc=0x91d067
        /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000065fe8 sp=0xc000065fe0 pc=0x4a7e81
created by in goroutine 19
        /go/pkg/mod/ +0x189
FAIL     0.025s
@mauri870 mauri870 added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 21, 2023
Copy link

mauri870 commented Oct 21, 2023

Seems to only affect 1.21.x, I was unable to reproduce it in 1.20 or tip. Reproduced on both darwin/arm64 and linux/amd64

This message in the stacktrace caught my attention: traceback: unexpected SPWRITE function runtime.morestack

@mauri870 mauri870 added RaceDetector compiler/runtime Issues related to the Go compiler and/or runtime. labels Oct 21, 2023
@mauri870 mauri870 changed the title Race detection causes invalid stack pointer runtime: invalid pointer found on stack when compiled with -race Oct 21, 2023
@mauri870 mauri870 added this to the Go1.21.4 milestone Oct 21, 2023
Copy link

cc @golang/compiler

Copy link

@mauri870 CL fixes the "unexpected SPWRITE" message. But that is just a message, unrelated to the original bad pointer bug.

@mknyszek mknyszek modified the milestones: Go1.21.4, Backlog Oct 23, 2023
Copy link

That sounded like a red herring, thanks for clarifying it.

Copy link

cuonglm commented Oct 24, 2023

The program starts failing since, then "fixed" after

Kindly cc @randall77 and @mdempsky to decide what should we do.

Copy link

This looks like a bad reordering of a nil pointer check and subsequent pointer arithmetic.

  9542a0:       e8 5b fe ff ff          call   954100 <>
  9542a5:       48 89 44 24 30          mov    %rax,0x30(%rsp)

f returns nil, result is written to 0x30(SP).

  9542aa:       48 8d 05 ef 4b 05 00    lea    0x54bef(%rip),%rax        # 9a8ea0 <type:*+0x53ea0>
  9542b1:       e8 0a 14 af ff          call   4456c0 <runtime.newobject>
  9542b6:       48 89 44 24 40          mov    %rax,0x40(%rsp)
  9542bb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  9542c0:       e8 fb 52 b5 ff          call   4a95c0 <runtime.racewrite>
  9542c5:       48 8b 4c 24 40          mov    0x40(%rsp),%rcx
  9542ca:       48 c7 41 08 21 00 00    movq   $0x21,0x8(%rcx)
  9542d1:       00 

start a write barrier to initialize new object allocated above.
(In non-race mode I think this barrier is not needed, as it is writing a constant string pointer to known-zeroed memory. Alas, I think in race mode the zeroed-ness of the allocation is not detectable.)

  9542d2:       83 3d 67 2e 47 00 00    cmpl   $0x0,0x472e67(%rip)        # dc7140 <runtime.writeBarrier>
  9542d9:       74 0d                   je     9542e8 <>
  9542db:       48 8b 11                mov    (%rcx),%rdx
  9542de:       66 90                   xchg   %ax,%ax
  9542e0:       e8 1b 3b b5 ff          call   4a7e00 <runtime.gcWriteBarrier1>
  9542e5:       49 89 13                mov    %rdx,(%r11)

Write barrier is done here, now to do the actual write. But first there are some other instructions!
It's calculating &g.Listeners for use by a subsequent raceread call. I have no idea why these instructions appear here - they should not be before the write associated with the barrier. And in any case, they should be after the nil check below. The bad pointer is in 0x38(SP) and that's where the stack copier finds it and barfs.

  9542e8:       48 8b 54 24 30          mov    0x30(%rsp),%rdx
  9542ed:       48 8d 5a 10             lea    0x10(%rdx),%rbx
  9542f1:       48 89 5c 24 38          mov    %rbx,0x38(%rsp)

These 2 instructions are the actual write.

  9542f6:       48 8d 35 f8 43 0c 00    lea    0xc43f8(%rip),%rsi        # a186f5 <go:string.*+0xf39d>
  9542fd:       48 89 31                mov    %rsi,(%rcx)

Some uninteresting stuff follows.

  954300:       48 8d 05 21 71 19 00    lea    0x197121(%rip),%rax        # aeb428 <go:itab.*errors.errorString,error+0x8>
  954307:       e8 54 52 b5 ff          call   4a9560 <runtime.raceread>
  95430c:       48 8b 05 15 71 19 00    mov    0x197115(%rip),%rax        # aeb428 <go:itab.*errors.errorString,error+0x8>
  954313:       48 8b 5c 24 40          mov    0x40(%rsp),%rbx
  954318:       31 c9                   xor    %ecx,%ecx
  95431a:       31 ff                   xor    %edi,%edi
  95431c:       48 89 fe                mov    %rdi,%rsi
  95431f:       90                      nop
  954320:       e8 7b fa ff ff          call   953da0 <>
  954325:       48 8b 48 20             mov    0x20(%rax),%rcx
  954329:       48 89 d8                mov    %rbx,%rax
  95432c:       48 8d 1d 05 9c 19 00    lea    0x199c05(%rip),%rbx        # aedf38 <go:itab.*,>
  954333:       31 ff                   xor    %edi,%edi
  954335:       31 f6                   xor    %esi,%esi
  954337:       49 89 f0                mov    %rsi,%r8
  95433a:       48 89 ca                mov    %rcx,%rdx
  95433d:       48 8d 0d fc 27 47 00    lea    0x4727fc(%rip),%rcx        # dc6b40 <runtime.zerobase>
  954344:       ff d2                   call   *%rdx

The nil pointer check of the result of f has made it all the way down here.

  954346:       48 8b 4c 24 30          mov    0x30(%rsp),%rcx
  95434b:       84 01                   test   %al,(%rcx)

Then we actually read g.Listeners. We only compute &g.Listeners to pass to raceread, another reason this only happens in -race mode.

  95434d:       48 8b 44 24 38          mov    0x38(%rsp),%rax
  954352:       e8 09 52 b5 ff          call   4a9560 <runtime.raceread>
  954357:       48 8b 4c 24 30          mov    0x30(%rsp),%rcx
  95435c:       48 8b 59 10             mov    0x10(%rcx),%rbx

@randall77 randall77 self-assigned this Oct 25, 2023
Copy link

Reminds me of #42673, which is what CL 270940 was supposed to fix, not cause. Somehow the nil check just isn't getting the priority it needs to come before the address calculation &g.Listener.

Here's a reproducer. At least, you can see the problem in assembly. (It will need a wrapper to make it into a test.)

package main

type T struct {
	a, b int

func f() {
	x := p()
	gb = gi != 0
func g(p *int)

func q()

func p() *T

var gi int
var gb bool

When compiled (no -race required), we get this:

	0x000e 00014 (tmp2.go:11)	CALL	main.p(SB)
	0x0013 00019 (tmp2.go:11)	MOVQ	AX, main.x+8(SP)
	0x0018 00024 (tmp2.go:14)	LEAQ	8(AX), CX
	0x001c 00028 (tmp2.go:14)	MOVQ	CX, main..autotmp_2+16(SP)
	0x0021 00033 (tmp2.go:12)	CMPQ, $0
	0x0029 00041 (tmp2.go:12)	SETNE
	0x0030 00048 (tmp2.go:13)	CALL	main.q(SB)
	0x0035 00053 (tmp2.go:14)	MOVQ	main.x+8(SP), AX
	0x003a 00058 (tmp2.go:14)	TESTB	AL, (AX)
	0x003c 00060 (tmp2.go:14)	MOVQ	main..autotmp_2+16(SP), AX
	0x0041 00065 (tmp2.go:14)	CALL	main.g(SB)

Note the LEAQ happens before the call to q and its result is spilled to the stack. The nil check doesn't happen until just before the call to g.

The trick with the gb = gi != 0 is to introduce a low-priority instruction to the scheduler. Flags-generating instructions are low-priority (we want them to issue as late as possible to minimize the result's lifetime). This causes a priority inversion, as the nil check is dependent on that low-priority instruction, so it gets delayed as well. The LEAQ is standard priority, so it gets to go first.

@randall77 randall77 changed the title runtime: invalid pointer found on stack when compiled with -race cmd/compile: invalid pointer found on stack when compiled with -race Oct 25, 2023
Copy link

@gopherbot Please open a backport issue for 1.21.

Copy link

Backport issue(s) opened: #63743 (for 1.21).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to

Copy link

Change mentions this issue: cmd/compile: ensure pointer arithmetic happens after the nil check

Copy link

Change mentions this issue: cmd/compile: handle constant pointer offsets in dead store elimination

gopherbot pushed a commit that referenced this issue Oct 31, 2023
Update #63657
Update #45573

Change-Id: I163c6038c13d974dc0ca9f02144472bc05331826
LUCI-TryBot-Result: Go LUCI <>
Reviewed-by: David Chase <>
Reviewed-by: Keith Randall <>
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 1, 2023
@dmitshur dmitshur modified the milestones: Backlog, Go1.22 Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. RaceDetector
None yet

No branches or pull requests

8 participants