Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc/cgo/test: 'signal arrived during external code execution' with updated windows/386 builder #53540

thanm opened this issue Jun 24, 2022 · 6 comments


Copy link

@thanm thanm commented Jun 24, 2022

What version of Go are you using (go version)?

main branch tip
go version go1.19-pre4 cl/455575533 +12f49fe0ed

Does this issue reproduce with the latest release?


What operating system and processor architecture are you using (go env)?

windows/386, specifically the new builder with update windows image and compilers, for example windows-386-2012-newcc.

What did you do?

ran the misc/cgo/test as part of all.bat

What did you expect to see?

clean run

What did you see instead?

Representative slowbot failure:

The test fails with this failure mode:

test run output with stack traces

ok misc/cgo/test 12.649s
Exception 0xc0000005 0x8 0x424448b 0x424448b
signal arrived during external code execution

runtime.cgocall(0x403170, 0x128a4f70)
C:/workdir/go/src/runtime/cgocall.go:158 +0x58 fp=0x128a4f5c sp=0x128a4f44 pc=0x40d9f8
_cgo_gotypes.go:2043 +0x2f fp=0x128a4f70 sp=0x128a4f5c pc=0x52134f
C:/workdir/go/misc/cgo/test/test.go:1254 +0x1e fp=0x128a4f94 sp=0x128a4f70 pc=0x542b0e
C:/workdir/go/misc/cgo/test/cgo_test.go:15 +0x21 fp=0x128a4f9c sp=0x128a4f94 pc=0x51ceb1
testing.tRunner(0x129083c0, 0x584428)
C:/workdir/go/src/testing/testing.go:1446 +0x113 fp=0x128a4fe4 sp=0x128a4f9c pc=0x4e0693
C:/workdir/go/src/testing/testing.go:1493 +0x2e fp=0x128a4ff0 sp=0x128a4fe4 pc=0x4e148e
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x128a4ff4 sp=0x128a4ff0 pc=0x473291
created by testing.(*T).Run
C:/workdir/go/src/testing/testing.go:1493 +0x374

goroutine 1 [chan receive]:
runtime.gopark(0x5859cc, 0x12902270, 0xe, 0x17, 0x2)
C:/workdir/go/src/runtime/proc.go:363 +0xff fp=0x1292bcd4 sp=0x1292bcc0 pc=0x446bdf
runtime.chanrecv(0x12902240, 0x1292bd53, 0x1)
C:/workdir/go/src/runtime/chan.go:583 +0x3f2 fp=0x1292bd1c sp=0x1292bcd4 pc=0x410282
runtime.chanrecv1(0x12902240, 0x1292bd53)
C:/workdir/go/src/runtime/chan.go:442 +0x1c fp=0x1292bd30 sp=0x1292bd1c pc=0x40fe8c
testing.(*T).Run(0x128840f0, {0x576e27, 0x8}, 0x584428)
C:/workdir/go/src/testing/testing.go:1494 +0x393 fp=0x1292bd7c sp=0x1292bd30 pc=0x4e1413
C:/workdir/go/src/testing/testing.go:1846 +0x5e fp=0x1292bda0 sp=0x1292bd7c pc=0x4e31ae
testing.tRunner(0x128840f0, 0x1292be38)
C:/workdir/go/src/testing/testing.go:1446 +0x113 fp=0x1292bde8 sp=0x1292bda0 pc=0x4e0693
testing.runTests(0x12898000, {0x6af760, 0x59, 0x59}, {0xc0a55481db90a338, 0x8bb304cc19, 0x6b2260})
C:/workdir/go/src/testing/testing.go:1844 +0x3a6 fp=0x1292be4c sp=0x1292bde8 pc=0x4e3096
C:/workdir/go/src/testing/testing.go:1726 +0x680 fp=0x1292bf88 sp=0x1292be4c pc=0x4e1c30
_testmain.go:229 +0x185 fp=0x1292bfc4 sp=0x1292bf88 pc=0x549785
C:/workdir/go/src/runtime/proc.go:250 +0x231 fp=0x1292bff0 sp=0x1292bfc4 pc=0x446821
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x1292bff4 sp=0x1292bff0 pc=0x473291

goroutine 2 [force gc (idle)]:
runtime.gopark(0x585aa4, 0x6b2020, 0x11, 0x14, 0x1)
C:/workdir/go/src/runtime/proc.go:363 +0xff fp=0x1282ffdc sp=0x1282ffc8 pc=0x446bdf
C:/workdir/go/src/runtime/proc.go:302 +0xc3 fp=0x1282fff0 sp=0x1282ffdc pc=0x446a63
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x1282fff4 sp=0x1282fff0 pc=0x473291
created by runtime.init.5
C:/workdir/go/src/runtime/proc.go:290 +0x23

goroutine 3 [GC sweep wait]:
runtime.gopark(0x585aa4, 0x6b21c0, 0xc, 0x14, 0x1)
C:/workdir/go/src/runtime/proc.go:363 +0xff fp=0x12830fd4 sp=0x12830fc0 pc=0x446bdf
C:/workdir/go/src/runtime/mgcsweep.go:278 +0x96 fp=0x12830fe8 sp=0x12830fd4 pc=0x42e916
C:/workdir/go/src/runtime/mgc.go:178 +0x27 fp=0x12830ff0 sp=0x12830fe8 pc=0x420577
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x12830ff4 sp=0x12830ff0 pc=0x473291
created by runtime.gcenable
C:/workdir/go/src/runtime/mgc.go:178 +0x7c

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x585aa4, 0x6b2320, 0xd, 0x14, 0x2)
C:/workdir/go/src/runtime/proc.go:363 +0xff fp=0x12831fb8 sp=0x12831fa4 pc=0x446bdf
C:/workdir/go/src/runtime/mgcscavenge.go:389 +0x5f fp=0x12831fcc sp=0x12831fb8 pc=0x42c41f
C:/workdir/go/src/runtime/mgcscavenge.go:617 +0x4b fp=0x12831fe8 sp=0x12831fcc pc=0x42caab
C:/workdir/go/src/runtime/mgc.go:179 +0x27 fp=0x12831ff0 sp=0x12831fe8 pc=0x420537
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x12831ff4 sp=0x12831ff0 pc=0x473291
created by runtime.gcenable
C:/workdir/go/src/runtime/mgc.go:179 +0xc1

goroutine 18 [finalizer wait]:
runtime.gopark(0x585aa4, 0x6d90bc, 0x10, 0x14, 0x1)
C:/workdir/go/src/runtime/proc.go:363 +0xff fp=0x1282ef90 sp=0x1282ef7c pc=0x446bdf
C:/workdir/go/src/runtime/mfinal.go:180 +0xfb fp=0x1282eff0 sp=0x1282ef90 pc=0x41f7cb
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x1282eff4 sp=0x1282eff0 pc=0x473291
created by runtime.createfing
C:/workdir/go/src/runtime/mfinal.go:157 +0x54

goroutine 19 [syscall]:
runtime.cgocall(0x4034c0, 0x1282afe4)
C:/workdir/go/src/runtime/cgocall.go:158 +0x58 fp=0x1282afd0 sp=0x1282afb8 pc=0x40d9f8
_cgo_gotypes.go:2373 +0x37 fp=0x1282afe4 sp=0x1282afd0 pc=0x5217a7
C:/workdir/go/misc/cgo/test/testx.go:359 +0x28 fp=0x1282aff0 sp=0x1282afe4 pc=0x548988
C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x1282aff4 sp=0x1282aff0 pc=0x473291
created by misc/cgo/test.lockOSThreadCallback
C:/workdir/go/misc/cgo/test/testx.go:359 +0x87
eax 0x18feb8
ebx 0x128a4f70
ecx 0x6b2854
edx 0x128a4f40
edi 0xc0
esi 0x5b502a
ebp 0x6b2820
esp 0x18fea0
eip 0x424448b
eflags 0x10206
cs 0x23
fs 0x53
gs 0x2b
FAIL misc/cgo/test 0.142s
2022/06/23 21:16:43 Failed: exit status 1
ok misc/cgo/test 12.454s
ok misc/cgo/test 12.651s
ok misc/cgo/test 5.689s
skipped due to earlier error
go tool dist: FAILED

I am not sure what the root cause is here, needs to be debugged.

@thanm thanm added the NeedsInvestigation label Jun 24, 2022
@thanm thanm added this to the Go1.20 milestone Jun 24, 2022
@thanm thanm self-assigned this Jun 24, 2022
Copy link
Contributor Author

@thanm thanm commented Jun 29, 2022

Cherry helped me with some debugging and was able to reduce the test case down to this:

package main

// #include <stdio.h>
// int x = 123;
// void F() { printf("hello %d\n", x); }
import "C"

func main() {

By comparing external linking (works) with internal linking (crashes) she found that the C function printf generated from the C fragment above contains an indirect call to the data symbol "__imp____acrt_iob_func". Here's the code:

$ llvm-objdump-13 -t -l -d -r xxx/tmpdir/go-build2751358927/b001/_x002.o
00000020 <_printf>:
; _printf():
; C:/godep/gcc32/include/stdio.h:369
      20: 50                           	pushl	%eax
      21: 8d 44 24 0c                  	leal	12(%esp), %eax
; C:/godep/gcc32/include/stdio.h:371
      25: 89 04 24                     	movl	%eax, (%esp)
; C:/godep/gcc32/include/stdio.h:372
      28: 6a 01                        	pushl	$1
      2a: ff 15 00 00 00 00            	calll	*0
			0000002c:  IMAGE_REL_I386_DIR32	__imp____acrt_iob_func
      30: 83 c4 04                     	addl	$4, %esp
      33: ff 34 24                     	pushl	(%esp)
      36: 68 00 00 00 00               	pushl	$0
			00000037:  IMAGE_REL_I386_DIR32	.rdata
      3b: 50                           	pushl	%eax
      3c: e8 00 00 00 00               	calll	0x41 <_printf+0x21>
			0000003d:  IMAGE_REL_I386_REL32	___mingw_vfprintf
      41: 83 c4 0c                     	addl	$12, %esp
; C:/godep/gcc32/include/stdio.h:374
      44: 58                           	popl	%eax
      45: c3                           	retl
      46: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%eax,%eax)

Note the call at _printf+0x2a: it loads the data symbol and then branches to the loaded value.

However this data symbol is being renamed to "__acrt_iob_func" (a text symbol) as a result of CL 382837. Prior to CL 382837 the host object loader would always rename "__imp_XXX" to "XXX" except for the specific case of "__imp____acrt_iob_func"; after 382837 we always do the renaming except for when there is a local def in the same PE file.

Because of the bad rename, the code loads up data from the function itself and then tries to branch to it (crash).

I tried restoring the special case for "__imp____acrt_iob_func" that was present before CL 382837. With that change in place, there is no longer a crash at that spot, but a new crash with a very similar failure mode:

Exception 0xc0000005 0x8 0x4d8208 0x4d8208
signal arrived during external code execution

runtime.cgocall(0x401130, 0x1243bfc0)
	C:/workdir/go/src/runtime/cgocall.go:158 +0x58 fp=0x1243bfac sp=0x1243bf94 pc=0x4098b8
	_cgo_gotypes.go:41 +0x2f fp=0x1243bfc0 sp=0x1243bfac pc=0x46a05f
	C:/workdir/xxx/cgohimom.go:9 +0x17 fp=0x1243bfc4 sp=0x1243bfc0 pc=0x46a087
	C:/workdir/go/src/runtime/proc.go:250 +0x231 fp=0x1243bff0 sp=0x1243bfc4 pc=0x43b461
	C:/workdir/go/src/runtime/asm_386.s:1326 +0x1 fp=0x1243bff4 sp=0x1243bff0 pc=0x45f021

I inspected the resulting executable. The call to "0x4d8208" is happening from the mingw function ___mingw_pformat. Here's a partial disassembly of the offending function from the linked binary:

004015e0 <libmingwex(.text)>:
; __mingw_pformat():
  4015e0: 55                           	pushl	%ebp
  4015e1: 89 e5                        	movl	%esp, %ebp
  4015e3: 53                           	pushl	%ebx
  4015e4: 57                           	pushl	%edi
  4015e5: 56                           	pushl	%esi
  4015e6: 83 e4 f8                     	andl	$-8, %esp
  4015e9: 81 ec b8 00 00 00            	subl	$184, %esp
  4015ef: 8b 45 18                     	movl	24(%ebp), %eax
  4015f2: 89 44 24 10                  	movl	%eax, 16(%esp)
  4015f6: 8b 5d 14                     	movl	20(%ebp), %ebx
  4015f9: 8b 75 10                     	movl	16(%ebp), %esi
  4015fc: 8b 7d 0c                     	movl	12(%ebp), %edi
  4015ff: e8 04 6c 0d 00               	calll	0x4d8208
  401604: 8b 00                        	movl	(%eax), %eax
  401606: 89 44 24 7c                  	movl	%eax, 124(%esp)

I tracked down the object file that this is coming from in libmingwex.a, and dumped out the function there with relocations:

00000000 <___mingw_pformat>:
; ___mingw_pformat():
; /build/mingw-w64/mingw-w64-crt/build-i686/../stdio/mingw_pformat.c:2413
       0: 55                           	pushl	%ebp
       1: 89 e5                        	movl	%esp, %ebp
       3: 53                           	pushl	%ebx
       4: 57                           	pushl	%edi
       5: 56                           	pushl	%esi
       6: 83 e4 f8                     	andl	$-8, %esp
       9: 81 ec b8 00 00 00            	subl	$184, %esp
       f: 8b 45 18                     	movl	24(%ebp), %eax
      12: 89 44 24 10                  	movl	%eax, 16(%esp)
      16: 8b 5d 14                     	movl	20(%ebp), %ebx
      19: 8b 75 10                     	movl	16(%ebp), %esi
      1c: 8b 7d 0c                     	movl	12(%ebp), %edi
; /build/mingw-w64/mingw-w64-crt/build-i686/../stdio/mingw_pformat.c:2415
      1f: e8 00 00 00 00               	calll	0x24 <___mingw_pformat+0x24>
			00000020:  IMAGE_REL_I386_REL32	__errno
      24: 8b 00                        	movl	(%eax), %eax
      26: 89 44 24 7c                  	movl	%eax, 124(%esp)
; /build/mingw-w64/mingw-w64-crt/build-i686/../stdio/mingw_pformat.c:2418

I don't have a good idea as to what went wrong here. The relocation is to "__errno", but it looks as though we already have a SDYNIMPORT symbol for "_errno" set up... and I don't see any other bogus _errno definitions. The address in question (0x4d8208) is somewhere in the .data section apparently:

Idx Name               Size     VMA      Type
  0 .text              00069190 00401000 TEXT
  1 .rdata             0006c7f4 0046b000 DATA
  2 .data              00002a00 004d8000 DATA
  3 .zdebug_abbrev     00000127 00507000 DATA

As to why the relocation targeting _errno is being resolved to some place in the data section, I am really not sure (scratching my head over that one).

Copy link
Contributor Author

@thanm thanm commented Jun 29, 2022

I looked at what the external linker is doing for the __errno reference in ___mingw_pformat. It appears to be creating a thunk and then calling the thunk:

// the thunk:
0046a1c8 <__errno>:
; __errno():
  46a1c8: ff 25 c0 5c 4d 00            	jmpl	*5070016

// the code in __mingw_pformat:
  4644f6: 8b 5d 14                     	movl	20(%ebp), %ebx
  4644f9: 8b 75 10                     	movl	16(%ebp), %esi
  4644fc: 8b 7d 0c                     	movl	12(%ebp), %edi
; /build/mingw-w64/mingw-w64-crt/build-i686/../stdio/mingw_pformat.c:2415
  4644ff: e8 c4 5c 00 00               	calll	0x46a1c8 <__errno>
  464504: 8b 00                        	movl	(%eax), %eax
  464506: 89 44 24 7c                  	movl	%eax, 124(%esp)

As far as I know there isn't any code in the Go linker that creates these sorts of import thunks for x86. Hmm.

Copy link

@cherrymui cherrymui commented Jun 29, 2022

It looks similar to generating PLT stubs for ELF, which the Go linker can do. Maybe we could do something similar?

Copy link
Contributor Author

@thanm thanm commented Jul 7, 2022

It looks similar to generating PLT stubs for ELF, which the Go linker can do. Maybe we could do something similar?

Agree. I am looking over the corresponding code in lld, e.g. to see if I can use that as a model as well.

Copy link

@gopherbot gopherbot commented Jul 14, 2022

Change mentions this issue: cmd/link: create import thunks when reading PE host objects

Copy link

@gopherbot gopherbot commented Jul 14, 2022

Change mentions this issue: cmd/link: revise DLL import symbol handling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

No branches or pull requests

3 participants