Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/cgo: bad debug_frame entry for crosscall2 #21569

Open
aarzilli opened this issue Aug 23, 2017 · 14 comments

Comments

Projects
None yet
5 participants
@aarzilli
Copy link
Contributor

commented Aug 23, 2017

go version go1.9rc1 linux/amd64
go env:
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/a/n/go/"
GORACE=""
GOROOT="/usr/local/go19rc1"
GOTOOLDIR="/usr/local/go19rc1/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build839946921=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

The debug_frame entry for crosscall2 is wrong in any cgo program.

This is crosscall2:

  asm_amd64.s:12	0x457cc0		4883ec58		SUBQ $0x58, SP		
  asm_amd64.s:16	0x457cc4		48895c2418		MOVQ BX, 0x18(SP)	
  asm_amd64.s:17	0x457cc9		48896c2420		MOVQ BP, 0x20(SP)	
  asm_amd64.s:18	0x457cce		4c89642428		MOVQ R12, 0x28(SP)	
  asm_amd64.s:19	0x457cd3		4c896c2430		MOVQ R13, 0x30(SP)	
  asm_amd64.s:20	0x457cd8		4c89742438		MOVQ R14, 0x38(SP)	
  asm_amd64.s:21	0x457cdd		4c897c2440		MOVQ R15, 0x40(SP)	
  asm_amd64.s:57	0x457ce2		48893424		MOVQ SI, 0(SP)		
  asm_amd64.s:58	0x457ce6		4889542408		MOVQ DX, 0x8(SP)	
  asm_amd64.s:59	0x457ceb		48894c2410		MOVQ CX, 0x10(SP)	
  asm_amd64.s:61	0x457cf0		ffd7			CALL DI			
  asm_amd64.s:64	0x457cf2		488b5c2418		MOVQ 0x18(SP), BX	
  asm_amd64.s:65	0x457cf7		488b6c2420		MOVQ 0x20(SP), BP	
  asm_amd64.s:66	0x457cfc		4c8b642428		MOVQ 0x28(SP), R12	
  asm_amd64.s:67	0x457d01		4c8b6c2430		MOVQ 0x30(SP), R13	
  asm_amd64.s:68	0x457d06		4c8b742438		MOVQ 0x38(SP), R14	
  asm_amd64.s:69	0x457d0b		4c8b7c2440		MOVQ 0x40(SP), R15	
  asm_amd64.s:72	0x457d10		4883c458		ADDQ $0x58, SP		
  asm_amd64.s:76	0x457d14		c3			RET			

The debug_info entry:

 <1><169bb>: Abbrev Number: 2 (DW_TAG_subprogram)
    <169bc>   DW_AT_name        : crosscall2
    <169c7>   DW_AT_low_pc      : 0x457cc0
    <169cf>   DW_AT_high_pc     : 0x457d15
    <169d7>   DW_AT_frame_base  : 1 byte block: 9c      (DW_OP_call_frame_cfa)
    <169d9>   DW_AT_external    : 1
 <2><169da>: Abbrev Number: 0

the debug_frame CIE:

00000000 0000000000000010 ffffffff CIE
  Version:               3
  Augmentation:          ""
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 16

  DW_CFA_def_cfa: r7 (rsp) ofs 8
  DW_CFA_offset_extended: r16 (rip) at cfa-8
  DW_CFA_nop

and the debug_frame entry for crosscall2:

0000b4d4 000000000000001c 00000000 FDE cie=00000000 pc=0000000000457cc0..0000000000457d15
  DW_CFA_def_cfa_offset_sf: 8
  DW_CFA_advance_loc1: 84 to 0000000000457d14
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop

the cfa offset is 8, it should be 0x58 + 0x8 after the first instruction.

@ianlancetaylor ianlancetaylor changed the title cgo: bad debug_frame entry for crosscall2 runtime/cgo: bad debug_frame entry for crosscall2 Aug 23, 2017

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Aug 23, 2017

@heschik heschik added the Debugging label Aug 28, 2017

@heschik

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2017

Brief investigation:

.debug_frame is generated from FuncInfo.Pcsp, which is populated in linkpcln using pctospadj, which reads Spadj from Progs, which is populated by the CPU-specific assembler, mostly in preprocess. For X86, it understands explicit push/pop instructions, but not the SUBQ ..., SP that crosscall2 does. I don't know enough assembly to fill a full list of every instruction that could be used to manipulate the stack pointer, but it seems straightforward to fix this specific case for x86 at least.

@aarzilli

This comment has been minimized.

Copy link
Contributor Author

commented Oct 13, 2017

I see that crosscall2 is defined in src/runtime/cgo/asm_*.s, why do all of them do the SUB manually instead of declaring a frame size in the TEXT header? I can see why the one for amd64 would do that (because it has different frame sizes on windows and linux) but why do all the others do it too?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2017

There is probably no special reason. It's probably worth trying setting up the frame in the TEXT pseudo-op.

@heschik

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2017

More notes from the clueless (me):

A naive attempt at this fails because the assembler doesn't like unbalanced stack operations, and there are a few functions in the runtime that aren't balanced, like runtime.rt0_go which doesn't bother cleaning up because it never returns. These are fixable, sort of, though it's sort of weird to cater to the subset of instructions I've bothered implementing. We could also just disable the error.

Either way, it points to a bigger question: how realistic is it to expect the assembler to figure this out for an arbitrary function? For example, asmcgocall is switching to the system stack before making a cgo call. I presume the assembler can't automatically generate the right information for something that weird.

It seems that GNU as has directives for emitting call frame information. Without implementing something like that, and using it everywhere, it seems unlikely we'll ever emit perfect .debug_frame information for assembly functions.

@aarzilli: Can you explain a little more why you need this? If all you're trying to do is unwind the stack, aren't the frame pointers supposed to be enough?

@aarzilli

This comment has been minimized.

Copy link
Contributor Author

commented Oct 13, 2017

Frame pointers would be enough if there was no FDE for crosscall2.

Without implementing something like that, and using it everywhere, it seems unlikely we'll ever emit perfect .debug_frame information for assembly functions.

True, but crosscall2 isn't weird like asmcgocall.

@heschik

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2017

True, but crosscall2 isn't weird like asmcgocall.

Sure. But it may show up in a stack trace Delve cares about, e.g. a program that goes Go -> C -> Go, which isn't unheard of. So I think it pays to worry about it.

Dropping the FDEs for assembly functions feels a lot safer and more future-proof than trying to massage debugger-relevant functions to fit into the subset that the assembler understands. Let me see how that would work.

@aarzilli

This comment has been minimized.

Copy link
Contributor Author

commented Oct 13, 2017

That seems like a bit of a overreaction. Most assembly functions are called on goroutine stacks and therefore need to have correct unwind information, otherwise the garbage collector wouldn't work. Even asmcgocall and cgocallback_gofunc are correct at their safe points as long as you limit the unwind to the goroutine stack.
crosscall2 is getting away with being wrong because it lives on the cgo stack only.

@gopherbot

This comment has been minimized.

Copy link

commented Oct 16, 2017

Change https://golang.org/cl/70937 mentions this issue: cgo: fix FDE of crosscall2 on amd64 and 386

gopherbot pushed a commit that referenced this issue Oct 16, 2017

runtime/cgo: declare crosscall2 frame using TEXT for amd64 and 386
Use TEXT pseudo-instruction to adjust SP instead of a SUB instruction
so that the assembler knows how to fill in the pcsp table and the frame
description entry correctly.

Updates #21569

Change-Id: I436c840b2af99bbb3042ecd38a7d7c1ab4d7372a
Reviewed-on: https://go-review.googlesource.com/70937
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@heschik

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2017

Fixed? Or do we want to try to get the other architectures done too?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2017

The other architectures matters too, yes.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2017

The x86 fix does not apply to the other architectures, because the assembly in crosscall2 needs to behave like a C function, and using the standard TEXT-induced prologue makes the code behave like a Go function. Those happen to be close enough to the same on x86, but they are not on the other architectures (they differ in where the link register is saved, for example).

I am not convinced this bug is worth fixing on the other architectures. Why does crosscall2's debug info need to be correct at all?

@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2017

If there is anything left for the other architectures, it will need to happen in a future cycle. (One option would be to add a NODEBUG bit that disables generation of debug info for a function.)

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 29, 2017

@aarzilli

This comment has been minimized.

Copy link
Contributor Author

commented Nov 29, 2017

Why does crosscall2's debug info need to be correct at all?

I was trying to get delve to print stacktraces "correctly" when the stack contains cgo calls (here "correctly" means "matching the programmer's model").

Having a bad debug_frame entry for crosscall2 was a problem because I couldn't walk the crosscall2 frame and get to the runtime.asmcgocall frame.

I don't think other architectures are a priority since delve doesn't support them and other debuggers aren't in a position to take advantage of it anyway.

@aarzilli

This comment has been minimized.

Copy link
Contributor Author

commented Nov 29, 2017

As far as I am concerned it can go under an 'unscheduled' tag instead of 1.11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.