Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: stack growth during execution of "next" confuses gdb #25505

Closed
dr2chase opened this issue May 22, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@dr2chase
Copy link
Contributor

commented May 22, 2018

This is not necessarily a "Go" bug, but this needs to be recorded.

If, in gdb, you attempt a "next" across a call to a function f, and if execution of f causes stack growth, then gdb does not recognize the return from f as the actual return from f (because it does not match the call stack, as might be the case if there is recursion, or multiple threads executing the same function), and will appear to "run ahead" to the next breakpoint or the end of the program, whichever comes first.

This is the root cause of #25497.

This is a problem for Go versions between 1.5 and 1.11, at least, and gdb versions between 7.9 and 8.1, at least.

@bcmills bcmills changed the title Stack growth during execution of "next" confuses gdb. runtime: stack growth during execution of "next" confuses gdb May 23, 2018

@bcmills

This comment has been minimized.

Copy link
Member

commented May 23, 2018

I'm gonna call this a “runtime” issue on the theory that there might be something the runtime can do to tie the call and return together, but feel free to change the prefix if that turns out not to be the case.

CC: @heschik @aclements

@bcmills bcmills added this to the Unplanned milestone May 23, 2018

@dr2chase

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2018

@bcmills I'm not sure that I want to imply that any of us is required to act on this, I just wanted to have a thing to point to for "that problem". Ideally gdb would learn that goroutines are identified by their "g" "registers" and use that. I.e., the runtime already ties the call and return together with the g register.

@bcmills

This comment has been minimized.

Copy link
Member

commented May 23, 2018

I'm not sure that I want to imply that any of us is required to act on this, I just wanted to have a thing to point to for "that problem".

Agreed. That's why I milestoned it “Unplanned”. 🙂

@aclements

This comment has been minimized.

Copy link
Member

commented May 23, 2018

I'm gonna call this a “runtime” issue on the theory that there might be something the runtime can do to tie the call and return together

I don't think there is. GDB's "next" creates a temporary internal breakpoint that's predicated on both the stack pointer and the thread [1]. This seems pretty fundamentally incompatible with both stack copying (though would work with segmented stacks, which cause other problems for debugging) and N:M scheduling. The only potential wedge I see is that there's special handling for setjmp/longjmp, though I'm not sure what it does. The other possibility, of course, is to nag GDB to implement more complete support for Go, but the "next" breakpoint mechanism seems really deeply embedded.

[1] This starts at next_command. Look for skip_subroutines and STEP_OVER_ALL. The crux is the function insert_step_resume_breakpoint_at_caller, which indirectly sets a non-null the frame_id and thread in the of breakpoint object. bpstat_check_breakpoint_conditions checks these when gdb hits a breakpoint.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented May 24, 2018

Like other commenters, I can't think of any way to fix this other than changing gdb. That makes me think that we should close this. It will still exist as a place to point people.

@aclements aclements closed this May 24, 2018

@golang golang locked and limited conversation to collaborators May 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.