Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fpTraceback fails when trying to take a stack trace of a goroutine in a syscall #66889

Closed
mknyszek opened this issue Apr 18, 2024 · 4 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.

Comments

@mknyszek
Copy link
Contributor

After https://go.dev/cl/567076 (for #65634) landed, the runtime began trying to take the frame-pointer-based stack trace of a goroutine in a syscall. Turns out, the runtime had never tried to this before, and it's totally broken.

The main issue is that traceStack tries to use gp.sched.bp but that's not even set when entering a syscall. Worse still, gp.sched in general could get clobbered by calling systemstack. It can then get updated again when switching back to the Go stack (since the syscall path wants to undo the clobbering).

For now, I'm just going to disable getting a stack trace for goroutines blocked in a syscall for the entirety of a generation. To resolve this, what we really need is a gp.syscallbp, akin to gp.syscallpc and gp.syscallsp. The latter two are already used by the regular stack unwinder if available, and they're actually stable, as opposed to gp.sched which gets mutated. Implementing that will be a bigger change, so I'm putting that on the backburner.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 18, 2024
@mknyszek mknyszek changed the title runtime: fpTraceback fails when trying to take a stack trace of a goroutine in a syscall runtime: fpTraceback fails when trying to take a stack trace of a goroutine in a syscall Apr 18, 2024
@nsrip-dd
Copy link
Contributor

nsrip-dd commented Apr 18, 2024

Aha! I wondered about this when looking into #66734.

Adding syscallbp makes sense to me. Perhaps we don't need to completely disable getting a stack trace, though? Could we fall back to the regular non-frame pointer stack trace method if the goroutine has a Gsyscall status? For example, changing this check to add an || readgstatus(gp)&_Gsyscall != 0 check to take the slow path.

@mknyszek
Copy link
Contributor Author

Yeah, I just went ahead and implemented syscallbp. It wasn't hard.

@mknyszek mknyszek self-assigned this Apr 18, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/580255 mentions this issue: runtime: track frame pointer while in syscall

@phuslu
Copy link

phuslu commented Apr 26, 2024

This PR altered the goid offset, potentially disrupting compatibility with several Go applications that depend on it. For those interested in the latest goid() support, I have updated my goid library here: https://github.com/phuslu/goid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

4 participants