Skip to content

runtime: cgo callback on extra M treated as external code after nested cgo callback returns #72870

@prattmic

Description

@prattmic

When a C-created thread running outside of Go receives a SIGPROF, we record the sample as coming from runtime._ExternalCode (https://cs.opensource.google/go/go/+/master:src/runtime/signal_unix.go;l=443;drc=944df9a7516021f0405cd8adb1e6894ae9872cb5).

As of https://go.dev/cl/495855, these threads will have an M if they've ever run Go before. In this case, we use mp.isExtraInC to track when this thread is running completely outside of Go.

Unfortunately, we have a bug in our handling of this flag that can result in running Go code with isExtraInC set. This happens like so:

  1. C-created thread calls into Go function 1 (via cgocallback).
  2. Go function 1 calls into C function 1 (via cgocall).
  3. C function 1 calls into Go function 2 (via cgocallback).
  4. Go function 2 returns back to C function 1 (returning via the remainder of cgocallback).
  5. C function 1 returns back to Go function 1 (returning via the remainder of cgocall).
  6. Go function 1 is now running with mp.isExtraInC == true.

The problem occurs in step 4. On return, cgocallback unconditionally sets mp.isExtraInC, but it should actually only set the flag if there are no more Go frames higher up the stack.

I believe all of the effects of the incorrect flag are:

  1. SIGPROF will record a sample with frames [pc, runtime._ExternalCode] rather than the actual stack trace. Normally this is just some data loss in profiles, but if the PC corresponds to multiple inlined frames, this can result in a panic: runtime error: slice bounds out of range crash in runtime/pprof.(*profileBuilder).appendLocsForStack because runtime/pprof assumes that each PC always corresponds to a constant number of inlined frames. I believe this is the issue described in runtime/pprof: theoretical appendLocsForStack panic with SIGPROF between cgocallback and exitsyscall #70529 (comment).

  2. Async preemption signals to this thread will be ignored.

  3. I'm not certain (need to test), but I think that synchronous signals will be forwarded to the C handler or completely ignored. e.g., SIGILL might just be ignored and thus be triggered in a loop?

cc @mknyszek @cherrymui @nsrip-dd @golang/runtime

Metadata

Metadata

Assignees

Labels

CriticalA critical problem that affects the availability or correctness of production systems built using GoNeedsFixThe path to resolution is known, but the work has not been done.compiler/runtimeIssues related to the Go compiler and/or runtime.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions