Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: -msan / -asan stack corruption with CPU profiling and SetCgoTraceback context callback #71395

Open
prattmic opened this issue Jan 22, 2025 · 4 comments
Assignees
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@prattmic
Copy link
Member

msancall and asancall are used to call into the MSAN and ASAN C runtimes, respectively.

These wrappers need to handle stack switching, similar to asmcgocall.

If the caller is running on g0, then they just perform the call, otherwise they switch SP to g0.sched.sp and then make the call. This is normally fine, but in a signal context we will be on gsignal (not g0!), but the code the signal interrupted may have been on g0. By using g0.sched.sp, the MSAN/ASAN call will scribble all over the stack that the interrupted code is using.

As far as I know, MSAN/ASAN calls are possible from signal context in only one case:

  • runtime.cgoContextPCs contains msanwrite/asanwrite calls.
  • runtime.cgoContextPCs is reachable from the SIGPROF signal handler: runtime.sigprof -> runtime.tracebackPCs -> runtime.(*unwinder).cgoCallers -> runtime.cgoContextPCs.
  • This is only reachable if the application has registered cgo traceback via runtime.SetCgoTraceback. Note that both the traceback and context handlers must be registered. The latter is required because runtime.cgoContextPCs only calls the traceback function if gp.cgoCtxt is active, which requires a context handler.

https://go.dev/cl/643875 contains a reproducer. The allocator runs portions on the system stack, so with MSAN/ASAN plus profiling, we see crashes due to stack corruption in the allocator.

$ GOFLAGS=-msan CC=clang go test -run CgoTracebackContextProfile -v runtime
=== RUN   TestCgoTracebackContextProfile
=== PAUSE TestCgoTracebackContextProfile
=== CONT  TestCgoTracebackContextProfile
    crash_test.go:172: running /usr/local/google/home/mpratt/src/go/bin/go build -o /tmp/go-build4253652554/testprogcgo.exe
    crash_test.go:194: built testprogcgo in 1.417734407s
    crash_cgo_test.go:292: /tmp/go-build4253652554/testprogcgo.exe TracebackContextProfile: exit status 2
    crash_cgo_test.go:295: expected "OK\n" got SIGSEGV: segmentation violation
        PC=0x50d8e2 m=7 sigcode=1 addr=0x1b
        
        goroutine 0 gp=0xc0003021c0 m=7 mp=0xc000300008 [idle]:
        runtime.callers.func1()
                /usr/local/google/home/mpratt/src/go/src/runtime/traceback.go:1100 +0xc2 fp=0x7f6637ffed40 sp=0x7f6637ffec78 pc=0x50d8e2
        msancall()
                /usr/local/google/home/mpratt/src/go/src/runtime/msan_amd64.s:87 +0x2d fp=0x7f6637ffed50 sp=0x7f6637ffed40 pc=0x525c2d
        
        goroutine 24 gp=0xc000103180 m=7 mp=0xc000300008 [running, locked to thread]:
        runtime.systemstack_switch()
                /usr/local/google/home/mpratt/src/go/src/runtime/asm_amd64.s:479 +0x8 fp=0xc00051abb0 sp=0xc00051aba0 pc=0x522728
        runtime.callers(0x7f6684100788?, {0xc00030e000?, 0x219cd20?, 0x7f6684e18470?})
                /usr/local/google/home/mpratt/src/go/src/runtime/traceback.go:1097 +0x92 fp=0xc00051ac18 sp=0xc00051abb0 pc=0x5215f2
        runtime.mProf_Malloc(0xc000300008, 0xc000330880, 0x80)
                /usr/local/google/home/mpratt/src/go/src/runtime/mprof.go:447 +0x74 fp=0xc00051ac98 sp=0xc00051ac18 pc=0x4db374
        runtime.profilealloc(0xc000300008?, 0xc000330880?, 0x80?)
                /usr/local/google/home/mpratt/src/go/src/runtime/malloc.go:1802 +0x9b fp=0xc00051acc8 sp=0xc00051ac98 pc=0x4be47b
        runtime.mallocgcSmallNoscan(0xc000330800?, 0x80?, 0x0?)
                /usr/local/google/home/mpratt/src/go/src/runtime/malloc.go:1327 +0x23c fp=0xc00051ad20 sp=0xc00051acc8 pc=0x4bd61c
        runtime.mallocgc(0x80, 0x688f80, 0x1)
                /usr/local/google/home/mpratt/src/go/src/runtime/malloc.go:1055 +0xb9 fp=0xc00051ad58 sp=0xc00051ad20 pc=0x51b4f9
        runtime.makeslice(0x0?, 0xc000103180?, 0x4b3c45?)
                /usr/local/google/home/mpratt/src/go/src/runtime/slice.go:116 +0x49 fp=0xc00051ad80 sp=0xc00051ad58 pc=0x51f449
        main.TracebackContextProfileGoFunction(...)
                /usr/local/google/home/mpratt/src/go/src/runtime/testdata/testprogcgo/tracebackctxt.go:176
        _cgoexp_b32fe38f1ae6_TracebackContextProfileGoFunction(0x0?)
                _cgo_gotypes.go:868 +0x27 fp=0xc00051adb0 sp=0xc00051ad80 pc=0x658227
        runtime.cgocallbackg1(0x658200, 0x7f6637ffedd0, 0x1)
                /usr/local/google/home/mpratt/src/go/src/runtime/cgocall.go:444 +0x28b fp=0xc00051ae68 sp=0xc00051adb0 pc=0x4b3b8b
        runtime.cgocallbackg(0x658200, 0x7f6637ffedd0, 0x1)
                /usr/local/google/home/mpratt/src/go/src/runtime/cgocall.go:350 +0x133 fp=0xc00051aed0 sp=0xc00051ae68 pc=0x4b3833
        runtime.cgocallbackg(0x658200, 0x7f6637ffedd0, 0x1)
                <autogenerated>:1 +0x29 fp=0xc00051aef8 sp=0xc00051aed0 pc=0x526cc9
        runtime.cgocallback(0xc00051af58, 0x51a8f5, 0x662270)
                /usr/local/google/home/mpratt/src/go/src/runtime/asm_amd64.s:1084 +0xcc fp=0xc00051af20 sp=0xc00051aef8 pc=0x5244ec
        cFunction
                tracebackctxt.go:65792 pc=0x100
        cFunction
                tracebackctxt.go:256 pc=0x100
        runtime.systemstack_switch()
                /usr/local/google/home/mpratt/src/go/src/runtime/asm_amd64.s:479 +0x8 fp=0xc00051af30 sp=0xc00051af20 pc=0x522728
        runtime.cgocall(0x662270, 0xc00051af90)
                /usr/local/google/home/mpratt/src/go/src/runtime/cgocall.go:185 +0x75 fp=0xc00051af68 sp=0xc00051af30 pc=0x51a8f5
        main._Cfunc_TracebackContextProfileCallGo()
                _cgo_gotypes.go:267 +0x3a fp=0xc00051af90 sp=0xc00051af68 pc=0x6478fa
        main.TracebackContextProfile.func1()
                /usr/local/google/home/mpratt/src/go/src/runtime/testdata/testprogcgo/tracebackctxt.go:161 +0x7e fp=0xc00051afe0 sp=0xc00051af90 pc=0x6574be
        runtime.goexit({})
                /usr/local/google/home/mpratt/src/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00051afe8 sp=0xc00051afe0 pc=0x524741
        created by main.TracebackContextProfile in goroutine 1
                /usr/local/google/home/mpratt/src/go/src/runtime/testdata/testprogcgo/tracebackctxt.go:158 +0x10e
...

I haven't tested older versions, but this code hasn't changed in a while, so I suspect that 1.22 and 1.23 are also affected.

@prattmic prattmic added compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. labels Jan 22, 2025
@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Jan 22, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/643875 mentions this issue: runtime: MSAN/ASAN + SIGPROF regression test

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/643897 mentions this issue: runtime: pass through -asan/-msan/-race to testprog tests

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/643918 mentions this issue: main.star: add linux-arm64 ASAN/MSAN builders

gopherbot pushed a commit to golang/build that referenced this issue Jan 23, 2025
linux-arm64 is a first class port with no coverage of ASAN or MSAN
modes. We already have a clang15 builder, so adding ASAN and MSAN should
be trivial (fingers crossed).

For golang/go#71395.
For golang/go#70054.

Change-Id: I6a6a636c7a41147b1b22933db946ca838d3696f4
Reviewed-on: https://go-review.googlesource.com/c/build/+/643918
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
@mknyszek mknyszek added this to the Backlog milestone Jan 29, 2025
@mknyszek mknyszek moved this from Todo to In Progress in Go Compiler / Runtime Jan 29, 2025
@mknyszek mknyszek modified the milestones: Backlog, Go1.25 Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: In Progress
Development

No branches or pull requests

4 participants