Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV during runtime init, in gcenable #38639

Open
wadey opened this issue Apr 24, 2020 · 8 comments
Open

runtime: SIGSEGV during runtime init, in gcenable #38639

wadey opened this issue Apr 24, 2020 · 8 comments
Assignees
Milestone

Comments

@wadey
Copy link
Contributor

@wadey wadey commented Apr 24, 2020

What version of Go are you using (go version)?

go1.14.2

Does this issue reproduce with the latest release?

It appears to be a rare crash, hard to reproduce.

What operating system and processor architecture are you using (go env)?

linux amd64

What did you do?

Just started a Go process, got this SIGSEGV, the second goroutine is still in gcenable, so it looks like this is during runtime init.

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x6c0 pc=0x4191cc]

runtime stack:
runtime.throw(0xb11cd0, 0x2a)
        runtime/panic.go:1116 +0x72
runtime.sigpanic()
        runtime/signal_unix.go:679 +0x46a
runtime.(*mcache).prepareForSweep(0x0)
        runtime/mcache.go:179 +0x2c
runtime.acquirep(0xc000030800)
        runtime/proc.go:4278 +0x3d
runtime.stoplockedm()
        runtime/proc.go:1979 +0xca
runtime.schedule()
        runtime/proc.go:2454 +0x4a6
runtime.park_m(0xc000000180)
        runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
        runtime/asm_amd64.s:318 +0x5b

goroutine 1 [runnable, locked to thread]:
runtime.gopark(0xb24840, 0xc00007a058, 0x170e, 0x2)
        runtime/proc.go:304 +0xe0
runtime.chanrecv(0xc00007a000, 0x0, 0xc000000101, 0x410101)
        runtime/chan.go:525 +0x2e7
runtime.chanrecv1(0xc00007a000, 0x0)
        runtime/chan.go:407 +0x2b
runtime.gcenable()
        runtime/mgc.go:217 +0xac
runtime.main()
        runtime/proc.go:166 +0x115
runtime.goexit()
        runtime/asm_amd64.s:1373 +0x1
@randall77
Copy link
Contributor

@randall77 randall77 commented Apr 24, 2020

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Apr 24, 2020

Hm. This implies the mcache is nil. I'm not sure what that implies (especially so early on in the init process; all the Ps and mcaches should've been initialized already!), but it can't be good. I'll investigate.

@mknyszek mknyszek self-assigned this Apr 24, 2020
@mknyszek mknyszek added this to the Go1.15 milestone Apr 24, 2020
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 24, 2020

A stab in the dark: does your code call runtime.GOMAXPROCS from an init function?

@wadey
Copy link
Contributor Author

@wadey wadey commented Apr 25, 2020

It does not set GOMAXPROCS, here is the main package (tag v1.2.0): https://github.com/slackhq/nebula/blob/v1.2.0/cmd/nebula/main.go

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 14, 2020

There isn't much to go on here, but I had one hypothesis: in Go 1.14.2, when procresize is first called, it sets the 0th p's mcache to the bootstrap mcache, which mallocinit just places on whatever m is doing the initializing (that m later gets the 0th p in procresize). I was thinking that maybe the 0th p has its mcache set to nil as the result of g.m.mcache becoming nil between mallocinit and gcenable. Looking through all the code that gets executed in between, I don't see how this could happen. There are (potential) heap allocations all through there and they don't fail.

With that being said, things might be different given that in Go 1.15, ms no longer have an mcache pointer, only ps do, so the bootstrap mcache is more explicit (the malloc code just uses mcache0 when none exists and it's bootstrapping). The fact that the mcache is assigned as getg().m.mcache in p.init is a bit fishy to me.

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 28, 2020

I may be getting a core dump for an application which crashed with this issue, so I might know more soon. 🤞

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 29, 2020

False alarm, turns out the code dump was for a binary from an old Go version which died for a different reason.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 26, 2020

We don't know what is happening here. Rolling milestone forward to 1.16.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.15, Go1.16 Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.