Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV in garbage collector sweeping #64173

Closed
joepadmiraal opened this issue Nov 15, 2023 · 3 comments
Closed

runtime: SIGSEGV in garbage collector sweeping #64173

joepadmiraal opened this issue Nov 15, 2023 · 3 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@joepadmiraal
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.21.0 linux/amd64

Does this issue reproduce with the latest release?

Not sure, it happened once on production.
We did not yet encounter it during our test runs.

What operating system and processor architecture are you using (go env)?

The executable is build on a system with these details:

go env Output
$ go env
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/supervisor/.cache/go-build'
GOENV='/home/supervisor/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/supervisor/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/supervisor/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go-1.21'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go-1.21/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.0'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build559569511=/tmp/go-build -gno-record-gcc-switches'

What did you do?

We have an executable that runs as a daemon on Debian12 machines.
These machines are deployed in office networks of our customers.
Usually they run for a couple of hours and are shutdown afterwards.

We noticed a crash of our service on one of the systems and investigated it.
It seems like a crash in the garbage collector:

Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: SIGSEGV: segmentation violation
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: PC=0x42abb2 m=11 sigcode=128
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: goroutine 0 [idle]:
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.(*sweepLocked).sweep(0x0?, 0x0)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/mgcsweep.go:552 +0x132 fp=0x7f816bffeb70 sp=0x7f816bffea68 pc=0x42abb2
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.(*mcentral).uncacheSpan(0x0?, 0x0?)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/mcentral.go:228 +0x98 fp=0x7f816bffeb98 sp=0x7f816bffeb70 pc=0x41bff8
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.(*mcache).releaseAll(0x7f81d7bed108)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/mcache.go:291 +0x13c fp=0x7f816bffec00 sp=0x7f816bffeb98 pc=0x41ba5c
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.(*mcache).prepareForSweep(0x7f81d7bed108)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/mcache.go:328 +0x35 fp=0x7f816bffec28 sp=0x7f816bffec00 pc=0x41bb55
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.acquirep(0xc000030000)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:5334 +0x26 fp=0x7f816bffec68 sp=0x7f816bffec28 pc=0x44a666
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.stopm()
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:2537 +0xb5 fp=0x7f816bffec98 sp=0x7f816bffec68 pc=0x443db5
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.findRunnable()
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:3229 +0xb9c fp=0x7f816bffeda8 sp=0x7f816bffec98 pc=0x4456bc
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.schedule()
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:3582 +0xb1 fp=0x7f816bffede0 sp=0x7f816bffeda8 pc=0x4464b1
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.park_m(0xc00007c000?)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:3745 +0x11f fp=0x7f816bffee28 sp=0x7f816bffede0 pc=0x4469bf
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.mcall()
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/asm_amd64.s:458 +0x4e fp=0x7f816bffee40 sp=0x7f816bffee28 pc=0x470f0e
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: goroutine 1 [chan receive, 14 minutes]:
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]: runtime.gopark(0xc000218268?, 0xc0003b1e20?, 0x0?, 0x0?, 0xc0002ba860?)
Nov 10 14:06:44 fakemachine exmg-encoder-daemon[642]:         /usr/lib/go-1.21/src/runtime/proc.go:398 +0xce fp=0xc000599df0 sp=0xc000599dd0 pc=0x43ffee
...

The full stack traces can be seen in this log file:
stack.log

What did you expect to see?

I'd expect either an error which indicates that our code did something wrong, or no crash at all.

I also found this issue, which points at exactly the same line of code (in an older version of golang). But unfortunately that was never resolved.

The problem is that its very hard to reproduce.
We have multiple test machines here, running exactly the same software on exactly the same hardware, but have not seen the issue there.

To me this looks like an issue in the garbage collector code of golang, but I'm not an expert on that topic so I could be completely wrong. Any pointers on how I can further investigate this would be appreciated.
Thanks

@mengzhuo mengzhuo changed the title SIGSEGV in garbage collector sweeping runtime: SIGSEGV in garbage collector sweeping Nov 15, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 15, 2023
@mauri870
Copy link
Member

Could you try go 1.21.4? Also, having a reproducible example would help with investigating the issue.

@mauri870 mauri870 added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 15, 2023
@joepadmiraal
Copy link
Author

Could you try go 1.21.4? Also, having a reproducible example would help with investigating the issue.

I'll update to go 1.21.4.
However, I don't see any related fixes in the change logs so I don't think that will help.
Regarding the reproducible example, like I said in the ticket:
The problem is that its very hard to reproduce.
We have multiple test machines here, running exactly the same software on exactly the same hardware, but have not seen the issue there.

So its not reproducible.
I know it sucks, but it is what it is.

What would help me is if somebody with knowledge about the garbage collector could give me a hint on what can cause this specific SIGSEGV. With that I can maybe find a bug in our code that triggers it.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Dec 8, 2023
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants