-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: handle hitting the top of the address space in the allocator more gracefully #35954
Comments
Thanks for reporting this. If I've understood the report you linked, the issue is that an OOM caused by a goroutines leak is crashing the runtime with a SIGSEV instead of showing a more graceful failure message. The crash starts like this: fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x565db1a3] runtime stack: runtime.throw(0x5672f88b, 0x2a) /usr/lib/golang/src/runtime/panic.go:774 +0x70 runtime.sigpanic() /usr/lib/golang/src/runtime/signal_unix.go:378 +0x40a runtime.(*mheap).setSpans(0x568e89c0, 0xff000000, 0x800, 0xe12c2118) /usr/lib/golang/src/runtime/mheap.go:1158 +0x43 runtime.(*mheap).growAddSpan(0x568e89c0, 0xff000000, 0x1000000) /usr/lib/golang/src/runtime/mheap.go:1311 +0xcd runtime.(*mheap).grow(0x568e89c0, 0x800, 0xffffffff) /usr/lib/golang/src/runtime/mheap.go:1296 +0x6b runtime.(*mheap).allocSpanLocked(0x568e89c0, 0x800, 0x568f9368, 0x0) /usr/lib/golang/src/runtime/mheap.go:1170 +0x24e runtime.(*mheap).alloc_m(0x568e89c0, 0x800, 0x568e0101, 0x3bd57) /usr/lib/golang/src/runtime/mheap.go:1022 +0xed runtime.(*mheap).alloc.func1() /usr/lib/golang/src/runtime/mheap.go:1093 +0x3f runtime.(*mheap).alloc(0x568e89c0, 0x800, 0x56010101, 0x61c8e700) /usr/lib/golang/src/runtime/mheap.go:1092 +0x72 runtime.largeAlloc(0x1000000, 0x57b00101, 0x5) /usr/lib/golang/src/runtime/malloc.go:1138 +0x81 runtime.mallocgc.func1() /usr/lib/golang/src/runtime/malloc.go:1033 +0x3a runtime.systemstack(0x2d673500) /usr/lib/golang/src/runtime/asm_386.s:399 +0x62 runtime.mstart() /usr/lib/golang/src/runtime/proc.go:1146 goroutine 22760 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_386.s:360 fp=0x59d98e18 sp=0x59d98e14 pc=0x5660fcc0 runtime.mallocgc(0x1000000, 0x567b7e20, 0x1, 0x565f9f25) /usr/lib/golang/src/runtime/malloc.go:1032 +0x6ff fp=0x59d98e6c sp=0x59d98e18 pc=0x565c264f runtime.makeslice(0x567b7e20, 0x0, 0x1000000, 0x565f9ff5) /usr/lib/golang/src/runtime/slice.go:49 +0x50 fp=0x59d98e80 sp=0x59d98e6c pc=0x565f9f70 runtime.makeslice64(0x567b7e20, 0x0, 0x0, 0x1000000, 0x0, 0xf7cc036c) /usr/lib/golang/src/runtime/slice.go:63 +0x50 fp=0x59d98e94 sp=0x59d98e80 pc=0x565fa040 github.com/klauspost/compress/zstd.(*Decoder).DecodeAll(0x577cc050, 0x6a8043c0, 0x3b, 0x40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /builddir/build/BUILD/compress-1.9.3/_build/src/github.com/klauspost/compress/zstd/decoder.go:318 +0x346 fp=0x59d98f10 sp=0x59d98e94 pc=0x566f1ac6 github.com/klauspost/compress/zstd.TestEncoderRegression.func1.1.1(0x41abc320) /builddir/build/BUILD/compress-1.9.3/_build/src/github.com/klauspost/compress/zstd/encoder_test.go:224 +0x46a fp=0x59d98f9c sp=0x59d98f10 pc=0x5671fe2a testing.tRunner(0x41abc320, 0x60a9a370) /usr/lib/golang/src/testing/testing.go:909 +0xae fp=0x59d98fe8 sp=0x59d98f9c pc=0x5667749e runtime.goexit() /usr/lib/golang/src/runtime/asm_386.s:1325 +0x1 fp=0x59d98fec sp=0x59d98fe8 pc=0x56611721 created by testing.(*T).Run /usr/lib/golang/src/testing/testing.go:960 +0x2d3 goroutine 1 [chan receive]: testing.(*T).Run(0x57a3a320, 0x56729ce1, 0x15, 0x567e43b8, 0x301) /usr/lib/golang/src/testing/testing.go:961 +0x2f2 testing.runTests.func1(0x56cdc000) /usr/lib/golang/src/testing/testing.go:1202 +0x5b testing.tRunner(0x56cdc000, 0x56c666dc) /usr/lib/golang/src/testing/testing.go:909 +0xae testing.runTests(0x56c0e080, 0x568d0d40, 0x27, 0x27, 0x0) /usr/lib/golang/src/testing/testing.go:1200 +0x28b testing.(*M).Run(0x56c8e180, 0x0) /usr/lib/golang/src/testing/testing.go:1117 +0x16e main.main() _testmain.go:152 +0x14f goroutine 7254 [chan receive]: testing.(*T).Run(0x57a3a3c0, 0x567268a8, 0x7, 0x579c2700, 0x0) /usr/lib/golang/src/testing/testing.go:961 +0x2f2 github.com/klauspost/compress/zstd.TestEncoderRegression(0x57a3a320) /builddir/build/BUILD/compress-1.9.3/_build/src/github.com/klauspost/compress/zstd/encoder_test.go:170 +0x23e testing.tRunner(0x57a3a320, 0x567e43b8) /usr/lib/golang/src/testing/testing.go:909 +0xae created by testing.(*T).Run /usr/lib/golang/src/testing/testing.go:960 +0x2d3 and then goes on with a list of the thousands of (leaked) goroutines. |
I can reproduce on 1.13.4, with We get a different error with tip:
Slightly better, but still not great. I can reproduce with this:
Run with This seems to be a regression from 1.12 to 1.13. My gut feeling is we should just fix this at tip. |
Tentatively milestoning as 1.14, but not release blocker. |
Actually, this is somewhat nondeterministic even with my small repro. Larger sizes tend to make it fail more often, but the demarcation size is not as clear cut as I indicated previously. |
From my side, I applied upstream-recommended work-around and I run the test suite with |
@randall77, what commit was that traceback from? (mpagealloc.go:409 is a comment at current tip.) |
@aclements, tip was bf3ee57 |
AFAICT this is the result of the runtime not handling allocations at the top of the address space well. We're running on 32-bit platforms and the arguments to both I think we need a bunch of overflow checking in more than one place in the runtime to fix this properly. I'm not sure why |
Ping @mknyszek |
Ah thanks, nearly forgot about this. Time to play overflow whack-a-mole! |
Change https://golang.org/cl/230719 mentions this issue: |
I've confirmed that the latest version of https://golang.org/cl/230719 properly crashes @randall77's program above via an "out of memory" error from the OS allocator, as opposed to a random segfault. |
Change https://golang.org/cl/231341 mentions this issue: |
Change https://golang.org/cl/231344 mentions this issue: |
Change https://golang.org/cl/231339 mentions this issue: |
Change https://golang.org/cl/231345 mentions this issue: |
Change https://golang.org/cl/231338 mentions this issue: |
Change https://golang.org/cl/231342 mentions this issue: |
Change https://golang.org/cl/231343 mentions this issue: |
Change https://golang.org/cl/231337 mentions this issue: |
Change https://golang.org/cl/231346 mentions this issue: |
Change https://golang.org/cl/231340 mentions this issue: |
Currently when checking if we can grow the heap into the current arena, we do an addition which may overflow. This is particularly likely on 32-bit systems. Avoid this situation by explicitly checking for overflow, and adding in some comments about when overflow is possible, when it isn't, and why. For #35954. Change-Id: I2d4ecbb1ccbd43da55979cc721f0cd8d1757add2 Reviewed-on: https://go-review.googlesource.com/c/go/+/231337 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Run the test suite for github.com/klauspost/compress 1.9.3.
What did you expect to see?
All tests pass.
What did you see instead?
zstd test fails with SIGSEGV.
I reported it upstream and they suggested reporting it here as well as it shouldn't segfault.
The text was updated successfully, but these errors were encountered: