Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "fatal error: unexpected signal" 0xC0000005 on Windows for a small program with a large allocation #37470

Closed
ulikunitz opened this issue Feb 26, 2020 · 25 comments

Comments

@ulikunitz
Copy link
Contributor

@ulikunitz ulikunitz commented Feb 26, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14 windows/amd64

Does this issue reproduce with the latest release?

The tests were run with Go 1.14 on a fully patched Windows 10 Home Version 1909 (Build 18363.657).

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Uli\AppData\Local\go-build
set GOENV=C:\Users\Uli\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\Uli\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\Uli\src\lz\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\Uli\AppData\Local\Temp\go-build893962726=/tmp/go-build -gno-record-gcc-switches
GOROOT/bin/go version: go version go1.14 windows/amd64
GOROOT/bin/go tool compile -V: compile version go1.14

What did you do?

I'm developing a package that creates Lempel-Ziv sequences for byte streams. On my usual development environment Ubuntu 18.04 I observe sometimes crashes of the test runs (fatal error: bad g->status in ready) for go1.13.8 and go1.14 on multiple kernels including 4.15.0 and 5.3.18. Both should not be affected by the AVX register corruption.

To exclude Linux as a factor I tested the package on Windows and got a fatal error on every run. I was able to reduce it to a small program. The program runs without any errors on Linux. Whether the Windows issue is related to the Linux problems I cannot tell. I'm aware that initializing a structure with a huge array this way is not a good idea, but that is what I wrote initially and what appears to trigger the stack extension that runs into an invalid address access.

I started the program with go run.

> go run main.go
package main

import "fmt"

type Sequencer struct {
        htable [1 << 17]uint32
        buf    []byte
}

func (s *Sequencer) Init(windowSize int) *Sequencer {
        if !(0 <= windowSize) {
                panic(fmt.Errorf("windowSize out of range [%d,%d]", 0, 0))
        }
        *s = Sequencer{
                buf: []byte{0xff},
        }

        return s
}

func main() {
        var s Sequencer
        s.Init(0)
}

https://play.golang.org/p/VRavJw-WPie

What did you expect to see?

No output and no fatal error.

What did you see instead?

fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x1 addr=0xc000134000 pc=0x4143a9]

runtime stack:
runtime.throw(0x4d5ec2, 0x2a)
        c:/go/src/runtime/panic.go:1112 +0x79
runtime.sigpanic()
        c:/go/src/runtime/signal_windows.go:240 +0x25a
runtime.runGCProg(0x4a294c, 0x0, 0xc000132000, 0x1, 0x579680)
        c:/go/src/runtime/mbitmap.go:1901 +0xd9
runtime.materializeGCProg(0x80008, 0x4a2948, 0x7bfc20)
        c:/go/src/runtime/mbitmap.go:1925 +0x93
runtime.adjustframe(0x7bfb30, 0x7bfc20, 0x579680)
        c:/go/src/runtime/stack.go:696 +0x272
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc00004a000, 0x0, 0x0, 0x7fffffff, 0x4d7720, 0x7bfc20, 0x0, ...)
        c:/go/src/runtime/traceback.go:334 +0x111c
runtime.copystack(0xc00004a000, 0x200000)
        c:/go/src/runtime/stack.go:888 +0x298
runtime.newstack()
        c:/go/src/runtime/stack.go:1043 +0x219
runtime.morestack()
        c:/go/src/runtime/asm_amd64.s:449 +0x97

goroutine 1 [copystack]:
main.(*Sequencer).Init(0xc0004dff60, 0x0, 0x0)
        C:/Users/Uli/src/lz/main.go:10 +0x1af fp=0xc0004dff48 sp=0xc0004dff40 pc=0x49f93f
main.main()
        C:/Users/Uli/src/lz/main.go:23 +0x76 fp=0xc00055ff88 sp=0xc0004dff48 pc=0x49f9c6
runtime.main()
        c:/go/src/runtime/proc.go:203 +0x212 fp=0xc00055ffe0 sp=0xc00055ff88 pc=0x434952
runtime.goexit()
        c:/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc00055ffe8 sp=0xc00055ffe0 pc=0x45cd61
exit status 2
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 26, 2020

The [1 << 17]uint32 field is ~524 KiB. If that ends up being allocated on the stack, I could imagine that it is triggering some bug in the interaction between the runtime and the OS and causing the stack growth to appear to be a wild memory fault.

CC @randall77 @aclements @mknyszek

@bcmills bcmills changed the title Windows: Fatal error unexpected signal 0xC0000005 for small program that runs on Linux runtime: "fatal error: unexpected signal" 0xC0000005 on Windows for a small program with a large allocation Feb 26, 2020
@bcmills bcmills added this to the Backlog milestone Feb 26, 2020
@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

This isn't the stack growth itself causing the crash, but when we try to unroll a temporary GC program for scanning the large stack-allocated object. I'm not sure what's happening exactly, but that's just a bug. That should never crash.

@aclements aclements modified the milestones: Backlog, Go1.15 Feb 26, 2020
@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

@ulikunitz, do you know if this is reproducible with Go 1.13?

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

Found it.

In this case, the GC bitmap for Sequencer will be 65536 zeros followed by 3 ones, or exactly 8 KiB of zero bytes, followed by an 0x7 byte. t.ptrdata is 524312 (the bytes of Sequencer up to an including the last pointer). The calculation for the scratch buffer size in materializeGCProg is wrong: (ptrdata/(8*sys.PtrSize)+pageSize-1)/pageSize. It needs to allocate 8193 bytes, but the ptrdata/(8*sys.PtrSize) rounds down and it allocates only 8192 bytes. That just happens to land on a page boundary, and I guess the next page happens to be unmapped, so runGCProg faults when it tries to write the last byte of the GC bitmap.

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

I'm not sure how this could lead to bad g->status in ready specifically, but I bet the same bug is affecting you on Linux, and the out-of-bounds write is showing up as memory corruption instead of a segfault.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Feb 26, 2020

Change https://golang.org/cl/221197 mentions this issue: runtime: fix rounding in materializeGCProg

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

@gopherbot, please open a backport to Go 1.14 and Go 1.15.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Feb 26, 2020

Backport issue(s) opened: #37480 (for 1.14), #37481 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

Oops. @gopherbot, please open a backport to Go 1.13.

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Feb 26, 2020

Also needs 1.12 backport? There's a queue of fixes for the final 1.12.x release:
https://github.com/golang/go/milestone/135

gopherbot can't backport more than once per issue.

cc @dmitshur

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 26, 2020

@networkimprov, now that 1.14 has been released I would not expect any further 1.12.x releases.

@ulikunitz

This comment has been minimized.

Copy link
Contributor Author

@ulikunitz ulikunitz commented Feb 26, 2020

I checked that 1.13 and 1.13.2 works and first broken version appears to be 1.13.3. So 1.12 is probably not affected.

@ulikunitz

This comment has been minimized.

Copy link
Contributor Author

@ulikunitz ulikunitz commented Feb 26, 2020

No, go1.12.9 is also broken. It shares probably a CL with the go1.13 release branch

@dmitshur

This comment has been minimized.

Copy link
Member

@dmitshur dmitshur commented Feb 26, 2020

Oops. @gopherbot, please open a backport to Go 1.13.

Manually opened #37483 for Go 1.13.

@ulikunitz

This comment has been minimized.

Copy link
Contributor Author

@ulikunitz ulikunitz commented Feb 26, 2020

Here is the report of what I tested under Windows.

go1.12     crash
go1.12.9   crash
go1.13     no crash
go1.13.2   no crash
go1.13.3   crash
go1.13.4   crash
go1.13.8   crash
go1.14     crash
@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 26, 2020

Yes, this bug was introduced in https://go-review.googlesource.com/c/134155 as part of adding support for stack objects, which was released in Go 1.12. Though, as @bcmills pointed out, we're probably not going to do any more 1.12 releases.

The crash is very sensitive to the behavior of the memory allocator, so the fact that it didn't reproduce on 1.13 or 1.13.2 doesn't indicate much. The problem has been there since that CL was committed.

@ulikunitz

This comment has been minimized.

Copy link
Contributor Author

@ulikunitz ulikunitz commented Feb 26, 2020

Many thanks for the explanation and the fix. Meanwhile I tested https://golang.org/cl/221197 on Linux and Windows and can happily report that i was not able to reproduce any of the issues with the package on Linux and WIndows.

@dmitshur

This comment has been minimized.

Copy link
Member

@dmitshur dmitshur commented Feb 27, 2020

@aclements You've asked for this issue to be backported to Go 1.14 and 1.13. This seems like a serious issue, do you think there is a workaround that can be used?

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Feb 28, 2020

@dmitshur, technically a workaround could be to pad all types of size [N524288+1, N524288+63] for any integer N so they're not that size any more. That's both really awful (and can apply to code you depend on but don't control), and you have to know that you've encountered this issue to even think about doing something like that, which is most likely to just show up as memory corruption, which you might not even notice.

So, practically speaking, I'd say there isn't really a workaround.

@ulikunitz

This comment has been minimized.

Copy link
Contributor Author

@ulikunitz ulikunitz commented Mar 20, 2020

Hi, Apparently the CL https://golang.org/cl/221197 didn't make it into 1.14.1. #37480 has been moved to milestone go1.14.2. I guess because the CL review has not been completed.

Is there anything I can do to move the CL forward?

@dmitshur

This comment has been minimized.

Copy link
Member

@dmitshur dmitshur commented Mar 20, 2020

@ulikunitz Yes, the review needs to be completed. The CL will then get submitted, backported, and be a part of the next minor release.

I've left a ping for @aclements on the CL in case the notification fell through.

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Mar 20, 2020

Oof, sorry this missed 1.14.1. Things have been crazy. I've updated the CL.

@dmitshur

This comment has been minimized.

Copy link
Member

@dmitshur dmitshur commented Mar 20, 2020

No problem Austin, it'll make its way into the next minor release. Thank you!

@gopherbot gopherbot closed this in ab5a40c Mar 20, 2020
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 20, 2020

Change https://golang.org/cl/224417 mentions this issue: [release-branch.go1.14] runtime: fix rounding in materializeGCProg

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 20, 2020

Change https://golang.org/cl/224418 mentions this issue: [release-branch.go1.13] runtime: fix rounding in materializeGCProg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.