Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "fatal error: PowerRegisterSuspendResumeNotification failure" when running in Windows docker containers #35447

Open
KnicKnic opened this issue Nov 8, 2019 · 7 comments

Comments

@KnicKnic
Copy link

@KnicKnic KnicKnic commented Nov 8, 2019

What version of Go are you using (go version)?

go version go1.13.3 windows/amd64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

go env Output
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\nmaliwa\AppData\Local\go-build
set GOENV=C:\Users\nmaliwa\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\nmaliwa\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\nmaliwa\AppData\Local\Temp\go-build924229414=/tmp/go-build -gno-record-gcc-switches

What did you do?

package main
import "fmt"
func main() {
    fmt.Println("hello world")
}

Ran it inside a windows docker container
docker run -it -v C:\test:c:\data mcr.microsoft.com/windows/nanoserver:1809

What did you expect to see?

hello world

What did you see instead?

c:\data\test.exe
fatal error: PowerRegisterSuspendResumeNotification failure

runtime stack:
runtime.throw(0x4d916b, 0x2e)
        c:/go/src/runtime/panic.go:774 +0x79 fp=0x82fde0 sp=0x82fdb0 pc=0x42cea9
runtime.monitorSuspendResume()
        c:/go/src/runtime/os_windows.go:294 +0x1a9 fp=0x82fe80 sp=0x82fde0 pc=0x4298c9
runtime.goenvs()
        c:/go/src/runtime/os_windows.go:531 +0x1ba fp=0x82fed8 sp=0x82fe80 pc=0x42a24a
runtime.schedinit()
        c:/go/src/runtime/proc.go:554 +0xa9 fp=0x82ff30 sp=0x82fed8 pc=0x42f899
runtime.rt0_go(0x82ff60, 0x7ff9d77ca27f, 0x82ff60, 0x0, 0x7ff9d77ca27f, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        c:/go/src/runtime/asm_amd64.s:214 +0x13d fp=0x82ff38 sp=0x82ff30 pc=0x452dfd

This is related to change https://go-review.googlesource.com/c/go/+/191957/ from issue #31528

The problem is that the PowerRegisterSuspendResumeNotification returns error 2 - ERROR_FILE_NOT_FOUND

This also happens with the latest ltsc container images mcr.microsoft.com/windows/servercore:ltsc2019

I found someone else experiencing the same issue https://social.msdn.microsoft.com/Forums/en-US/f13a5e6c-e57d-4790-88db-ea9757ca3846/running-golang-program-in-azure-app-service-console-give-fatal-error?forum=windowsazurewebsitespreview

It would be nice if when calling PowerRegisterSuspendResumeNotification we don't throw on error 2.

@zx2c4

@KnicKnic KnicKnic changed the title Cannot run golang programs in windows docker containers Cannot run go programs in windows docker containers Nov 8, 2019
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Nov 8, 2019

@bcmills bcmills changed the title Cannot run go programs in windows docker containers runtime: "fatal error: PowerRegisterSuspendResumeNotification failure" when running in Windows docker containers Nov 8, 2019
@bcmills bcmills added this to the Unplanned milestone Nov 8, 2019
@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Nov 9, 2019

@KnicKnic

Thank you for report.

We should at least add Windows error number to the panic message. Lucky we have you, who worked out what the error number is.

The problem is that the PowerRegisterSuspendResumeNotification returns error 2 - ERROR_FILE_NOT_FOUND

This error looks strange. Maybe PowerRegisterSuspendResumeNotification is broken when running in docker container. Unfortunetly I don't have docker to verify the error.

@jstarks should PowerRegisterSuspendResumeNotification work in container? Thank you.

Alex

@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Nov 9, 2019

We can look into whether this could be supported in a future version of Windows containers. I think it would be reasonable to ignore failure in this case.

The bigger question to me is whether the change that introduced this regression was the right one at all. The Windows kernel team changed timer behavior in Windows 8 to stop advancing relative timeouts on wake. Otherwise when you open your laptop lid, every timer in the system goes off all at once and you get a bunch of unpredictable errors. Software is generally written to assume that local processes will make forward progress over reasonable time periods, and if they don't then something is wrong. When the machine is asleep, this assumption is violated. By making relative timers behave like threads, so that they both run together or they both don't, the illusion is maintained. You can claim these programs are buggy, but they obviously exist. Watchdog timers are well-known constructs.

This was a conscious design decision in Windows, and so it's disappointing to see the Go runtime second guess this several years later in a bug fix.

As far as behavior on Linux, there is clearly no consensus in issue #24595, which discusses this same problem. And indeed you can see that the CLOCK_MONOTONIC/CLOCK_BOOTTIME convergence was reverted in the kernel exactly because of the reason we stopped advancing time in Windows: random code has random failures due to timeouts. See https://lkml.org/lkml/2018/4/26/929 for a brief justification.

But despite the lack of consensus on Linux, for some reason the change in Windows behavior was merged already.

I think the change that introduced this should be reverted and the original proposal to use QueryUnbiasedInterruptTime (to match the behavior of WaitForSingleObject) should be adopted.

@mpx @networkimprov @ianlancetaylor seemed interested in this topic from the other issue.

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Nov 9, 2019

Filed #35482 re reverting the Windows change.

@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Nov 10, 2019

We can look into whether this could be supported in a future version of Windows containers.

Sounds good.

I think it would be reasonable to ignore failure in this case.

Do you suggest we always ignore PowerRegisterSuspendResumeNotification failure? Or you suggest to ignore it on particular Windows version? How do I decide when to ignore PowerRegisterSuspendResumeNotification failure?

The bigger question to me is whether the change that introduced this regression was the right one at all. The Windows kernel team changed timer behavior in Windows 8 to stop advancing relative timeouts on wake. Otherwise when you open your laptop lid, every timer in the system goes off all at once and you get a bunch of unpredictable errors.

Sounds reasonable to me. But you could also see problems with your design decision too - see, for example #31528. It could be confusing when your timer takes longer after you have laptop lid closed.

I will let Ian and @zx2c4 fight that fight with you.

Thank you for your comments.

Alex

@northtyphoon

This comment has been minimized.

Copy link

@northtyphoon northtyphoon commented Nov 11, 2019

What golang version can I use to workaround the issue now?

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Nov 11, 2019

Go 1.12.* and 1.13.2 don't have the code which causes the error.

Also, patching your local c:/go/src/runtime/os_windows.go to ignore the error would be straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.