Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows/service: implement graceful shutdown when run as windows service #73292

Merged
merged 3 commits into from Feb 20, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions pkg/windows/service/BUILD
Expand Up @@ -11,6 +11,7 @@ go_library(
importpath = "k8s.io/kubernetes/pkg/windows/service",
deps = select({
"@io_bazel_rules_go//go/platform:windows": [
"//staging/src/k8s.io/apiserver/pkg/server:go_default_library",
"//vendor/golang.org/x/sys/windows:go_default_library",
"//vendor/golang.org/x/sys/windows/svc:go_default_library",
"//vendor/k8s.io/klog:go_default_library",
Expand Down
30 changes: 27 additions & 3 deletions pkg/windows/service/service.go
Expand Up @@ -20,7 +20,9 @@ package service

import (
"os"
"time"

"k8s.io/apiserver/pkg/server"
"k8s.io/klog"

"golang.org/x/sys/windows"
Expand Down Expand Up @@ -80,9 +82,31 @@ Loop:
case svc.Interrogate:
s <- c.CurrentStatus
case svc.Stop, svc.Shutdown:
s <- svc.Status{State: svc.Stopped}
// TODO: Stop the kubelet gracefully instead of killing the process
os.Exit(0)
klog.Infof("Service stopping")
// We need to translate this request into a signal that can be handled by the the signal handler
// handling shutdowns normally (currently apiserver/pkg/server/signal.go).
// If we do not do this, our main threads won't be notified of the upcoming shutdown.
// Since Windows services do not use any console, we cannot simply generate a CTRL_BREAK_EVENT
// but need a dedicated notification mechanism.
graceful := server.RequestShutdown()

// Free up the control handler and let us terminate as gracefully as possible.
// If that takes too long, the service controller will kill the remaining threads.
// As per https://docs.microsoft.com/en-us/windows/desktop/services/service-control-handler-function
s <- svc.Status{State: svc.StopPending}

// If we cannot exit gracefully, we really only can exit our process, so atleast the
// service manager will think that we gracefully exited. At the time of writing this comment this is
// needed for applications that do not use signals (e.g. kube-proxy)
if !graceful {
go func() {
// Ensure the SCM was notified (The operation above (send to s) was received and communicated to the
// service control manager - so it doesn't look like the service crashes)
time.Sleep(1 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to explicitly flush or get an ack from the SCM, rather than just waiting for an arbitrary duration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without effort, you really don't want to be doing there (and you'd have to be polling and doing windows API (=syscalls) manually here).

Also 1 second is really enough time, we basically only need to make sure that the function that spawns this goroutine has exited, since then the relevant syscall is performed, which should only take a few hundred nanoseconds.
Also keep in mind that this is the edge case (for programs that do not support graceful shutdowns).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SCM doesn't have a way to ACK, it's the other direction. This is actually the kubelet's ack to the service control manager that the stop request was received, and is processing. If the service control manager doesn't get this pending state, it will assume the process is hung and forcefully kill it. If the process is still around after the wait (30 seconds, or more if we give it a hint when passing the stop pending status), the service control manager will poll this status again.

os.Exit(0)
}()
}
break Loop
}
}
}
Expand Down
24 changes: 20 additions & 4 deletions staging/src/k8s.io/apiserver/pkg/server/signal.go
Expand Up @@ -22,22 +22,38 @@ import (
)

var onlyOneSignalHandler = make(chan struct{})
var shutdownHandler chan os.Signal

// SetupSignalHandler registered for SIGTERM and SIGINT. A stop channel is returned
// which is closed on one of these signals. If a second signal is caught, the program
// is terminated with exit code 1.
func SetupSignalHandler() <-chan struct{} {
close(onlyOneSignalHandler) // panics when called twice

shutdownHandler = make(chan os.Signal, 2)

stop := make(chan struct{})
c := make(chan os.Signal, 2)
signal.Notify(c, shutdownSignals...)
signal.Notify(shutdownHandler, shutdownSignals...)
go func() {
<-c
<-shutdownHandler
close(stop)
<-c
<-shutdownHandler
os.Exit(1) // second signal. Exit directly.
}()

return stop
}

// RequestShutdown emulates a received event that is considered as shutdown signal (SIGTERM/SIGINT)
// This returns whether a handler was notified
func RequestShutdown() bool {
if shutdownHandler != nil {
select {
case shutdownHandler <- shutdownSignals[0]:
return true
default:
}
}

return false
}