Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add support for startup probes #2669

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion designs/component-config.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# ComponentConfig Controller Runtime Support
Author: @christopherhein

Last Updated on: 03/02/2020
Last Updated on: 31/01/2024

## Table of Contents

Expand Down Expand Up @@ -91,6 +91,7 @@ type ManagerConfiguration interface {

GetReadinessEndpointName() string
GetLivenessEndpointName() string
GetStartupEndpointName() string

GetPort() int
GetHost() string
Expand Down Expand Up @@ -161,6 +162,7 @@ type ControllerManagerConfigurationHealth struct {

ReadinessEndpointName string `json:"readinessEndpointName,omitempty"`
LivenessEndpointName string `json:"livenessEndpointName,omitempty"`
StartupEndpointName string `json:"startupEndpointName,omitempty"`
}
```

Expand Down
3 changes: 3 additions & 0 deletions designs/move-cluster-specific-code-out-of-manager.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ type Manager interface {
// AddReadyzCheck allows you to add Readyz checker
AddReadyzCheck(name string, check healthz.Checker) error

// AddStartzCheck allows you to add Startz checker
AddStartzCheck(name string, check healthz.Checker) error

// Start starts all registered Controllers and blocks until the Stop channel is closed.
// Returns an error if there is an error starting any controller.
// If LeaderElection is used, the binary must be exited immediately after this returns,
Expand Down
4 changes: 4 additions & 0 deletions pkg/config/v1alpha1/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,10 @@ type ControllerHealth struct {
// LivenessEndpointName, defaults to "healthz"
// +optional
LivenessEndpointName string `json:"livenessEndpointName,omitempty"`

// StartupEndpointName, defaults to "startz"
// +optional
StartupEndpointName string `json:"startupEndpointName,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is where the other endpoint names live but this is deprecated. Do we want to keep it here with the risk of it being moved and/or removed for configuration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of this file being deprecated. Where should it be instead? I didn't find another place where the endpoint name is defined. I suppose that the risk for StartupEndpointName is the same than for the others, is it correct?

Copy link
Member

@troy0820 troy0820 Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// ControllerHealth defines the health configs.
//
// Deprecated: The component config package has been deprecated and will be removed in a future release. Users should migrate to their own config implementation, please share feedback in #895.

There is this PR that removes this file #2648 because most of the config is deprecated. Adding this functionality there doesn't seem like it will stay long as the plan would be to get rid of this configuration.

Now that doesn't mean it doesn't have a place defined in this repo, but I would probably place it as an option and not within the deprecated type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thank you. However, from my point of view, isn't it better to remove this config from my change when #2648 is merged? Because right now it's still in progress

After merging your change, I can rebase my change to bring your changes and then I can remove my code from the deprecated files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yours may get merged first, so with that if the maintainers lgtm/approve, I can work on finding a spot for this to move it out of the deprecated struct.

}

// ControllerWebhook defines the webhook server for the controller.
Expand Down
4 changes: 2 additions & 2 deletions pkg/healthz/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

// Package healthz contains helpers from supporting liveness and readiness endpoints.
// (often referred to as healthz and readyz, respectively).
// Package healthz contains helpers from supporting liveness, readiness and startup endpoints.
// (often referred to as healthz, readyz and startz, respectively).
//
// This package draws heavily from the apiserver's healthz package
// ( https://github.com/kubernetes/apiserver/tree/master/pkg/server/healthz )
Expand Down
29 changes: 29 additions & 0 deletions pkg/manager/internal.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ const (

defaultReadinessEndpoint = "/readyz"
defaultLivenessEndpoint = "/healthz"
defaultStartupEndpoint = "/startz"
)

var _ Runnable = &controllerManager{}
Expand Down Expand Up @@ -94,12 +95,18 @@ type controllerManager struct {
// Liveness probe endpoint name
livenessEndpointName string

// Startup probe endpoint name
startupEndpointName string

// Readyz probe handler
readyzHandler *healthz.Handler

// Healthz probe handler
healthzHandler *healthz.Handler

// Startz probe handler
startzHandler *healthz.Handler

// pprofListener is used to serve pprof
pprofListener net.Listener

Expand Down Expand Up @@ -213,6 +220,23 @@ func (cm *controllerManager) AddReadyzCheck(name string, check healthz.Checker)
return nil
}

// AddStartzCheck allows you to add Startz checker.
func (cm *controllerManager) AddStartzCheck(name string, check healthz.Checker) error {
cm.Lock()
defer cm.Unlock()

if cm.started {
return fmt.Errorf("unable to add new checker because healthz endpoint has already been created")
}

if cm.startzHandler == nil {
cm.startzHandler = &healthz.Handler{Checks: map[string]healthz.Checker{}}
}

cm.startzHandler.Checks[name] = check
return nil
}

func (cm *controllerManager) GetHTTPClient() *http.Client {
return cm.cluster.GetHTTPClient()
}
Expand Down Expand Up @@ -283,6 +307,11 @@ func (cm *controllerManager) addHealthProbeServer() error {
// Append '/' suffix to handle subpaths
mux.Handle(cm.livenessEndpointName+"/", http.StripPrefix(cm.livenessEndpointName, cm.healthzHandler))
}
if cm.startzHandler != nil {
mux.Handle(cm.startupEndpointName, http.StripPrefix(cm.startupEndpointName, cm.startzHandler))
// Append '/' suffix to handle subpaths
mux.Handle(cm.startupEndpointName+"/", http.StripPrefix(cm.startupEndpointName, cm.startzHandler))
}

return cm.add(&server{
Kind: "health probe",
Expand Down
1 change: 1 addition & 0 deletions pkg/manager/internal/integration/manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ var _ = Describe("manger.Manager Start", func() {
// Configure health probes.
Expect(mgr.AddReadyzCheck("webhook", mgr.GetWebhookServer().StartedChecker())).To(Succeed())
Expect(mgr.AddHealthzCheck("webhook", mgr.GetWebhookServer().StartedChecker())).To(Succeed())
Expect(mgr.AddStartzCheck("webhook", mgr.GetWebhookServer().StartedChecker())).To(Succeed())

// Set up Driver reconciler (using v2).
driverReconciler := &DriverReconciler{
Expand Down
15 changes: 15 additions & 0 deletions pkg/manager/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ type Manager interface {
// AddReadyzCheck allows you to add Readyz checker
AddReadyzCheck(name string, check healthz.Checker) error

// AddStartzCheck allows you to add Startz checker
AddStartzCheck(name string, check healthz.Checker) error

// Start starts all registered Controllers and blocks until the context is cancelled.
// Returns an error if there is an error starting any controller.
//
Expand Down Expand Up @@ -228,6 +231,9 @@ type Options struct {
// Liveness probe endpoint name, defaults to "healthz"
LivenessEndpointName string

// Startup probe endpoint name, defaults to "healthz"
StartupEndpointName string

// PprofBindAddress is the TCP address that the controller should bind to
// for serving pprof.
// It can be set to "" or "0" to disable the pprof serving.
Expand Down Expand Up @@ -430,6 +436,7 @@ func New(config *rest.Config, options Options) (Manager, error) {
healthProbeListener: healthProbeListener,
readinessEndpointName: options.ReadinessEndpointName,
livenessEndpointName: options.LivenessEndpointName,
startupEndpointName: options.StartupEndpointName,
pprofListener: pprofListener,
gracefulShutdownTimeout: *options.GracefulShutdownTimeout,
internalProceduresStop: make(chan struct{}),
Expand Down Expand Up @@ -480,6 +487,10 @@ func (o Options) AndFrom(loader config.ControllerManagerConfiguration) (Options,
o.LivenessEndpointName = newObj.Health.LivenessEndpointName
}

if o.StartupEndpointName == "" && newObj.Health.StartupEndpointName != "" {
o.StartupEndpointName = newObj.Health.StartupEndpointName
}

if o.WebhookServer == nil {
port := 0
if newObj.Webhook.Port != nil {
Expand Down Expand Up @@ -640,6 +651,10 @@ func setOptionsDefaults(options Options) Options {
options.LivenessEndpointName = defaultLivenessEndpoint
}

if options.StartupEndpointName == "" {
options.StartupEndpointName = defaultStartupEndpoint
}

if options.newHealthProbeListener == nil {
options.newHealthProbeListener = defaultHealthProbeListener
}
Expand Down
68 changes: 65 additions & 3 deletions pkg/manager/manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ var _ = Describe("manger.Manager", func() {
HealthProbeBindAddress: "6060",
ReadinessEndpointName: "/readyz",
LivenessEndpointName: "/livez",
StartupEndpointName: "/startz",
},
Webhook: v1alpha1.ControllerWebhook{
Port: &port,
Expand All @@ -170,6 +171,7 @@ var _ = Describe("manger.Manager", func() {
Expect(m.HealthProbeBindAddress).To(Equal("6060"))
Expect(m.ReadinessEndpointName).To(Equal("/readyz"))
Expect(m.LivenessEndpointName).To(Equal("/livez"))
Expect(m.StartupEndpointName).To(Equal("/startz"))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.Port).To(Equal(port))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.Host).To(Equal("localhost"))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.CertDir).To(Equal("/certs"))
Expand Down Expand Up @@ -201,6 +203,7 @@ var _ = Describe("manger.Manager", func() {
HealthProbeBindAddress: "6060",
ReadinessEndpointName: "/readyz",
LivenessEndpointName: "/livez",
StartupEndpointName: "/startz",
},
Webhook: v1alpha1.ControllerWebhook{
Port: &port,
Expand Down Expand Up @@ -229,6 +232,7 @@ var _ = Describe("manger.Manager", func() {
HealthProbeBindAddress: "5000",
ReadinessEndpointName: "/readiness",
LivenessEndpointName: "/liveness",
StartupEndpointName: "/startup",
WebhookServer: webhook.NewServer(webhook.Options{
Port: 8080,
Host: "example.com",
Expand All @@ -251,6 +255,7 @@ var _ = Describe("manger.Manager", func() {
Expect(m.HealthProbeBindAddress).To(Equal("5000"))
Expect(m.ReadinessEndpointName).To(Equal("/readiness"))
Expect(m.LivenessEndpointName).To(Equal("/liveness"))
Expect(m.StartupEndpointName).To(Equal("/startup"))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.Port).To(Equal(8080))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.Host).To(Equal("example.com"))
Expect(m.WebhookServer.(*webhook.DefaultServer).Options.CertDir).To(Equal("/pki"))
Expand Down Expand Up @@ -1399,6 +1404,11 @@ var _ = Describe("manger.Manager", func() {
})

Context("should start serving health probes", func() {

const (
namedCheck = "check"
)

var listener net.Listener
var opts Options

Expand Down Expand Up @@ -1452,7 +1462,6 @@ var _ = Describe("manger.Manager", func() {
Expect(err).NotTo(HaveOccurred())

res := fmt.Errorf("not ready yet")
namedCheck := "check"
err = m.AddReadyzCheck(namedCheck, func(_ *http.Request) error { return res })
Expect(err).NotTo(HaveOccurred())

Expand Down Expand Up @@ -1507,7 +1516,6 @@ var _ = Describe("manger.Manager", func() {
Expect(err).NotTo(HaveOccurred())

res := fmt.Errorf("not alive")
namedCheck := "check"
err = m.AddHealthzCheck(namedCheck, func(_ *http.Request) error { return res })
Expect(err).NotTo(HaveOccurred())

Expand Down Expand Up @@ -1547,14 +1555,68 @@ var _ = Describe("manger.Manager", func() {
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusOK))

// Check readiness path for individual check
// Check liveness path for individual check
livenessEndpoint = fmt.Sprint("http://", listener.Addr().String(), path.Join(defaultLivenessEndpoint, namedCheck))
res = nil
resp, err = http.Get(livenessEndpoint)
Expect(err).NotTo(HaveOccurred())
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusOK))
})

It("should serve startup endpoint", func() {
opts.HealthProbeBindAddress = ":0"
m, err := New(cfg, opts)
Expect(err).NotTo(HaveOccurred())

res := fmt.Errorf("not alive")
err = m.AddStartzCheck(namedCheck, func(_ *http.Request) error { return res })
Expect(err).NotTo(HaveOccurred())

ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go func() {
defer GinkgoRecover()
Expect(m.Start(ctx)).NotTo(HaveOccurred())
}()
<-m.Elected()

startupEndpoint := fmt.Sprint("http://", listener.Addr().String(), defaultStartupEndpoint)

// Controller is not ready
resp, err := http.Get(startupEndpoint)
Expect(err).NotTo(HaveOccurred())
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusInternalServerError))

// Controller is ready
res = nil
resp, err = http.Get(startupEndpoint)
Expect(err).NotTo(HaveOccurred())
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusOK))

// Check startup path without trailing slash without redirect
startupEndpoint = fmt.Sprint("http://", listener.Addr().String(), defaultStartupEndpoint)
res = nil
httpClient := http.Client{
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse // Do not follow redirect
},
}
resp, err = httpClient.Get(startupEndpoint)
Expect(err).NotTo(HaveOccurred())
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusOK))

// Check startup path for individual check
startupEndpoint = fmt.Sprint("http://", listener.Addr().String(), path.Join(defaultStartupEndpoint, namedCheck))
res = nil
resp, err = http.Get(startupEndpoint)
Expect(err).NotTo(HaveOccurred())
defer resp.Body.Close()
Expect(resp.StatusCode).To(Equal(http.StatusOK))
})
})

Context("should start serving pprof", func() {
Expand Down