Add health and readiness probes to controller manager#4459
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds health and readiness probe support for the controller manager, wiring controller-runtime’s health endpoints into the manager binary and enabling Kubernetes probe configuration in the shipped manifests/charts.
Changes:
- Add
--health-probe-bind-addressflag and configure controller-runtimeHealthProbeBindAddresson the manager. - Register
/healthzand/readyzendpoints via controller-runtime health checks. - Add
livenessProbeandreadinessProbeto the kustomize manager manifest and Helm controller deployments.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| main.go | Adds controller-runtime health/ready endpoints and CLI flag to configure probe bind address. |
| config/manager/manager.yaml | Adds Kubernetes liveness/readiness probes to the manager container. |
| charts/gha-runner-scale-set-controller/templates/deployment.yaml | Adds liveness/readiness probes to the scale-set controller deployment. |
| charts/gha-runner-scale-set-controller-experimental/templates/_controller_template.tpl | Adds liveness/readiness probes to the experimental controller container template. |
| charts/actions-runner-controller/templates/deployment.yaml | Adds liveness/readiness probes to the actions-runner-controller deployment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| livenessProbe: | ||
| httpGet: | ||
| path: /healthz | ||
| port: 8081 | ||
| initialDelaySeconds: 15 | ||
| periodSeconds: 20 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: /readyz | ||
| port: 8081 |
There was a problem hiding this comment.
These probes hard-code port 8081, but the manager now supports --health-probe-bind-address and this chart allows users to append arbitrary extraArgs. If a user sets a different bind address/port via extraArgs, the probes will start failing and repeatedly restart the pod. Suggest adding a chart value for the health probe bind address/port, wiring it into both the container args and the probe port field (or at least deriving the probe port from the configured bind address).
| livenessProbe: | |
| httpGet: | |
| path: /healthz | |
| port: 8081 | |
| initialDelaySeconds: 15 | |
| periodSeconds: 20 | |
| readinessProbe: | |
| httpGet: | |
| path: /readyz | |
| port: 8081 | |
| {{- $healthProbeBindAddress := ":8081" -}} | |
| {{- with .Values.controller.manager.extraArgs }} | |
| {{- if kindIs "slice" . }} | |
| {{- range . }} | |
| {{- if and (kindIs "string" .) (hasPrefix "--health-probe-bind-address=" .) }} | |
| {{- $healthProbeBindAddress = trimPrefix "--health-probe-bind-address=" . -}} | |
| {{- end }} | |
| {{- end }} | |
| {{- end }} | |
| {{- end }} | |
| {{- $healthProbePort := 8081 -}} | |
| {{- $healthProbeBindParts := splitList ":" $healthProbeBindAddress -}} | |
| {{- if gt (len $healthProbeBindParts) 1 }} | |
| {{- $healthProbePort = (last $healthProbeBindParts | int) -}} | |
| {{- end }} | |
| livenessProbe: | |
| httpGet: | |
| path: /healthz | |
| port: {{ $healthProbePort }} | |
| initialDelaySeconds: 15 | |
| periodSeconds: 20 | |
| readinessProbe: | |
| httpGet: | |
| path: /readyz | |
| port: {{ $healthProbePort }} |
There was a problem hiding this comment.
Since (technically) we can use extraArgs in experimental chart, this feeedback mighy be reasonable, but I think it would introduce unnecessary complexity and a lack of consistency with other charts.
But, if this change is necessary, I will create a fixup
There was a problem hiding this comment.
I think this is reasonable, since we aim to use current experimental charts as a main chart at some point. Just I would change one thing, if health check is not defined, we don't expose it. I'd prefer this feature being off by default (in values.yaml), and only enabling it so people are not surprised when they upgrade, the port shows up all of the sudden. Please correct me if I'm wrong
There was a problem hiding this comment.
Thank you for your review!
Since probes are enabled by default in many k8s custom controllers, I don’t think it’s unreasonable to have them enabled by default... 🤔
However, from a backward compatibility perspective, it would make sense to disable them by default (as they might be blocked by network policies or conflict with other ports or another reasons).
So, I think it’s a good idea to start with an opt-in approach, as you suggested 👍
nikola-jokic
left a comment
There was a problem hiding this comment.
The PR looks great, I just have one suggestion
| livenessProbe: | ||
| httpGet: | ||
| path: /healthz | ||
| port: 8081 | ||
| initialDelaySeconds: 15 | ||
| periodSeconds: 20 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: /readyz | ||
| port: 8081 |
There was a problem hiding this comment.
I think this is reasonable, since we aim to use current experimental charts as a main chart at some point. Just I would change one thing, if health check is not defined, we don't expose it. I'd prefer this feature being off by default (in values.yaml), and only enabling it so people are not surprised when they upgrade, the port shows up all of the sudden. Please correct me if I'm wrong
0748b0a to
a4be860
Compare
What
Add health and readiness probes to the controller manager
HealthProbeBindAddress(:8081) on the controller-runtime manager/healthz(liveness) and/readyz(readiness) checks viahealthz.Ping--health-probe-bind-addressCLI flag to allow customizationlivenessProbeandreadinessProbeto all Helm chart deployment templates andconfig/manager/manager.yamlWhy
Controller manager currently has no health or readiness probes configured. This means: