Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For Ansible/Helm-based operators, add Liveness and Readiness probe #4326

Merged
merged 3 commits into from
Dec 18, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions changelog/fragments/liveness_readiness_probe_for_operator.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# entries is a list of entries to include in
# release notes and/or the migration guide
entries:
- description: >
For Helm-based operators, added Liveness and Readiness probe by default using [`healthz.Ping`](https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/healthz#CheckHandler).

# kind is one of:
# - addition
# - change
# - deprecation
# - removal
# - bugfix
kind: "addition"

# Is this a breaking change?
breaking: false

# Migration can be defined to automatically add a section to
# the migration guide. This is required for breaking changes.
migration:
header: (Optional) For Helm-based operators, add Liveness and Readiness probe
body: >
New projects built with the tool will have the probes configured by default. The endpoints `/healthz` and
`/readyz` are available now in the image based provided.

You can update your pre-existing project to use them. For that update the Dockerfile to use the latest
release base image, then add the following to the `manager` container in
`config/default/manager/manager.yaml`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path to the manager resource seems to be wrong. Should be config/manager/manager.yaml instead.


```yaml
livenessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
```
- description: >
For Ansible-based operators, added Liveness and Readiness probe by default using [`healthz.Ping`](https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/healthz#CheckHandler).

# kind is one of:
# - addition
# - change
# - deprecation
# - removal
# - bugfix
kind: "addition"

# Is this a breaking change?
breaking: false

# Migration can be defined to automatically add a section to
# the migration guide. This is required for breaking changes.
migration:
header: (Optional) For Ansible-based operators, add Liveness and Readiness probe
body: >
New projects built with the tool will have the probes configured by default. The endpoints `/healthz` and
`/readyz` are available now in the image based provided.

You can update your pre-existing project to use them. For that update the Dockerfile to use the latest
release base image, then add the following to the `manager` container in
`config/default/manager/manager.yaml`:

```yaml
livenessProbe:
httpGet:
path: /healthz
port: 6789
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /readyz
port: 6789
initialDelaySeconds: 5
periodSeconds: 10
```
- description: >
For Ansible-based operators, the `/ping` endpoint is deprecated. Use `/healthz` instead.
kind: "deprecation"
breaking: false

- description: >
For Ansible/Helm-based operators, added new flag `--health-probe-bind-address` to set the health probe address.
kind: "addition"
9 changes: 9 additions & 0 deletions internal/ansible/flags/flag.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type Flags struct {
AnsibleRolesPath string
AnsibleCollectionsPath string
MetricsAddress string
ProbeAddr string
LeaderElectionID string
LeaderElectionNamespace string
AnsibleArgs string
Expand Down Expand Up @@ -82,6 +83,14 @@ func (f *Flags) AddTo(flagSet *pflag.FlagSet) {
":8080",
"The address the metric endpoint binds to",
)
// todo: for Go/Helm the port used is: 8081
// update it to keep the project aligned to the other
// types for 2.0
flagSet.StringVar(&f.ProbeAddr,
"health-probe-bind-address",
":6789",
"The address the probe endpoint binds to.",
)
flagSet.BoolVar(&f.EnableLeaderElection,
"enable-leader-election",
false,
Expand Down
18 changes: 12 additions & 6 deletions internal/cmd/ansible-operator/run/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,7 @@ import (
sdkVersion "github.com/operator-framework/operator-sdk/internal/version"
)

var (
metricsHost = "0.0.0.0"
log = logf.Log.WithName("cmd")
healthProbePort int32 = 6789
)
var log = logf.Log.WithName("cmd")

func printVersion() {
log.Info("Version",
Expand Down Expand Up @@ -105,8 +101,8 @@ func run(cmd *cobra.Command, f *flags.Flags) {
// Set default manager options
// TODO: probably should expose the host & port as an environment variables
options := manager.Options{
HealthProbeBindAddress: fmt.Sprintf("%s:%d", metricsHost, healthProbePort),
MetricsBindAddress: f.MetricsAddress,
HealthProbeBindAddress: f.ProbeAddr,
LeaderElection: f.EnableLeaderElection,
LeaderElectionID: f.LeaderElectionID,
LeaderElectionResourceLock: resourcelock.ConfigMapsResourceLock,
Expand Down Expand Up @@ -148,6 +144,15 @@ func run(cmd *cobra.Command, f *flags.Flags) {
os.Exit(1)
}

if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
log.Error(err, "Unable to set up health check")
os.Exit(1)
}
if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
log.Error(err, "Unable to set up ready check")
os.Exit(1)
}

cMap := controllermap.NewControllerMap()
watches, err := watches.Load(f.WatchesFile, f.MaxConcurrentReconciles, f.AnsibleVerbosity)
if err != nil {
Expand Down Expand Up @@ -183,6 +188,7 @@ func run(cmd *cobra.Command, f *flags.Flags) {
}, w.Blacklist)
}

// todo: remove when a upper version be bumped
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be a good idea to file an issue for this and put a "backwards incompatible" label, or put this in the 2.0 milestone. Otherwise, we will probably forget this when we bump to 2.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err = mgr.AddHealthzCheck("ping", healthz.Ping)
camilamacedo86 marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
log.Error(err, "Failed to add Healthz check.")
Expand Down
11 changes: 11 additions & 0 deletions internal/cmd/helm-operator/run/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import (
"k8s.io/client-go/tools/leaderelection/resourcelock"
"sigs.k8s.io/controller-runtime/pkg/cache"
"sigs.k8s.io/controller-runtime/pkg/client/config"
"sigs.k8s.io/controller-runtime/pkg/healthz"
logf "sigs.k8s.io/controller-runtime/pkg/log"
zapf "sigs.k8s.io/controller-runtime/pkg/log/zap"
"sigs.k8s.io/controller-runtime/pkg/manager"
Expand Down Expand Up @@ -96,6 +97,7 @@ func run(cmd *cobra.Command, f *flags.Flags) {
// Set default manager options
options := manager.Options{
MetricsBindAddress: f.MetricsAddress,
HealthProbeBindAddress: f.ProbeAddr,
LeaderElection: f.EnableLeaderElection,
LeaderElectionID: f.LeaderElectionID,
LeaderElectionResourceLock: resourcelock.ConfigMapsResourceLock,
Expand Down Expand Up @@ -130,6 +132,15 @@ func run(cmd *cobra.Command, f *flags.Flags) {
os.Exit(1)
}

if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
log.Error(err, "Unable to set up health check")
os.Exit(1)
}
if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
log.Error(err, "Unable to set up ready check")
os.Exit(1)
}

ws, err := watches.Load(f.WatchesFile)
if err != nil {
log.Error(err, "Failed to create new manager factories.")
Expand Down
6 changes: 6 additions & 0 deletions internal/helm/flags/flag.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ type Flags struct {
LeaderElectionID string
LeaderElectionNamespace string
MaxConcurrentReconciles int
ProbeAddr string
}

// AddTo - Add the helm operator flags to the the flagset
Expand All @@ -49,6 +50,11 @@ func (f *Flags) AddTo(flagSet *pflag.FlagSet) {
":8080",
"The address the metric endpoint binds to",
)
flagSet.StringVar(&f.ProbeAddr,
"health-probe-bind-address",
":8081",
estroz marked this conversation as resolved.
Show resolved Hide resolved
"The address the probe endpoint binds to.",
)
flagSet.BoolVar(&f.EnableLeaderElection,
"enable-leader-election",
false,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,5 +78,17 @@ spec:
- name: ANSIBLE_GATHERING
value: explicit
image: {{ .Image }}
livenessProbe:
httpGet:
path: /readyz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

livenessProbe using /readyz endpoint

port: 6789
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /healthz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readinessProbe using /healthz endpoint

port: 6789
initialDelaySeconds: 5
periodSeconds: 10
terminationGracePeriodSeconds: 10
`
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,18 @@ spec:
- "--enable-leader-election"
- "--leader-election-id={{ .ProjectName }}"
name: manager
livenessProbe:
httpGet:
path: /readyz
port: 8081
estroz marked this conversation as resolved.
Show resolved Hide resolved
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /healthz
port: 8081
estroz marked this conversation as resolved.
Show resolved Hide resolved
initialDelaySeconds: 5
periodSeconds: 10
resources:
limits:
cpu: 100m
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,19 @@ spec:
- name: ANSIBLE_GATHERING
value: explicit
image: quay.io/example/memcached-operator:v0.0.1
livenessProbe:
httpGet:
path: /readyz
port: 6789
initialDelaySeconds: 15
periodSeconds: 20
name: manager
readinessProbe:
httpGet:
path: /healthz
port: 6789
initialDelaySeconds: 5
periodSeconds: 10
Comment on lines +127 to +139
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

livenessProbe using /readyz endpoint and readinessProbe using /healthz endpoint. Should be the other way around I guess.

resources: {}
terminationGracePeriodSeconds: 10
permissions:
Expand Down
12 changes: 12 additions & 0 deletions testdata/ansible/memcached-operator/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,16 @@ spec:
- name: ANSIBLE_GATHERING
value: explicit
image: controller:latest
livenessProbe:
httpGet:
path: /readyz
port: 6789
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /healthz
port: 6789
initialDelaySeconds: 5
periodSeconds: 10
Comment on lines +34 to +45
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

livenessProbe using /readyz endpoint and readinessProbe using /healthz endpoint. Should be the other way around I guess.

terminationGracePeriodSeconds: 10
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,19 @@ spec:
- --enable-leader-election
- --leader-election-id=memcached-operator
image: quay.io/example/memcached-operator:v0.0.1
livenessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
name: manager
readinessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
Comment on lines +212 to +224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

livenessProbe using /readyz endpoint and readinessProbe using /healthz endpoint. Should be the other way around I guess.

resources:
limits:
cpu: 100m
Expand Down
12 changes: 12 additions & 0 deletions testdata/helm/memcached-operator/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,18 @@ spec:
- "--enable-leader-election"
- "--leader-election-id=memcached-operator"
name: manager
livenessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
Comment on lines +31 to +42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

livenessProbe using /readyz endpoint and readinessProbe using /healthz endpoint. Should be the other way around I guess.

resources:
limits:
cpu: 100m
Expand Down