Skip to content

Commit

Permalink
manifests/0000_31_cluster-baremetal-operator_06_deployment: Enable le…
Browse files Browse the repository at this point in the history
…ader election

The option has been available for years:

  $ git blame main.go | grep enable-leader-election
  dcbe86f (Sandhya Dasu               2020-08-18 21:09:29 -0400  72)    flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,

and without it overlapping operator pods can crash-loop [1]:

  : [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers	0s
  {  event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 26 times
  event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 51 times}

while fighting each other over the same ClusterOperator status:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-ovn-upgrade/1737335551998038016/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2
  $ zgrep -h '"resource":"clusteroperators","name":"baremetal"' kube-apiserver/*audit*log.gz | jq -r 'select(.verb == "create" or .verb == "update") | .stageTimestamp + " " + .verb + " " + (.responseStatus.code | tostring) + " " + (.objectRef.subresource) + " " + .user.username + " " + .user.extra["authentication.kubernetes.io/pod-name"][0]' | grep 'T06:08:.*cluster-baremetal-operator' | sort
  2023-12-20T06:08:21.757799Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4
  2023-12-20T06:08:21.778638Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g
  2023-12-20T06:08:21.780378Z update 409 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4
  2023-12-20T06:08:21.790000Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g
  2023-12-20T06:08:21.802780Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g

Using a leader lock will avoid this contention, and the system should
be able to coast through brief moments after an outgoing leader leaves
until a replacement leader picks things back up.

I'm also setting a Recreate strategy [2], because:

1. Incoming pod surged by the default Deployment strategy.
2. Incoming pod attempts to acquire the Lease, but the outgoing pod is holding it.
3. Outgoing pod releases the lease and exits.
4. Incoming pod tries again, and this time acquires the lease.

can be slow in the 3-to-4 phase, while:

1. Outgoing pod releases the lease and exits.
2. Incoming pod created, scheduled, and acquires the lease.

tends to be faster.  And again, the component should be able to coast
through small durations without a functioning leader.

See openshift/machine-config-operator@7530ded86 (install: Recreate and
delayed default ServiceAccount deletion, 2023-08-29,
openshift/machine-config-operator#3895) for another example of how
Recreate can help that way.

[1]: https://issues.redhat.com/browse/OCPBUGS-25766
[2]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment
  • Loading branch information
wking committed Dec 20, 2023
1 parent 18c6614 commit 4e4519d
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ spec:
selector:
matchLabels:
k8s-app: cluster-baremetal-operator
strategy:
type: Recreate
template:
metadata:
annotations:
Expand All @@ -26,6 +28,8 @@ spec:
image: registry.ci.openshift.org/openshift:cluster-baremetal-operator
command:
- "/usr/bin/cluster-baremetal-operator"
args:
- --enable-leader-election
env:
- name: RELEASE_VERSION
value: "0.0.1-snapshot"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ spec:
selector:
matchLabels:
k8s-app: cluster-baremetal-operator
strategy:
type: Recreate
template:
metadata:
annotations:
Expand All @@ -26,7 +28,9 @@ spec:
k8s-app: cluster-baremetal-operator
spec:
containers:
- command:
- args:
- --enable-leader-election
command:
- /usr/bin/cluster-baremetal-operator
env:
- name: RELEASE_VERSION
Expand Down

0 comments on commit 4e4519d

Please sign in to comment.