manifests/0000_31_cluster-baremetal-operator_06_deployment: Enable le… · openshift/cluster-baremetal-operator@a409b95

Commit

manifests/0000_31_cluster-baremetal-operator_06_deployment: Enable le…

…ader election

The option has been available for years:

  $ git blame main.go | grep enable-leader-election
  dcbe86f (Sandhya Dasu               2020-08-18 21:09:29 -0400  72)    flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,

and without it overlapping operator pods can crash-loop [1]:

  : [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers	0s
  {  event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 26 times
  event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 51 times}

while fighting each other over the same ClusterOperator status:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-ovn-upgrade/1737335551998038016/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2
  $ zgrep -h '"resource":"clusteroperators","name":"baremetal"' kube-apiserver/*audit*log.gz | jq -r 'select(.verb == "create" or .verb == "update") | .stageTimestamp + " " + .verb + " " + (.responseStatus.code | tostring) + " " + (.objectRef.subresource) + " " + .user.username + " " + .user.extra["authentication.kubernetes.io/pod-name"][0]' | grep 'T06:08:.*cluster-baremetal-operator' | sort
  2023-12-20T06:08:21.757799Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4
  2023-12-20T06:08:21.778638Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g
  2023-12-20T06:08:21.780378Z update 409 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4
  2023-12-20T06:08:21.790000Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g
  2023-12-20T06:08:21.802780Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g

Using a leader lock will avoid this contention, and the system should
be able to coast through brief moments after an outgoing leader leaves
until a replacement leader picks things back up.

I'm also setting a Recreate strategy [2], because:

1. Incoming pod surged by the default Deployment strategy.
2. Incoming pod attempts to acquire the Lease, but the outgoing pod is holding it.
3. Outgoing pod releases the lease and exits.
4. Incoming pod tries again, and this time acquires the lease.

can be slow in the 3-to-4 phase, while:

1. Outgoing pod releases the lease and exits.
2. Incoming pod created, scheduled, and acquires the lease.

tends to be faster.  And again, the component should be able to coast
through small durations without a functioning leader.

See openshift/machine-config-operator@7530ded86 (install: Recreate and
delayed default ServiceAccount deletion, 2023-08-29,
openshift/machine-config-operator#3895) for another example of how
Recreate can help that way.

[1]: https://issues.redhat.com/browse/OCPBUGS-25766
[2]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment

Loading branch information

wking committed Dec 20, 2023

1 parent 18c6614 commit a409b95

manifests/0000_31_cluster-baremetal-operator_06_deployment.yaml

-Original file line number
+Diff line change
@@ Expand Up / @@ -15,6 +15,8 @@ spec: @@
       selector:
         matchLabels:
           k8s-app: cluster-baremetal-operator
+      strategy:
+        type: Recreate
       template:
         metadata:
           annotations:
@@ Expand Down Expand Up / @@ -62,6 +64,7 @@ spec: @@
             - --tls-private-key-file=/etc/tls/private/tls.key
             - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
             - --config-file=/etc/baremetal-kube-rbac-proxy/config-file.yaml
+            - --enable-leader-election
             - --logtostderr=true
             - --v=10
             image: registry.ci.openshift.org/openshift:kube-rbac-proxy
@@ Expand Down @@

0 comments on commit `a409b95`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `a409b95`

Commit

There are no files selected for viewing

0 comments on commit a409b95

0 comments on commit `a409b95`