Skip to content

Conversation

@loganmc10
Copy link
Contributor

@loganmc10 loganmc10 commented Jul 13, 2023

When the CR is progressing, because the API is being modified it becomes unavailable quite a bit:

2023-07-13T18:51:28Z	INFO	Still waiting for ingress Progressing to be False	{"controller": "clusterrelocation", "controllerGroup": "rhsyseng.github.io", "controllerKind": "ClusterRelocation", "ClusterRelocation": {"name":"cluster"}, "namespace": "", "name": "cluster", "reconcileID": "3dc3ae60-fa85-4365-9d97-ee69ea86f042"}
E0713 18:51:29.095901       1 leaderelection.go:330] error retrieving resource lock openshift-operators/f4de3632.rhsyseng.github.io: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/f4de3632.rhsyseng.github.io": dial tcp 172.30.0.1:443: connect: connection refused
E0713 18:51:31.097243       1 leaderelection.go:330] error retrieving resource lock openshift-operators/f4de3632.rhsyseng.github.io: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/f4de3632.rhsyseng.github.io": dial tcp 172.30.0.1:443: connect: connection refused
E0713 18:51:33.097182       1 leaderelection.go:330] error retrieving resource lock openshift-operators/f4de3632.rhsyseng.github.io: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/f4de3632.rhsyseng.github.io": dial tcp 172.30.0.1:443: connect: connection refused
E0713 18:51:35.096744       1 leaderelection.go:330] error retrieving resource lock openshift-operators/f4de3632.rhsyseng.github.io: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/f4de3632.rhsyseng.github.io": dial tcp 172.30.0.1:443: connect: connection refused
E0713 18:51:37.096857       1 leaderelection.go:330] error retrieving resource lock openshift-operators/f4de3632.rhsyseng.github.io: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/f4de3632.rhsyseng.github.io": dial tcp 172.30.0.1:443: connect: connection refused
2023-07-13T18:51:38Z	INFO	Still waiting for ingress Progressing to be False	{"controller": "clusterrelocation", "controllerGroup": "rhsyseng.github.io", "controllerKind": "ClusterRelocation", "ClusterRelocation": {"name":"cluster"}, "namespace": "", "name": "cluster", "reconcileID": "3dc3ae60-fa85-4365-9d97-ee69ea86f042"}
I0713 18:51:39.095783       1 leaderelection.go:283] failed to renew lease openshift-operators/f4de3632.rhsyseng.github.io: timed out waiting for the condition
2023-07-13T18:51:39Z	ERROR	setup	problem running manager	{"error": "leader election lost"}
main.main
	/workspace/main.go:112
runtime.main
	/usr/local/go/src/runtime/proc.go:250

extending the lease duration should reduce these errors

https://sdk.operatorframework.io/docs/building-operators/golang/advanced-topics/#leader-election

@loganmc10 loganmc10 force-pushed the leader branch 2 times, most recently from 65973d7 to d202a39 Compare July 13, 2023 20:47
@loganmc10 loganmc10 changed the title use leader-for-life to reduce operator errors Extend lease duration Jul 13, 2023
@loganmc10 loganmc10 force-pushed the leader branch 3 times, most recently from ebc41cc to 5d3a2b1 Compare July 13, 2023 20:56
@loganmc10 loganmc10 marked this pull request as ready for review July 13, 2023 21:42
@loganmc10 loganmc10 merged commit 55338a5 into main Jul 14, 2023
@loganmc10 loganmc10 deleted the leader branch July 14, 2023 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants