installer: add controller that watch pending installer pods#550
Conversation
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Show resolved
Hide resolved
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Show resolved
Hide resolved
7d3aa06 to
9244238
Compare
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Show resolved
Hide resolved
9244238 to
059fec0
Compare
pkg/operator/staticpod/controller/installerstate/installer_state_controller.go
Show resolved
Hide resolved
059fec0 to
55e91fb
Compare
55e91fb to
f3a5f1c
Compare
|
The proof PR is green: openshift/cluster-kube-apiserver-operator#586 |
|
/cherrypick release-4.2 |
|
@mfojtik: once the present PR merges, I will cherry-pick it on top of release-4.2 in a new PR and assign it to you. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| if len(pods) == 0 { | ||
| return conditions, nil | ||
| } | ||
| namespaceEvents, err := c.eventsGetter.Events(c.targetNamespace).List(metav1.ListOptions{}) |
There was a problem hiding this comment.
Wouldn't it be better to use Search(scheme, obj) here? This way you'd get only events for the pods. You would be exchanging one call vs several but each will be about specific object?
I'm not sure if obj can be a list of objects, though.
There was a problem hiding this comment.
i'm not sure the .Search will make it faster/more efficient... this is only listing events for single namespace and only when there is actual pending pods :-)
| if event.InvolvedObject.Kind != "Pod" { | ||
| continue | ||
| } | ||
| if !strings.Contains(event.Message, "failed to create pod network") { |
There was a problem hiding this comment.
I wonder whether we could benefit from the more general approach of https://github.com/openshift/cluster-kube-apiserver-operator/pull/571/files#diff-d4d4aa822fed3489ac3aee1560b501b4R94 where the idea was that you can later easily extend the number of regexes to get a specific reason about why the pod is failing and not only stick to network failures in the longer run.
There was a problem hiding this comment.
yes, a good follow up I think
|
this is ok to start, but I would try to write the code so that it works when kubelet events fail to de-dupe and when we have more than one installer pod /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, mfojtik, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@mfojtik: new pull request created: #555 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
openshift/library-go#551: Add pkg/crypto:MakeSelfSignedCAConfigForSubject openshift/library-go#550: installer: add controller that watch pending installer pods openshift/library-go#546: Emit event when certificate gets updated
installer: add controller that watch pending installer pods
This will add controller that watches the installer pods in
Pendingstate that are in this state for longer then 5 minutes. If such pods are found, the controller will then make the operator go to Degraded and report the reason and message found for such pod/container state.This will help improve debugging and triaging bugs caused by slow networking or kubelet that prevents rolling updates to static pod based operators.
/cc @deads2k
/cc @sttts