fix: delete misconfigured pods in error state.#338
Conversation
| // handleCreatePodRequest Finds relevant AuthProxyWorkload resources and updates the pod | ||
| // with matching resources, returning a non-nil pod when the pod was updated. | ||
| func (a *PodAdmissionWebhook) handleCreatePodRequest(ctx context.Context, p corev1.Pod) (*corev1.Pod, error) { | ||
| l := logf.FromContext(ctx) |
There was a problem hiding this comment.
This change moves code from the handleCreatePodRequest() function into a new function called findMatchingProxies() so that it can be reused between the webhook and the new Pod event listener.
| // listOwners returns the list of this object's owners and its extended owners. | ||
| // Warning: this is a recursive function | ||
| func (a *PodAdmissionWebhook) listOwners(ctx context.Context, object client.Object) ([]workload.Workload, error) { | ||
| func listOwners(ctx context.Context, c client.Client, object client.Object) ([]workload.Workload, error) { |
There was a problem hiding this comment.
This refactors listOwners() from a member of PodAdmissionWebhook to a plain function so that it can be used both in the webhook and the pod event listener.
| } | ||
| return owners, nil | ||
| } | ||
|
|
There was a problem hiding this comment.
This is the new pod event listener implementation.
| setupLog.Error(err, "unable to create workload admission webhook controller") | ||
| return err | ||
| } | ||
| err = registerPodInformer(mgr, u) |
There was a problem hiding this comment.
Register the PodEventHandler when the operator starts up.
|
@hessjcg Is it possible to add tests for this to make sure |
|
Does this fix #337? |
|
@enocom, Yes this fixes #337. |
…configured incorrectly.
… PodEventHandler.
01a864b to
8a33ea6
Compare
|
Updated the title to reflect what we're doing here. |
| } | ||
|
|
||
| // OnAdd is called by the informer when a Pod is added. | ||
| func (h *PodEventHandler) OnAdd(obj interface{}) { |
There was a problem hiding this comment.
Is this and the following methods part of an interface that Kubernetes uses?
There was a problem hiding this comment.
the OnAdd, OnChange, OnDelete methods all implement cache.ResourceEventHandler. I've updated the method comment.
| } | ||
|
|
||
| // OnAdd is called by the informer when a Pod is added. | ||
| func (h *PodEventHandler) OnAdd(obj interface{}) { |
There was a problem hiding this comment.
the OnAdd, OnChange, OnDelete methods all implement cache.ResourceEventHandler. I've updated the method comment.
enocom
left a comment
There was a problem hiding this comment.
LGTM. I'd like @pwschuurman to approve this one just to make sure we have the logic and nuances right here.
enocom
left a comment
There was a problem hiding this comment.
LGTM. I'd like @pwschuurman to approve this one just to make sure we have the logic and nuances right here.
|
/lgtm |
|
@enocom This needs an "Approve" review from you before I can merge it, since @pwschuurman is not a code owner. |
| return nil, fmt.Errorf("there is an AuthProxyWorkloadConfiguration error reconciling this workload %v", wlConfigErr) | ||
| } | ||
|
|
||
| return wl.Pod, nil // updated |
There was a problem hiding this comment.
Is this comment needed here? Seems slightly vague, maybe change to // update pod? if that is what it is intended to say
| updater *workload.Updater | ||
| } | ||
|
|
||
| // newDeletePodController constructs an podDeleteController |
There was a problem hiding this comment.
| // newDeletePodController constructs an podDeleteController | |
| // newDeletePodController constructs a podDeleteController |
| return nil | ||
| } | ||
|
|
||
| // Check if this pod is in error and missing proxy containers |
There was a problem hiding this comment.
add detail here? is in error the proper wording? in error state perhaps?
| // Check if this pod is in error and missing proxy containers | |
| // Check if this pod is in error state and missing proxy containers |
| // Check if this pod is in error and missing proxy containers | ||
| wlConfigErr := r.updater.CheckWorkloadContainers(wl, proxies) | ||
|
|
||
| // If this pod is in error, delete it. Simply logging an error is sufficient. |
There was a problem hiding this comment.
same here should we be saying in error?
The pod cannot be configured to reliably instrument every single pod creation event due to limitations in the K8s API.
Thus, the operator needs to listen for pod create and update events, to check if the pod is configured correctly. If the
pod is not configured correctly, and the pod is in an error or waiting state, the operator will delete the pod so that it can
be replaced. The replacement pod will be correctly configured by the webhook as it is created.
Fixes #337