Skip admission webhook validation for deleting MachineDeployments#2017
Conversation
The MachineDeployment admission webhook runs full cloud provider validation (including live API calls) on UPDATEs, even when the object is being deleted. This blocks OSM from removing its cleanup finalizer when the provider image is invalid or unresolvable, leaving MachineDeployments stuck in Terminating indefinitely. Add a DeletionTimestamp early return to mutateMachineDeployments, mirroring the pattern used in KKP's cluster mutator, OSM's MachineDeployment mutation handler, and controller-runtime's CustomDefaulter. Fixes kubermatic/kubermatic#15802 Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>
kron4eg
left a comment
There was a problem hiding this comment.
/approve
long should have been done ❤️
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kron4eg The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
LGTM label has been added. DetailsGit tree hash: fcb7db7075f90190d8f3e52af3643458ddde458c |
|
/retest Review the full test history Silence the bot with an |
|
/cherrypick release/v1.65 |
|
/cherrypick release/v1.64 |
|
@buraksekili: new pull request created: #2032 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@buraksekili: new pull request created: #2033 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
The MachineDeployment admission webhook (
/machinedeployments) runs full cloud provider validation, including live API calls to resolve images on every UPDATE, even when the MachineDeployment has aDeletionTimestampset and is being garbage collected.When OSM's
operating-system-config-controllerremoves its cleanup finalizer (kubermatic.io/cleanup-operating-system-configs) viaworkerClient.Update(), the UPDATE is routed through the webhook. If the provider image is invalid or no longer resolves (e.g. an AMI that was deleted from AWS),Validate()returns an error, the webhook denies the UPDATE, and OSM cannot remove its finalizer. The MachineDeployment stays stuck inTerminatingindefinitely.This PR adds a
DeletionTimestampearly return tomutateMachineDeployments, placed beforeDeepCopy()and the full mutation/validation pipeline. Objects under deletion returnAllowed: trueimmediately, skipping the unnecessary deep copy, re-marshaling, and cloud provider validation call.Which issue(s) this PR fixes:
Fixes kubermatic/kubermatic#15802
What type of PR is this?
/kind bug
Special notes for your reviewer:
Does this PR introduce a user-facing change? Then add your Release Note here:
Documentation:
Test issue: