OCPBUGS-86949: Guard HCCO KubeletConfig CM deletion against transient source absence#8672
OCPBUGS-86949: Guard HCCO KubeletConfig CM deletion against transient source absence#8672vsolanki12 wants to merge 1 commit into
Conversation
…bsence HCCO's reconcileKubeletConfig deletes guest-side ConfigMaps whose source is absent from the HCP namespace. During the immutable-to-mutable migration (OCPBUGS-85778) or any transient API error, the source CM can be briefly absent, causing HCCO to delete the guest copy. NTO then regenerates MachineConfigs without it, triggering an MCO node rollout. Skip deletion of guest-side CMs that carry the NTOMirroredConfigLabel, since their source is expected to reappear on the next reconcile. Also refactor deleteImmutableConfigMapIfNeeded to use DeleteIfNeededWithPredicate with a KubeletConfigConfigMapLabel ownership guard, and clear ResourceVersion after the predicate-based delete to avoid stale-resourceVersion errors on the subsequent CreateOrUpdate. Signed-off-by: Vimal Solanki <vsolanki@redhat.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@vsolanki12: This pull request references Jira Issue OCPBUGS-86949, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR modifies KubeletConfig ConfigMap reconciliation in the hosted cluster operator to address mutability and NTO mirroring concerns. The reconciler now forces ConfigMaps to be mutable before update by clearing 🚥 Pre-merge checks | ✅ 4 | ❌ 7❌ Failed checks (7 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@vsolanki12: This pull request references Jira Issue OCPBUGS-86949, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: vsolanki12 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8672 +/- ##
=======================================
Coverage 41.43% 41.44%
=======================================
Files 756 756
Lines 93658 93664 +6
=======================================
+ Hits 38807 38816 +9
+ Misses 52128 52125 -3
Partials 2723 2723
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
/uncc |
|
@vsolanki12: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
I have tried to reproduce and tested the fix in my test cluster as below: Environment: Custom CPO image deployed: Test scenario: Deleted source KubeletConfig CM from HCP namespace to simulate transient absence during immutable-to-mutable migration (OCPBUGS-85778). Before fix: guest-side CM immediately removed: After fix: guest side CM preserved: HCCO logs guard active: |
|
/cc @jparrill |
What this PR does / why we need it:
HCCO's
reconcileKubeletConfigdeletes guest-side ConfigMaps whose source is absent from the HCP namespace. During the immutable-to-mutable migration (OCPBUGS-85778) or any transient API error, the source CM can be briefly absent, causing HCCO to delete the guest copy. NTO then regenerates MachineConfigs without it, triggering an MCO node rollout.This PR:
NTOMirroredConfigLabel, since their source is expected to reappear on the next reconcile cycledeleteImmutableConfigMapIfNeeded: Refactors to useDeleteIfNeededWithPredicatewith aKubeletConfigConfigMapLabelownership check, preventing accidental deletion of unrelated immutable ConfigMapsResourceVersionandImmutableafterDeleteIfNeededWithPredicateto avoid stale-resourceVersion errors and immutable leakage on the subsequentCreateOrUpdateWhich issue(s) this PR fixes:
Fixes OCPBUGS-86949
Special notes for your reviewer:
Checklist:
Summary by CodeRabbit
Bug Fixes
Tests