[release-4.22] OCPBUGS-84674: retry transient kubeconfig read failures in GetClientConfig#31096
Conversation
During upgrade prow jobs the kubeconfig Secret volume is re-mounted by the kubelet, which atomically swaps symlinks and creates a sub-second window where ReadFile() returns ENOENT or reads empty/partial content. With 10-15 parallel Ginkgo goroutines all calling GetClientConfig() on every client creation, this window reliably causes mass test failures. Replace the single ReadFile attempt with a wait.PollImmediate retry (200ms interval, 10s timeout) that retries on file-not-found, empty file, or clientcmd parse errors. Non-transient I/O errors still fail immediately. Adds unit tests covering all four cases. Fixes: OCPBUGS-84504 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@openshift-cherrypick-robot: Jira Issue OCPBUGS-84504 has been cloned as Jira Issue OCPBUGS-84674. Will retitle bug to link to clone. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-84674, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-cherrypick-robot: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/jira refresh |
|
@dgoodwin: This pull request references Jira Issue OCPBUGS-84674, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, openshift-cherrypick-robot The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is an automated cherry-pick of #31080
/assign dgoodwin