OCPBUGS-83585: Wait for CRD removal in GenerateCRDInstallTest to fix flaky envtest#8261
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@jparrill: This pull request references Jira Issue OCPBUGS-83585, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughBoth Sequence Diagram(s)sequenceDiagram
participant Test as Test Code
participant K8sClient as k8sClient
participant API as Kubernetes API Server
Test->>K8sClient: envtest.UninstallCRDs(...)
K8sClient->>API: Uninstall CRDs request
API-->>K8sClient: Uninstall accepted (CRDs deleting)
loop per CRD (until deleted)
Test->>K8sClient: Get CRD
K8sClient->>API: GET /apis/apiextensions.k8s.io/v1/customresourcedefinitions/{name}
alt CRD exists
API-->>K8sClient: 200 OK (CRD present)
K8sClient-->>Test: nil error
Note right of Test: Wait 1s then retry
else CRD not found
API-->>K8sClient: 404 Not Found
K8sClient-->>Test: apierrors.IsNotFound -> true
Note right of Test: Proceed to next CRD
end
end
Test-->>Test: All CRDs confirmed removed
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 inconclusive)
✅ Passed checks (9 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jparrill The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/envtest/generator.go`:
- Around line 255-258: The test currently uses
Eventually(...).ShouldNot(Succeed()) which treats any error (timeouts, transient
API errors) as a success; change the Eventually assertion to explicitly check
for NotFound by using the same Get closure but returning whether
apierrors.IsNotFound(err) (or asserting
Expect(apierrors.IsNotFound(err)).To(BeTrue())) so the Eventually only succeeds
when the CRD is actually absent; update the closure that calls
k8sClient.Get(ctx, client.ObjectKeyFromObject(crd), crd) to use
apierrors.IsNotFound on the returned error (importing
k8s.io/apimachinery/pkg/api/errors as apierrors if needed).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: a79c7bcb-90a1-46b7-a7c3-f6f4d2aa2307
📒 Files selected for processing (1)
test/envtest/generator.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8261 +/- ##
=======================================
Coverage 35.61% 35.62%
=======================================
Files 767 767
Lines 93333 93330 -3
=======================================
Hits 33245 33245
+ Misses 57399 57396 -3
Partials 2689 2689 🚀 New features to boost your workflow:
|
|
/jira refresh |
|
@jparrill: This pull request references Jira Issue OCPBUGS-83585, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@jparrill: This pull request references Jira Issue OCPBUGS-83585, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
…flaky envtest failures GenerateCRDInstallTest uninstalls all CRDs after validation but does not wait for the API server to fully remove them. When individual per-suite tests start immediately after, they find a stale CRD in a transitional state, causing WaitForCRDs to timeout with "context deadline exceeded". Add a wait-for-removal loop after UninstallCRDs, using explicit apierrors.IsNotFound checks instead of ShouldNot(Succeed()) to avoid false positives from transient API errors. Also fix the same pattern in GenerateTestSuite's AfterEach for consistency. Fixes: https://issues.redhat.com/browse/OCPBUGS-83585 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Juan Manuel Parrilla Madrid <jparrill@redhat.com>
|
@jparrill: This pull request references Jira Issue OCPBUGS-83585, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jparrill: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
/verified by UnitTest |
|
@jparrill: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jparrill: Jira Issue Verification Checks: Jira Issue OCPBUGS-83585 Jira Issue OCPBUGS-83585 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
What this PR does / why we need it
GenerateCRDInstallTestuninstalls all CRDs after validation but does not wait for the API server to fully remove them. When individual per-suite tests start immediately after, they find a stale CRD in a transitional/deletion state, causingWaitForCRDsto timeout withcontext deadline exceeded.This adds a wait-for-removal loop after
UninstallCRDsinGenerateCRDInstallTest, matching the pattern already used inGenerateTestSuite'sAfterEach.Root Cause
The execution order in
suite_test.gois:GenerateCRDInstallTest("Default")— installs Default CRDsGenerateCRDInstallTest("TechPreviewNoUpgrade")— installs all TechPreviewNoUpgrade CRDs (includinghcpetcdbackups), uninstalls without waitinghcpetcdbackupsCustomNoUpgrade variant hits the stale CRDThe flakiness depends on how fast the API server finalizes CRD deletion, which varies by K8s version and CI load:
Which issue(s) this PR fixes
Fixes https://issues.redhat.com/browse/OCPBUGS-83585
Checklist
🤖 Generated with Claude Code
Summary by CodeRabbit