OCPBUGS-63048: Kube object asserts do not handle case where object is replaced#6059
OCPBUGS-63048: Kube object asserts do not handle case where object is replaced#6059dkhater-redhat wants to merge 4 commits into
Conversation
The DoesNotExist assertion methods only checked if an object with a given name returns NotFound error. This failed to detect when an object was deleted and replaced with a new object of the same name but different UID. This commonly occurs in tests like TestRebuildAnnotationRestartsBuild where: 1. A MachineOSBuild/Job/Pod is deleted 2. A rebuild creates new objects with same names but different UIDs 3. DoesNotExist assertions see the new objects and keep polling Fix by: - Capturing the original object's UID before polling - Checking both NotFound errors AND UID changes - Treating UID mismatch as deletion (object was replaced) Updated methods: - PodDoesNotExist - JobDoesNotExist - SecretDoesNotExist - ConfigMapDoesNotExist - MachineOSBuildDoesNotExist - MachineOSConfigDoesNotExist - MachineConfigDoesNotExist - ImageDoesNotExist This fixes the flaky failures in TestRebuildAnnotationRestartsBuild.
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@dkhater-redhat: This pull request references Jira Issue OCPBUGS-63048, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds UID-aware deletion checks to various test assertions, introduces a MachineOSBuild deletion finalizer and reconciler handling, and strengthens E2E test logging and polling around MOSB deletion and ImageStreamTag pruning. ChangesUID-aware deletion + MOSB finalizer + E2E updates
Sequence Diagram(s)sequenceDiagram
participant Controller as MachineOSBuildController
participant API as KubernetesAPI
participant IST as ImageStreamTag
participant CM as DigestConfigMap
Controller->>API: Observe MachineOSBuild (syncMachineOSBuild)
alt DeletionTimestamp set
Controller->>IST: Delete ImageStreamTag / artifacts
Controller->>CM: Delete digest ConfigMap
Controller->>API: Update MachineOSBuild to remove MachineOSBuildDeletionFinalizer
else Normal flow
Controller->>API: Ensure MachineOSBuildDeletionFinalizer present on object
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/helpers/assertions.go (1)
849-854: 💤 Low valueRemove dead conditional branch.
Both branches of the
if a.pollcheck return the same value(false, err), making the conditional unnecessary.Suggested simplification
// If there's another error, return it if err != nil { - if a.poll { - return false, err - } return false, err }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/helpers/assertions.go` around lines 849 - 854, The conditional checking a.poll is dead because both branches return the same tuple; simplify by removing the if block and directly returning (false, err) where that code appears (the block referencing a.poll in assertions.go), i.e., replace the if err != nil { if a.poll { return false, err } return false, err } with a single if err != nil { return false, err } to eliminate redundant logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/helpers/assertions.go`:
- Around line 849-854: The conditional checking a.poll is dead because both
branches return the same tuple; simplify by removing the if block and directly
returning (false, err) where that code appears (the block referencing a.poll in
assertions.go), i.e., replace the if err != nil { if a.poll { return false, err
} return false, err } with a single if err != nil { return false, err } to
eliminate redundant logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6a2b52cb-997f-422f-bcd4-8b6ffb21692f
📒 Files selected for processing (1)
test/helpers/assertions.go
|
/test e2e-gcp-op-ocl-part1 |
1 similar comment
|
/test e2e-gcp-op-ocl-part1 |
|
/jira refresh |
|
@isabella-janssen: This pull request references Jira Issue OCPBUGS-63048, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dkhater-redhat, isabella-janssen The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
TestMissingImageIsRebuilt was timing out waiting for ImageStreamTags to be pruned after deleting a MachineOSBuild. The test expects that when a MOSB is deleted via the API, its associated ImageStreamTag in the internal registry is automatically cleaned up. Problem: ImageStreamTags are created by the build process (podman push) and cannot have the MOSB as an owner reference (different API group). When a MOSB is deleted externally (via kubectl or test), the controller never sees or handles the deletion, leaving orphaned ImageStreamTags in the registry. The existing deleteMOSBImage logic only runs when the controller itself initiates deletion (e.g., during rebuild), not when users delete MOSBs directly. Solution: Add a custom finalizer (machine-os-build-deletion) to all MOSBs. When a MOSB is deleted: 1. Controller detects DeletionTimestamp in syncMachineOSBuild 2. Calls handleMachineOSBuildDeletion which: - Deletes the ImageStreamTag via deleteMOSBImage - Deletes the digest ConfigMap - Removes the finalizer 3. Kubernetes proceeds with MOSB deletion This is the standard Kubernetes pattern for cleanup and ensures registry pruning works regardless of how the MOSB is deleted. Fixes: TestMissingImageIsRebuilt timeout at image pruning assertion
|
New changes are detected. LGTM label has been removed. |
|
/test e2e-gcp-op-ocl-part1 |
1 similar comment
|
/test e2e-gcp-op-ocl-part1 |
|
/test unit |
The previous fix only added the finalizer to newly created MOSBs, but TestMissingImageIsRebuilt deletes MOSBs that were created earlier in the test run before our controller code executed. Add ensureMachineOSBuildFinalizer() which runs during syncMachineOSBuild to migrate existing MOSBs. This ensures all MOSBs have the deletion finalizer, whether they were created before or after the code change. This handles the migration path and ensures the test will pass even when deleting MOSBs created before the finalizer logic was added.
Added comprehensive logging to diagnose the image pruning issue: 1. Log finalizers on all three MOSBs when they're created and at key points (fetch, start, completion) 2. Log ImageStreamTag state before and after MOSB deletion to see if the finalizer is doing its job 3. Poll the MOSB during deletion to see DeletionTimestamp and finalizer removal progress 4. Log exact timing of when MOSB is fully deleted vs when IST should be pruned This will help us understand: - Whether the finalizer is being added to MOSBs - Whether the finalizer handler is running - Whether the ImageStreamTag is being deleted by the finalizer - If there's a timing issue between MOSB deletion and IST pruning The logs will show the full lifecycle and help identify where the pruning is failing.
|
/test e2e-gcp-op-ocl-part1 |
|
@dkhater-redhat: This pull request references Jira Issue OCPBUGS-63048, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controller/build/reconciler.go`:
- Around line 1024-1034: The finalizer is being removed even when cleanup fails;
change the delete flow in reconciler.go so that you only remove the
MachineOSBuild finalizer when all cleanup steps succeeded: check the return from
deleteMOSBImage(mosb) and from kubeclient.CoreV1().ConfigMaps(...).Delete(...)
(treat IsNotFound as success) and if any non-tolerated error occurs return that
error (or requeue) instead of proceeding to clear the finalizer; apply the same
guard for the other cleanup block around lines 1045-1054 so finalizer removal
happens only after successful image and configmap deletion.
- Around line 1195-1198: The current code only ensures the deletion finalizer
via ensureMachineOSBuildFinalizer during sync (migration), which leaves race
windows where newly-created MachineOSBuilds (created in normal/rebuild create
paths) can be deleted before their first reconcile and skip
handleMachineOSBuildDeletion; to fix, set
constants.MachineOSBuildDeletionFinalizer on every MachineOSBuild object
immediately before any client.Create call that creates a MachineOSBuild (i.e.,
in the normal create and rebuild create paths/functions that construct and call
Create for MachineOSBuild), removing reliance on the migration-only
ensureMachineOSBuildFinalizer for new objects and keeping that helper only for
existing-resource migration.
- Around line 981-986: The code mutates an informer-backed MachineOSBuild (mosb)
in-place when adding finalizers in
ensureMachineOSBuildFinalizer/handleMachineOSBuildDeletion; instead, obtain a
copy before mutating by calling getMachineOSBuildForUpdate(ctx, mosb.Name)
(which performs DeepCopy) or by using mosb = mosb.DeepCopy(), then call
mosb.SetFinalizers(append(..., constants.MachineOSBuildDeletionFinalizer)) and
pass that copy to
b.mcfgclient.MachineconfigurationV1().MachineOSBuilds().Update(...); update
syncMachineOSBuild/ensureMachineOSBuildFinalizer to operate on the copied object
to avoid mutating the informer cache.
In `@test/e2e-ocl-1of2/onclusterlayering_test.go`:
- Around line 271-289: The deletion poll treats NotFound as the only success
condition and can misinterpret a recreated MachineOSBuild with the same name as
the original; modify the wait.PollImmediate closure that calls
cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get to also check
the retrieved object's UID against the original thirdMOSB.UID (or reuse the
existing waitForMOSBToBeDeleted helper) so the poll returns success when the
original thirdMOSB's UID is absent/equivalently when a returned MOSB has a
different UID, logging the UID mismatch and exiting early instead of waiting the
full timeout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3aa149ea-170a-400f-b6f9-602c0756e514
📒 Files selected for processing (3)
pkg/controller/build/constants/constants.gopkg/controller/build/reconciler.gotest/e2e-ocl-1of2/onclusterlayering_test.go
| // Add the finalizer | ||
| klog.V(4).Infof("Adding deletion finalizer to MachineOSBuild %q", mosb.Name) | ||
| mosb.SetFinalizers(append(mosb.GetFinalizers(), constants.MachineOSBuildDeletionFinalizer)) | ||
|
|
||
| // Update the MOSB | ||
| _, err := b.mcfgclient.MachineconfigurationV1().MachineOSBuilds().Update(ctx, mosb, metav1.UpdateOptions{}) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let me check the file and the specific lines mentioned
wc -l pkg/controller/build/reconciler.goRepository: openshift/machine-config-operator
Length of output: 117
🏁 Script executed:
# Read the context around the lines mentioned (981-986 and 1043-1046)
sed -n '960,1000p' pkg/controller/build/reconciler.go | cat -nRepository: openshift/machine-config-operator
Length of output: 1738
🏁 Script executed:
# Also check lines 1043-1046
sed -n '1030,1060p' pkg/controller/build/reconciler.go | cat -nRepository: openshift/machine-config-operator
Length of output: 1489
🏁 Script executed:
# Find the function that contains these lines to understand the context
sed -n '900,990p' pkg/controller/build/reconciler.go | cat -nRepository: openshift/machine-config-operator
Length of output: 4650
🏁 Script executed:
# Search for getMachineOSBuildForUpdate function
rg "getMachineOSBuildForUpdate" pkg/controller/build/Repository: openshift/machine-config-operator
Length of output: 403
🏁 Script executed:
# Check if there's a lister pattern and how objects are obtained
rg -A 5 "syncMachineOSBuild" pkg/controller/build/reconciler.go | head -50Repository: openshift/machine-config-operator
Length of output: 1371
🏁 Script executed:
# Get the full syncMachineOSBuild function
sed -n '820,1000p' pkg/controller/build/reconciler.go | cat -nRepository: openshift/machine-config-operator
Length of output: 8899
🏁 Script executed:
# Get the getMachineOSBuildForUpdate function definition
rg -A 10 "func.*getMachineOSBuildForUpdate" pkg/controller/build/reconciler.goRepository: openshift/machine-config-operator
Length of output: 385
🏁 Script executed:
# See where ensureMachineOSBuildFinalizer is called
rg -B 5 -A 2 "ensureMachineOSBuildFinalizer" pkg/controller/build/reconciler.goRepository: openshift/machine-config-operator
Length of output: 799
Avoid mutating informer-backed MachineOSBuild objects in place.
syncMachineOSBuild receives mosb from the lister (b.machineOSBuildLister.List(labels.Everything())), then ensureMachineOSBuildFinalizer and handleMachineOSBuildDeletion mutate it with SetFinalizers(...) before the API update. This mutates the shared cache entry, creating stale-cache side effects and potential races. Use getMachineOSBuildForUpdate() (which already exists and does a DeepCopy) or call DeepCopy() explicitly before modifying finalizers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controller/build/reconciler.go` around lines 981 - 986, The code mutates
an informer-backed MachineOSBuild (mosb) in-place when adding finalizers in
ensureMachineOSBuildFinalizer/handleMachineOSBuildDeletion; instead, obtain a
copy before mutating by calling getMachineOSBuildForUpdate(ctx, mosb.Name)
(which performs DeepCopy) or by using mosb = mosb.DeepCopy(), then call
mosb.SetFinalizers(append(..., constants.MachineOSBuildDeletionFinalizer)) and
pass that copy to
b.mcfgclient.MachineconfigurationV1().MachineOSBuilds().Update(...); update
syncMachineOSBuild/ensureMachineOSBuildFinalizer to operate on the copied object
to avoid mutating the informer cache.
| // Delete the ImageStreamTag | ||
| if err := b.deleteMOSBImage(ctx, mosb, moscName); err != nil { | ||
| // Log the error but don't block deletion - the image might already be gone | ||
| klog.Warningf("Failed to delete image for MachineOSBuild %q during deletion: %v", mosb.Name, err) | ||
| } | ||
|
|
||
| // Delete the digest configmap if it exists | ||
| err = b.kubeclient.CoreV1().ConfigMaps(ctrlcommon.MCONamespace).Delete(ctx, utils.GetDigestConfigMapName(mosb), metav1.DeleteOptions{}) | ||
| if err != nil && !k8serrors.IsNotFound(err) { | ||
| klog.Warningf("Failed to delete digest configmap for MachineOSBuild %q during deletion: %v", mosb.Name, err) | ||
| } |
There was a problem hiding this comment.
Don’t remove the finalizer after a cleanup failure.
Any non-tolerated error from deleteMOSBImage or the digest ConfigMap delete means cleanup did not finish, but this code only logs and still clears the finalizer. That can permanently leak artifacts and defeats the ordering guarantee the finalizer is meant to provide.
💡 Suggested change
// Delete the ImageStreamTag
- if err := b.deleteMOSBImage(ctx, mosb, moscName); err != nil {
- // Log the error but don't block deletion - the image might already be gone
- klog.Warningf("Failed to delete image for MachineOSBuild %q during deletion: %v", mosb.Name, err)
- }
+ if err := b.deleteMOSBImage(ctx, mosb, moscName); err != nil {
+ return fmt.Errorf("failed to delete image for MachineOSBuild %q during deletion: %w", mosb.Name, err)
+ }
// Delete the digest configmap if it exists
err = b.kubeclient.CoreV1().ConfigMaps(ctrlcommon.MCONamespace).Delete(ctx, utils.GetDigestConfigMapName(mosb), metav1.DeleteOptions{})
if err != nil && !k8serrors.IsNotFound(err) {
- klog.Warningf("Failed to delete digest configmap for MachineOSBuild %q during deletion: %v", mosb.Name, err)
+ return fmt.Errorf("failed to delete digest configmap for MachineOSBuild %q during deletion: %w", mosb.Name, err)
}Also applies to: 1045-1054
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controller/build/reconciler.go` around lines 1024 - 1034, The finalizer
is being removed even when cleanup fails; change the delete flow in
reconciler.go so that you only remove the MachineOSBuild finalizer when all
cleanup steps succeeded: check the return from deleteMOSBImage(mosb) and from
kubeclient.CoreV1().ConfigMaps(...).Delete(...) (treat IsNotFound as success)
and if any non-tolerated error occurs return that error (or requeue) instead of
proceeding to clear the finalizer; apply the same guard for the other cleanup
block around lines 1045-1054 so finalizer removal happens only after successful
image and configmap deletion.
| // Ensure finalizer is present on existing MOSBs (migration for pre-existing builds) | ||
| if err := b.ensureMachineOSBuildFinalizer(ctx, mosb); err != nil { | ||
| return fmt.Errorf("could not ensure finalizer on MachineOSBuild %q: %w", mosb.Name, err) | ||
| } |
There was a problem hiding this comment.
Attach the deletion finalizer before every MachineOSBuild create.
Adding it here during sync still leaves the normal/rebuild create paths with a race: a freshly-created MachineOSBuild can be deleted before its first reconcile, handleMachineOSBuildDeletion sees no finalizer, and the image/configmap cleanup is skipped entirely. Set constants.MachineOSBuildDeletionFinalizer on the object before each Create, and keep this helper for migration only.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controller/build/reconciler.go` around lines 1195 - 1198, The current
code only ensures the deletion finalizer via ensureMachineOSBuildFinalizer
during sync (migration), which leaves race windows where newly-created
MachineOSBuilds (created in normal/rebuild create paths) can be deleted before
their first reconcile and skip handleMachineOSBuildDeletion; to fix, set
constants.MachineOSBuildDeletionFinalizer on every MachineOSBuild object
immediately before any client.Create call that creates a MachineOSBuild (i.e.,
in the normal create and rebuild create paths/functions that construct and call
Create for MachineOSBuild), removing reliance on the migration-only
ensureMachineOSBuildFinalizer for new objects and keeping that helper only for
existing-resource migration.
| // Poll to see the MOSB state during deletion | ||
| t.Logf("Waiting for MachineOSBuild %q to be fully deleted...", thirdMOSB.Name) | ||
| deletionStart := time.Now() | ||
| pollErr := wait.PollImmediate(time.Second*2, time.Minute*5, func() (bool, error) { | ||
| mosb, err := cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get(ctx, thirdMOSB.Name, metav1.GetOptions{}) | ||
| if k8serrors.IsNotFound(err) { | ||
| t.Logf("MachineOSBuild %q is now fully deleted (took %s)", thirdMOSB.Name, time.Since(deletionStart)) | ||
| return true, nil | ||
| } | ||
| if err != nil { | ||
| return false, err | ||
| } | ||
| if mosb.DeletionTimestamp != nil { | ||
| t.Logf("MachineOSBuild %q has DeletionTimestamp=%s, finalizers=%v (waiting for finalizer cleanup...)", | ||
| thirdMOSB.Name, mosb.DeletionTimestamp, mosb.Finalizers) | ||
| } | ||
| return false, nil | ||
| }) | ||
| require.NoError(t, pollErr) |
There was a problem hiding this comment.
Keep this deletion poll UID-aware as well.
This loop only treats NotFound as success. If the original thirdMOSB is deleted and another MachineOSBuild with the same name appears, the poll waits the full 5 minutes even though the target object is gone—the same failure mode this PR is fixing elsewhere. Compare against thirdMOSB.UID, or reuse waitForMOSBToBeDeleted.
💡 Suggested change
t.Logf("Waiting for MachineOSBuild %q to be fully deleted...", thirdMOSB.Name)
deletionStart := time.Now()
+ expectedUID := thirdMOSB.UID
pollErr := wait.PollImmediate(time.Second*2, time.Minute*5, func() (bool, error) {
mosb, err := cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get(ctx, thirdMOSB.Name, metav1.GetOptions{})
if k8serrors.IsNotFound(err) {
t.Logf("MachineOSBuild %q is now fully deleted (took %s)", thirdMOSB.Name, time.Since(deletionStart))
return true, nil
}
if err != nil {
return false, err
}
+ if mosb.UID != expectedUID {
+ t.Logf("MachineOSBuild %q was replaced with UID %s after %s", thirdMOSB.Name, mosb.UID, time.Since(deletionStart))
+ return true, nil
+ }
if mosb.DeletionTimestamp != nil {
t.Logf("MachineOSBuild %q has DeletionTimestamp=%s, finalizers=%v (waiting for finalizer cleanup...)",
thirdMOSB.Name, mosb.DeletionTimestamp, mosb.Finalizers)
}
return false, nil
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Poll to see the MOSB state during deletion | |
| t.Logf("Waiting for MachineOSBuild %q to be fully deleted...", thirdMOSB.Name) | |
| deletionStart := time.Now() | |
| pollErr := wait.PollImmediate(time.Second*2, time.Minute*5, func() (bool, error) { | |
| mosb, err := cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get(ctx, thirdMOSB.Name, metav1.GetOptions{}) | |
| if k8serrors.IsNotFound(err) { | |
| t.Logf("MachineOSBuild %q is now fully deleted (took %s)", thirdMOSB.Name, time.Since(deletionStart)) | |
| return true, nil | |
| } | |
| if err != nil { | |
| return false, err | |
| } | |
| if mosb.DeletionTimestamp != nil { | |
| t.Logf("MachineOSBuild %q has DeletionTimestamp=%s, finalizers=%v (waiting for finalizer cleanup...)", | |
| thirdMOSB.Name, mosb.DeletionTimestamp, mosb.Finalizers) | |
| } | |
| return false, nil | |
| }) | |
| require.NoError(t, pollErr) | |
| // Poll to see the MOSB state during deletion | |
| t.Logf("Waiting for MachineOSBuild %q to be fully deleted...", thirdMOSB.Name) | |
| deletionStart := time.Now() | |
| expectedUID := thirdMOSB.UID | |
| pollErr := wait.PollImmediate(time.Second*2, time.Minute*5, func() (bool, error) { | |
| mosb, err := cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get(ctx, thirdMOSB.Name, metav1.GetOptions{}) | |
| if k8serrors.IsNotFound(err) { | |
| t.Logf("MachineOSBuild %q is now fully deleted (took %s)", thirdMOSB.Name, time.Since(deletionStart)) | |
| return true, nil | |
| } | |
| if err != nil { | |
| return false, err | |
| } | |
| if mosb.UID != expectedUID { | |
| t.Logf("MachineOSBuild %q was replaced with UID %s after %s", thirdMOSB.Name, mosb.UID, time.Since(deletionStart)) | |
| return true, nil | |
| } | |
| if mosb.DeletionTimestamp != nil { | |
| t.Logf("MachineOSBuild %q has DeletionTimestamp=%s, finalizers=%v (waiting for finalizer cleanup...)", | |
| thirdMOSB.Name, mosb.DeletionTimestamp, mosb.Finalizers) | |
| } | |
| return false, nil | |
| }) | |
| require.NoError(t, pollErr) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e-ocl-1of2/onclusterlayering_test.go` around lines 271 - 289, The
deletion poll treats NotFound as the only success condition and can misinterpret
a recreated MachineOSBuild with the same name as the original; modify the
wait.PollImmediate closure that calls
cs.GetMcfgclient().MachineconfigurationV1().MachineOSBuilds().Get to also check
the retrieved object's UID against the original thirdMOSB.UID (or reuse the
existing waitForMOSBToBeDeleted helper) so the poll returns success when the
original thirdMOSB's UID is absent/equivalently when a returned MOSB has a
different UID, logging the UID mismatch and exiting early instead of waiting the
full timeout.
|
@dkhater-redhat: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The DoesNotExist assertion methods only checked if an object with a given name returns NotFound error. This failed to detect when an object was deleted and replaced with a new object of the same name but different UID.
This commonly occurs in tests like TestRebuildAnnotationRestartsBuild where:
Fix by:
Updated methods:
This fixes the flaky failures in TestRebuildAnnotationRestartsBuild.
- What I did
- How to verify it
- Description for the changelog
Summary by CodeRabbit
New Features
Tests