New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for new hotplug attachment pod to be ready #10275
Wait for new hotplug attachment pod to be ready #10275
Conversation
Before deleting old attachment pod, wait for new attachment pod to be ready, so k8s should not detach the volume from the node since there will always be a pod using the volume from its perspective. Fixed issue where when adding or removing a volume the existing volumes would still have the UID of the old attachment pod in the VMI status which caused errors to appear in the virt-handler logs about not being able to find the device or image. Fixed issue where the cleanup would attempt to remove a volume that was already gone causing errors to appear in the virt-handler log. Signed-off-by: Alexander Wels <awels@redhat.com>
Looks like I have some flaky tests, investigating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidvossel The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold feel free to remove hold when flaky test is sorted out |
Signed-off-by: Alexander Wels <awels@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good, I just left a couple small comments/nits.
pkg/virt-controller/watch/vmi.go
Outdated
@@ -1816,15 +1812,29 @@ func (c *VMIController) waitForFirstConsumerTemporaryPods(vmi *virtv1.VirtualMac | |||
} | |||
|
|||
func (c *VMIController) needsHandleHotplug(hotplugVolumes []*virtv1.Volume, hotplugAttachmentPods []*k8sv1.Pod) bool { | |||
if len(hotplugAttachmentPods) > 1 { | |||
return true | |||
} | |||
// Determine if the ready volumes have changed compared to the current pod | |||
for _, attachmentPod := range hotplugAttachmentPods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are returning when len(hotplugAttachmentPods) > 1
, is this loop still necessary? Since the number of pods in hotplugAttachmentPods
at this point would only be either 0 or 1, you might be able to change this to an if/else (to check if length is 0 or 1), rather than a loop that only runs a single iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} else { | ||
} else if err != nil { | ||
return err | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be kept as is. (If err is nil, it will return nil anyway).
} else {
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Alexander Wels <awels@redhat.com>
/retest-required |
/test pull-kubevirt-e2e-k8s-1.26-sig-compute |
/test pull-kubevirt-e2e-k8s-1.26-sig-storage |
Co-authored-by: Vasiliy Ulyanov <vulyanov@suse.de> Signed-off-by: Alexander Wels <awels@redhat.com>
Agreed, I ran through a bunch of tests and I believe you are right not sure what we were testing there, but already exists cannot happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/test pull-kubevirt-e2e-kind-1.27-vgpu |
/cherrypick release-1.0 |
@awels: new pull request created: #10292 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Similar Design like this pr #6728 |
Yes, with one difference, we ensure the new pod is in running state before deleting the old pod. IIRC that PR didn't do that. #10567 is an improvement adding a rate limit to how fast we switch attachment pods. |
What this PR does / why we need it:
Before deleting the old attachment pod, wait for the new attachment pod to be ready, so k8s will not detach the volume from the node since there will always be a pod using the volume from its perspective. This should prevent issues with certain CSI drivers that don't respect the device being in use before detaching it from the node. This will now break csi drivers that don't allow multiple pods on the same node to use the volume (this is a bug in that driver).
Fixed issue where when adding or removing a volume the existing volumes would still have the UID of the old attachment pod in the VMI status which caused errors to appear in the virt-handler logs about not being able to find the device or image. This can be observed with SyncFailed events on the VMI
Fixed issue where the cleanup would attempt to remove a volume that was already gone causing errors to appear in the virt-handler log. This can be observed with SyncFailed events on the VMI
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: