Fix hotplug volume have a io error when use iscsi protocol csi plugin #6728

watermelon-brother · 2021-11-04T06:38:31Z

What this PR does / why we need it:
Now, virt-controller can delete old attachment pod when the new attachmentpod is running.So,It can't cause the volume iscsi connection break when delete the old attachment pod.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #6564

Special notes for your reviewer:

Release note:
None

kubevirt-bot · 2021-11-04T06:38:37Z

@watermelon-brother: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot · 2021-11-04T06:38:49Z

Hi @watermelon-brother. Thanks for your PR.

I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot · 2021-11-04T06:38:52Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign mazzystr after the PR has been reviewed.
You can assign the PR to them by writing /assign @mazzystr in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kubevirt-bot · 2021-11-04T07:12:18Z

@watermelon-brother: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

maya-r · 2021-11-04T07:14:47Z

/ok-to-test

maya-r · 2021-11-04T07:15:51Z

fyi @awels

Now, virt-controller can delete old attachment pod when the new attachmentpod is running. Signed-off-by: wujixin <599230270@qq.com>

alicefr · 2021-11-05T08:22:26Z

/retest

watermelon-brother · 2021-11-08T02:05:04Z

/retest

watermelon-brother · 2021-11-08T03:49:05Z

/retest

watermelon-brother · 2021-11-10T03:57:27Z

The tide stage of pipeline is not pass,How to solve it? @awels

alicefr · 2021-11-10T07:10:24Z

@watermelon-brother you need approval and a lgtm from reviewers to get this merged.

@awels do we want this solution? Should we try it with another CSI plugin that do not accept multiple pods accessing a RWO volume?

awels · 2021-12-20T21:20:59Z

pkg/virt-controller/watch/vmi.go

-		// Create new attachment pod that holds all the ready volumes
-		if err := c.createAttachmentPod(vmi, virtLauncherPod, readyHotplugVolumes); err != nil {
-			return err
+	switch len(currentPod) {


Any reason you used a switch instead of if len(currentPod) == 0 { ... } else { ... }

kubevirt-bot · 2022-03-20T21:28:00Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

awels · 2022-03-21T14:48:28Z

The solution I think is fine, it is just I am curious about the switch instead of an if else since there are only 2 options.

DMajrekar · 2022-03-28T19:39:18Z

I've been testing this change with OpenEBS/Mayastor backed hot plugged volumes, and this PR introduces and undesired side effect.

I've been testing with both block and file backed volumes, and executed the following steps:

Create a VM
Attach a first hot plug device block-test
Partition / Format the device within the VM: sda
Attach a second hot plug device block-test-2
Partition / Format the device within the VM: sdb
unmount the first device, sda from the VM
Hot plug remove the device block-test from the VM (virtctl removevolume volume-attach --volume-name block-test
Check the mount points within the VM, and notice that sda has been replaced with sdb, leaving sdb in an I/O error state

Looking at the virt-launcher logs for this VM, I'm seeing:

{"component":"virt-launcher","kind":"","level":"info","msg":"Attaching disk block-test, target sda","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:929","timestamp":"2022-03-28T18:12:50.396629Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Attaching disk block-test-2, target sdb","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:929","timestamp":"2022-03-28T18:13:16.131228Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}


{"component":"virt-launcher","kind":"","level":"info","msg":"Detaching disk block-test, target sda","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:908","timestamp":"2022-03-28T18:13:48.450685Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Detaching disk block-test-2, target sdb","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:908","timestamp":"2022-03-28T18:13:48.567696Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Attaching disk block-test-2, target sdb","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:929","timestamp":"2022-03-28T18:13:48.747398Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}

I however have been experiencing the same problems as seen in #6564 with OpenEBS/Mayastor which relies on NVMe over fabric rather than iSCSI.

Working though the issue, we have identified that the 2nd and 3rd hot plug pods are missing the volumeDevices section for previously mounted volumes. That is causing the Node CSI driver to performa an UNSTAGE of the volume (effectively removing the block device from the underlying compute node).

As a quick hack, I have forced subsequent hot plug pods to maintain the volumeDevice spec with the following patch

diff --git a/pkg/virt-controller/services/template.go b/pkg/virt-controller/services/template.go
index b1e102efa..892598387 100644
--- a/pkg/virt-controller/services/template.go
+++ b/pkg/virt-controller/services/template.go
@@ -1622,7 +1622,7 @@ func (t *templateService) RenderHotplugAttachmentPodTemplate(volumes []*v1.Volum
                }
                skipMount := false
                if hotplugVolumeStatusMap[volume.Name] == v1.VolumeReady || hotplugVolumeStatusMap[volume.Name] == v1.HotplugVolumeMounted {
-                       skipMount = true
+                       skipMount = false
                }
                pod.Spec.Volumes = append(pod.Spec.Volumes, k8sv1.Volume{
                        Name: volume.Name,

This now means the hot plug path is UNPUBLISHED, but is not UNSTAGED from the underlying compute node, meaning IO can still flow. It also does not show the same behaviour as I have described at the start of this, where the guest VM sees both devices detached and then a single device reattached.

These are the Virtual launcher logs with my patch:

{"component":"virt-launcher","kind":"","level":"info","msg":"Attaching disk block-test, target sda","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:929","timestamp":"2022-03-28T19:49:43.118806Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Attaching disk block-test-2, target sdb","name":"volume-attach-2438-4bcde4","namespace":"cust-ba6bc6b1-808d-default","pos":"manager.go:929","timestamp":"2022-03-28T19:49:59.882321Z","uid":"ba639b80-983a-4304-b491-fdbc27d449bb"}

I will write this up in a new issue tomorrow, but at the moment, I feel this PR is not safe to merge.

kubevirt-bot · 2022-04-27T22:02:06Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot · 2022-05-27T22:28:40Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot · 2022-05-27T22:28:44Z

@kubevirt-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 4, 2021

kubevirt-bot requested review from AlonaKaplan and omeryahud November 4, 2021 06:38

kubevirt-bot added size/M needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2021

kubevirt-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2021

kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Nov 4, 2021

watermelon-brother force-pushed the fix_hotplug_volume_io_error branch from d6c1550 to c7d80dc Compare November 4, 2021 08:03

kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Nov 4, 2021

Fix hotplug volume have a io error when use iscsi protocol csi plugin

2cac845

Now, virt-controller can delete old attachment pod when the new attachmentpod is running. Signed-off-by: wujixin <599230270@qq.com>

watermelon-brother force-pushed the fix_hotplug_volume_io_error branch from c7d80dc to 2cac845 Compare November 4, 2021 08:19

awels reviewed Dec 20, 2021

View reviewed changes

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 20, 2022

kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 27, 2022

kubevirt-bot closed this May 27, 2022

vasiliy-ul mentioned this pull request Feb 20, 2023

virt-controller: render hp-volume- pod should respect blockdevices #9269

Merged

watermelon-brother mentioned this pull request Oct 27, 2023

Wait for new hotplug attachment pod to be ready #10275

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hotplug volume have a io error when use iscsi protocol csi plugin #6728

Fix hotplug volume have a io error when use iscsi protocol csi plugin #6728

watermelon-brother commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

maya-r commented Nov 4, 2021

maya-r commented Nov 4, 2021

alicefr commented Nov 5, 2021

watermelon-brother commented Nov 8, 2021

watermelon-brother commented Nov 8, 2021

watermelon-brother commented Nov 10, 2021 •

edited

alicefr commented Nov 10, 2021

awels Dec 20, 2021

kubevirt-bot commented Mar 20, 2022

awels commented Mar 21, 2022

DMajrekar commented Mar 28, 2022 •

edited

kubevirt-bot commented Apr 27, 2022

kubevirt-bot commented May 27, 2022

kubevirt-bot commented May 27, 2022

Fix hotplug volume have a io error when use iscsi protocol csi plugin #6728

Fix hotplug volume have a io error when use iscsi protocol csi plugin #6728

Conversation

watermelon-brother commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

kubevirt-bot commented Nov 4, 2021

maya-r commented Nov 4, 2021

maya-r commented Nov 4, 2021

alicefr commented Nov 5, 2021

watermelon-brother commented Nov 8, 2021

watermelon-brother commented Nov 8, 2021

watermelon-brother commented Nov 10, 2021 • edited

alicefr commented Nov 10, 2021

awels Dec 20, 2021

Choose a reason for hiding this comment

kubevirt-bot commented Mar 20, 2022

awels commented Mar 21, 2022

DMajrekar commented Mar 28, 2022 • edited

kubevirt-bot commented Apr 27, 2022

kubevirt-bot commented May 27, 2022

kubevirt-bot commented May 27, 2022

watermelon-brother commented Nov 10, 2021 •

edited

DMajrekar commented Mar 28, 2022 •

edited