Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk hot-add after vm.restore causes I/O errors in guest #6265

Closed
praveen-pk opened this issue Mar 4, 2024 · 20 comments · Fixed by #6326
Closed

Disk hot-add after vm.restore causes I/O errors in guest #6265

praveen-pk opened this issue Mar 4, 2024 · 20 comments · Fixed by #6326
Labels
bug Something isn't working

Comments

@praveen-pk
Copy link
Contributor

Describe the bug

hot-adding a disk after vm.restore causes I/O errors in guest.

To Reproduce
Steps to reproduce the behaviour:

cloud-hypervisor --version
cloud-hypervisor v38.0-73-gd245e624

export API_SOCKET="/tmp/ch-socket"
export SNAPSHOT_DIR="${HOME}/ch-snapshot"

# Start the VM
/usr/bin/cloud-hypervisor --api-socket ${API_SOCKET} --kernel hypervisor-fw \
--disk path=focal-server-cloudimg-amd64.raw \
--cpus boot=2 --memory size=1024M --net tap=,mac=,ip=,mask= --serial tty --console off --seccomp log

# Pause and snapshot
rm -rf ${SNAPSHOT_DIR}/*
ch-remote --api-socket=${API_SOCKET} pause
ch-remote --api-socket=${API_SOCKET} snapshot file://${SNAPSHOT_DIR}

# clean up/kill the remnant ch process

/usr/bin/cloud-hypervisor \
    --api-socket ${API_SOCKET} \
    --restore source_url=file://${SNAPSHOT_DIR}

# resume the VM
ch-remote --api-socket=${API_SOCKET} resume


#Guest works just fine at this point

testuser@guest:~$ ls /dev/vd*
/dev/vda  /dev/vda1  /dev/vda14  /dev/vda15
testuser@guest:~$ uname -r
5.4.0-172-generic


# Create a test file
dd if=/dev/zero of=/tmp/test.img bs=1M count=10

# Hot add the file to the VM
$ ch-remote --api-socket=${API_SOCKET} add-disk path=/tmp/test.img 
{"id":"_disk2","bdf":"0000:00:04.0"}

Once a new disk it hot-added, I see the following failure messages in guest's dmesg:

[  192.747265] pci 0000:00:04.0: [1af4:1042] type 00 class 0x018000
[  192.747619] pci 0000:00:04.0: reg 0x10: [mem 0xe7f00000-0xe7f7ffff]
[  192.752451] pci 0000:00:04.0: BAR 0: assigned [mem 0xc0000000-0xc007ffff]
[  192.752725] virtio-pci 0000:00:04.0: enabling device (0000 -> 0002)
[  192.755663] virtio_blk virtio3: [vdb] 20480 512-byte logical blocks (10.5 MB/10.0 MiB)
[  192.782113] blk_update_request: I/O error, dev vda, sector 3545088 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.784566] Buffer I/O error on device vda1, logical block 414720
[  192.786015] blk_update_request: I/O error, dev vda, sector 464840 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.788176] Buffer I/O error on device vda1, logical block 29689
[  192.789402] blk_update_request: I/O error, dev vda, sector 465048 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.791435] Buffer I/O error on device vda1, logical block 29715
[  192.782113] blk_update_request: I/O error, dev vda, sector 35[  192.792628] blk_update_request: I/O error, dev vda, sector 465280 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
45088 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.784561] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 443137)[  192.798883] Buffer I/O error on device vda1, logical block 29744
[  192.800088] blk_update_request: I/O error, dev vda, sector 465440 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.802141] Buffer I/O error on device vda1, logical block 29764
[  192.803346] blk_update_request: I/O error, dev vda, sector 465528 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0

[  192.805395] Buffer I/O error on device vda1, logical block 29775
[  192.806760] blk_update_request: I/O error, dev vda, sector 465592 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.808776] Buffer I/O error on device vda1, logical block 29783
[  192.784566] Buffer I/O error on device vda1, logical block 41[  192.809984] blk_update_request: I/O error, dev vda, sector 465672 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
4720[  192.813043] Buffer I/O error on device vda1, logical block 29793
[  192.814363] blk_update_request: I/O error, dev vda, sector 465800 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.816262] Buffer I/O error on device vda1, logical block 29809

[  192.817365] blk_update_request: I/O error, dev vda, sector 465856 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.819365] Buffer I/O error on device vda1, logical block 29816
[  192.786015] blk_update_request: I/O error, dev vda, sector 464840 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.788175] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58106)
[  192.788176] Buffer I/O error on device vda1, logical block 29689
[  192.789402] blk_update_request: I/O error, dev vda, sector 465048 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.791434] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58132)
[  192.791435] Buffer I/O error on device vda1, logical block 29715
[  192.792628] blk_update_request: I/O error, dev vda, sector 465280 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.798882] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58161)
[  192.798883] Buffer I/O error on device vda1, logical block 29744
[  192.800088] blk_update_request: I/O error, dev vda, sector 465440 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.802139] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58181)
[  192.802141] Buffer I/O error on device vda1, logical block 29764
[  192.803346] blk_update_request: I/O error, dev vda, sector 465528 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.805393] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58192)
[  192.805395] Buffer I/O error on device vda1, logical block 29775
[  192.806760] blk_update_request: I/O error, dev vda, sector 465592 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.808775] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58200)
[  192.808776] Buffer I/O error on device vda1, logical block 29783
[  192.809984] blk_update_request: I/O error, dev vda, sector 465672 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[  192.813041] EXT4-fs warning (device vda1): ext4_end_bio:311: I/O error 10 writing to inode 72950 (offset 0 size 0 starting block 58210)

At the point, the guest is unusable.

Build Flags

cargo build

Guest OS version details:

Ubuntu Focal
$ uname -r
5.4.0-172-generic

Host OS version details:

Fedora 39 
$ uname -r
6.8.0-rc1+

Logs

Output of cloud-hypervisor -v from either standard error or via --log-file:

cloud-hypervisor: 648.738146ms: <vmm> WARN:hypervisor/src/kvm/mod.rs:2122 -- Detected faulty MSR 0x4b564d04 while setting MSRs
cloud-hypervisor: 649.853501ms: <vmm> WARN:hypervisor/src/kvm/mod.rs:2122 -- Detected faulty MSR 0x4b564d04 while setting MSRs
@liuw
Copy link
Member

liuw commented Mar 6, 2024

What happens if you just hot-add the same disk to the same guest without the steps to save/restore it?

@peng6662001
Copy link
Contributor

I reproduced this error on Ampere Altra, where /dev/vdc comes up if the same disk is added to the same client without a save/restore step.

@liuw
Copy link
Member

liuw commented Mar 14, 2024

I reproduced this error on Ampere Altra, where /dev/vdc comes up if the same disk is added to the same client without a save/restore step.

I am not sure if this is the same thing though. If you hot-add the same image twice I can see why there will be inconsistency.

I think Praveen created a new (different) image to be added as vdb.

@liuw
Copy link
Member

liuw commented Mar 15, 2024

I cannot reproduce this. I'm using the same focal image. The guest kernel version is the same. After the new disk is plugged, it doesn't show up in the guest.

@peng6662001
Copy link
Contributor

I reproduced this error on Ampere Altra, where /dev/vdc comes up if the same disk is added to the same client without a save/restore step.

I am not sure if this is the same thing though. If you hot-add the same image twice I can see why there will be inconsistency.

I think Praveen created a new (different) image to be added as vdb.

/dev/vdb is for cloud-init disk on my vm,/dev/vdc is for test.img

My scripts to reproduce the bug:

1_start_vm.sh
#!/bin/bash -x
rm -rf /tmp/ch-socket1

# Start the VM
cloud-hypervisor --api-socket /tmp/ch-socket1 --firmware /root/workloads/CLOUDHV_EFI.fd \
    --disk path=/home/dom/images/ubuntu22.04.raw path=/home/dom/images/cloudinit \
    --cpus boot=2 --memory size=1024M --net tap=,mac=,ip=,mask= --seccomp log
2_remote_vm.sh
#!/bin/bash -x                                                                                                                                                                     rm -rf ${PWD}/ch-snapshot
mkdir ${PWD}/ch-snapshot
ch-remote --api-socket=/tmp/ch-socket1 pause
ch-remote --api-socket=/tmp/ch-socket1 snapshot file://${PWD}/ch-snapshot
3_restore_vm.sh
#!/bin/bash -x                                                                                                                                                                     rm -rf /tmp/ch-socket2
cloud-hypervisor --api-socket /tmp/ch-socket2 --restore source_url=file://${PWD}/ch-snapshot
4_restore_new_vm.sh
#!/bin/bash -x                                                                                                                                                                     ch-remote --api-socket=/tmp/ch-socket2 resume
#!/bin/bash -x
dd if=/dev/zero of=/tmp/test.img bs=1M count=10
ch-remote --api-socket=/tmp/ch-socket2 add-disk path=/tmp/test.img

@praveen-pk
Copy link
Contributor Author

What happens if you just hot-add the same disk to the same guest without the steps to save/restore it?

As this case is just a regular hot-add, that works fine without any errors.

@liuw
Copy link
Member

liuw commented Mar 19, 2024

So this only happens with a recent kernel. My original kernel is 5.4.0-43-generic. That kernel didn't even hot plug the disk correctly. The recent 5.4.0-173-generic kernel can trigger this issue.

@liuw
Copy link
Member

liuw commented Mar 19, 2024

Heh, after the disk is hot added, old device threads quit. :-/

Before and after

  Id   Target Id                                             Frame
* 1    Thread 0x7f2b15e03800 (LWP 1062563) "cloud-hyperviso" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=1062564, futex_word=0x7f2b15e02990) at ./nptl/futex-internal.c:57
  2    Thread 0x7f2b15e026c0 (LWP 1062564) "vmm"             0x00007f2b15f0efc6 in epoll_wait (epfd=16, events=0x7f2b100041c0, maxevents=100, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7f2b15c016c0 (LWP 1062565) "http-server"     0x00007f2b15f0efc6 in epoll_wait (epfd=12, events=0x7f2b15bffb60, maxevents=12, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  4    Thread 0x7f2b15a006c0 (LWP 1062566) "vmm_signal_hand" __libc_recv (flags=<optimized out>, len=1, buf=0x7f2b159ffb28, fd=19) at ../sysdeps/unix/sysv/linux/recv.c:28
  5    Thread 0x7f2b157ff6c0 (LWP 1062569) "serial-manager"  0x00007f2b15f0efc6 in epoll_wait (epfd=61, events=0x7f2b157fecb4, maxevents=3, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  6    Thread 0x7f2b155f76c0 (LWP 1062570) "_disk0_q0"       syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  7    Thread 0x7f2b153f36c0 (LWP 1062571) "_net1_ctrl"      syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  8    Thread 0x7f2b151f26c0 (LWP 1062572) "_net1_qp0"       syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  9    Thread 0x7f2b14fee6c0 (LWP 1062574) "__rng"           syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  10   Thread 0x7f2b14de76c0 (LWP 1062575) "vcpu0"           syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  11   Thread 0x7f2b14be36c0 (LWP 1062576) "vcpu1"           syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  Id   Target Id                                             Frame
  1    Thread 0x7f2b15e03800 (LWP 1062563) "cloud-hyperviso" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=1062564, futex_word=0x7f2b15e02990) at ./nptl/futex-internal.c:57
  2    Thread 0x7f2b15e026c0 (LWP 1062564) "vmm"             0x00007f2b15f0efc6 in epoll_wait (epfd=16, events=0x7f2b100041c0, maxevents=100, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7f2b15c016c0 (LWP 1062565) "http-server"     0x00007f2b15f0efc6 in epoll_wait (epfd=12, events=0x7f2b15bffb60, maxevents=12, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
* 4    Thread 0x7f2b15a006c0 (LWP 1062566) "vmm_signal_hand" __libc_recv (flags=<optimized out>, len=1, buf=0x7f2b159ffb28, fd=19) at ../sysdeps/unix/sysv/linux/recv.c:28
  5    Thread 0x7f2b157ff6c0 (LWP 1062569) "serial-manager"  0x00007f2b15f0efc6 in epoll_wait (epfd=61, events=0x7f2b157fecb4, maxevents=3, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  10   Thread 0x7f2b14de76c0 (LWP 1062575) "vcpu0"           __GI___ioctl (fd=28, request=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
  11   Thread 0x7f2b14be36c0 (LWP 1062576) "vcpu1"           __GI___ioctl (fd=29, request=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
  12   Thread 0x7f2b149db6c0 (LWP 1062764) "_disk2_q0"       0x00007f2b15f0efc6 in epoll_wait (epfd=141, events=0x7f2aac000ca0, maxevents=100, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30

Apparently they recevied KILL_EVENTs.

cloud-hypervisor: 98.701300s: <_net1_qp0> INFO:virtio-devices/src/epoll_helper.rs:216 -- KILL_EVENT received, stopping epoll loop
cloud-hypervisor: 98.701304s: <_net1_ctrl> INFO:virtio-devices/src/epoll_helper.rs:216 -- KILL_EVENT received, stopping epoll loop

CC @rbradford @likebreath

@likebreath
Copy link
Member

@liuw That's very strange. Is the guest still functional?

AFAIK, the KILL_EVENT is mostly used from the drop() function of the virtio devices, e.g. when they are out of scope. This basically means the DeviceManager is getting out-of-scope.

@liuw
Copy link
Member

liuw commented Mar 20, 2024

@liuw That's very strange. Is the guest still functional?

AFAIK, the KILL_EVENT is mostly used from the drop() function of the virtio devices, e.g. when they are out of scope. This basically means the DeviceManager is getting out-of-scope.

No. The guest is basically dead since its OS disk is gone.

Device manager is still there. It is just that all device thread other than the last hot added disk thread are killed.

@liuw liuw added the bug Something isn't working label Mar 20, 2024
@rbradford
Copy link
Member

So this only happens with a recent kernel. My original kernel is 5.4.0-43-generic. That kernel didn't even hot plug the disk correctly. The recent 5.4.0-173-generic kernel can trigger this issue.

This is host kernel or guest kernel?

@liuw
Copy link
Member

liuw commented Mar 20, 2024

So this only happens with a recent kernel. My original kernel is 5.4.0-43-generic. That kernel didn't even hot plug the disk correctly. The recent 5.4.0-173-generic kernel can trigger this issue.

This is host kernel or guest kernel?

Guest kernel. It is the kernel in the the latest focal image as of yesterday.

@liuw
Copy link
Member

liuw commented Mar 20, 2024

I experimented a bit more last night.

  1. Pausing and resuming a guest does not cause this issue.
  2. Adding a virtio-nic will cause the same issue to surface.
  3. Serial manager thread stays alive at all times (not using virtio infra). Device manager is never dropped.
  4. If the new device is added before the guest is resume, other virtio threads will hang around, but the issue is still there.
  5. If the new device is added after the guest is resume, other virtio threads receive kill events.

The insight right now is the bug is in the common code for virtio devices.

@liuw
Copy link
Member

liuw commented Mar 20, 2024

@liuw That's very strange. Is the guest still functional?

AFAIK, the KILL_EVENT is mostly used from the drop() function of the virtio devices, e.g. when they are out of scope. This basically means the DeviceManager is getting out-of-scope.

I put in a debug! in Block's drop function. It is not called.

@rbradford
Copy link
Member

I experimented a bit more last night.

1. Pausing and resuming a guest does not cause this issue.

2. Adding a virtio-nic will cause the same issue to surface.

3. Serial manager thread stays alive at all times (not using virtio infra). Device manager is never dropped.

4. If the new device is added before the guest is resume, other virtio threads will hang around, but the issue is still there.

5. If the new device is added after the guest is resume, other virtio threads receive kill events.

The insight right now is the bug is in the common code for virtio devices.

Great observations. Would be interesting to see if it is newly introduced kernel behaviour - does it happen with direct kernel booting and our reference kernel?

@liuw
Copy link
Member

liuw commented Mar 20, 2024

I experimented a bit more last night.

1. Pausing and resuming a guest does not cause this issue.

2. Adding a virtio-nic will cause the same issue to surface.

3. Serial manager thread stays alive at all times (not using virtio infra). Device manager is never dropped.

4. If the new device is added before the guest is resume, other virtio threads will hang around, but the issue is still there.

5. If the new device is added after the guest is resume, other virtio threads receive kill events.

The insight right now is the bug is in the common code for virtio devices.

Great observations. Would be interesting to see if it is newly introduced kernel behaviour - does it happen with direct kernel booting and our reference kernel?

Direct kernel boot with the reference 6.2 kernel has the same issue.

@rbradford
Copy link
Member

I experimented a bit more last night.

1. Pausing and resuming a guest does not cause this issue.

2. Adding a virtio-nic will cause the same issue to surface.

3. Serial manager thread stays alive at all times (not using virtio infra). Device manager is never dropped.

4. If the new device is added before the guest is resume, other virtio threads will hang around, but the issue is still there.

5. If the new device is added after the guest is resume, other virtio threads receive kill events.

The insight right now is the bug is in the common code for virtio devices.

Great observations. Would be interesting to see if it is newly introduced kernel behaviour - does it happen with direct kernel booting and our reference kernel?

Direct kernel boot with the reference 6.2 kernel has the same issue.

Good to know. And I just checked - adding a hotplug device is not something we do after restoring in out test.

@liuw
Copy link
Member

liuw commented Mar 21, 2024

A quick test shows that virtio devices are reset after the new device is plugged.


I put a break point at virtio-devices/src/transport/pci_device.rs:VirtioCommon::reset. If the guest is resumed before adding the device, then the reset function will be called. If order is reversed, the reset function will not be called.

In both cases, the guest is hosed.

@rbradford
Copy link
Member

The reset is misdirection (that's just the kernel trying to recover). I bisected to this commit as the first bad:

eae804389048d0b9e4e33909c98d8787f0b9cb62 is the first bad commit
commit eae804389048d0b9e4e33909c98d8787f0b9cb62
Author: Sebastien Boeuf <sebastien.boeuf@intel.com>
Date:   Fri Oct 21 17:57:20 2022 +0200

    pci, virtio-devices: Move VirtioPciDevice to the new restore design
    
    The code for restoring a VirtioPciDevice has been updated, including the
    dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration.
    
    It's important to note that both PciConfiguration and MsixConfig still
    have restore() implementations because Vfio and VfioUser devices still
    rely on the old way for restore.
    
    Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>

 pci/src/bus.rs                                    |   1 +
 pci/src/configuration.rs                          | 106 +++++---
 pci/src/lib.rs                                    |   3 +-
 pci/src/msix.rs                                   |  66 ++++-
 pci/src/vfio.rs                                   |   5 +-
 pci/src/vfio_user.rs                              |   1 +
 virtio-devices/src/transport/mod.rs               |   2 +-
 virtio-devices/src/transport/pci_common_config.rs |  35 +--
 virtio-devices/src/transport/pci_device.rs        | 309 ++++++++++++----------
 vmm/src/device_manager.rs                         |   1 +
 10 files changed, 322 insertions(+), 207 deletions(-)

alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
…shots

When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the capabibility
should start. Hence, when you read a PCI configuration register in the
range [0..length(VirtioPciCfgCap)] you get the value 0 instead of the actual
register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are stored
at offset 0 in pci configuration space, their value is 0 for existing devices.
As such, Linux considers that the devices have been unplugged and it
removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the capabibility
should start. Hence, when you read a PCI configuration register in the
range [0..length(VirtioPciCfgCap)] you get the value 0 instead of the actual
register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are stored
at offset 0 in pci configuration space, their value is 0 for existing devices.
As such, Linux considers that the devices have been unplugged and it
removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
@alex-matei
Copy link
Contributor

I'm trying to add support for snapshots in kata-containers and I stumbled upon the same issue. The problem is with the PCI configuration capability, its state isn't saved. This means that PciDevice::read_config_register returns incorrect values after restore for existing devices.

alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the
capabibility should start. Hence, when you read a PCI configuration
register in the range [0..length(VirtioPciCfgCap)] you get the value 0
instead of the actual register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are
stored at offset 0 in pci configuration space, their value is 0 for
existing devices.  As such, Linux considers that the devices have been
unplugged and it removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the
capabibility starts. Hence, when you read a PCI configuration register
in the range [0..length(VirtioPciCfgCap)] you get the value 0 instead of
the actual register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are
stored at offset 0 in pci configuration space, their value is 0 for
existing devices.  As such, Linux considers that the devices have been
unplugged and it removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the
capabibility starts. Hence, when you read a PCI configuration register
in the range [0..length(VirtioPciCfgCap)] you get the value 0 instead of
the actual register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are
stored at offset 0 in pci configuration space, their value is 0 for
existing devices.  As such, Linux considers that the devices have been
unplugged and it removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
alex-matei added a commit to UiPath/cloud-hypervisor that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the
capabibility starts. Hence, when you read a PCI configuration register
in the range [0..length(VirtioPciCfgCap)] you get the value 0 instead of
the actual register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are
stored at offset 0 in pci configuration space, their value is 0 for
existing devices.  As such, Linux considers that the devices have been
unplugged and it removes them from the system.

Fixes: cloud-hypervisor#6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
github-merge-queue bot pushed a commit that referenced this issue Mar 24, 2024
When restoring a VM, the VirtioPciCfgCapInfo struct is not properly
initialized. All fields are 0, including the offset where the
capabibility starts. Hence, when you read a PCI configuration register
in the range [0..length(VirtioPciCfgCap)] you get the value 0 instead of
the actual register contents.

Linux rescans the whole PCI bus when adding a new device. It reads the
values vendor_id and device_id for every device. Because these are
stored at offset 0 in pci configuration space, their value is 0 for
existing devices.  As such, Linux considers that the devices have been
unplugged and it removes them from the system.

Fixes: #6265

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

6 participants