vmm: Only return from reset driven I/O once event received #5645

rbradford · 2023-08-03T14:27:38Z

The reset system is asynchronous with an I/O event (PIO or MMIO) for
ACPI/i8042/CMOS triggering a write to the reset_evt event handler. The
VMM thread will pick up this event on the VMM main loop and then trigger
a shutdown in the CpuManager. However since there is some delay between
the CPU threads being marked to be killed (through the
CpuManager::cpus_kill_signalled bool) it is possible for the guest vCPU
that triggered the exit to be re-entered when the vCPU KVM_RUN is called
after the I/O exit is completed.

This is undesirable and in particular the Linux kernel will attempt to
jump to real mode after a CMOS based exit - this is unsupported in
nested KVM on AMD on Azure and will trigger an error in KVM_RUN.

Solve this problem by spinning in the device that has triggered the
reset until the vcpus_kill_signalled boolean has been updated
indicating that the VMM thread has received the event and called
CpuManager::shutdown(). In particular if this bool is set then the vCPU
threads will not re-enter the guest.

Signed-off-by: Rob Bradford rbradford@rivosinc.com

The reset system is asynchronous with an I/O event (PIO or MMIO) for ACPI/i8042/CMOS triggering a write to the reset_evt event handler. The VMM thread will pick up this event on the VMM main loop and then trigger a shutdown in the CpuManager. However since there is some delay between the CPU threads being marked to be killed (through the CpuManager::cpus_kill_signalled bool) it is possible for the guest vCPU that triggered the exit to be re-entered when the vCPU KVM_RUN is called after the I/O exit is completed. This is undesirable and in particular the Linux kernel will attempt to jump to real mode after a CMOS based exit - this is unsupported in nested KVM on AMD on Azure and will trigger an error in KVM_RUN. Solve this problem by spinning in the device that has triggered the reset until the vcpus_kill_signalled boolean has been updated indicating that the VMM thread has received the event and called CpuManager::shutdown(). In particular if this bool is set then the vCPU threads will not re-enter the guest. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

weltling

Reboot issues on AMD are fixed by this patch as tested together with #5627.

Thanks

rbradford requested a review from a team as a code owner August 3, 2023 14:27

rbradford force-pushed the 2023-08-03-spin-waiting-for-reset-event-received branch from a0b1cff to 456ff9e Compare August 3, 2023 14:57

rbradford mentioned this pull request Aug 3, 2023

ci: Add AMD pass #5627

Merged

weltling approved these changes Aug 3, 2023

View reviewed changes

likebreath approved these changes Aug 4, 2023

View reviewed changes

likebreath merged commit 06dc708 into cloud-hypervisor:main Aug 4, 2023
21 checks passed

weltling mentioned this pull request Aug 7, 2023

tests: Re-enable cases in AMD pipeline #5660

Closed

rbradford added the bug-fix Bug fix to include in release notes label Aug 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vmm: Only return from reset driven I/O once event received #5645

vmm: Only return from reset driven I/O once event received #5645

rbradford commented Aug 3, 2023

weltling left a comment

vmm: Only return from reset driven I/O once event received #5645

vmm: Only return from reset driven I/O once event received #5645

Conversation

rbradford commented Aug 3, 2023

weltling left a comment

Choose a reason for hiding this comment