[Backport] virtio-devices: signal activated queue eventfds on resume#144
Merged
Coffeeri merged 5 commits intoApr 14, 2026
Merged
Conversation
This reverts commit fb29aa2. On-behalf-of: SAP leander.kohler@sap.com Signed-off-by: Leander Kohler <leander.kohler@cyberus-technology.de>
This reverts commit ff466c2. On-behalf-of: SAP leander.kohler@sap.com Signed-off-by: Leander Kohler <leander.kohler@cyberus-technology.de>
A restored virtqueue can already contain pending descriptors when the VM resumes. Before this change, the worker thread was unparked and then waited for a fresh queue eventfd signal. That is normally fine, but not when the queue was already non-empty at snapshot time. The virtqueue state lives in guest memory and is restored, but the original host-side queue eventfd signal is not persistent snapshot state. If the guest already notified the queue before the snapshot, it may not notify it again after resume. That can leave the worker idle while the guest is still waiting for the pending request to complete. In one observed case, this stalled a virtio-blk flush during early boot after snapshot/restore. We mitigate this in the shared `VirtioCommon` resume path. `VirtioCommon` retains cloned queue eventfds for activated virtqueues and signals each of them once on resume after unparking the worker threads. Keep virtio-net on its existing special-case path: it resumes worker threads without signaling queue eventfds so the `driver_awake` workaround remains intact until the guest performs a real notify. On-behalf-of: SAP leander.kohler@sap.com Signed-off-by: Leander Kohler <leander.kohler@cyberus-technology.de>
This will wake up the guest and avoid a livelock situation by ensuring that it will process any pending queues on its side. Signed-off-by: Rob Bradford <rbradford@meta.com>
Now on the generic restore path the worker thread is notified on the events and also the guest is notified via the interrupt. This avoids the same "livelock" situation that required this "driver_awake" workaround when restoring the net device. Signed-off-by: Rob Bradford <rbradford@meta.com>
arctic-alpaca
approved these changes
Apr 14, 2026
phip1611
approved these changes
Apr 14, 2026
Member
|
fantastic!! |
|
@Coffeeri Could you bump chv in libvirt to include this in the nightly runs? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We revert #138 and backport the rework landed upstream cloud-hypervisor#8004.