nvme_driver: don't lazily restore interrupts by mattkur · Pull Request #3220 · microsoft/openvmm

mattkur · 2026-04-07T23:51:30Z

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order.

Copilot

Pull request overview

This PR updates NVMe servicing restore behavior to avoid “lazy” interrupt restoration by eagerly restoring I/O queues in a deterministic interrupt-vector order, working around device/VPci interrupt allocation ordering issues.

Changes:

Change I/O queue restore strategy to (by default) restore all queues sorted by MSI-X interrupt vector.
Add a saved-state flag (allow_lazy_restore) to optionally fall back to the prior eager+proto (lazy) restore split.
Persist allow_lazy_restore: Some(false) on save to force eager restore for now.

Copilot · 2026-04-07T23:56:44Z

+                .filter(|q| {
+                    q.queue_data.qid == 1
+                        || !q.queue_data.handler_data.pending_cmds.commands.is_empty()
+                })
+                .flat_map(|q| -> Result<IoQueue<D>, anyhow::Error> {
+                    let qid = q.queue_data.qid;
+                    let cpu = q.cpu;


Iterator::flat_map with a closure returning Result<_, _> will silently drop Err values (because Result implements IntoIterator). That means queue restore failures (e.g. interrupt mapping / IoQueue::restore) can be ignored and the restore may succeed with missing queues. Use a fallible collection pattern (e.g., map + collect::<Result<Vec<_>, _>>()?, or try_fold) so errors propagate.

Pre-existing pattern — this PR didn't change the collection strategy. Worth fixing in a follow-up.

Filed #3229 to track this and the other two pre-existing error handling items together.

Copilot · 2026-04-07T23:56:44Z

+            worker.io = sorted_io
+                .into_iter()
+                .flat_map(|q| -> Result<IoQueue<D>, anyhow::Error> {
+                    let qid = q.queue_data.qid;
+                    let cpu = q.cpu;


Same issue as above: this flat_map over Result will drop restore errors instead of propagating them, potentially leaving the driver partially restored without failing. Switch to a Result-propagating collection (collect::<Result<...>>()? / try_fold) so any queue restore error aborts restore.

Same — pre-existing in the eager path.

Copilot · 2026-04-07T23:56:45Z

+                        .expect("unable to find restored mem block")
+                        .to_owned();


Restore uses expect("unable to find restored mem block"), which will panic if the saved state doesn't match the restored DMA allocations (corrupt/partial state, version skew, etc.). Prefer returning a structured restore error with context instead of panicking.

Suggested change

.expect("unable to find restored mem block")

.to_owned();

.cloned()

.ok_or_else(|| {

anyhow::anyhow!(

"unable to find restored mem block for {}, qid {}, cpu {}, mem_len {}, base_pfn {}",

pci_id,

qid,

cpu,

q.queue_data.mem_len,

q.queue_data.base_pfn

)

})?;

Pre-existing — the original code had the same expect. Agree the suggestion is better, but out of scope for this PR.

Copilot · 2026-04-07T23:56:45Z

+                        issuer: q.queue.issuer().clone(),
+                        cpu: q.cpu,
+                    };
+                    this.io_issuers.per_cpu[q.cpu as usize].set(issuer).unwrap();


q.cpu from saved state is used as an index (per_cpu[q.cpu as usize]) and .set(...).unwrap() assumes it is both in-range and not already set. With the new eager-restore-all-queues path, any mismatch between saved CPUs and cpu_count (or duplicate CPUs) will panic during restore. Validate q.cpu < cpu_count and handle duplicate/previously-set issuers by returning an error instead of panicking.

Suggested change

this.io_issuers.per_cpu[q.cpu as usize].set(issuer).unwrap();

let issuer_slot = this

.io_issuers

.per_cpu

.get(q.cpu as usize)

.ok_or_else(|| {

anyhow::anyhow!(

"saved queue {} has invalid cpu {} (cpu_count {})",

qid,

q.cpu,

this.io_issuers.per_cpu.len()

)

})?;

issuer_slot.set(issuer).map_err(|_| {

anyhow::anyhow!(

"duplicate io issuer restore for cpu {} on queue {}",

q.cpu,

qid

)

})?;

Pre-existing — identical .unwrap() existed in the original eager restore path. Saved state CPU values always come from our own prior save. Agree defensive validation would be better, but out of scope.

Copilot · 2026-04-07T23:56:45Z

            io,
            qsize: worker_state.qsize,
            max_io_queues: worker_state.max_io_queues,
+            allow_lazy_restore: Some(false), // For now, we always restore eagerly to work around device bugs.


allow_lazy_restore is always saved as Some(false), and there is no code path that can persist Some(true). That makes the lazy-restore branch effectively dead and likely to rot. Consider either (a) removing the flag/branch for now, or (b) plumbing a real knob (config/feature/env) and adding coverage so the lazy path stays tested.

Suggested change

allow_lazy_restore: Some(false), // For now, we always restore eagerly to work around device bugs.

// For now, we always restore eagerly to work around device bugs.

// Don't persist an explicit `false` value for a mode that has no way

// to be enabled by current save paths; `None` preserves the default.

allow_lazy_restore: None,

Intentional. The lazy path is preserved so the restore side can opt back into it once the host-side IV ordering bug is fixed. Some(false) is explicit — it signals "we chose eager" rather than "this field wasn't set."

Copilot · 2026-04-07T23:56:45Z

+            // Eager restore path: restore ALL queues sorted by interrupt
+            // vector for ordered VPci allocation (MSI-X ordering workaround).
+            //
+            // Devnote: Safety of inline new_self_drained(): This loop is fully
+            // synchronous (no .await). Although IoQueue::restore() spawns
+            // queue handler tasks, they don't poll until the async runtime
+            // yields — which happens only after .collect() completes. So all
+            // new_self_drained() and new_draining() calls finish before any
+            // handler can fire the drain-complete signal. If this loop is ever
+            // refactored to be async, the waiters for empty queues must be
+            // pre-created (as done in the lazy path above).
+            let mut sorted_io: Vec<_> = saved_state.worker_data.io.iter().collect();
+            sorted_io.sort_by_key(|q| q.iv);
+


The new restore behavior (restoring all IO queues sorted by interrupt vector to enforce VPci CreateInterruptMessage ordering) is not covered by tests in this crate. Add a unit/integration test that exercises servicing save/restore and asserts the interrupt mapping/allocation order or at least that all queues are restored eagerly (no proto queues) under the default setting.

Agree. Blocked by test infrastructure gaps tracked in #489 and #3219.

gurasinghMS · 2026-04-08T22:03:44Z

        #[mesh(4)]
        pub max_io_queues: u16,
+        /// Whether to allow lazy restore of IO queues that had no pending commands at the time of save.
+        #[mesh(5)]


nit: Do we need to save this if it is always false? We can just initialize the variable to false every time

You're right, since a missing bool defaults to false, saving a false is overkill. I just wanted to be very explicit in the ternary: we know it's unsafe, we know it's safe, or we don't know (assume unsafe).

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order. (cherry picked from commit 958b8a4)

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order.

nvme_driver: don't lazily restore interrupts

16e3ae6

Copilot AI review requested due to automatic review settings April 7, 2026 23:51

mattkur requested review from a team as code owners April 7, 2026 23:51

Copilot started reviewing on behalf of mattkur April 7, 2026 23:52 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

nvme_driver: fix CI — add allow_lazy_restore to test helper

8c7972c

mattkur mentioned this pull request Apr 8, 2026

nvme_driver: harden error handling in restore path #3229

Open

gurasinghMS reviewed Apr 8, 2026

View reviewed changes

gurasinghMS approved these changes Apr 8, 2026

View reviewed changes

alandau approved these changes Apr 8, 2026

View reviewed changes

mattkur merged commit 958b8a4 into microsoft:main Apr 8, 2026
61 checks passed

mattkur deleted the eager-restore branch April 8, 2026 23:19

mattkur mentioned this pull request Apr 9, 2026

nvme_driver: don't lazily restore interrupts (#3220) #3235

Open

mattkur added the backport_1.7.2511 label Apr 9, 2026

moor-coding pushed a commit to moor-coding/openvmm that referenced this pull request Apr 13, 2026

nvme_driver: don't lazily restore interrupts (microsoft#3220)

a082ab1

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order.

-                        .expect("unable to find restored mem block")
-                        .to_owned();
+                        .cloned()
+                        .ok_or_else(|| {
+                            anyhow::anyhow!(
+                                "unable to find restored mem block for {}, qid {}, cpu {}, mem_len {}, base_pfn {}",
+                                pci_id,
+                                qid,
+                                cpu,
+                                q.queue_data.mem_len,
+                                q.queue_data.base_pfn
+                            )
+                        })?;

-            allow_lazy_restore: Some(false), // For now, we always restore eagerly to work around device bugs.
+            // For now, we always restore eagerly to work around device bugs.
+            // Don't persist an explicit `false` value for a mode that has no way
+            // to be enabled by current save paths; `None` preserves the default.
+            allow_lazy_restore: None,

Conversation

mattkur commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mattkur commented Apr 7, 2026 •

edited

Loading