Skip to content

openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible (#2603)#2623

Merged
shenw0000 merged 1 commit intomicrosoft:release/1.7.2511from
shenw0000:release/1.7.2511
Jan 8, 2026
Merged

openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible (#2603)#2623
shenw0000 merged 1 commit intomicrosoft:release/1.7.2511from
shenw0000:release/1.7.2511

Conversation

@shenw0000
Copy link
Copy Markdown
Contributor

Previous change in the PCIe type0/type1 config space emulator broken backward compatibility of the save and restore. This change updates the save and restore implementation to be backward compatible.

Tested to verify that the change fixed the problem: ## 1.6->1.7 servicing repros the problem:

  81.885092300s ERROR paravisor_log:  inner_target="vmbus_relay_intercept_device" "Failed to teardown gpadl -- leaking memory." fields={"error": ["channel closed"]} extra={"timestamp": "4.007314600s"}
  81.885501800s ERROR guest_emulation_device:  guest reported vtl0 failed to start error=guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
  81.887201900s  INFO paravisor_log:  inner_target="mana_driver::gdma_driver" "dropping gdma driver" fields={"self.hwc_failure": "false", "self.state_saved": "false"} extra={"timestamp": "4.022594900s"}
  81.981364100s ERROR openvmm_entry:  vtl2 servicing failed error=vtl0 start failed: guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}

1.6 -> fixed version resolves the problem:

 828.988495000s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: syncing command register - intx_disable=false, mmio_enabled=false" fields={} extra={"timestamp": "6.316890400s"}
 829.007081300s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating intx_disable=false" fields={} extra={"timestamp": "6.336884200s"}
 829.018959200s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating mmio_enabled=false" fields={} extra={"timestamp": "6.348810600s"}
 829.035025800s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"activity_id": "5a89a430-0001-0029-0e00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "time_active_ns": 44800, "timestamp": "6.364453700s", "time_taken_ns": 550516500, "op_code": 2}
 829.036223000s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 50700, "time_taken_ns": 551889200, "name": "device_state_change", "activity_id": "5a8b3818-0001-0029-0f00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "op_code": 2, "timestamp": "6.365929800s"}
 829.036739300s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 28200, "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "timestamp": "6.366807000s", "time_taken_ns": 552654100, "op_code": 2, "activity_id": "5a8ceec4-0001-0029-1000-000040000000"}
 829.037609200s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"timestamp": "6.367466400s", "time_active_ns": 31500, "op_code": 2, "activity_id": "5abe34d4-0001-0029-0b00-000040000800", "name": "device_state_change", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "time_taken_ns": 550084300}
 829.038569900s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "activity_id": "5ac36300-0001-0029-0a00-000040000800", "time_taken_ns": 550401100, "timestamp": "6.368122700s", "name": "device_state_change", "time_active_ns": 31900, "op_code": 2}
 [...]
 829.199631600s  INFO guest_emulation_device:  restore vtl2 complete success=true
 829.200333400s  INFO guest_emulation_device:  guest reported vtl0 started successfully
 829.308077400s  INFO openvmm_entry:  vtl2 servicing complete duration=8937

…ate to make it backward compatible (microsoft#2603)

Previous change in the PCIe type0/type1 config space emulator broken
backward compatibility of the save and restore. This change updates the
save and restore implementation to be backward compatible.

Tested to verify that the change fixed the problem:
## 1.6->1.7 servicing repros the problem:
```
  81.885092300s ERROR paravisor_log:  inner_target="vmbus_relay_intercept_device" "Failed to teardown gpadl -- leaking memory." fields={"error": ["channel closed"]} extra={"timestamp": "4.007314600s"}
  81.885501800s ERROR guest_emulation_device:  guest reported vtl0 failed to start error=guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
  81.887201900s  INFO paravisor_log:  inner_target="mana_driver::gdma_driver" "dropping gdma driver" fields={"self.hwc_failure": "false", "self.state_saved": "false"} extra={"timestamp": "4.022594900s"}
  81.981364100s ERROR openvmm_entry:  vtl2 servicing failed error=vtl0 start failed: guest reported VTL0 start error: failed to restore: restore failed: {440bx-host-pci-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pci-isa-bridge: failed to decode protobuf: decoding failed: missing required field, piix4-pm: failed to decode protobuf: decoding failed: missing required field}
```

## 1.6 -> fixed version resolves the problem:
```
 828.988495000s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: syncing command register - intx_disable=false, mmio_enabled=false" fields={} extra={"timestamp": "6.316890400s"}
 829.007081300s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating intx_disable=false" fields={} extra={"timestamp": "6.336884200s"}
 829.018959200s  INFO paravisor_log:  inner_target="pci_core::cfg_space_emu" "ConfigSpaceCommonHeaderEmulator: updating mmio_enabled=false" fields={} extra={"timestamp": "6.348810600s"}
 829.035025800s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"activity_id": "5a89a430-0001-0029-0e00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "time_active_ns": 44800, "timestamp": "6.364453700s", "time_taken_ns": 550516500, "op_code": 2}
 829.036223000s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 50700, "time_taken_ns": 551889200, "name": "device_state_change", "activity_id": "5a8b3818-0001-0029-0f00-000040000000", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "op_code": 2, "timestamp": "6.365929800s"}
 829.036739300s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"time_active_ns": 28200, "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "name": "device_state_change", "timestamp": "6.366807000s", "time_taken_ns": 552654100, "op_code": 2, "activity_id": "5a8ceec4-0001-0029-1000-000040000000"}
 829.037609200s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"timestamp": "6.367466400s", "time_active_ns": 31500, "op_code": 2, "activity_id": "5abe34d4-0001-0029-0b00-000040000800", "name": "device_state_change", "related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "time_taken_ns": 550084300}
 829.038569900s  INFO paravisor_log:  inner_target="state_unit" fields={} extra={"related_activity_id": "5a4363d0-0001-0029-0500-00004000c800", "activity_id": "5ac36300-0001-0029-0a00-000040000800", "time_taken_ns": 550401100, "timestamp": "6.368122700s", "name": "device_state_change", "time_active_ns": 31900, "op_code": 2}
 [...]
 829.199631600s  INFO guest_emulation_device:  restore vtl2 complete success=true
 829.200333400s  INFO guest_emulation_device:  guest reported vtl0 started successfully
 829.308077400s  INFO openvmm_entry:  vtl2 servicing complete duration=8937
```
@shenw0000 shenw0000 requested a review from a team as a code owner January 8, 2026 21:01
Copilot AI review requested due to automatic review settings January 8, 2026 21:01
@github-actions github-actions Bot added the release_1.7.2511 Targets the release/1.7.2511 branch. label Jan 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a backward compatibility issue in PCI configuration space save/restore that was introduced by a previous change. The fix unifies the saved state structure for both Type 0 and Type 1 PCI devices into a single SavedState struct, making fields 6-15 (Type 1-specific) optional with default values of 0.

Key changes:

  • Unified Type 0 and Type 1 saved state into a single structure with common fields at mesh indices 1-5 and Type 1-specific fields at indices 6-15
  • Removed nested state structures and generic save/restore implementation, replacing with direct implementations for Type 0 and Type 1
  • Updated tests to separately verify Type 0 and Type 1 save/restore functionality

Comment on lines +1530 to +1536
// Pad base_addresses to 6 elements for common header (Type 1 uses 2 BARs)
let mut full_base_addresses = [0u32; 6];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
full_base_addresses[i] = addr;
}
self.common
.set_base_addresses(&[full_base_addresses[0], full_base_addresses[1]]);
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for extracting base addresses could be simplified. The code creates a 6-element array, populates it, then extracts exactly 2 elements. This could be written more directly as:

let mut type1_base_addresses = [0u32; 2];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
    type1_base_addresses[i] = addr;
}
self.common.set_base_addresses(&type1_base_addresses);

Or even more concisely:

self.common.set_base_addresses(&[
    base_addresses[0],
    base_addresses[1],
]);
Suggested change
// Pad base_addresses to 6 elements for common header (Type 1 uses 2 BARs)
let mut full_base_addresses = [0u32; 6];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
full_base_addresses[i] = addr;
}
self.common
.set_base_addresses(&[full_base_addresses[0], full_base_addresses[1]]);
let mut type1_base_addresses = [0u32; 2];
for (i, &addr) in base_addresses.iter().enumerate().take(2) {
type1_base_addresses[i] = addr;
}
self.common.set_base_addresses(&type1_base_addresses);

Copilot uses AI. Check for mistakes.
Comment on lines +2097 to +2116
// Write to latency timer / interrupt register
emu.write_u32(0x3C, 0x0040_0000).unwrap(); // Set latency_timer

// Save the state
let saved_state = common_emu.save().expect("save should succeed");
let saved_state = emu.save().expect("save should succeed");

// Reset the emulator
common_emu.reset();
let result = common_emu.read_u32(0x04, &mut test_val);
assert_eq!(result, CommonHeaderResult::Handled);
emu.reset();

// Verify state is reset
emu.read_u32(0x04, &mut test_val).unwrap();
assert_eq!(test_val & 0x0007, 0x0000); // Should be reset

// Restore the state
common_emu
.restore(saved_state)
.expect("restore should succeed");
let result = common_emu.read_u32(0x04, &mut test_val);
assert_eq!(result, CommonHeaderResult::Handled);
assert_eq!(test_val & 0x0007, 0x0007); // Should be restored
emu.restore(saved_state).expect("restore should succeed");

// Test Type 1 common header emulator save/restore
let hardware_ids = HardwareIds {
vendor_id: 0x3333,
device_id: 0x4444,
revision_id: 1,
prog_if: ProgrammingInterface::NONE,
sub_class: Subclass::BRIDGE_PCI_TO_PCI,
base_class: ClassCode::BRIDGE,
type0_sub_vendor_id: 0,
type0_sub_system_id: 0,
};
// Verify state is restored
emu.read_u32(0x04, &mut test_val).unwrap();
assert_eq!(test_val & 0x0007, 0x0007); // Should be restored
}
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test writes to the latency_timer register but doesn't verify that it's correctly restored. Consider adding a verification step after line 2115 to read back the latency_timer value and confirm it was restored correctly.

Copilot uses AI. Check for mistakes.
@shenw0000 shenw0000 merged commit b7e842f into microsoft:release/1.7.2511 Jan 8, 2026
82 of 84 checks passed
benhillis pushed a commit to benhillis/openvmm that referenced this pull request Jan 29, 2026
* release/1.7.2511:
  openvmm/pcie: Unify type 0 and type 1 save restore state into SavedState to make it backward compatible (microsoft#2603) (microsoft#2623)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_1.7.2511 Targets the release/1.7.2511 branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants