Skip to content

Commit

Permalink
Fix and re-enable MSI-X support
Browse files Browse the repository at this point in the history
  • Loading branch information
marmarek committed Sep 23, 2023
1 parent 6c9b442 commit ca8a488
Show file tree
Hide file tree
Showing 5 changed files with 320 additions and 71 deletions.
@@ -0,0 +1,86 @@
From 172275923f0f0999187a0bba60066a5672f839dc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?=
<marmarek@invisiblethingslab.com>
Date: Mon, 14 Nov 2022 13:40:02 +0100
Subject: [PATCH 1/3] hw/xen/xen_pt: Save back data only for declared registers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Call pci_default_write_config() in xen_pt_pci_write_config() only for
registers that have matching XenPTRegInfo structure, and do that only after
resolving any custom handlers. This is important for two reasons:
1. XenPTRegInfo has ro_mask which needs to be enforced - Xen-specific
hooks do that on their own (especially xen_pt_*_reg_write()).
2. Not setting value early allows the hooks to see the old value too.

If it would be only about the first point, setting PCIDevice.wmask would
probably be sufficient, but given the second point, restructure those
writes.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
v2:
- rewrite commit message, previous one was very misleading
- fix loop saving register values
---
hw/xen/xen_pt.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 2d33d178ad..3d6763dee7 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -256,6 +256,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
uint32_t find_addr = addr;
XenPTRegInfo *reg = NULL;
bool wp_flag = false;
+ uint32_t emul_mask = 0, write_val;

if (xen_pt_pci_config_access_check(d, addr, len)) {
return;
@@ -311,7 +312,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
}

memory_region_transaction_begin();
- pci_default_write_config(d, addr, val, len);

/* adjust the read and write value to appropriate CFC-CFF window */
read_val <<= (addr & 3) << 3;
@@ -371,6 +371,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
return;
}

+ emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) * 8);
+
/* calculate next address to find */
emul_len -= reg->size;
if (emul_len > 0) {
@@ -397,6 +399,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
/* need to shift back before passing them to xen_host_pci_set_block. */
val >>= (addr & 3) << 3;

+ /* store emulated registers after calling their handlers */
+ write_val = val;
+ for (index = 0; index < len; index += emul_len) {
+ emul_len = 0;
+ while (emul_mask & 0xff) {
+ emul_len++;
+ emul_mask >>= 8;
+ }
+ if (emul_len) {
+ uint32_t mask = ((1 << (emul_len * 8)) - 1);
+ pci_default_write_config(d, addr + index, write_val & mask, emul_len);
+ } else {
+ emul_mask >>= 8;
+ emul_len = 1;
+ }
+ write_val >>= emul_len * 8;
+ }
+
memory_region_transaction_commit();

out:
--
2.41.0

@@ -0,0 +1,191 @@
From 3d1059895e92448d0a53b42b76a4282cd2bd3005 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?=
<marmarek@invisiblethingslab.com>
Date: Thu, 22 Sep 2022 22:39:39 +0200
Subject: [PATCH 2/3] Do not access /dev/mem in MSI-X PCI passthrough on Xen
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The /dev/mem is used for two purposes:
- reading PCI_MSIX_ENTRY_CTRL_MASKBIT
- reading Pending Bit Array (PBA)

The first one was originally done because when Xen did not send all
vector ctrl writes to the device model, so QEMU might have outdated old
register value. If Xen is new enough, this has been changed, so QEMU can
now use its cached value of the register instead.

The Pending Bit Array (PBA) handling is for the case where it lives on
the same page as the MSI-X table itself. Xen has been extended to handle
this case too (as well as other registers that may live on those pages),
so QEMU handling is not necessary anymore.

Removing /dev/mem access is useful to work within stubdomain, and
necessary when dom0 kernel runs in lockdown mode.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v2:
- Make change conditional on new Xen version (tested via
XENFEAT_dm_msix_all_writes)

FIXME: separate adding XENFEAT_dm_msix_all_writes to another patch
---
hw/xen/xen_pt.h | 9 +++++
hw/xen/xen_pt_msi.c | 89 ++++++++++++++++++++++++++-------------------
2 files changed, 61 insertions(+), 37 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index b20744f7c7..813fc56927 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -29,6 +29,15 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3);
# define XEN_PT_LOG_CONFIG(d, addr, val, len)
#endif

+/* addition to xen/interface/features.h until updated upstream */
+/*
+ * If set, Xen will passthrough all MSI-X vector ctrl writes to device model,
+ * not only those unmasking an entry. This allows device model to properly keep
+ * track of the MSI-X table without having to read it from the device behind
+ * Xen's backs. This information is relevant only for device models.
+ */
+#define XENFEAT_dm_msix_all_writes 18
+

/* Helper */
#define XEN_PFN(x) ((x) >> XC_PAGE_SHIFT)
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index 09cca4eecb..898db4a4d6 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -460,15 +460,19 @@ static void pci_msix_write(void *opaque, hwaddr addr,
entry->updated = true;
} else if (msix->enabled && entry->updated &&
!(val & PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
- const volatile uint32_t *vec_ctrl;
-
- /*
- * If Xen intercepts the mask bit access, entry->vec_ctrl may not be
- * up-to-date. Read from hardware directly.
- */
- vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE
- + PCI_MSIX_ENTRY_VECTOR_CTRL;
- xen_pt_msix_update_one(s, entry_nr, *vec_ctrl);
+ if (s->msix->phys_iomem_base) {
+ const volatile uint32_t *vec_ctrl;
+
+ /*
+ * If Xen intercepts the mask bit access, entry->vec_ctrl may not be
+ * up-to-date. Read from hardware directly.
+ */
+ vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE
+ + PCI_MSIX_ENTRY_VECTOR_CTRL;
+ xen_pt_msix_update_one(s, entry_nr, *vec_ctrl);
+ } else {
+ xen_pt_msix_update_one(s, entry_nr, entry->latch(VECTOR_CTRL));
+ }
}

set_entry_value(entry, offset, val);
@@ -493,7 +497,11 @@ static uint64_t pci_msix_read(void *opaque, hwaddr addr,
return get_entry_value(&msix->msix_entry[entry_nr], offset);
} else {
/* Pending Bit Array (PBA) */
- return *(uint32_t *)(msix->phys_iomem_base + addr);
+ if (s->msix->phys_iomem_base)
+ return *(uint32_t *)(msix->phys_iomem_base + addr);
+ XEN_PT_LOG(&s->dev, "reading PBA, addr %#lx, offset %#lx\n",
+ addr, addr - msix->total_entries * PCI_MSIX_ENTRY_SIZE);
+ return 0xFFFFFFFF;
}
}

@@ -528,8 +536,8 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
uint32_t table_off = 0;
int i, total_entries, bar_index;
XenHostPCIDevice *hd = &s->real_device;
+ xen_feature_info_t xc_version_info = { 0 };
PCIDevice *d = &s->dev;
- int fd = -1;
XenPTMSIX *msix = NULL;
int rc = 0;

@@ -543,6 +551,9 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
return -1;
}

+ if (xc_version(xen_xc, XENVER_get_features, &xc_version_info) < 0)
+ return -1;
+
rc = xen_host_pci_get_word(hd, base + PCI_MSIX_FLAGS, &control);
if (rc) {
XEN_PT_ERR(d, "Failed to read PCI_MSIX_FLAGS field\n");
@@ -576,33 +587,37 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
msix->table_base = s->real_device.io_regions[bar_index].base_addr;
XEN_PT_LOG(d, "get MSI-X table BAR base 0x%"PRIx64"\n", msix->table_base);

- fd = open("/dev/mem", O_RDWR);
- if (fd == -1) {
- rc = -errno;
- XEN_PT_ERR(d, "Can't open /dev/mem: %s\n", strerror(errno));
- goto error_out;
- }
- XEN_PT_LOG(d, "table_off = 0x%x, total_entries = %d\n",
- table_off, total_entries);
- msix->table_offset_adjust = table_off & 0x0fff;
- msix->phys_iomem_base =
- mmap(NULL,
- total_entries * PCI_MSIX_ENTRY_SIZE + msix->table_offset_adjust,
- PROT_READ,
- MAP_SHARED | MAP_LOCKED,
- fd,
- msix->table_base + table_off - msix->table_offset_adjust);
- close(fd);
- if (msix->phys_iomem_base == MAP_FAILED) {
- rc = -errno;
- XEN_PT_ERR(d, "Can't map physical MSI-X table: %s\n", strerror(errno));
- goto error_out;
- }
- msix->phys_iomem_base = (char *)msix->phys_iomem_base
- + msix->table_offset_adjust;
+ if (!(xc_version_info.submap & (1U << XENFEAT_dm_msix_all_writes))) {
+ int fd = -1;

- XEN_PT_LOG(d, "mapping physical MSI-X table to %p\n",
- msix->phys_iomem_base);
+ fd = open("/dev/mem", O_RDWR);
+ if (fd == -1) {
+ rc = -errno;
+ XEN_PT_ERR(d, "Can't open /dev/mem: %s\n", strerror(errno));
+ goto error_out;
+ }
+ XEN_PT_LOG(d, "table_off = 0x%x, total_entries = %d\n",
+ table_off, total_entries);
+ msix->table_offset_adjust = table_off & 0x0fff;
+ msix->phys_iomem_base =
+ mmap(NULL,
+ total_entries * PCI_MSIX_ENTRY_SIZE + msix->table_offset_adjust,
+ PROT_READ,
+ MAP_SHARED | MAP_LOCKED,
+ fd,
+ msix->table_base + table_off - msix->table_offset_adjust);
+ close(fd);
+ if (msix->phys_iomem_base == MAP_FAILED) {
+ rc = -errno;
+ XEN_PT_ERR(d, "Can't map physical MSI-X table: %s\n", strerror(errno));
+ goto error_out;
+ }
+ msix->phys_iomem_base = (char *)msix->phys_iomem_base
+ + msix->table_offset_adjust;
+
+ XEN_PT_LOG(d, "mapping physical MSI-X table to %p\n",
+ msix->phys_iomem_base);
+ }

memory_region_add_subregion_overlap(&s->bar[bar_index], table_off,
&msix->mmio,
--
2.41.0

40 changes: 40 additions & 0 deletions qemu/patches/0003-Conditionally-disable-MSI-X-cap.patch
@@ -0,0 +1,40 @@
From 45791f82e3e72834cd171d73d81164d798098517 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?=
<marmarek@invisiblethingslab.com>
Date: Fri, 22 Sep 2023 23:19:51 +0200
Subject: [PATCH 3/3] Conditionally disable MSI-X cap
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Hide MSI-X from the guest if Xen is too old to support its emulation
with stubdomain (see "Do not access /dev/mem in MSI-X PCI passthrough on
Xen" patch)

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
hw/xen/xen_pt_config_init.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 2b8680b112..20308824a1 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -54,6 +54,14 @@ static int xen_pt_hide_dev_cap(const XenHostPCIDevice *d, uint8_t grp_id)
return 1;
}
break;
+#ifdef CONFIG_STUBDOM
+ case PCI_CAP_ID_MSIX: {
+ xen_feature_info_t xc_version_info = { 0 };
+ if (xc_version(xen_xc, XENVER_get_features, &xc_version_info) < 0)
+ return 1;
+ return !(xc_version_info.submap & (1U << XENFEAT_dm_msix_all_writes));
+ }
+#endif
}
return 0;
}
--
2.41.0

70 changes: 0 additions & 70 deletions qemu/patches/0005-Disable-MSI-X-caps.patch

This file was deleted.

0 comments on commit ca8a488

Please sign in to comment.