Skip to content

Commit

Permalink
IOMMU: make DMA containment of quarantined devices optional
Browse files Browse the repository at this point in the history
Containing still in flight DMA was introduced to work around certain
devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
Passing through (such) devices (on such systems) is inherently insecure
(as guests could easily arrange for IOMMU faults of any kind to occur).
Defaulting to a mode where admins may not even become aware of issues
with devices can be considered undesirable. Therefore convert this mode
of operation to an optional one, not one enabled by default.

This involves resurrecting code commit ea38867 ("x86 / iommu: set
up a scratch page in the quarantine domain") did remove, in a slightly
extended and abstracted fashion. Here, instead of reintroducing a pretty
pointless use of "goto" in domain_context_unmap(), and instead of making
the function (at least temporarily) inconsistent, take the opportunity
and replace the other similarly pointless "goto" as well.

In order to key the re-instated bypasses off of there (not) being a root
page table this further requires moving the allocate_domain_resources()
invocation from reassign_device() to amd_iommu_setup_domain_device() (or
else reassign_device() would allocate a root page table anyway); this is
benign to the second caller of the latter function.

In VT-d's domain_context_unmap(), instead of adding yet another
"goto out" when all that's wanted is a "return", eliminate the "out"
label at the same time.

Take the opportunity and also limit the control to builds supporting
PCI.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
  • Loading branch information
jbeulich committed Jul 7, 2021
1 parent f17a73b commit 980d6ac
Show file tree
Hide file tree
Showing 6 changed files with 127 additions and 41 deletions.
31 changes: 26 additions & 5 deletions docs/misc/xen-command-line.pandoc
Original file line number Diff line number Diff line change
Expand Up @@ -1364,7 +1364,7 @@ detection of systems known to misbehave upon accesses to that port.
> Default: `new` unless directed-EOI is supported

### iommu
= List of [ <bool>, verbose, debug, force, required, quarantine,
= List of [ <bool>, verbose, debug, force, required, quarantine[=scratch-page],
sharept, intremap, intpost, crash-disable,
snoop, qinval, igfx, amd-iommu-perdev-intremap,
dom0-{passthrough,strict} ]
Expand Down Expand Up @@ -1402,11 +1402,32 @@ boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled.
will prevent Xen from booting if IOMMUs aren't discovered and enabled
successfully.

* The `quarantine` boolean can be used to control Xen's behavior when
de-assigning devices from guests. If enabled (the default), Xen always
* The `quarantine` option can be used to control Xen's behavior when
de-assigning devices from guests.

When a PCI device is assigned to an untrusted domain, it is possible
for that domain to program the device to DMA to an arbitrary address.
The IOMMU is used to protect the host from malicious DMA by making
sure that the device addresses can only target memory assigned to the
guest. However, when the guest domain is torn down, assigning the
device back to the hardware domain would allow any in-flight DMA to
potentially target critical host data. To avoid this, quarantining
should be enabled. Quarantining can be done in two ways: In its basic
form, all in-flight DMA will simply be forced to encounter IOMMU
faults. Since there are systems where doing so can cause host lockup,
an alternative form is available where writes to memory will be made
fault, but reads will be directed to a scratch page. The implication
here is that such reads will go unnoticed, i.e. an admin may not
become aware of the underlying problem.

Therefore, if this option is set to true (the default), Xen always
quarantines such devices; they must be explicitly assigned back to Dom0
before they can be used there again. If disabled, Xen will only
quarantine devices the toolstack hass arranged for getting quarantined.
before they can be used there again. If set to "scratch-page", still
active DMA reads will additionally be directed to a "scratch" page. If
set to false, Xen will only quarantine devices the toolstack has arranged
for getting quarantined, and only in the "basic" form.

This option is only valid on builds supporting PCI.

* The `sharept` boolean controls whether the IOMMU pagetables are shared
with the CPU-side HAP pagetables, or allocated separately. Sharing
Expand Down
28 changes: 28 additions & 0 deletions xen/drivers/passthrough/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,31 @@ endif

config IOMMU_FORCE_PT_SHARE
bool

choice
prompt "IOMMU device quarantining default behavior"
depends on HAS_PCI
default IOMMU_QUARANTINE_BASIC
---help---
When a PCI device is assigned to an untrusted domain, it is possible
for that domain to program the device to DMA to an arbitrary address.
The IOMMU is used to protect the host from malicious DMA by making
sure that the device addresses can only target memory assigned to the
guest. However, when the guest domain is torn down, assigning the
device back to the hardware domain would allow any in-flight DMA to
potentially target critical host data. To avoid this, quarantining
should be enabled. Quarantining can be done in two ways: In its basic
form, all in-flight DMA will simply be forced to encounter IOMMU
faults. Since there are systems where doing so can cause host lockup,
an alternative form is available where writes to memory will be made
fault, but reads will be directed to a scratch page. The implication
here is that such reads will go unnoticed, i.e. an admin may not
become aware of the underlying problem.

config IOMMU_QUARANTINE_NONE
bool "none"
config IOMMU_QUARANTINE_BASIC
bool "basic"
config IOMMU_QUARANTINE_SCRATCH_PAGE
bool "scratch page"
endchoice
55 changes: 33 additions & 22 deletions xen/drivers/passthrough/amd/pci_amd_iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@
#include "iommu.h"
#include "../ats.h"

/* dom_io is used as a sentinel for quarantined devices */
#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)->arch.amd.root_table)

static bool_t __read_mostly init_done;

static const struct iommu_init_ops _iommu_init_ops;
Expand Down Expand Up @@ -81,19 +84,36 @@ int get_dma_requestor_id(uint16_t seg, uint16_t bdf)
return req_id;
}

static void amd_iommu_setup_domain_device(
static int __must_check allocate_domain_resources(struct domain *d)
{
struct domain_iommu *hd = dom_iommu(d);
int rc;

spin_lock(&hd->arch.mapping_lock);
rc = amd_iommu_alloc_root(d);
spin_unlock(&hd->arch.mapping_lock);

return rc;
}

static int __must_check amd_iommu_setup_domain_device(
struct domain *domain, struct amd_iommu *iommu,
uint8_t devfn, struct pci_dev *pdev)
{
struct amd_iommu_dte *table, *dte;
unsigned long flags;
int req_id, valid = 1;
int req_id, valid = 1, rc;
u8 bus = pdev->bus;
const struct domain_iommu *hd = dom_iommu(domain);
struct domain_iommu *hd = dom_iommu(domain);

BUG_ON( !hd->arch.amd.root_table ||
!hd->arch.amd.paging_mode ||
!iommu->dev_table.buffer );
if ( QUARANTINE_SKIP(domain) )
return 0;

BUG_ON(!hd->arch.amd.paging_mode || !iommu->dev_table.buffer);

rc = allocate_domain_resources(domain);
if ( rc )
return rc;

if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
valid = 0;
Expand Down Expand Up @@ -153,6 +173,8 @@ static void amd_iommu_setup_domain_device(

amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
}

return 0;
}

int __init acpi_ivrs_init(void)
Expand Down Expand Up @@ -224,18 +246,6 @@ int amd_iommu_alloc_root(struct domain *d)
return 0;
}

static int __must_check allocate_domain_resources(struct domain *d)
{
struct domain_iommu *hd = dom_iommu(d);
int rc;

spin_lock(&hd->arch.mapping_lock);
rc = amd_iommu_alloc_root(d);
spin_unlock(&hd->arch.mapping_lock);

return rc;
}

static int amd_iommu_domain_init(struct domain *d)
{
struct domain_iommu *hd = dom_iommu(d);
Expand Down Expand Up @@ -285,6 +295,9 @@ static void amd_iommu_disable_domain_device(const struct domain *domain,
int req_id;
u8 bus = pdev->bus;

if ( QUARANTINE_SKIP(domain) )
return;

BUG_ON ( iommu->dev_table.buffer == NULL );
req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn));
table = iommu->dev_table.buffer;
Expand Down Expand Up @@ -354,11 +367,10 @@ static int reassign_device(struct domain *source, struct domain *target,
pdev->domain = target;
}

rc = allocate_domain_resources(target);
rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
if ( rc )
return rc;

amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
AMD_IOMMU_DEBUG("Re-assign %pp from dom%d to dom%d\n",
&pdev->sbdf, source->domain_id, target->domain_id);

Expand Down Expand Up @@ -465,8 +477,7 @@ static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
amd_iommu_flush_device(iommu, bdf);
}

amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
return 0;
return amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
}

static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
Expand Down
23 changes: 21 additions & 2 deletions xen/drivers/passthrough/iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,24 @@ bool_t __initdata iommu_enable = 1;
bool_t __read_mostly iommu_enabled;
bool_t __read_mostly force_iommu;
bool_t __read_mostly iommu_verbose;
bool __read_mostly iommu_quarantine = true;
bool_t __read_mostly iommu_crash_disable;

#define IOMMU_quarantine_none 0 /* aka false */
#define IOMMU_quarantine_basic 1 /* aka true */
#define IOMMU_quarantine_scratch_page 2
#ifdef CONFIG_HAS_PCI
uint8_t __read_mostly iommu_quarantine =
# if defined(CONFIG_IOMMU_QUARANTINE_NONE)
IOMMU_quarantine_none;
# elif defined(CONFIG_IOMMU_QUARANTINE_BASIC)
IOMMU_quarantine_basic;
# elif defined(CONFIG_IOMMU_QUARANTINE_SCRATCH_PAGE)
IOMMU_quarantine_scratch_page;
# endif
#else
# define iommu_quarantine IOMMU_quarantine_none
#endif /* CONFIG_HAS_PCI */

static bool __hwdom_initdata iommu_hwdom_none;
bool __hwdom_initdata iommu_hwdom_strict;
bool __read_mostly iommu_hwdom_passthrough;
Expand Down Expand Up @@ -64,8 +79,12 @@ static int __init parse_iommu_param(const char *s)
else if ( (val = parse_boolean("force", s, ss)) >= 0 ||
(val = parse_boolean("required", s, ss)) >= 0 )
force_iommu = val;
#ifdef CONFIG_HAS_PCI
else if ( (val = parse_boolean("quarantine", s, ss)) >= 0 )
iommu_quarantine = val;
else if ( ss == s + 15 && !strncmp(s, "quarantine=scratch-page", 23) )
iommu_quarantine = IOMMU_quarantine_scratch_page;
#endif
#ifdef CONFIG_X86
else if ( (val = parse_boolean("igfx", s, ss)) >= 0 )
iommu_igfx = val;
Expand Down Expand Up @@ -432,7 +451,7 @@ static int __init iommu_quarantine_init(void)
dom_io->options |= XEN_DOMCTL_CDF_iommu;

rc = iommu_domain_init(dom_io, 0);
if ( rc )
if ( rc || iommu_quarantine < IOMMU_quarantine_scratch_page )
return rc;

if ( !hd->platform_ops->quarantine_init )
Expand Down
27 changes: 16 additions & 11 deletions xen/drivers/passthrough/vtd/iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@
#include "vtd.h"
#include "../ats.h"

/* dom_io is used as a sentinel for quarantined devices */
#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)->arch.vtd.pgd_maddr)

struct mapped_rmrr {
struct list_head list;
u64 base, end;
Expand Down Expand Up @@ -1342,6 +1345,9 @@ int domain_context_mapping_one(
int rc, ret;
bool_t flush_dev_iotlb;

if ( QUARANTINE_SKIP(domain) )
return 0;

ASSERT(pcidevs_locked());
spin_lock(&iommu->lock);
maddr = bus_to_context_maddr(iommu, bus);
Expand Down Expand Up @@ -1584,6 +1590,9 @@ int domain_context_unmap_one(
int iommu_domid, rc, ret;
bool_t flush_dev_iotlb;

if ( QUARANTINE_SKIP(domain) )
return 0;

ASSERT(pcidevs_locked());
spin_lock(&iommu->lock);

Expand Down Expand Up @@ -1658,7 +1667,7 @@ static int domain_context_unmap(struct domain *domain, u8 devfn,
{
struct acpi_drhd_unit *drhd;
struct vtd_iommu *iommu;
int ret = 0;
int ret;
u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus;
int found = 0;

Expand All @@ -1673,14 +1682,12 @@ static int domain_context_unmap(struct domain *domain, u8 devfn,
if ( iommu_debug )
printk(VTDPREFIX "%pd:Hostbridge: skip %pp unmap\n",
domain, &PCI_SBDF3(seg, bus, devfn));
if ( !is_hardware_domain(domain) )
return -EPERM;
goto out;
return is_hardware_domain(domain) ? 0 : -EPERM;

case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
case DEV_TYPE_LEGACY_PCI_BRIDGE:
goto out;
return 0;

case DEV_TYPE_PCIe_ENDPOINT:
if ( iommu_debug )
Expand Down Expand Up @@ -1734,12 +1741,11 @@ static int domain_context_unmap(struct domain *domain, u8 devfn,
default:
dprintk(XENLOG_ERR VTDPREFIX, "%pd:unknown(%u): %pp\n",
domain, pdev->type, &PCI_SBDF3(seg, bus, devfn));
ret = -EINVAL;
goto out;
return -EINVAL;
}

if ( ret )
goto out;
if ( ret || QUARANTINE_SKIP(domain) )
return ret;

/*
* if no other devices under the same iommu owned by this domain,
Expand All @@ -1764,8 +1770,7 @@ static int domain_context_unmap(struct domain *domain, u8 devfn,
cleanup_domid_map(domain, iommu);
}

out:
return ret;
return 0;
}

static void iommu_clear_root_pgtable(struct domain *d)
Expand Down
4 changes: 3 additions & 1 deletion xen/include/xen/iommu.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,9 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)

#ifdef CONFIG_HAS_PASSTHROUGH
extern bool_t iommu_enable, iommu_enabled;
extern bool force_iommu, iommu_quarantine, iommu_verbose;
extern bool force_iommu, iommu_verbose;
/* Boolean except for the specific purposes of drivers/passthrough/iommu.c. */
extern uint8_t iommu_quarantine;
#else
#define iommu_enabled false
#endif
Expand Down

0 comments on commit 980d6ac

Please sign in to comment.