Skip to content

Commit

Permalink
PCI: Add Intel remapped NVMe device support
Browse files Browse the repository at this point in the history
Consumer products that are configured by default to run the Intel SATA AHCI
controller in "RAID" or "Intel RST Premium With Intel Optane System
Acceleration" mode are becoming increasingly prevalent.

Unde this mode, NVMe devices are remapped into the SATA device and become
hidden from the PCI bus, which means that Linux users cannot access their
storage devices unless they go into the firmware setup menu to revert back
to AHCI mode - assuming such option is available. Lack of support for this
mode is also causing complications for vendors who distribute Linux.

Add support for the remapped NVMe mode by creating a virtual PCI bus,
where the AHCI and NVMe devices are presented separately, allowing the
ahci and nvme drivers to bind in the normal way.

Unfortunately the NVMe device configuration space is inaccesible under
this scheme, so we provide a fake one, and hope that no DeviceID-based
quirks are needed. The interrupt is shared between the AHCI and NVMe
devices.

Allow pci_real_dma_dev() to traverse back to the real DMA device from
the PCI devices created on our virtual bus, in case the iommu driver
will be involved with data transfers here.

The existing ahci driver is modified to not claim devices where remapped
NVMe devices are present, allowing this new driver to step in.

The details of the remapping scheme came from patches previously
posted by Dan Williams and the resulting discussion.

https://phabricator.endlessm.com/T24358
https://phabricator.endlessm.com/T29119

Signed-off-by: Daniel Drake <drake@endlessm.com>
  • Loading branch information
dsd authored and damentz committed Jul 24, 2021
1 parent efe513d commit 28defae
Show file tree
Hide file tree
Showing 5 changed files with 488 additions and 16 deletions.
6 changes: 6 additions & 0 deletions arch/x86/include/asm/pci.h
Expand Up @@ -27,6 +27,7 @@ struct pci_sysdata {
#if IS_ENABLED(CONFIG_VMD)
struct pci_dev *vmd_dev; /* VMD Device if in Intel VMD domain */
#endif
struct pci_dev *nvme_remap_dev; /* AHCI Device if NVME remapped bus */
};

extern int pci_routeirq;
Expand Down Expand Up @@ -70,6 +71,11 @@ static inline bool is_vmd(struct pci_bus *bus)
#define is_vmd(bus) false
#endif /* CONFIG_VMD */

static inline bool is_nvme_remap(struct pci_bus *bus)
{
return to_pci_sysdata(bus)->nvme_remap_dev != NULL;
}

/* Can be used to override the logic in pci_scan_bus for skipping
already-configured bus numbers - to be used for buggy BIOSes
or architectures with incomplete PCI setup by the loader */
Expand Down
7 changes: 5 additions & 2 deletions arch/x86/pci/common.c
Expand Up @@ -715,12 +715,15 @@ int pci_ext_cfg_avail(void)
return 0;
}

#if IS_ENABLED(CONFIG_VMD)
struct pci_dev *pci_real_dma_dev(struct pci_dev *dev)
{
#if IS_ENABLED(CONFIG_VMD)
if (is_vmd(dev->bus))
return to_pci_sysdata(dev->bus)->vmd_dev;
#endif

if (is_nvme_remap(dev->bus))
return to_pci_sysdata(dev->bus)->nvme_remap_dev;

return dev;
}
#endif
23 changes: 9 additions & 14 deletions drivers/ata/ahci.c
Expand Up @@ -1497,7 +1497,7 @@ static irqreturn_t ahci_thunderx_irq_handler(int irq, void *dev_instance)
}
#endif

static void ahci_remap_check(struct pci_dev *pdev, int bar,
static int ahci_remap_check(struct pci_dev *pdev, int bar,
struct ahci_host_priv *hpriv)
{
int i;
Expand All @@ -1510,7 +1510,7 @@ static void ahci_remap_check(struct pci_dev *pdev, int bar,
pci_resource_len(pdev, bar) < SZ_512K ||
bar != AHCI_PCI_BAR_STANDARD ||
!(readl(hpriv->mmio + AHCI_VSCAP) & 1))
return;
return 0;

cap = readq(hpriv->mmio + AHCI_REMAP_CAP);
for (i = 0; i < AHCI_MAX_REMAP; i++) {
Expand All @@ -1525,18 +1525,11 @@ static void ahci_remap_check(struct pci_dev *pdev, int bar,
}

if (!hpriv->remapped_nvme)
return;

dev_warn(&pdev->dev, "Found %u remapped NVMe devices.\n",
hpriv->remapped_nvme);
dev_warn(&pdev->dev,
"Switch your BIOS from RAID to AHCI mode to use them.\n");
return 0;

/*
* Don't rely on the msi-x capability in the remap case,
* share the legacy interrupt across ahci and remapped devices.
*/
hpriv->flags |= AHCI_HFLAG_NO_MSI;
/* Abort probe, allowing intel-nvme-remap to step in when available */
dev_info(&pdev->dev, "Device will be handled by intel-nvme-remap.\n");
return -ENODEV;
}

static int ahci_get_irq_vector(struct ata_host *host, int port)
Expand Down Expand Up @@ -1761,7 +1754,9 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
hpriv->mmio = pcim_iomap_table(pdev)[ahci_pci_bar];

/* detect remapped nvme devices */
ahci_remap_check(pdev, ahci_pci_bar, hpriv);
rc = ahci_remap_check(pdev, ahci_pci_bar, hpriv);
if (rc)
return rc;

sysfs_add_file_to_group(&pdev->dev.kobj,
&dev_attr_remapped_nvme.attr,
Expand Down
6 changes: 6 additions & 0 deletions drivers/pci/controller/Makefile
@@ -1,4 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
ifdef CONFIG_X86_64
ifdef CONFIG_SATA_AHCI
obj-y += intel-nvme-remap.o
endif
endif

obj-$(CONFIG_PCIE_CADENCE) += cadence/
obj-$(CONFIG_PCI_FTPCI100) += pci-ftpci100.o
obj-$(CONFIG_PCI_HYPERV) += pci-hyperv.o
Expand Down

0 comments on commit 28defae

Please sign in to comment.