Skip to content
/ linux Public

Commit cad253d

Browse files
l1kSasha Levin
authored andcommitted
PCI/AER: Clear stale errors on reporting agents upon probe
[ Upstream commit e242d09 ] Correctable and Uncorrectable Error Status Registers on reporting agents are cleared upon PCI device enumeration in pci_aer_init() to flush past events. They're cleared again when an error is handled by the AER driver. If an agent reports a new error after pci_aer_init() and before the AER driver has probed on the corresponding Root Port or Root Complex Event Collector, that error is not handled by the AER driver: It clears the Root Error Status Register on probe, but neglects to re-clear the Correctable and Uncorrectable Error Status Registers on reporting agents. The error will eventually be reported when another error occurs. Which is irritating because to an end user it appears as if the earlier error has just happened. Amend the AER driver to clear stale errors on reporting agents upon probe. Skip reporting agents which have not invoked pci_aer_init() yet to avoid using an uninitialized pdev->aer_cap. They're recognizable by the error bits in the Device Control register still being clear. Reporting agents may execute pci_aer_init() after the AER driver has probed, particularly when devices are hotplugged or removed/rescanned via sysfs. For this reason, it continues to be necessary that pci_aer_init() clears Correctable and Uncorrectable Error Status Registers. Reported-by: Lucas Van <lucas.van@intel.com> # off-list Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Lucas Van <lucas.van@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/3011c2ed30c11f858e35e29939add754adea7478.1769332702.git.lukas@wunner.de Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent a175a76 commit cad253d

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

drivers/pci/pcie/aer.c

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1603,6 +1603,20 @@ static void aer_disable_irq(struct pci_dev *pdev)
16031603
pci_write_config_dword(pdev, aer + PCI_ERR_ROOT_COMMAND, reg32);
16041604
}
16051605

1606+
static int clear_status_iter(struct pci_dev *dev, void *data)
1607+
{
1608+
u16 devctl;
1609+
1610+
/* Skip if pci_enable_pcie_error_reporting() hasn't been called yet */
1611+
pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &devctl);
1612+
if (!(devctl & PCI_EXP_AER_FLAGS))
1613+
return 0;
1614+
1615+
pci_aer_clear_status(dev);
1616+
pcie_clear_device_status(dev);
1617+
return 0;
1618+
}
1619+
16061620
/**
16071621
* aer_enable_rootport - enable Root Port's interrupts when receiving messages
16081622
* @rpc: pointer to a Root Port data structure
@@ -1624,9 +1638,19 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
16241638
pcie_capability_clear_word(pdev, PCI_EXP_RTCTL,
16251639
SYSTEM_ERROR_INTR_ON_MESG_MASK);
16261640

1627-
/* Clear error status */
1641+
/* Clear error status of this Root Port or RCEC */
16281642
pci_read_config_dword(pdev, aer + PCI_ERR_ROOT_STATUS, &reg32);
16291643
pci_write_config_dword(pdev, aer + PCI_ERR_ROOT_STATUS, reg32);
1644+
1645+
/* Clear error status of agents reporting to this Root Port or RCEC */
1646+
if (reg32 & AER_ERR_STATUS_MASK) {
1647+
if (pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_EC)
1648+
pcie_walk_rcec(pdev, clear_status_iter, NULL);
1649+
else if (pdev->subordinate)
1650+
pci_walk_bus(pdev->subordinate, clear_status_iter,
1651+
NULL);
1652+
}
1653+
16301654
pci_read_config_dword(pdev, aer + PCI_ERR_COR_STATUS, &reg32);
16311655
pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS, reg32);
16321656
pci_read_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, &reg32);

0 commit comments

Comments
 (0)