Skip to content

Commit 9011f06

Browse files
l1kbjorn-helgaas
authored andcommitted
PCI/ERR: Notify drivers on failure to recover
According to Documentation/PCI/pci-error-recovery.rst, the following shall occur on failure to recover from a PCIe Uncorrectable Error: STEP 6: Permanent Failure ------------------------- A "permanent failure" has occurred, and the platform cannot recover the device. The platform will call error_detected() with a pci_channel_state_t value of pci_channel_io_perm_failure. The device driver should, at this point, assume the worst. It should cancel all pending I/O, refuse all new I/O, returning -EIO to higher layers. The device driver should then clean up all of its memory and remove itself from kernel operations, much as it would during system shutdown. Sathya notes that AER does not call error_detected() on failure and thus deviates from the document (as well as EEH, for which the document was originally added). Most drivers do nothing on permanent failure, but the SCSI drivers and a number of Ethernet drivers do take advantage of the notification to flush queues and give up resources. Amend AER to notify such drivers and align with the documentation and EEH. Link: https://lore.kernel.org/r/f496fc0f-64d7-46a4-8562-dba74e31a956@linux.intel.com/ Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/ec212d4d4f5c65d29349df33acdc9768ff8279d1.1755008151.git.lukas@wunner.de
1 parent 1cbc5e2 commit 9011f06

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

drivers/pci/pcie/err.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,19 @@ static int report_normal_detected(struct pci_dev *dev, void *data)
110110

111111
static int report_perm_failure_detected(struct pci_dev *dev, void *data)
112112
{
113+
struct pci_driver *pdrv;
114+
const struct pci_error_handlers *err_handler;
115+
116+
device_lock(&dev->dev);
117+
pdrv = dev->driver;
118+
if (!pdrv || !pdrv->err_handler || !pdrv->err_handler->error_detected)
119+
goto out;
120+
121+
err_handler = pdrv->err_handler;
122+
err_handler->error_detected(dev, pci_channel_io_perm_failure);
123+
out:
113124
pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
125+
device_unlock(&dev->dev);
114126
return 0;
115127
}
116128

0 commit comments

Comments
 (0)