Skip to content

Commit

Permalink
nvme-pci: fix hot removal during error handling
Browse files Browse the repository at this point in the history
A removal waits for the reset_work to complete. If a surprise removal
occurs around the same time as an error triggered controller reset, and
reset work happened to dispatch a command to the removed controller, the
command won't be recovered since the timeout work doesn't do anything
during error recovery. We wouldn't want to wait for timeout handling
anyway, so this patch fixes this by disabling the controller and killing
admin queues prior to syncing with the reset_work.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
  • Loading branch information
Keith Busch authored and Christoph Hellwig committed Oct 17, 2018
1 parent 1216e9e commit cb4bfda
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions drivers/nvme/host/pci.c
Original file line number Diff line number Diff line change
Expand Up @@ -2564,13 +2564,12 @@ static void nvme_remove(struct pci_dev *pdev)
struct nvme_dev *dev = pci_get_drvdata(pdev);

nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);

cancel_work_sync(&dev->ctrl.reset_work);
pci_set_drvdata(pdev, NULL);

if (!pci_device_is_present(pdev)) {
nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD);
nvme_dev_disable(dev, true);
nvme_dev_remove_admin(dev);
}

flush_work(&dev->ctrl.reset_work);
Expand Down

0 comments on commit cb4bfda

Please sign in to comment.