Skip to content

Commit

Permalink
nvme: fix possible hang when removing a controller during error recovery
Browse files Browse the repository at this point in the history
commit 1b95e81 upstream.

Error recovery can be interrupted by controller removal, then the
controller is left as quiesced, and IO hang can be caused.

Fix the issue by unquiescing controller unconditionally when removing
namespaces.

This way is reasonable and safe given forward progress can be made
when removing namespaces.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Chunguang Xu <brookxu.cn@gmail.com>
Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/
Cc: stable@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
Ming Lei authored and gregkh committed Aug 16, 2023
1 parent b6c506a commit 0712a9e
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions drivers/nvme/host/core.c
Expand Up @@ -4728,6 +4728,12 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
*/
nvme_mpath_clear_ctrl_paths(ctrl);

/*
* Unquiesce io queues so any pending IO won't hang, especially
* those submitted from scan work
*/
nvme_unquiesce_io_queues(ctrl);

/* prevent racing with ns scanning */
flush_work(&ctrl->scan_work);

Expand All @@ -4737,10 +4743,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
* removing the namespaces' disks; fail all the queues now to avoid
* potentially having to clean up the failed sync later.
*/
if (ctrl->state == NVME_CTRL_DEAD) {
if (ctrl->state == NVME_CTRL_DEAD)
nvme_mark_namespaces_dead(ctrl);
nvme_unquiesce_io_queues(ctrl);
}

/* this is a no-op when called from the controller reset handler */
nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
Expand Down

0 comments on commit 0712a9e

Please sign in to comment.