Skip to content

Commit

Permalink
scsi/block PM: Always set request queue runtime active in blk_post_ru…
Browse files Browse the repository at this point in the history
…ntime_resume()

A deadlock may occur if the device is being removed at the same time as
runtime PM resume is occuring, as reported in:
https://lore.kernel.org/linux-scsi/47f35811-33c5-9620-45d5-8201e5ec5db3@huawei.com/T/#ma44d489dce5f2a4d7c50d101e35e5f7f7ff3b3b1

The respective stackframes are as follows:
__switch_to+0x164/0x1d4
__schedule+0x8f8/0x1450
schedule+0x7c/0x110
blk_queue_enter+0x290/0x490
blk_mq_alloc_request+0x50/0xb4
blk_get_request+0x38/0x80
__scsi_execute+0x6c/0x1c4
scsi_vpd_inquiry+0x88/0xf0
scsi_get_vpd_buf+0x68/0xb0
scsi_attach_vpd+0x58/0x170
scsi_rescan_device+0x40/0xac
ata_scsi_dev_rescan+0xb4/0x14c
process_one_work+0x29c/0x6fc
worker_thread+0x80/0x470
kthread+0x15c/0x170
ret_from_fork+0x10/0x18

sas_phy_event_worker [libsas]
Call trace:
__switch_to+0x164/0x1d4
__schedule+0x8f8/0x1450
schedule+0x7c/0x110
schedule_preempt_disabled+0x30/0x4c
__mutex_lock+0x308/0x8b0
mutex_lock_nested+0x44/0x70
device_del+0x4c/0x450
__scsi_remove_device+0x11c/0x14c
scsi_remove_target+0x1bc/0x240
sas_rphy_remove+0x90/0x94
sas_rphy_delete+0x24/0x40
sas_destruct_devices+0x64/0xa0 [libsas]
sas_deform_port+0x178/0x1bc [libsas]
sas_phye_loss_of_signal+0x28/0x34 [libsas]
sas_phy_event_worker+0x34/0x50 [libsas]
process_one_work+0x29c/0x6fc
worker_thread+0x80/0x470
kthread+0x15c/0x170
ret_from_fork+0x10/0x18
INFO: task fio:3382 blocked for more than 121

From commit e27829d ("scsi: serialize ->rescan against ->remove"), the
SCSI rescan code holds the sdev_gendev.device while executing the rescan.

However the device is being removed, which means that the PM resume fails,
which means means that the request queue cannot become runtime active. But
the device removal in __scsi_remove_device() requires the same lock, which
leads to deadlock.

As a solution, always make active the request queue in
blk_post_runtime_resume() - it should not make much difference since the
device and respective request queue are being removing anyway.

Signed-off-by: John Garry <john.garry@huawei.com>
  • Loading branch information
John Garry committed Jul 12, 2021
1 parent 5c89ee7 commit 3696ca8
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 15 deletions.
17 changes: 4 additions & 13 deletions block/blk-pm.c
Expand Up @@ -163,27 +163,18 @@ EXPORT_SYMBOL(blk_pre_runtime_resume);
/**
* blk_post_runtime_resume - Post runtime resume processing
* @q: the queue of the device
* @err: return value of the device's runtime_resume function
*
* Description:
* Update the queue's runtime status according to the return value of the
* device's runtime_resume function. If the resume was successful, call
* blk_set_runtime_active() to do the real work of restarting the queue.
* Call blk_set_runtime_active() to do the real work of restarting the queue.
*
* This function should be called near the end of the device's
* runtime_resume callback.
*/
void blk_post_runtime_resume(struct request_queue *q, int err)
void blk_post_runtime_resume(struct request_queue *q)
{
if (!q->dev)
return;
if (!err) {
blk_set_runtime_active(q);
} else {
spin_lock_irq(&q->queue_lock);
q->rpm_status = RPM_SUSPENDED;
spin_unlock_irq(&q->queue_lock);
}
blk_set_runtime_active(q);
}
EXPORT_SYMBOL(blk_post_runtime_resume);

Expand All @@ -201,7 +192,7 @@ EXPORT_SYMBOL(blk_post_runtime_resume);
* runtime PM status and re-enable peeking requests from the queue. It
* should be called before first request is added to the queue.
*
* This function is also called by blk_post_runtime_resume() for successful
* This function is also called by blk_post_runtime_resume() for all
* runtime resumes. It does everything necessary to restart the queue.
*/
void blk_set_runtime_active(struct request_queue *q)
Expand Down
2 changes: 1 addition & 1 deletion drivers/scsi/scsi_pm.c
Expand Up @@ -262,7 +262,7 @@ static int sdev_runtime_resume(struct device *dev)
blk_pre_runtime_resume(sdev->request_queue);
if (pm && pm->runtime_resume)
err = pm->runtime_resume(dev);
blk_post_runtime_resume(sdev->request_queue, err);
blk_post_runtime_resume(sdev->request_queue);

return err;
}
Expand Down
2 changes: 1 addition & 1 deletion include/linux/blk-pm.h
Expand Up @@ -14,7 +14,7 @@ extern void blk_pm_runtime_init(struct request_queue *q, struct device *dev);
extern int blk_pre_runtime_suspend(struct request_queue *q);
extern void blk_post_runtime_suspend(struct request_queue *q, int err);
extern void blk_pre_runtime_resume(struct request_queue *q);
extern void blk_post_runtime_resume(struct request_queue *q, int err);
extern void blk_post_runtime_resume(struct request_queue *q);
extern void blk_set_runtime_active(struct request_queue *q);
#else
static inline void blk_pm_runtime_init(struct request_queue *q,
Expand Down

0 comments on commit 3696ca8

Please sign in to comment.