Skip to content

Commit a052bfa

Browse files
Muchun Songaxboe
authored andcommitted
block: refactor rq_qos_wait()
When rq_qos_wait() is first introduced, it is easy to understand. But with some bug fixes applied, it is not easy for newcomers to understand the whole logic under those fixes. In this patch, rq_qos_wait() is refactored and more comments are added for better understanding. There are 3 points for the improvement: 1) Use waitqueue_active() instead of wq_has_sleeper() to eliminate unnecessary memory barrier in wq_has_sleeper() which is supposed to be used in waker side. In this case, we do need the barrier. So use the cheaper one to locklessly test for waiters on the queue. 2) Remove acquire_inflight_cb() logic for the first waiter out of the while loop to make the code clear. 3) Add more comments to explain how to sync with different waiters and the waker. Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250208090416.38642-2-songmuchun@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent 36d03cb commit a052bfa

File tree

1 file changed

+47
-21
lines changed

1 file changed

+47
-21
lines changed

block/blk-rq-qos.c

Lines changed: 47 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,14 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr,
223223
* Remove explicitly and use default_wake_function().
224224
*/
225225
default_wake_function(curr, mode, wake_flags, key);
226+
/*
227+
* Note that the order of operations is important as finish_wait()
228+
* tests whether @curr is removed without grabbing the lock. This
229+
* should be the last thing to do to make sure we will not have a
230+
* UAF access to @data. And the semantics of memory barrier in it
231+
* also make sure the waiter will see the latest @data->got_token
232+
* once list_empty_careful() in finish_wait() returns true.
233+
*/
226234
list_del_init_careful(&curr->entry);
227235
return 1;
228236
}
@@ -248,37 +256,55 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
248256
cleanup_cb_t *cleanup_cb)
249257
{
250258
struct rq_qos_wait_data data = {
251-
.rqw = rqw,
252-
.cb = acquire_inflight_cb,
253-
.private_data = private_data,
259+
.rqw = rqw,
260+
.cb = acquire_inflight_cb,
261+
.private_data = private_data,
262+
.got_token = false,
254263
};
255-
bool has_sleeper;
264+
bool first_waiter;
256265

257-
has_sleeper = wq_has_sleeper(&rqw->wait);
258-
if (!has_sleeper && acquire_inflight_cb(rqw, private_data))
266+
/*
267+
* If there are no waiters in the waiting queue, try to increase the
268+
* inflight counter if we can. Otherwise, prepare for adding ourselves
269+
* to the waiting queue.
270+
*/
271+
if (!waitqueue_active(&rqw->wait) && acquire_inflight_cb(rqw, private_data))
259272
return;
260273

261274
init_wait_func(&data.wq, rq_qos_wake_function);
262-
has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq,
275+
first_waiter = prepare_to_wait_exclusive(&rqw->wait, &data.wq,
263276
TASK_UNINTERRUPTIBLE);
277+
/*
278+
* Make sure there is at least one inflight process; otherwise, waiters
279+
* will never be woken up. Since there may be no inflight process before
280+
* adding ourselves to the waiting queue above, we need to try to
281+
* increase the inflight counter for ourselves. And it is sufficient to
282+
* guarantee that at least the first waiter to enter the waiting queue
283+
* will re-check the waiting condition before going to sleep, thus
284+
* ensuring forward progress.
285+
*/
286+
if (!data.got_token && first_waiter && acquire_inflight_cb(rqw, private_data)) {
287+
finish_wait(&rqw->wait, &data.wq);
288+
/*
289+
* We raced with rq_qos_wake_function() getting a token,
290+
* which means we now have two. Put our local token
291+
* and wake anyone else potentially waiting for one.
292+
*
293+
* Enough memory barrier in list_empty_careful() in
294+
* finish_wait() is paired with list_del_init_careful()
295+
* in rq_qos_wake_function() to make sure we will see
296+
* the latest @data->got_token.
297+
*/
298+
if (data.got_token)
299+
cleanup_cb(rqw, private_data);
300+
return;
301+
}
302+
303+
/* we are now relying on the waker to increase our inflight counter. */
264304
do {
265-
/* The memory barrier in set_current_state saves us here. */
266305
if (data.got_token)
267306
break;
268-
if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
269-
finish_wait(&rqw->wait, &data.wq);
270-
271-
/*
272-
* We raced with rq_qos_wake_function() getting a token,
273-
* which means we now have two. Put our local token
274-
* and wake anyone else potentially waiting for one.
275-
*/
276-
if (data.got_token)
277-
cleanup_cb(rqw, private_data);
278-
return;
279-
}
280307
io_schedule();
281-
has_sleeper = true;
282308
set_current_state(TASK_UNINTERRUPTIBLE);
283309
} while (1);
284310
finish_wait(&rqw->wait, &data.wq);

0 commit comments

Comments
 (0)