Skip to content

Commit 55a1ab5

Browse files
committed
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe: "A smaller collection of fixes that should go into -rc1. This contains: - A fix from Christoph, fixing a regression with the WRITE_SAME and partial completions. Caused a BUG() on ppc. - Fixup for __blk_mq_stop_hw_queues(), it should be static. From Colin. - Removal of dmesg error messages on elevator switching, when invoked from sysfs. From me. - Fix for blk-stat, using this_cpu_ptr() in a section only protected by rcu_read_lock(). This breaks when PREEMPT_RCU is enabled. From me. - Two fixes for BFQ from Paolo, one fixing a crash and one updating the documentation. - An error handling lightnvm memory leak, from Rakesh. - The previous blk-mq hot unplug lock reversal depends on the CPU hotplug rework that isn't in mainline yet. This caused a lockdep splat when people unplugged CPUs with blk-mq devices. From Wanpeng. - A regression fix for DIF/DIX on blk-mq. From Wen" * 'for-linus' of git://git.kernel.dk/linux-block: block: handle partial completions for special payload requests blk-mq: NVMe 512B/4K+T10 DIF/DIX format returns I/O error on dd with split op blk-stat: don't use this_cpu_ptr() in a preemptable section elevator: remove redundant warnings on IO scheduler switch block, bfq: stress that low_latency must be off to get max throughput block, bfq: use pointer entity->sched_data only if set nvme: lightnvm: fix memory leak blk-mq: make __blk_mq_stop_hw_queues static lightnvm: remove unused rq parameter of nvme_nvm_rqtocmd() to kill warning block/mq: fix potential deadlock during cpu hotplug
2 parents 9786e34 + ed6565e commit 55a1ab5

File tree

8 files changed

+65
-35
lines changed

8 files changed

+65
-35
lines changed

Documentation/block/bfq-iosched.txt

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,13 @@ controllers), BFQ's main features are:
1111
groups (switching back to time distribution when needed to keep
1212
throughput high).
1313

14+
In its default configuration, BFQ privileges latency over
15+
throughput. So, when needed for achieving a lower latency, BFQ builds
16+
schedules that may lead to a lower throughput. If your main or only
17+
goal, for a given device, is to achieve the maximum-possible
18+
throughput at all times, then do switch off all low-latency heuristics
19+
for that device, by setting low_latency to 0. Full details in Section 3.
20+
1421
On average CPUs, the current version of BFQ can handle devices
1522
performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. As a
1623
reference, 30-50 KIOPS correspond to very high bandwidths with
@@ -375,11 +382,19 @@ default, low latency mode is enabled. If enabled, interactive and soft
375382
real-time applications are privileged and experience a lower latency,
376383
as explained in more detail in the description of how BFQ works.
377384

378-
DO NOT enable this mode if you need full control on bandwidth
385+
DISABLE this mode if you need full control on bandwidth
379386
distribution. In fact, if it is enabled, then BFQ automatically
380387
increases the bandwidth share of privileged applications, as the main
381388
means to guarantee a lower latency to them.
382389

390+
In addition, as already highlighted at the beginning of this document,
391+
DISABLE this mode if your only goal is to achieve a high throughput.
392+
In fact, privileging the I/O of some application over the rest may
393+
entail a lower throughput. To achieve the highest-possible throughput
394+
on a non-rotational device, setting slice_idle to 0 may be needed too
395+
(at the cost of giving up any strong guarantee on fairness and low
396+
latency).
397+
383398
timeout_sync
384399
------------
385400

block/bfq-iosched.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,11 @@
5656
* rotational or flash-based devices, and to get the job done quickly
5757
* for applications consisting in many I/O-bound processes.
5858
*
59+
* NOTE: if the main or only goal, with a given device, is to achieve
60+
* the maximum-possible throughput at all times, then do switch off
61+
* all low-latency heuristics for that device, by setting low_latency
62+
* to 0.
63+
*
5964
* BFQ is described in [1], where also a reference to the initial, more
6065
* theoretical paper on BFQ can be found. The interested reader can find
6166
* in the latter paper full details on the main algorithm, as well as

block/bfq-wf2q.c

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1114,12 +1114,21 @@ static void bfq_activate_requeue_entity(struct bfq_entity *entity,
11141114
bool __bfq_deactivate_entity(struct bfq_entity *entity, bool ins_into_idle_tree)
11151115
{
11161116
struct bfq_sched_data *sd = entity->sched_data;
1117-
struct bfq_service_tree *st = bfq_entity_service_tree(entity);
1118-
int is_in_service = entity == sd->in_service_entity;
1117+
struct bfq_service_tree *st;
1118+
bool is_in_service;
11191119

11201120
if (!entity->on_st) /* entity never activated, or already inactive */
11211121
return false;
11221122

1123+
/*
1124+
* If we get here, then entity is active, which implies that
1125+
* bfq_group_set_parent has already been invoked for the group
1126+
* represented by entity. Therefore, the field
1127+
* entity->sched_data has been set, and we can safely use it.
1128+
*/
1129+
st = bfq_entity_service_tree(entity);
1130+
is_in_service = entity == sd->in_service_entity;
1131+
11231132
if (is_in_service)
11241133
bfq_calc_finish(entity, entity->service);
11251134

block/blk-core.c

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2644,8 +2644,6 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
26442644
return false;
26452645
}
26462646

2647-
WARN_ON_ONCE(req->rq_flags & RQF_SPECIAL_PAYLOAD);
2648-
26492647
req->__data_len -= total_bytes;
26502648

26512649
/* update sector only for requests with clear definition of sector */
@@ -2658,17 +2656,19 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
26582656
req->cmd_flags |= req->bio->bi_opf & REQ_FAILFAST_MASK;
26592657
}
26602658

2661-
/*
2662-
* If total number of sectors is less than the first segment
2663-
* size, something has gone terribly wrong.
2664-
*/
2665-
if (blk_rq_bytes(req) < blk_rq_cur_bytes(req)) {
2666-
blk_dump_rq_flags(req, "request botched");
2667-
req->__data_len = blk_rq_cur_bytes(req);
2668-
}
2659+
if (!(req->rq_flags & RQF_SPECIAL_PAYLOAD)) {
2660+
/*
2661+
* If total number of sectors is less than the first segment
2662+
* size, something has gone terribly wrong.
2663+
*/
2664+
if (blk_rq_bytes(req) < blk_rq_cur_bytes(req)) {
2665+
blk_dump_rq_flags(req, "request botched");
2666+
req->__data_len = blk_rq_cur_bytes(req);
2667+
}
26692668

2670-
/* recalculate the number of segments */
2671-
blk_recalc_rq_segments(req);
2669+
/* recalculate the number of segments */
2670+
blk_recalc_rq_segments(req);
2671+
}
26722672

26732673
return true;
26742674
}

block/blk-mq.c

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1236,7 +1236,7 @@ void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
12361236
}
12371237
EXPORT_SYMBOL(blk_mq_stop_hw_queue);
12381238

1239-
void __blk_mq_stop_hw_queues(struct request_queue *q, bool sync)
1239+
static void __blk_mq_stop_hw_queues(struct request_queue *q, bool sync)
12401240
{
12411241
struct blk_mq_hw_ctx *hctx;
12421242
int i;
@@ -1554,13 +1554,13 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
15541554

15551555
blk_queue_bounce(q, &bio);
15561556

1557+
blk_queue_split(q, &bio, q->bio_split);
1558+
15571559
if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
15581560
bio_io_error(bio);
15591561
return BLK_QC_T_NONE;
15601562
}
15611563

1562-
blk_queue_split(q, &bio, q->bio_split);
1563-
15641564
if (!is_flush_fua && !blk_queue_nomerges(q) &&
15651565
blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq))
15661566
return BLK_QC_T_NONE;
@@ -2341,15 +2341,15 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
23412341

23422342
blk_mq_init_cpu_queues(q, set->nr_hw_queues);
23432343

2344-
mutex_lock(&all_q_mutex);
23452344
get_online_cpus();
2345+
mutex_lock(&all_q_mutex);
23462346

23472347
list_add_tail(&q->all_q_node, &all_q_list);
23482348
blk_mq_add_queue_tag_set(set, q);
23492349
blk_mq_map_swqueue(q, cpu_online_mask);
23502350

2351-
put_online_cpus();
23522351
mutex_unlock(&all_q_mutex);
2352+
put_online_cpus();
23532353

23542354
if (!(set->flags & BLK_MQ_F_NO_SCHED)) {
23552355
int ret;

block/blk-stat.c

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -96,13 +96,16 @@ void blk_stat_add(struct request *rq)
9696

9797
rcu_read_lock();
9898
list_for_each_entry_rcu(cb, &q->stats->callbacks, list) {
99-
if (blk_stat_is_active(cb)) {
100-
bucket = cb->bucket_fn(rq);
101-
if (bucket < 0)
102-
continue;
103-
stat = &this_cpu_ptr(cb->cpu_stat)[bucket];
104-
__blk_stat_add(stat, value);
105-
}
99+
if (!blk_stat_is_active(cb))
100+
continue;
101+
102+
bucket = cb->bucket_fn(rq);
103+
if (bucket < 0)
104+
continue;
105+
106+
stat = &get_cpu_ptr(cb->cpu_stat)[bucket];
107+
__blk_stat_add(stat, value);
108+
put_cpu_ptr(cb->cpu_stat);
106109
}
107110
rcu_read_unlock();
108111
}

block/elevator.c

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1062,10 +1062,8 @@ static int __elevator_change(struct request_queue *q, const char *name)
10621062

10631063
strlcpy(elevator_name, name, sizeof(elevator_name));
10641064
e = elevator_get(strstrip(elevator_name), true);
1065-
if (!e) {
1066-
printk(KERN_ERR "elevator: type %s not found\n", elevator_name);
1065+
if (!e)
10671066
return -EINVAL;
1068-
}
10691067

10701068
if (q->elevator &&
10711069
!strcmp(elevator_name, q->elevator->type->elevator_name)) {
@@ -1105,7 +1103,6 @@ ssize_t elv_iosched_store(struct request_queue *q, const char *name,
11051103
if (!ret)
11061104
return count;
11071105

1108-
printk(KERN_ERR "elevator: switch to %s failed\n", name);
11091106
return ret;
11101107
}
11111108

drivers/nvme/host/lightnvm.c

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -367,7 +367,8 @@ static int nvme_nvm_get_l2p_tbl(struct nvm_dev *nvmdev, u64 slba, u32 nlb,
367367

368368
if (unlikely(elba > nvmdev->total_secs)) {
369369
pr_err("nvm: L2P data from device is out of bounds!\n");
370-
return -EINVAL;
370+
ret = -EINVAL;
371+
goto out;
371372
}
372373

373374
/* Transform physical address to target address space */
@@ -464,8 +465,8 @@ static int nvme_nvm_set_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr *ppas,
464465
return ret;
465466
}
466467

467-
static inline void nvme_nvm_rqtocmd(struct request *rq, struct nvm_rq *rqd,
468-
struct nvme_ns *ns, struct nvme_nvm_command *c)
468+
static inline void nvme_nvm_rqtocmd(struct nvm_rq *rqd, struct nvme_ns *ns,
469+
struct nvme_nvm_command *c)
469470
{
470471
c->ph_rw.opcode = rqd->opcode;
471472
c->ph_rw.nsid = cpu_to_le32(ns->ns_id);
@@ -503,7 +504,7 @@ static int nvme_nvm_submit_io(struct nvm_dev *dev, struct nvm_rq *rqd)
503504
if (!cmd)
504505
return -ENOMEM;
505506

506-
nvme_nvm_rqtocmd(rq, rqd, ns, cmd);
507+
nvme_nvm_rqtocmd(rqd, ns, cmd);
507508

508509
rq = nvme_alloc_request(q, (struct nvme_command *)cmd, 0, NVME_QID_ANY);
509510
if (IS_ERR(rq)) {

0 commit comments

Comments
 (0)