Skip to content

Commit

Permalink
block: introduce the BFQ-v7r3 I/O sched for 3.14
Browse files Browse the repository at this point in the history
Add the BFQ-v7r3 I/O scheduler to 3.14.
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.

    - Slices are given in the service domain: tasks are assigned
      budgets, measured in number of sectors. Once got the disk, a task
      must however consume its assigned budget within a configurable
      maximum time (by default, the maximum possible value of the
      budgets is automatically computed to comply with this timeout).
      This allows the desired latency vs "throughput boosting" tradeoff
      to be set.

    - Budgets are scheduled according to a variant of WF2Q+, implemented
      using an augmented rb-tree to take eligibility into account while
      preserving an O(log N) overall complexity.

    - A low-latency tunable is provided; if enabled, both interactive
      and soft real-time applications are guaranteed a very low latency.

    - Latency guarantees are preserved also in the presence of NCQ.

    - Also with flash-based devices, a high throughput is achieved
      while still preserving latency guarantees.

    - BFQ features Early Queue Merge (EQM), a sort of fusion of the
      cooperating-queue-merging and the preemption mechanisms present
      in CFQ. EQM is in fact a unified mechanism that tries to get a
      sequential read pattern, and hence a high throughput, with any
      set of processes performing interleaved I/O over a contiguous
      sequence of sectors.

    - BFQ supports full hierarchical scheduling, exporting a cgroups
      interface.  Since each node has a full scheduler, each group can
      be assigned its own weight.

    - If the cgroups interface is not used, only I/O priorities can be
      assigned to processes, with ioprio values mapped to weights
      with the relation weight = IOPRIO_BE_NR - ioprio.

    - ioprio classes are served in strict priority order, i.e., lower
      priority queues are not served as long as there are higher
      priority queues.  Among queues in the same class the bandwidth is
      distributed in proportion to the weight of each queue. A very
      thin extra bandwidth is however guaranteed to the Idle class, to
      prevent it from starving.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>

=======================

block, bfq: add Early Queue Merge (EQM) to BFQ-v7r3 for
 3.14.0

A set of processes may happen  to  perform interleaved reads, i.e., requests
whose union would give rise to a  sequential read  pattern.  There are two
typical  cases: in the first  case,   processes  read  fixed-size chunks of
data at a fixed distance from each other, while in the second case rocesses
may read variable-size chunks at  variable distances. The latter case occurs
for  example with  KVM, which  splits the  I/O generated  by the  guest into
multiple chunks,  and lets these chunks  be served by a  pool of cooperating
processes,  iteratively  assigning  the  next  chunk of  I/O  to  the first
available  process. CFQ  uses actual  queue merging  for the  first type of
rocesses, whereas it  uses preemption to get a sequential  read pattern out
of the read requests  performed by the second type of  processes. In the end
it uses  two different  mechanisms to  achieve the  same goal: boosting the
throughput with interleaved I/O.

This patch introduces  Early Queue Merge (EQM), a unified mechanism to get a
sequential  read pattern  with both  types of  processes. The  main idea is
checking newly arrived requests against the next request of the active queue
both in case of actual request insert and in case of request merge. By doing
so, both the types of processes can be handled by just merging their queues.
EQM is  then simpler and  more compact than the  pair of mechanisms used in
CFQ.

Finally, EQM  also preserves the  typical low-latency properties of BFQ, by
properly restoring the weight-raising state of  a queue when it gets back to
a non-merged state.

Signed-off-by: Mauro Andreolini <mauro.andreolini@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Reviewed-by: Paolo Valente <paolo.valente@unimore.it>
  • Loading branch information
dorimanx committed May 1, 2014
1 parent f02008d commit e344aef
Show file tree
Hide file tree
Showing 4 changed files with 836 additions and 331 deletions.
11 changes: 6 additions & 5 deletions block/bfq-cgroup.c
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ static inline void bfq_group_init_entity(struct bfqio_cgroup *bgrp,
entity->ioprio = entity->new_ioprio;
entity->ioprio_class = entity->new_ioprio_class = bgrp->ioprio_class;
entity->my_sched_data = &bfqg->sched_data;
bfqg->active_entities = 0;
}

static inline void bfq_group_set_parent(struct bfq_group *bfqg,
Expand Down Expand Up @@ -540,14 +541,14 @@ static void bfq_destroy_group(struct bfqio_cgroup *bgrp, struct bfq_group *bfqg)
kfree(bfqg);
}

static void bfq_end_raising_async(struct bfq_data *bfqd)
static void bfq_end_wr_async(struct bfq_data *bfqd)
{
struct hlist_node *tmp;
struct bfq_group *bfqg;

hlist_for_each_entry_safe(bfqg, tmp, &bfqd->group_list, bfqd_node)
bfq_end_raising_async_queues(bfqd, bfqg);
bfq_end_raising_async_queues(bfqd, bfqd->root_group);
bfq_end_wr_async_queues(bfqd, bfqg);
bfq_end_wr_async_queues(bfqd, bfqd->root_group);
}

/**
Expand Down Expand Up @@ -894,9 +895,9 @@ static inline void bfq_bfqq_move(struct bfq_data *bfqd,
{
}

static void bfq_end_raising_async(struct bfq_data *bfqd)
static void bfq_end_wr_async(struct bfq_data *bfqd)
{
bfq_end_raising_async_queues(bfqd, bfqd->root_group);
bfq_end_wr_async_queues(bfqd, bfqd->root_group);
}

static inline void bfq_disconnect_groups(struct bfq_data *bfqd)
Expand Down

0 comments on commit e344aef

Please sign in to comment.