Permalink
Browse files

Reduce taskq and context-switch cost of zio pipe

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736
  • Loading branch information...
ahrens authored and tonyhutter committed Aug 2, 2018
1 parent 60cea5d commit 8e1d0d34332ff2a2edc06ee6259adb358ef987ca
Showing with 139 additions and 117 deletions.
  1. +2 −2 include/sys/zio.h
  2. +137 −115 module/zfs/zio.c
@@ -237,7 +237,7 @@ enum zio_child {
#define ZIO_CHILD_DDT_BIT ZIO_CHILD_BIT(ZIO_CHILD_DDT)
#define ZIO_CHILD_LOGICAL_BIT ZIO_CHILD_BIT(ZIO_CHILD_LOGICAL)
#define ZIO_CHILD_ALL_BITS \
(ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \
(ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \
ZIO_CHILD_DDT_BIT | ZIO_CHILD_LOGICAL_BIT)
enum zio_wait_type {
@@ -375,7 +375,7 @@ typedef struct zio_transform {
struct zio_transform *zt_next;
} zio_transform_t;
typedef int zio_pipe_stage_t(zio_t *zio);
typedef zio_t *zio_pipe_stage_t(zio_t *zio);
/*
* The io_reexecute flags are distinct from io_flags because the child must
Oops, something went wrong.

0 comments on commit 8e1d0d3

Please sign in to comment.