reduce taskq and context-switch cost of zio pipe #7736

ahrens · 2018-07-23T20:21:26Z

External-issue: DLPX-59292

Motivation and Context

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.

Description

We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

How Has This Been Tested?

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.
Change has been approved by a ZFS on Linux member.

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). External-issue: DLPX-59292 Signed-off-by: Matthew Ahrens <mahrens@delphix.com>

codecov · 2018-07-24T06:17:15Z

Codecov Report

Merging #7736 into master will increase coverage by 0.19%.
The diff coverage is 93.75%.

@@            Coverage Diff             @@
##           master    #7736      +/-   ##
==========================================
+ Coverage   78.08%   78.28%   +0.19%     
==========================================
  Files         368      368              
  Lines      111899   111897       -2     
==========================================
+ Hits        87378    87599     +221     
+ Misses      24521    24298     -223

Flag	Coverage Δ
#kernel	`78.8% <83.87%> (+0.07%)`	⬆️
#user	`67.22% <87.5%> (+0.3%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e5dc44...c24e3fe. Read the comment docs.

behlendorf

Even ignoring the performance wins from making this change, I think it makes the pipeline code more readable and intuitive. Looks good!

ahrens · 2018-10-05T02:11:39Z

Let's consider backporting this to 0.7.x, given that it fixes a regression (helps get performance back closer to 0.6). See #7834 for details.

behlendorf · 2018-10-08T18:59:58Z

@ahrens I had the same thought. Alternately, we should be able to just revert the __zio_execute() -> zio_taskq_dispatch() hunk in zio_notify_parent() which would be a much smaller change.

3dfb57a#diff-bdeb26c3e7b126b07c42b8583f57a2b9R517

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: George Wilson <george.wilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> External-issue: DLPX-59292 Closes openzfs#7736

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: George Wilson <george.wilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> External-issue: DLPX-59292 Requires-spl: spl-0.7-release Closes openzfs#7736

behlendorf · 2018-10-10T23:29:03Z

Opened PR #8011, it applied relatively cleanly so I ported the entire patch.

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: George Wilson <george.wilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> External-issue: DLPX-59292 Closes openzfs#7736

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: George Wilson <george.wilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> External-issue: DLPX-59292 Closes #7736

behlendorf approved these changes Jul 26, 2018

View reviewed changes

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) Reviewed labels Jul 27, 2018

ahrens added the Type: Performance Performance improvement or performance problem label Aug 2, 2018

behlendorf merged commit 6284003 into openzfs:master Aug 2, 2018

tonyhutter added this to To do in 0.7.12 Oct 8, 2018

tonyhutter moved this from To do to In progress in 0.7.12 Oct 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce taskq and context-switch cost of zio pipe #7736

reduce taskq and context-switch cost of zio pipe #7736

ahrens commented Jul 23, 2018

codecov bot commented Jul 24, 2018 •

edited

behlendorf left a comment

ahrens commented Oct 5, 2018

behlendorf commented Oct 8, 2018

behlendorf commented Oct 10, 2018

reduce taskq and context-switch cost of zio pipe #7736

reduce taskq and context-switch cost of zio pipe #7736

Conversation

ahrens commented Jul 23, 2018

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

codecov bot commented Jul 24, 2018 • edited

Codecov Report

behlendorf left a comment

Choose a reason for hiding this comment

ahrens commented Oct 5, 2018

behlendorf commented Oct 8, 2018

behlendorf commented Oct 10, 2018

codecov bot commented Jul 24, 2018 •

edited