Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix L2ARC reads when compressed ARC disabled #10693

Merged
merged 2 commits into from
Aug 14, 2020

Conversation

allanjude
Copy link
Contributor

Signed-off-by: Allan Jude allanjude@freebsd.org
Sponsored-by: The FreeBSD Foundation

Motivation and Context

When reading compressed blocks from the L2ARC, with compressed ARC disabled, arc_hdr_size() returns LSIZE rather than PSIZE, but the actual read is PSIZE. This causes l2arc_read_done() to compare the checksum against the wrong size, resulting in checksum failure.

This manifests as an increase in the kstat l2_cksum_bad and the read being retried from the main pool, making the L2ARC ineffective.

Description

This bug was discovered while creating the tests introduced in #10692

How Has This Been Tested?

With this change, the new test now passed.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

@codecov
Copy link

codecov bot commented Aug 9, 2020

Codecov Report

Merging #10693 into master will decrease coverage by 0.17%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10693      +/-   ##
==========================================
- Coverage   79.92%   79.75%   -0.18%     
==========================================
  Files         394      394              
  Lines      124657   124661       +4     
==========================================
- Hits        99636    99419     -217     
- Misses      25021    25242     +221     
Flag Coverage Δ
#kernel 80.42% <50.00%> (+0.02%) ⬆️
#user 65.45% <0.00%> (-0.66%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
module/zfs/arc.c 89.89% <50.00%> (-0.29%) ⬇️
cmd/zdb/zdb_il.c 30.86% <0.00%> (-24.08%) ⬇️
module/zfs/vdev_indirect.c 73.50% <0.00%> (-11.00%) ⬇️
module/zfs/vdev_rebuild.c 93.26% <0.00%> (-3.92%) ⬇️
module/zcommon/zfs_fletcher.c 75.65% <0.00%> (-2.64%) ⬇️
module/zfs/vdev_raidz.c 89.54% <0.00%> (-2.62%) ⬇️
module/zfs/btree.c 81.61% <0.00%> (-2.00%) ⬇️
module/zfs/lzjb.c 98.14% <0.00%> (-1.86%) ⬇️
cmd/ztest/ztest.c 79.17% <0.00%> (-1.63%) ⬇️
module/zcommon/zfs_uio.c 87.75% <0.00%> (-1.03%) ⬇️
... and 58 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d64c6a2...b24a773. Read the comment docs.

When reading compressed blocks from the L2ARC, with
compressed ARC disabled, arc_hdr_size() returns
LSIZE rather than PSIZE, but the actual read is PSIZE.
This causes l2arc_read_done() to compare the checksum
against the wrong size, resulting in checksum failure.

This manifests as an increase in the kstat l2_cksum_bad
and the read being retried from the main pool, making the
L2ARC ineffective.

Add new L2ARC tests with Compressed ARC enabled/disabled

Blocks are handled differently depending on the state of the
zfs_compressed_arc_enabled tunable.

If a block is compressed on-disk, and compressed_arc is enabled:
- the block is read from disk
- It is NOT decompressed
- It is added to the ARC in its compressed form
- l2arc_write_buffers() may write it to the L2ARC (as is)
- l2arc_read_done() compares the checksum to the BP (compressed)

However, if compressed_arc is disabled:
- the block is read from disk
- It is decompressed
- It is added to the ARC (uncompressed)
- l2arc_write_buffers() will use l2arc_apply_transforms() to
  recompress the block, before writing it to the L2ARC
- l2arc_read_done() compares the checksum to the BP (compressed)
- l2arc_read_done() will use l2arc_untransform() to uncompress it

This test writes out a test file to a pool consisting of one disk
and one cache device, then randomly reads from it. Since the arc_max
in the tests is low, this will feed the L2ARC, and result in reads
from the L2ARC.

We compare the value of the kstat l2_cksum_bad before and after
to determine if any blocks failed to survive the trip through the
L2ARC.

Sponsored-by: The FreeBSD Foundation
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're also going to add these two new tests to the tests/function/compression/Makefile.am. This ensures they're included in the make dist tarball which is what effectively gets built and tested by the CI.

Signed-off-by: Allan Jude <allanjude@freebsd.org>
@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 13, 2020
@behlendorf behlendorf merged commit fc34dfb into openzfs:master Aug 14, 2020
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
When reading compressed blocks from the L2ARC, with
compressed ARC disabled, arc_hdr_size() returns
LSIZE rather than PSIZE, but the actual read is PSIZE.
This causes l2arc_read_done() to compare the checksum
against the wrong size, resulting in checksum failure.

This manifests as an increase in the kstat l2_cksum_bad
and the read being retried from the main pool, making the
L2ARC ineffective.

Add new L2ARC tests with Compressed ARC enabled/disabled

Blocks are handled differently depending on the state of the
zfs_compressed_arc_enabled tunable.

If a block is compressed on-disk, and compressed_arc is enabled:
- the block is read from disk
- It is NOT decompressed
- It is added to the ARC in its compressed form
- l2arc_write_buffers() may write it to the L2ARC (as is)
- l2arc_read_done() compares the checksum to the BP (compressed)

However, if compressed_arc is disabled:
- the block is read from disk
- It is decompressed
- It is added to the ARC (uncompressed)
- l2arc_write_buffers() will use l2arc_apply_transforms() to
  recompress the block, before writing it to the L2ARC
- l2arc_read_done() compares the checksum to the BP (compressed)
- l2arc_read_done() will use l2arc_untransform() to uncompress it

This test writes out a test file to a pool consisting of one disk
and one cache device, then randomly reads from it. Since the arc_max
in the tests is low, this will feed the L2ARC, and result in reads
from the L2ARC.

We compare the value of the kstat l2_cksum_bad before and after
to determine if any blocks failed to survive the trip through the
L2ARC.

Sponsored-by: The FreeBSD Foundation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes openzfs#10693
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
When reading compressed blocks from the L2ARC, with
compressed ARC disabled, arc_hdr_size() returns
LSIZE rather than PSIZE, but the actual read is PSIZE.
This causes l2arc_read_done() to compare the checksum
against the wrong size, resulting in checksum failure.

This manifests as an increase in the kstat l2_cksum_bad
and the read being retried from the main pool, making the
L2ARC ineffective.

Add new L2ARC tests with Compressed ARC enabled/disabled

Blocks are handled differently depending on the state of the
zfs_compressed_arc_enabled tunable.

If a block is compressed on-disk, and compressed_arc is enabled:
- the block is read from disk
- It is NOT decompressed
- It is added to the ARC in its compressed form
- l2arc_write_buffers() may write it to the L2ARC (as is)
- l2arc_read_done() compares the checksum to the BP (compressed)

However, if compressed_arc is disabled:
- the block is read from disk
- It is decompressed
- It is added to the ARC (uncompressed)
- l2arc_write_buffers() will use l2arc_apply_transforms() to
  recompress the block, before writing it to the L2ARC
- l2arc_read_done() compares the checksum to the BP (compressed)
- l2arc_read_done() will use l2arc_untransform() to uncompress it

This test writes out a test file to a pool consisting of one disk
and one cache device, then randomly reads from it. Since the arc_max
in the tests is low, this will feed the L2ARC, and result in reads
from the L2ARC.

We compare the value of the kstat l2_cksum_bad before and after
to determine if any blocks failed to survive the trip through the
L2ARC.

Sponsored-by: The FreeBSD Foundation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes openzfs#10693
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants