Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QAT: Fix uninitialized seed in QAT compression #14632

Merged
merged 1 commit into from
Mar 16, 2023

Conversation

naivekun
Copy link
Contributor

Motivation and Context

#14463

Description

According to QAT example from intel, CpaDcRqResults have to be initialized with checksum=1 for adler32. Otherwise when error CPA_DC_OVERFLOW occurred, the next compression operation will continue on previously part-compressed data, and write invalid checksum data on disk. When ZFS decompress the compressed data, a invalid checksum will occurred and lead to issue #14463.

Reference doc: 330684-012-intel-qat-api-programmers-guide.pdf
Code example from Intel: https://github.com/intel/qatlib/blob/ff155a0778320ce7cff532bfa96a65ad35aa75f7/quickassist/lookaside/access_layer/src/sample_code/functional/dc/stateless_multi_op_checksum_sample/cpa_dc_stateless_multi_op_checksum_sample.c#L319

How Has This Been Tested?

Test platform

Xeon D-2177NT with three QAT device type c62x

Test version

  • ZFS 2.0.3
  • debian 11
  • QAT driver v1.7.L.4.16.0-00017

Test setup

  • service qat start && modprobe zfs && service zfs-zed start
  • zpool create testpool /dev/nvme0n1
  • zfs set compression=gzip testpool
  • stress /testpool with fio

Test result

As shown in #14463, without this patch, data on disk will decompress failed and lead to kernel panic. With CpaDcRqResults initialized with checksum=1, everything works fine.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occured, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occured and lead to openzfs#14463

Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes: openzfs#14463
@behlendorf
Copy link
Contributor

@wli5 @cfzhu could you take a look at this fix.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Mar 15, 2023
@wli5
Copy link
Contributor

wli5 commented Mar 16, 2023

@naivekun thanks for the fix! It looks good to me.

@cfzhu
Copy link
Contributor

cfzhu commented Mar 16, 2023

hi @naivekun, thank you very much for finding and fixing this problem! Weigang and I discussed this issue, We think the patch is very good. Thank you again for your patch.

Copy link
Contributor

@mcmilk mcmilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - @naivekun - thanks for figuring out the problem and also the fix for it

@wli5 @cfzhu - thanks for the testings

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 16, 2023
@behlendorf behlendorf merged commit 60cfd3b into openzfs:master Mar 16, 2023
@behlendorf
Copy link
Contributor

Merged. Thanks everybody, I'll open a PR to get this included in the 2.1.10 release.

behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Mar 16, 2023
CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occurred, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occurred and lead to openzfs#14463

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Weigang Li <weigang.li@intel.com>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes openzfs#14632
Closes openzfs#14463
behlendorf pushed a commit that referenced this pull request Mar 17, 2023
CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occurred, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occurred and lead to #14463

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Weigang Li <weigang.li@intel.com>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes #14632
Closes #14463
andrewc12 pushed a commit to andrewc12/openzfs that referenced this pull request Mar 17, 2023
CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occurred, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occurred and lead to openzfs#14463

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Weigang Li <weigang.li@intel.com>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes openzfs#14632
Closes openzfs#14463
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
CpaDcRqResults have to be initialized with checksum=1 for adler32.
Otherwise when error CPA_DC_OVERFLOW occurred, the next compress
operation will continue on previously part-compressed data, and write
invalid checksum data. When zfs decompress the compressed data, a
invalid checksum will occurred and lead to openzfs#14463

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Weigang Li <weigang.li@intel.com>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: naivekun <naivekun0817@gmail.com>
Closes openzfs#14632
Closes openzfs#14463
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants