Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Zstandard decompression in Parquet reader #10847

Merged
merged 8 commits into from May 17, 2022

Conversation

vuule
Copy link
Contributor

@vuule vuule commented May 13, 2022

Adds ZSTD compression type to the nvcomp adapter. The zstd header is conditionally included so the code works without nvcomp 2.3.

Test changes:

  • nvcomp 2.3 still not used in CI, so almost no test changes;
  • Modifies a Python test that assumes the Zstandard is not supported, to also pass if reading is successful.

@vuule vuule self-assigned this May 13, 2022
@vuule vuule added this to PR-WIP in v22.06 Release via automation May 13, 2022
@github-actions github-actions bot added cuDF (Python) Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels May 13, 2022
@vuule vuule added feature request New feature or request non-breaking Non-breaking change labels May 13, 2022
@github-actions github-actions bot removed the cuDF (Python) Affects Python cuDF API. label May 16, 2022
@codecov
Copy link

codecov bot commented May 16, 2022

Codecov Report

Merging #10847 (dc223a2) into branch-22.06 (4ad1e51) will increase coverage by 0.02%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.06   #10847      +/-   ##
================================================
+ Coverage         86.29%   86.32%   +0.02%     
================================================
  Files               144      144              
  Lines             22656    22656              
================================================
+ Hits              19552    19558       +6     
+ Misses             3104     3098       -6     
Impacted Files Coverage Δ
python/cudf/cudf/core/dataframe.py 93.78% <0.00%> (+0.04%) ⬆️
python/cudf/cudf/core/column/string.py 88.78% <0.00%> (+0.12%) ⬆️
python/cudf/cudf/core/groupby/groupby.py 91.79% <0.00%> (+0.22%) ⬆️
python/cudf/cudf/core/tools/datetimes.py 84.49% <0.00%> (+0.30%) ⬆️
python/cudf/cudf/core/column/lists.py 91.70% <0.00%> (+0.97%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ad1e51...dc223a2. Read the comment docs.

@github-actions github-actions bot added the cuDF (Python) Affects Python cuDF API. label May 16, 2022
@vuule vuule marked this pull request as ready for review May 16, 2022 16:34
@vuule vuule requested review from a team as code owners May 16, 2022 16:34
Copy link
Contributor

@hyperbolic2346 hyperbolic2346 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit at best, but I prefer caps for #define and it seems that RMM and CUDF follow this as well. Other than that this looks great. Thank you.

cpp/src/io/comp/nvcomp_adapter.cpp Outdated Show resolved Hide resolved
cpp/src/io/comp/nvcomp_adapter.cpp Outdated Show resolved Hide resolved
cpp/src/io/comp/nvcomp_adapter.cpp Outdated Show resolved Hide resolved
v22.06 Release automation moved this from PR-WIP to PR-Needs review May 16, 2022
vuule and others added 2 commits May 16, 2022 12:18
Co-authored-by: Mike Wilson <hyperbolic2346@users.noreply.github.com>
@vuule vuule requested a review from hyperbolic2346 May 16, 2022 19:29
Copy link
Contributor

@hyperbolic2346 hyperbolic2346 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

v22.06 Release automation moved this from PR-Needs review to PR-Reviewer approved May 16, 2022
@vuule vuule added the 4 - Needs Review Waiting for reviewer to review or respond label May 17, 2022
@vuule
Copy link
Contributor Author

vuule commented May 17, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 369d005 into rapidsai:branch-22.06 May 17, 2022
v22.06 Release automation moved this from PR-Reviewer approved to Done May 17, 2022
@vuule vuule deleted the nvcomp-zstd-pq branch May 18, 2022 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Review Waiting for reviewer to review or respond cuDF (Python) Affects Python cuDF API. feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants