-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Parquet] Parquet writer should control set of columns to enable page index #34949
Comments
wgtmac
added a commit
to wgtmac/arrow
that referenced
this issue
Apr 19, 2023
wjones127
pushed a commit
that referenced
this issue
Apr 19, 2023
### Rationale for this change Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them. ### What changes are included in this PR? Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns. ### Are these changes tested? Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings. ### Are there any user-facing changes? Yes, users are now more flexible to enable/disable page index. * Closes: #34949 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
liujiacheng777
pushed a commit
to LoongArch-Python/arrow
that referenced
this issue
May 11, 2023
…5230) ### Rationale for this change Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them. ### What changes are included in this PR? Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns. ### Are these changes tested? Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings. ### Are there any user-facing changes? Yes, users are now more flexible to enable/disable page index. * Closes: apache#34949 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
ArgusLi
pushed a commit
to Bit-Quill/arrow
that referenced
this issue
May 15, 2023
…5230) ### Rationale for this change Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them. ### What changes are included in this PR? Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns. ### Are these changes tested? Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings. ### Are there any user-facing changes? Yes, users are now more flexible to enable/disable page index. * Closes: apache#34949 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
rtpsw
pushed a commit
to rtpsw/arrow
that referenced
this issue
May 16, 2023
…5230) ### Rationale for this change Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them. ### What changes are included in this PR? Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns. ### Are these changes tested? Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings. ### Are there any user-facing changes? Yes, users are now more flexible to enable/disable page index. * Closes: apache#34949 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
Once #34054 is merged, it is nice to provide a writer option to select set of columns to enable page index.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: