Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Parquet writer should control set of columns to enable page index #34949

Closed
wgtmac opened this issue Apr 7, 2023 · 0 comments · Fixed by #35230
Closed

[C++][Parquet] Parquet writer should control set of columns to enable page index #34949

wgtmac opened this issue Apr 7, 2023 · 0 comments · Fixed by #35230

Comments

@wgtmac
Copy link
Member

wgtmac commented Apr 7, 2023

Describe the enhancement requested

Once #34054 is merged, it is nice to provide a writer option to select set of columns to enable page index.

Component(s)

C++, Parquet

wgtmac added a commit to wgtmac/arrow that referenced this issue Apr 19, 2023
wjones127 pushed a commit that referenced this issue Apr 19, 2023
### Rationale for this change

Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them.

### What changes are included in this PR?

Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns.

### Are these changes tested?

Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings.

### Are there any user-facing changes?

Yes, users are now more flexible to enable/disable page index.
* Closes: #34949

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
@wjones127 wjones127 added this to the 13.0.0 milestone Apr 19, 2023
liujiacheng777 pushed a commit to LoongArch-Python/arrow that referenced this issue May 11, 2023
…5230)

### Rationale for this change

Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them.

### What changes are included in this PR?

Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns.

### Are these changes tested?

Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings.

### Are there any user-facing changes?

Yes, users are now more flexible to enable/disable page index.
* Closes: apache#34949

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
…5230)

### Rationale for this change

Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them.

### What changes are included in this PR?

Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns.

### Are these changes tested?

Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings.

### Are there any user-facing changes?

Yes, users are now more flexible to enable/disable page index.
* Closes: apache#34949

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue May 16, 2023
…5230)

### Rationale for this change

Currently parquet writer only supports enabling page index for all columns. It would be good to enable/disable at the column level as sometimes it may not be useful for some columns but it pays to create them.

### What changes are included in this PR?

Similar to `WriterProperties::Builder::enable_dictionary/disable_dictionary`, this patch adds `WriterProperties::Builder::enable_write_page_index/disable_write_page_index` and keep it backward compatible to enable/disable for all columns.

### Are these changes tested?

Added `ParquetPageIndexRoundTripTest::EnablePerColumn` to cover the new settings.

### Are there any user-facing changes?

Yes, users are now more flexible to enable/disable page index.
* Closes: apache#34949

Authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants