Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PARQUET-1780: [C++] Set ColumnMetadata.encoding_stats field
This is to solve the issue PARQUET-1780: ColumnMetadata.encoding_stats field is empty in parquet-cpp implementation. This leads to metadata mismatches between 2 parquet files generated by cpp and scala(parquet-mr). encoding_stat is a vector of **PageEncodingStats**. PageEncodingStats has three attributes: - page_type: (data or dict) - encoding: encoding of the page - count:number of pages of this type with this encoding From above first to can be extracted from available information. But for count I have to create a add some attributes to exisiting classes. Modifications: For the class **SerializedPageWriter**, added following two attributes. int32_t num_dict_pages_; std::pair<int32_t, int32_t> num_data_pages_; (first: number of un-encoded pages, second:number of encoded pages ) Closes #6370 from omega-gamage/PARQUET-1780 and squashes the following commits: 086af4e <Wes McKinney> Code review comments a9c684b <Omega Gamage> Match the implementation with impala implementation eae56fa <Wes McKinney> Simplify PageEncodingStats 54ac1eb <Omega Gamage> commit 9eecaaf Author: Omega Gamage <omega@bigstream.co> Date: Tue Feb 18 14:23:08 2020 +0530 Lead-authored-by: Omega Gamage <omega@bigstream.co> Co-authored-by: Wes McKinney <wesm+git@apache.org> Signed-off-by: Wes McKinney <wesm+git@apache.org>
- Loading branch information
1 parent
21c4d4b
commit b4acb0b
Showing
6 changed files
with
116 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters