-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect (but ignored) metadata written after ColumnChunk #1946
Comments
I will try to fix the this point. |
@tustvold |
I created a ticket in the Parquet-MR for java version |
@liukun4515 and @tustvold I tried to summarize the issue in the title of this ticket (which is included in the CHANGELOG) but I am not sure I totally understand -- can you please verify my title change is correct? |
The TLDR is we wrote the wrong thing after the column chunk, but no implementations actually read this data as it is already present in the footer, and so we never noticed |
Thank you -- added to the original description |
TLDR: is we wrote the wrong thing after the column chunk, but no implementations actually read this data as it is already present in the footer, and so we never noticed
Describe the bug
I'm working on the #1935, and go through the write path of rust-version.
I find that parquet-write write some error data which is ColumnChunk to the file, you can find the logic from
arrow-rs/parquet/src/file/writer.rs
Line 441 in 9f7b600
From my knowledge about parquet format in this https://github.com/apache/parquet-format/blob/54e53e5d7794d383529dd30746378f19a12afd58/src/main/thrift/parquet.thrift#L789 and https://github.com/apache/parquet-format#file-format, the data after column data is
ColumnMetaData
, not ColumnChunkTo Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: