Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Fix backwards compatibility for ParquetV2 data pages written prior to 3.0.0 per ARROW-10353 #20322

Closed
asfimport opened this issue Jul 18, 2022 · 4 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Jul 18, 2022

As described in https://lists.apache.org/thread/xkrhgfpk9sr1mj74d4chz3r5yp3szt6c,

ef0feb2

Caused some files written prior to 3.0.0 to be unreadable. Given that the patch was small, this will hopefully not be too difficult to fix

Reporter: Will Jones / @wjones127
Assignee: Will Jones / @wjones127

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-17100. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
That changeset is ARROW-10353, which fixes bugs both in the read and write path for V2 data pages. On the write side, Parquet C++ used not to always set is_compressed = false in the data page, regardless of compression. On the read side, Parquet C++ used to always decompress, regardless of the is_compressed flag.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
Note that this issue has existed since 3.0.0, so the blocker classification seems a bit exagerated to me.

@asfimport
Copy link
Collaborator Author

Raúl Cumplido / @raulcd:
This is identified as one of the tickets required to create the first 9.0.0 RC, should this block the release?

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
Issue resolved by pull request 13665
#13665

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants