Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parquet: Field Ids are not read from a Parquet file without serialized arrow schema #4877

Closed
Samrose-Ahmed opened this issue Sep 29, 2023 · 1 comment · Fixed by #4878
Closed
Labels
bug parquet Changes to the parquet crate

Comments

@Samrose-Ahmed
Copy link
Contributor

Samrose-Ahmed commented Sep 29, 2023

Describe the bug

If a parquet file that does not have the serialized arrow schema in the metadata (e.g. written by parquet-mr) is read by parquet crate and written back to parquet, the new file does not contain Parquet field ids.

To Reproduce

  • Have Parquet file with field ids and no arrow serialized schema in metadata
  • Read with parquet and write back to parquet file.
  • New parquet file does not have field ids.

Expected behavior

New parquet file should have same field ids.

Additional context

related: #4702, #3548

@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'parquet'} from #4878

@tustvold tustvold added the parquet Changes to the parquet crate label Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants