Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] View per-column metadata #84

Closed
ngbrown opened this issue Aug 7, 2023 · 3 comments
Closed

[FEAT] View per-column metadata #84

ngbrown opened this issue Aug 7, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@ngbrown
Copy link

ngbrown commented Aug 7, 2023

Describe the feature you'd like to be added to Parquet Viewer

I would like the metadata viewer to show the custom key_value_metadata added to each column of the schema. PyArrow's API seems to allow this to be added at the schema level, while Parquet.Net's API adds it per row group, which is more inline with the actual file structure.

Share why this feature would be a good addition to the utility

I want to validate that I'm building Parquet files correctly with the data I expect. I would like to use metadata for per-column information like units and description.

@ngbrown ngbrown added the enhancement New feature or request label Aug 7, 2023
@mukunku mukunku mentioned this issue Aug 12, 2023
@mukunku
Copy link
Owner

mukunku commented Aug 12, 2023

I added some column metadata to the row groups in https://github.com/mukunku/ParquetViewer/releases/tag/v2.7.2 . Can you check it out and see if that's good enough for your needs?

@ngbrown
Copy link
Author

ngbrown commented Aug 14, 2023

@mukunku I see the column metadata (KeyValueMetadata) in each row group. I like that the byte sizes of each column are now available to get an idea of how well each column is de-duplicating and compressing.

This extra information also greatly increases the line count of the metadata window (2,440 -> 78,498), so I copied the json into Visual Studio Code to make use of the collapsing and search. I may have too many row groups... Anyways doing more with the UI will have diminishing returns because other editors will do a better job. Maybe a copy button would be the extent of any change I would suggest to the metadata viewer window?

Thank you for your tool and I'm now able to use this feature request for what I needed.

@mukunku
Copy link
Owner

mukunku commented Aug 18, 2023

Thank you for the feedback. I reverted the extra info I added to the thrift metadata being shown. And added a "Copy Raw Metadata" button as you suggested. You can find this new feature available in: v2.7.2.1
image

Really appreciate the ideas to help make the app better. Closing out this ticket for now but feel free to reopen if required.

@mukunku mukunku closed this as completed Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants