Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Allow key value metadata to be set after writing rows #399

Merged
merged 1 commit into from
Nov 7, 2022

Conversation

tschaub
Copy link
Contributor

@tschaub tschaub commented Nov 3, 2022

This adds a SetKeyValueMetadata method to the writer to allow key value file metadata to be updated after the writer is created. This allows for writing metadata that might be derived from the rows without having to buffer all the rows before creating the writer.

Fixes #397.

@tschaub tschaub marked this pull request as ready for review November 4, 2022 14:32
Copy link
Contributor

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add these methods to the GenericWriter type as well?

writer.go Outdated
// keys. This may create incompatibilities with other parquet libraries, or may
// cause some key/value pairs to be lost when open parquet files written with
// repeated keys. We can revisit this decision if it ever becomes a blocker.
func (w *Writer) KeyValueMetadata(key, value string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming to Set* to better communicate that the previous value will be overwritten?

We could also imagine having an Add* method for cases where the application needs to associated multiple values to a key.

Suggested change
func (w *Writer) KeyValueMetadata(key, value string) {
func (w *Writer) SetKeyValueMetadata(key, value string) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the latest commit.

@achille-roussel achille-roussel self-assigned this Nov 7, 2022
@tschaub
Copy link
Contributor Author

tschaub commented Nov 7, 2022

Should we add these methods to the GenericWriter type as well?

Latest commit adds the method to GenericWriter as well.

Copy link
Contributor

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write key value metadata after rows
2 participants