Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38887: [C++][Parquet] Move EstimatedBufferedValueBytes from TypedColumnWriter to ColumnWriter #39055

Merged
merged 2 commits into from
Dec 6, 2023

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Dec 3, 2023

Rationale for this change

Trying to put EstimatedBufferedValueBytes from TypedColumnWriter to ColumnWriter.

What changes are included in this PR?

put EstimatedBufferedValueBytes from TypedColumnWriter to ColumnWriter.

Are these changes tested?

No, just interface change

Are there any user-facing changes?

Yes, interface changed

@mapleFU
Copy link
Member Author

mapleFU commented Dec 3, 2023

@pitrou @wgtmac
The interface name can be changed, I can also write a description like mentioned here: #33897 (comment)

Also cc @Hattonuri

@@ -175,6 +175,9 @@ class PARQUET_EXPORT ColumnWriter {
/// total_bytes_written().
virtual int64_t total_compressed_bytes_written() const = 0;

/// \brief Estimated size of the values that are not written to a page yet.
virtual int64_t EstimatedBufferedValueBytes() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unrelated, but the naming looks weird after relocation...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Dec 4, 2023
Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on my side. But this introduces a minor breaking change.

@mapleFU
Copy link
Member Author

mapleFU commented Dec 5, 2023

But this introduces a minor breaking change.

Though TypedColumnWriter is not PARQUET_EXPORT, some user might rely on it. We can keep this and let user fixing the compiling, or just add a wrapper for it.

@wgtmac
Copy link
Member

wgtmac commented Dec 5, 2023

IMO, this is really a minor breaking change and it is very easy for users to fix. So I don't think we need to add a wrapper or something.

@mapleFU mapleFU merged commit 1cc1f4c into apache:main Dec 6, 2023
30 of 32 checks passed
@mapleFU mapleFU removed the awaiting committer review Awaiting committer review label Dec 6, 2023
@mapleFU mapleFU deleted the put-estimated-buffered-value-bytes branch December 6, 2023 03:15
@mapleFU
Copy link
Member Author

mapleFU commented Dec 6, 2023

Merged, cc @Hattonuri

Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 1cc1f4c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@amoeba amoeba added the Breaking Change Includes a breaking change to the API label Jan 13, 2024
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…TypedColumnWriter to ColumnWriter (apache#39055)

### Rationale for this change

Trying to put `EstimatedBufferedValueBytes` from `TypedColumnWriter` to `ColumnWriter`.

### What changes are included in this PR?

put `EstimatedBufferedValueBytes` from `TypedColumnWriter` to `ColumnWriter`.

### Are these changes tested?

No, just interface change

### Are there any user-facing changes?

Yes, interface changed

* Closes: apache#38887

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: mwish <maplewish117@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking Change Includes a breaking change to the API Component: C++ Component: Parquet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] [Parquet] Move EstimatedBufferedValueBytes from TypedColumnWriter to ColumnWriter
4 participants