Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-248: Add ParquetWriter.Builder. #199

Closed

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented May 19, 2015

This refactors the builder recently added to parquet-avro so that it can
be used by all object models. The Builder class is abstract and
implementations should extend it.

This changes the API slightly from AvroParquetWriter, renaming
withBlockSize to withRowGroupSize. The Avro builder has not been
released so this isn't a breaking change.

@rdblue
Copy link
Contributor Author

rdblue commented May 19, 2015

@SinghAsDev, can you do a first round of review?

@SinghAsDev
Copy link
Contributor

+1, looks good to me.

@rdblue
Copy link
Contributor Author

rdblue commented May 24, 2015

@isnotinvain could you take a look? I think this is a good thing to get in now so we have time to get the other object models using builders.

@peltekster
Copy link

@rdblue, This was already done in #157
I have no idea why it was ignored. Anyway, are you interested in having control on the encoding used per column?

@rdblue
Copy link
Contributor Author

rdblue commented May 30, 2015

Hi @peltekster, sorry about that. I had no idea it had been done as part of another issue. Since this is a separate issue, it should be a bit easier to merge this one and use it to simplify the other commit. Would you mind reviewing this then?

@rdblue rdblue force-pushed the PARQUET-248-add-parquet-writer-builder branch from a226b7b to 63aa891 Compare June 23, 2015 00:33
@rdblue
Copy link
Contributor Author

rdblue commented Jun 23, 2015

@isnotinvain, now that #211 is in, I've updated this and added the new options from master. I think it's about ready to go in.

* @return this builder for method chaining.
*/
public SELF withRowGroupSize(int rowGroupSize) {
this.blockSize = rowGroupSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also rename this.blockSize to be this.rowGroupSize ?

@isnotinvain
Copy link
Contributor

+1

This refactors the builder recently added to parquet-avro so that it can
be used by all object models. The Builder class is abstract and
implementations should extend it.

This changes the API slightly from AvroParquetWriter, renaming
withBlockSize to withRowGroupSize. The Avro builder has not been
released so this isn't a breaking change.
This adds the options that have been added to master since this pull
request was added.
@rdblue rdblue force-pushed the PARQUET-248-add-parquet-writer-builder branch from 63aa891 to a1a25ee Compare June 24, 2015 23:24
@asfgit asfgit closed this in cb04562 Jun 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants