Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Writing Parquet with pyarrow.dataset still defaults to Parquet version 1.0 #36537

Closed
jorisvandenbossche opened this issue Jul 7, 2023 · 0 comments · Fixed by #36538

Comments

@jorisvandenbossche
Copy link
Member

While we just bumped the default Parquet version we write from 2.4 to 2.6 (#35746), and this was bumped from 1.0 to 2.4 a year ago (#13280), in both cases we missed to update that default in the ParquetFileWriteOptions for writing with pyarrow.dataset.write_dataset

@jorisvandenbossche jorisvandenbossche added this to the 13.0.0 milestone Jul 7, 2023
jorisvandenbossche added a commit to jorisvandenbossche/arrow that referenced this issue Jul 7, 2023
jorisvandenbossche added a commit to danepitkin/arrow that referenced this issue Jul 7, 2023
jorisvandenbossche added a commit that referenced this issue Jul 7, 2023
…ion of 2.6 (#36538)

### Rationale for this change

When bumping the default Parquet write version from 1.0 to 2.4 and then to 2.6, we forgot to also bump that default in the parquet.dataset writer (`ParquetFileWriteOptions`).

This PR bumps that directly from 1.0 to 2.6 to follow the default of the pyarrow.parquet module.

### Are these changes tested?

Yes

### Are there any user-facing changes?

Different default version can give different types in the parquet file.

* Closes: #36537

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment