Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] fallback to storage type when writing ExtensionType to Parquet #22579

Closed
asfimport opened this issue Aug 9, 2019 · 1 comment
Closed

Comments

@asfimport
Copy link

Writing a table that contains an ExtensionType array to a parquet file is not yet implemented. It currently raises "ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema conversion: extension<arrow.py_extension_type>" (for a PyExtensionType in this case).

I think minimal support can consist of writing the storage type / array.

We also might want to save the extension name and metadata in the parquet FileMetadata.

Later on, this could be potentially be used to restore the extension type when reading. This is related to other issues that need to save the arrow schema (categorical: ARROW-5480, time zones: ARROW-5888). Only in this case, we probably want to store the serialised type in addition to the schema (which only has the extension type's name).

Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche

PRs and other links:

Note: This issue was originally created as ARROW-6187. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 5436
#5436

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants