-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet: Support repetition level >1 and multi-column fields #871
Comments
Likely the first step is some kind of "flattening", but this is contrary to the intent of the Dremel design, so maybe we can think of a better solution. |
I'll be improving our error messages with a PR shortly. New messages:
We'll see:
For:
We'll see:
|
It might be nice to be able to specify which columns you care about for your Table - in which case, the user can choose to not include the nested columns. There's a mechanism right now to provide column instructions: from deephaven.parquet import read, ColumnInstruction
t = read(
path="/snappy.parquet",
col_instructions=[
ColumnInstruction(column_name="date", parquet_column_name="date")
],
) but this currently throws the error:
|
A user has hit this w/ the parquet viewer, see devinrsmith/deephaven-parquet-viewer#9 |
Additionally, adds explicit entry points for single, flat-partitioned, and kv-partitioned reads. Fixes deephaven#4746 Partial workaround for deephaven#871
Currently, we regard nested repetition and multi-column fields as uncommon and hard to map into a columnar data table like Deephaven's.
This feature request is intended to capture views to the contrary.
Linked to #294 , although intended for a later effort.
The text was updated successfully, but these errors were encountered: