-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[C++][Parquet] Unable to read Parquet files with list inside struct #17612
Description
Is PyArrow currently unable to read in Parquet files with a vector as a column? For example, the schema of such a file is below:
{{<pyarrow._parquet.ParquetSchema object at 0x7f2d42493c88>
mbc: FLOAT
deltae: FLOAT
labels: FLOAT
features.type: INT32 INT_8
features.size: INT32
features.indices.list.element: INT32
features.values.list.element: DOUBLE}}
Using either pq.read_table() or pq.ParquetDataset('/path/to/parquet').read() yields the following error: ArrowNotImplementedError: Currently only nesting with Lists is supported.
From the error I assume that this may be implemented in further releases?
Environment: Ubuntu
Reporter: Jovann Kung
Assignee: Micah Kornfield / @emkornfield
Related issues:
- [C++] Rebase https://github.com/apache/parquet-cpp/pull/462# onto arrow repo (is related to)
- [Python] Fail to write nested data to Parquet via BigQuery API (is related to)
Note: This issue was originally created as ARROW-1599. Please see the migration documentation for further details.