Skip to content

[C++][Parquet] Unable to read Parquet files with list inside struct #17612

@asfimport

Description

@asfimport

Is PyArrow currently unable to read in Parquet files with a vector as a column? For example, the schema of such a file is below:

{{<pyarrow._parquet.ParquetSchema object at 0x7f2d42493c88>
mbc: FLOAT
deltae: FLOAT
labels: FLOAT
features.type: INT32 INT_8
features.size: INT32
features.indices.list.element: INT32
features.values.list.element: DOUBLE}}

Using either pq.read_table() or pq.ParquetDataset('/path/to/parquet').read() yields the following error: ArrowNotImplementedError: Currently only nesting with Lists is supported.

From the error I assume that this may be implemented in further releases?

Environment: Ubuntu
Reporter: Jovann Kung
Assignee: Micah Kornfield / @emkornfield

Related issues:

Note: This issue was originally created as ARROW-1599. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions