Skip to content

[Python] Error when loading all the files in a dictionary #17825

@asfimport

Description

@asfimport

I can read one parquet file, but when I tried to read all the parquet files in a folder, I got an error.

>>> data = pq.ParquetDataset('./aaa/part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86')
>>> data = pq.ParquetDataset('./aaa/')
Ignoring path: ./aaa//part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 638, in __init__
    self.validate_schemas()
  File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 647, in validate_schemas
    self.schema = self.pieces[0].get_metadata(open_file).schema
IndexError: list index out of range
>>> 

Environment: Python 2.7.11 (default, Jan 22 2016, 08:29:18) + pyarrow 0.7.1
Reporter: DB Tsai
Assignee: Wes McKinney / @wesm

PRs and other links:

Note: This issue was originally created as ARROW-1830. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions