You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pyarrow 0.8 and 0.9 raises an AssertionError for one of the datasets I have (created using an older version of pyarrow). Repro steps:
In [1]: from pyarrow.parquet import ParquetDataset
In [2]: d = ParquetDataset(['bug.parq'])
In [3]: t = d.read()
In [4]: t.to_pandas() --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) <ipython-input-4-d17c9e2818f1> in <module>() ----> 1 t.to_pandas()
table.pxi in pyarrow.lib.Table.to_pandas()
~/envs/cli3/lib/python3.6/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, memory_pool, nthreads, categories) 529 # There must be the same number of field names and physical names 530 # (fields in the arrow Table) --> 531 assert len(logical_index_names) == len(index_columns_set) 532 533 # It can never be the case in a released version of pyarrow that
Uwe Korn / @xhochy:
Do you still know with which version the file was written? We had a small range of commits between 0.7 and 0.8 that produced files that were later rejected by 0.8 but those were never a part of a release.
Pyarrow 0.8 and 0.9 raises an AssertionError for one of the datasets I have (created using an older version of pyarrow). Repro steps:
In [1]: from pyarrow.parquet import ParquetDataset
In [2]: d = ParquetDataset(['bug.parq'])
In [3]: t = d.read()
In [4]: t.to_pandas()
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-4-d17c9e2818f1> in <module>()
----> 1 t.to_pandas()
table.pxi in pyarrow.lib.Table.to_pandas()
~/envs/cli3/lib/python3.6/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, memory_pool, nthreads, categories)
529 # There must be the same number of field names and physical names
530 # (fields in the arrow Table)
--> 531 assert len(logical_index_names) == len(index_columns_set)
532
533 # It can never be the case in a released version of pyarrow that
AssertionError:
Here's the file: https://www.dropbox.com/s/oja3khjsc5tycfh/bug.parq
(I was not able to attach it here due to a "missing token", whatever that means.)
Reporter: Dima Ryazanov / @dimaryaz
Assignee: Wes McKinney / @wesm
PRs and other links:
Note: This issue was originally created as ARROW-2592. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: