You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation for pyarrow.parquet.read_table states:
columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’.
It is not clear what should be the expected result if columns is an empty list. In pyarrow 3.0 this read in all columns (as long as use_legacy_dataset=False). In pyarrow 4.0 this doesn't read in any columns. I think this behavior (not reading in any columns) is the correct behavior (since None can be used for all columns) but we should clarify that in the docs.
Joris Van den Bossche / @jorisvandenbossche:
Yes, I think an empty list should mean no columns read in (which is the current behaviour?). Note that the table still has the correct length (num_rows), even though it has no rows. Would be good to clarify in the docs.
The documentation for pyarrow.parquet.read_table states:
columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’.
It is not clear what should be the expected result if columns is an empty list. In pyarrow 3.0 this read in all columns (as long as use_legacy_dataset=False). In pyarrow 4.0 this doesn't read in any columns. I think this behavior (not reading in any columns) is the correct behavior (since None can be used for all columns) but we should clarify that in the docs.
Reporter: Weston Pace / @westonpace
Assignee: Sasha Krassovsky / @save-buffer
PRs and other links:
Note: This issue was originally created as ARROW-13436. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: