Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns #29103

Closed
asfimport opened this issue Jul 22, 2021 · 2 comments

Comments

@asfimport
Copy link
Collaborator

The documentation for pyarrow.parquet.read_table states:

 

  • columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’.

     

    It is not clear what should be the expected result if columns is an empty list.  In pyarrow 3.0 this read in all columns (as long as use_legacy_dataset=False).  In pyarrow 4.0 this doesn't read in any columns.  I think this behavior (not reading in any columns) is the correct behavior (since None can be used for all columns) but we should clarify that in the docs.

Reporter: Weston Pace / @westonpace
Assignee: Sasha Krassovsky / @save-buffer

PRs and other links:

Note: This issue was originally created as ARROW-13436. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Yes, I think an empty list should mean no columns read in (which is the current behaviour?). Note that the table still has the correct length (num_rows), even though it has no rows. Would be good to clarify in the docs.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Issue resolved by pull request 11451
#11451

@asfimport asfimport added this to the 6.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant