You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
this is convenient because it's easier and quicker than providing the whole schema, which can still be autodetected from the loaded data.
On the other side, we don't support this when saving data. If you provide field_names instead of the schema you will get a crash
pyarrow/dataset.pyin_ensure_write_partitioning(scheme)
684ifnotisinstance(scheme, Partitioning):
685 # TODOsupportpassingfieldnames, andgettypesfromschema
--> 686raiseValueError("partitioning needs to be actual Partitioning object")
687returnscheme688
It would be convenient to allow to use field_names only even when saving as we can automatically detect the schema from the table itself that we are saving.
When loading back datasets, it's possible to quickly provide the name of the columns for which data was partitioned using
this is convenient because it's easier and quicker than providing the whole schema, which can still be autodetected from the loaded data.
On the other side, we don't support this when saving data. If you provide
field_names
instead of theschema
you will get a crashIt would be convenient to allow to use
field_names
only even when saving as we can automatically detect the schema from the table itself that we are saving.Reporter: Alessandro Molina / @amol-
Assignee: Alessandro Molina / @amol-
PRs and other links:
Note: This issue was originally created as ARROW-13755. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: