You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Directory-based partitioning is a feature of Arrow, but could we support filename-based partitioning?
e.g. I have a series of CSV files here all called something like foo_month_year.csv and it'd be nice to be able to read them in and then the month/year bits of the filenames then appear as fields I can filter on etc.
David Li / @lidavidm:
The current APIs trim the filenames before they're handed to partitioning, but assuming we can change that, we add or update the partitioning schemes to allow for this as well without too much trouble, I think. (If the filenames weren't trimmed, then it could already be done - at least in C++ - via a FunctionPartitioning.)
Directory-based partitioning is a feature of Arrow, but could we support filename-based partitioning?
e.g. I have a series of CSV files here all called something like
foo_month_year.csv
and it'd be nice to be able to read them in and then the month/year bits of the filenames then appear as fields I can filter on etc.Reporter: Nicola Crane / @thisisnic
Assignee: Sanjiban Sengupta / @sanjibansg
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-14612. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: