You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current implementation throws error in reader.cc:286 when skip_rows > header. However, in some workloads skip_rows used for not only skipping header but for just skipping first n-rows. In this case block-size constraint is greatly interferes. I think this constraint could be removed without performance reduction.
Weston Pace / @westonpace:
This behavior could be useful for ARROW-12598. Also, in a recent discussion, n3world (no Jira I can find) pointed out that skip_rows is probably not the best tool for this. This sort of "paging" would require skipping data rows so it would be nice if the "skip header rows" (constant parameter based on the tool generating the data) is distinct from "skip data rows" (per query parameter based on paging needs)
Current implementation throws error in reader.cc:286 when skip_rows > header. However, in some workloads skip_rows used for not only skipping header but for just skipping first n-rows. In this case block-size constraint is greatly interferes. I think this constraint could be removed without performance reduction.
Reporter: Ravil Bikbulatov
Assignee: Nate Clark / @n3world
Related issues:
Note: This issue was originally created as ARROW-8527. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: