Currently, if you use ChunkedArrays to make a Table, the num_rows argument is optional – if left as-is, the Table can just figure out the amount from the ChunkedArrays, and use that to initialize the Table. If the user wants a subset, instead, the user can fill in this argument.
RecordBatch, when made with Arrays, requires the number of rows to be supplied, no matter what, leading users to do things like pass arr->length() when they just want all their data.
Could RecordBatch's Array-using Make() method be changed to match the behavior of Table's ChunkedArray-using Make() method? If only for the sake of consistency?
Reporter: Kae Suarez / @ksuarez1423
Note: This issue was originally created as ARROW-17443. Please see the migration documentation for further details.
Currently, if you use ChunkedArrays to make a Table, the num_rows argument is optional – if left as-is, the Table can just figure out the amount from the ChunkedArrays, and use that to initialize the Table. If the user wants a subset, instead, the user can fill in this argument.
RecordBatch, when made with Arrays, requires the number of rows to be supplied, no matter what, leading users to do things like pass arr->length() when they just want all their data.
Could RecordBatch's Array-using Make() method be changed to match the behavior of Table's ChunkedArray-using Make() method? If only for the sake of consistency?
Reporter: Kae Suarez / @ksuarez1423
Note: This issue was originally created as ARROW-17443. Please see the migration documentation for further details.