[C++] Add max_rows parameter to csv ReadOptions

I'm trying to read only the first 1,000 rows of a huge CSV with PyArrow.

I don't see a way to do this with Arrow. I guess it should be easy to implement by adding a `max_rows` parameter to pyarrow.csv.ReadOptions.

After reading the first 1,000, it should be possible to load the next 1,000 (or any other chunk) by using both the new `max_rows` together with `skip_rows` (e.g. `pyarrow.csv.read_csv(path, pyarrow.csv.ReadOption(skip_rows=1_000, max_rows=1_000)` would read from 1,000 to 2,000).

Thanks!

**Reporter**: [Marc Garcia](https://issues.apache.org/jira/browse/ARROW-10419)

<sub>**Note**: *This issue was originally created as [ARROW-10419](https://issues.apache.org/jira/browse/ARROW-10419). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[C++] Add max_rows parameter to csv ReadOptions #26399

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++] Add max_rows parameter to csv ReadOptions #26399

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions