Handle impossible conversions in csv.ConvertOptions

https://arrow.apache.org/docs/python/generated/pyarrow.csv.ParseOptions.html#pyarrow.csv.ParseOptions allows for skipping invalid rows by means of the `invalid_row_handler`.

In https://arrow.apache.org/docs/python/generated/pyarrow.csv.ConvertOptions.html#pyarrow.csv.ConvertOptions, one can supply a schema to get correct types in the resulting table.
I have a data source that almost always follows a specific schema, but its data isn't validated beforehand. In practice, it's possible for a field which is int16 99.9% of the time to have an out-of-range value in a few rows.

I'd like to handle those cases similarly to the `invalid_row_handler`, perhaps allowing to set failing conversions to NULL, or supplying a handler to apply a more specific operation.

**Reporter**: [Tim Loderhose](https://issues.apache.org/jira/browse/ARROW-16834)

<sub>**Note**: *This issue was originally created as [ARROW-16834](https://issues.apache.org/jira/browse/ARROW-16834). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle impossible conversions in csv.ConvertOptions #32163

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Handle impossible conversions in csv.ConvertOptions #32163

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions