Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make bounds configurable in csv ReaderBuilder #1341

Merged
merged 1 commit into from
Feb 24, 2022
Merged

Make bounds configurable in csv ReaderBuilder #1341

merged 1 commit into from
Feb 24, 2022

Conversation

gsserge
Copy link
Contributor

@gsserge gsserge commented Feb 19, 2022

Which issue does this PR close?

Closes #1327.

Rationale for this change

ReaderBuilder for cvs has the bounds field, which currently cannot be set by a user.

What changes are included in this PR?

This PR adds a new public method ReaderBuilder::with_bounds(), and changes ReaderBuilder::build() to correctly pass the configured bounds to Reader::from_csv_reader().

Are there any user-facing changes?

Users can now configure bounds when using ReaderBuilder.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 19, 2022
@codecov-commenter
Copy link

codecov-commenter commented Feb 19, 2022

Codecov Report

Merging #1341 (b896496) into master (ecba7dc) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1341      +/-   ##
==========================================
- Coverage   83.04%   83.03%   -0.01%     
==========================================
  Files         181      181              
  Lines       52937    52948      +11     
==========================================
+ Hits        43960    43968       +8     
- Misses       8977     8980       +3     
Impacted Files Coverage Δ
arrow/src/csv/reader.rs 88.25% <100.00%> (+0.13%) ⬆️
parquet_derive/src/parquet_field.rs 65.98% <0.00%> (-0.46%) ⬇️
arrow/src/array/transform/mod.rs 84.39% <0.00%> (-0.14%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ecba7dc...b896496. Read the comment docs.

@gsserge
Copy link
Contributor Author

gsserge commented Feb 19, 2022

It might be better from the API perspective for the with_bounds() method to have explicit start and end parameters instead of wrapping a tuple in Some: .with_bounds(0, 2) instead of .with_bounds(Some((0, 2))).

@nevi-me
Copy link
Contributor

nevi-me commented Feb 24, 2022

It might be better from the API perspective for the with_bounds() method to have explicit start and end parameters instead of wrapping a tuple in Some: .with_bounds(0, 2) instead of .with_bounds(Some((0, 2))).

I think either approach is fine, we would often pass an Option in case one wants to reset an optional value. in practise though, I don't know whether a readerbuilder can be reused, as it could be cheap to create a new one with different configs.

I'll merge this one for now, and then if you'd like to update to exclude the Option, we can do that before the next major release to avoid breaking changes.

@nevi-me nevi-me merged commit bae3087 into apache:master Feb 24, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ❤️

@gsserge gsserge deleted the csv_builder_bounds branch February 28, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make bounds configurable via builder when reading CSV
5 participants