Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove delimiter from csv Writer #1342

Merged
merged 1 commit into from Feb 24, 2022
Merged

Remove delimiter from csv Writer #1342

merged 1 commit into from Feb 24, 2022

Conversation

gsserge
Copy link
Contributor

@gsserge gsserge commented Feb 19, 2022

Which issue does this PR close?

Closes #1328.

I think that I wrote the issue above in a bit of a rush, the part about changes to Writer::new(). The doc comment for Writer::new() clearly states that a new CsvWriter is initialized with default options, with , being the default delimiter.

A custom delimiter can be configured using already existing WriterBuilder::with_delimiter. So, in my view, the scope of the issue is to simply remove unused code without any changes to the API.

Rationale for this change

There is no use for the delimiter field in csv Writer.

What changes are included in this PR?

The unused delimiter field is removed.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 19, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #1342 (333672c) into master (ecba7dc) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1342   +/-   ##
=======================================
  Coverage   83.04%   83.04%           
=======================================
  Files         181      181           
  Lines       52937    52937           
=======================================
  Hits        43960    43960           
  Misses       8977     8977           
Impacted Files Coverage Δ
arrow/src/csv/writer.rs 72.13% <ø> (ø)
arrow/src/datatypes/datatype.rs 66.40% <0.00%> (-0.40%) ⬇️
parquet_derive/src/parquet_field.rs 66.21% <0.00%> (-0.23%) ⬇️
arrow/src/array/transform/mod.rs 84.65% <0.00%> (+0.13%) ⬆️
arrow/src/datatypes/field.rs 54.10% <0.00%> (+0.30%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ecba7dc...333672c. Read the comment docs.

@Dandandan
Copy link
Contributor

I think the delimiter is not used in the builder (when converting to Writer), so I think the original issue description is correct?

@gsserge
Copy link
Contributor Author

gsserge commented Feb 19, 2022

    /// Create a new `Writer`
    pub fn build<W: Write>(self, writer: W) -> Writer<W> {
        let delimiter = self.delimiter.unwrap_or(b','); // looks like it's used
        let mut builder = csv_crate::WriterBuilder::new();
        let writer = builder.delimiter(delimiter).from_writer(writer);
        ...

@Dandandan
Copy link
Contributor

    /// Create a new `Writer`
    pub fn build<W: Write>(self, writer: W) -> Writer<W> {
        let delimiter = self.delimiter.unwrap_or(b','); // looks like it's used
        let mut builder = csv_crate::WriterBuilder::new();
        let writer = builder.delimiter(delimiter).from_writer(writer);
        ...

Yes it's used in the builder, but looks to me it's not passed to the Writer eventually (and so not used when writing the CSV).

@gsserge
Copy link
Contributor Author

gsserge commented Feb 19, 2022

Hm, but what about this line

let writer = builder.delimiter(delimiter).from_writer(writer);

If I understand correctly, builder here is from the csv crate, and it gets the delimiter configuration.

@Dandandan
Copy link
Contributor

Hm, but what about this line

let writer = builder.delimiter(delimiter).from_writer(writer);

If I understand correctly, builder here is from the csv crate, and it gets the delimiter configuration.

Ah yeah, you are right 👍 missed that this was the Writer from the CSV crate.

Maybe we could add a test for writing a csv with another separator?

@gsserge
Copy link
Contributor Author

gsserge commented Feb 19, 2022

There's already a test for custom options, including the delimiter: https://github.com/apache/arrow-rs/blob/master/arrow/src/csv/writer.rs#L637

@Dandandan
Copy link
Contributor

There's already a test for custom options, including the delimiter: https://github.com/apache/arrow-rs/blob/master/arrow/src/csv/writer.rs#L637

Perfect, thanks!

@nevi-me nevi-me merged commit a2e629d into apache:master Feb 24, 2022
@gsserge gsserge deleted the csv_writer_delimiter branch February 28, 2022 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix CSV Writer::new to accept delimiter and make WriterBuilder::build use it
4 participants