Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Add function write_csv_dataset() #36247

Closed
thisisnic opened this issue Jun 22, 2023 · 4 comments · Fixed by #36436
Closed

[R] Add function write_csv_dataset() #36247

thisisnic opened this issue Jun 22, 2023 · 4 comments · Fixed by #36436

Comments

@thisisnic
Copy link
Member

thisisnic commented Jun 22, 2023

Describe the enhancement requested

Add function write_csv_dataset() (we don't currently support the functionality needed for write_delim_dataset(), and write_tsv_dataset() - not entirely sure how true this is, so could be worth investigating?), to mirror our open_csv_dataset() function

Component(s)

R

@dgreiss
Copy link
Contributor

dgreiss commented Jul 1, 2023

I can take this one if no one else is working on it.

@thisisnic
Copy link
Member Author

Fantastic, thanks @dgreiss!

@dgreiss
Copy link
Contributor

dgreiss commented Jul 2, 2023

I drafted a PR #36436 for write_csv_dataset, and I also investigated if write_tsx_dataset() and write_delim_dataset() could be implemented. The available options that get passed to the CSV writer are here: https://arrow.apache.org/docs/cpp/api/formats.html#csv-writer .

There is a delimiter option, and it is defined as Field delimiter, but I'm not entirely sure what that means in this context. If it is the delimiter for values, then it should be relatively straightforward to implement it as an option to the csv writer.

Of the options listed for the CSV writer class the package currently accepts these :

  • include_header
  • batch_string
  • null_string

It may be worth while exposing the quoting_style option as well.

@dgreiss
Copy link
Contributor

dgreiss commented Jul 4, 2023

I've updated the PR to include these other wrappers.

thisisnic added a commit to dgreiss/arrow that referenced this issue Aug 15, 2023
thisisnic added a commit that referenced this issue Aug 25, 2023
### Rationale for this change

Create a convenience wrapper around `write_dataset` for csv files. 

### What changes are included in this PR?

Adds a `write_csv_dataset()` 

### Are these changes tested?

Yes a few tests were added. 

### Are there any user-facing changes?

Yes a new function has been added. If this looks good I can add more to the docs as well. 
* Closes: #36247

Lead-authored-by: David Greiss <david.dgreiss@gmail.com>
Co-authored-by: Nic Crane <thisisnic@gmail.com>
Co-authored-by: David Greiss <dgreiss@users.noreply.github.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
@thisisnic thisisnic added this to the 14.0.0 milestone Aug 25, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
### Rationale for this change

Create a convenience wrapper around `write_dataset` for csv files. 

### What changes are included in this PR?

Adds a `write_csv_dataset()` 

### Are these changes tested?

Yes a few tests were added. 

### Are there any user-facing changes?

Yes a new function has been added. If this looks good I can add more to the docs as well. 
* Closes: apache#36247

Lead-authored-by: David Greiss <david.dgreiss@gmail.com>
Co-authored-by: Nic Crane <thisisnic@gmail.com>
Co-authored-by: David Greiss <dgreiss@users.noreply.github.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants