Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GH-33526: [R] Implement new function open_dataset_csv with signature …
…more closely matching read_csv_arrow (#33614) This PR implements a wrapper around `open_dataset()` specifically for value-delimited files. It takes the parameters from `open_dataset()` and appends the parameters of `read_csv_arrow()` which are compatible with `open_dataset()`. This should make it easier for users to switch between the two, e.g.: ``` r library(arrow) library(dplyr) # Set up directory for examples tf <- tempfile() dir.create(tf) on.exit(unlink(tf)) df <- data.frame(x = c("1", "2", "NULL")) file_path <- file.path(tf, "file1.txt") write.table(df, file_path, sep = ",", row.names = FALSE) read_csv_arrow(file_path, na = c("", "NA", "NULL"), col_names = "y", skip = 1) #> # A tibble: 3 × 1 #> y #> <int> #> 1 1 #> 2 2 #> 3 NA open_csv_dataset(file_path, na = c("", "NA", "NULL"), col_names = "y", skip = 1) %>% collect() #> # A tibble: 3 × 1 #> y #> <int> #> 1 1 #> 2 2 #> 3 NA ``` This PR also hooks up the "na" (readr-style) parameter to "null_values" (i.e. CSVConvertOptions parameter). In the process of making this PR, I also refactored `CsvFileFormat$create()`. Unfortunately, many changes needed to be made at once, which has considerably increasing the size/complexity of this PR. Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
- Loading branch information
Showing
10 changed files
with
648 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.