Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Improve error message when providing a mix of readr and Arrow options #33420

Open
Tracked by #33520
asfimport opened this issue Nov 3, 2022 · 1 comment
Open
Tracked by #33520

Comments

@asfimport
Copy link

I was trying to solve a user issue today and tried to run the following code:

df = tibble(x = c("a","b",  ""  , "d"))
write_tsv(df, "data.tsv")
open_dataset("data.tsv", format="tsv", skip_rows=1, schema=schema(x=string()), skip_empty_rows = TRUE) %>%
  collect()

which gives me the error

Error: Use either Arrow parse options or readr parse options, not both

which is somewhat obnoxious as I have literally no context provided to know which options are being referred to and what the possible options are.

Also, like, why can't we have a mix of both? This is a totally valid use-case. I think both a code update and a more informative error message are needed here.

Reporter: Nicola Crane / @thisisnic

Note: This issue was originally created as ARROW-18236. Please see the migration documentation for further details.

@thisisnic
Copy link
Member

thisisnic commented Feb 2, 2023

Note: an improved error message has already been implemented for the CSV reading options, so this would need something similar.

Here's the code which improves the messages for the CSV reading:

arrow/r/R/dataset-format.R

Lines 440 to 454 in b413ac4

if (any(is_readr_opt)) {
# Catch cases when the user specifies a mix of Arrow C++ options and
# readr-style options
if (!all(is_readr_opt)) {
abort(c(
"Additional CSV reading options must be Arrow-style or readr-style, but not both.",
i = sprintf("Arrow options used: %s.", oxford_paste(opt_names[is_arrow_opt])),
i = sprintf("readr options used: %s.", oxford_paste(opt_names[is_readr_opt]))
))
}
do.call(readr_to_csv_read_options, opts) # all options have readr-style names
} else {
do.call(CsvReadOptions$create, opts) # all options have Arrow C++ names
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants