-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] open_csv_dataset() error if schema supplied and col_names left as TRUE (the default) #34092
Comments
thisisnic
changed the title
[R] bad default in
[R] open_csv_dataset() error if schema supplied and col_names left as TRUE (the default)
Feb 16, 2023
open_csv_dataset()
assignUser
pushed a commit
that referenced
this issue
Feb 23, 2023
…es left as TRUE (the default) (#34217) Before this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> Error in `check_schema()`: #> ! Values in `column_names` must match `schema` field names #> ✖ `x` and `y` not present in `column_names` #> Backtrace: #> ▆ #> 1. └─arrow (local) `<fn>`(...) #> 2. └─arrow::open_dataset(...) #> 3. └─DatasetFactory$create(...) #> 4. └─FileFormat$create(match.arg(format), ...) #> 5. └─CsvFileFormat$create(schema = schema, ...) #> 6. └─arrow:::check_schema(options[["schema"]], options[["read_options"]]$column_names) #> 7. └─rlang::abort(...) ``` After this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> FileSystemDataset with 1 csv file #> x: int64 #> y: int64 ``` * Closes: #34092 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
fatemehp
pushed a commit
to fatemehp/arrow
that referenced
this issue
Feb 24, 2023
…ol_names left as TRUE (the default) (apache#34217) Before this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> Error in `check_schema()`: #> ! Values in `column_names` must match `schema` field names #> ✖ `x` and `y` not present in `column_names` #> Backtrace: #> ▆ #> 1. └─arrow (local) `<fn>`(...) #> 2. └─arrow::open_dataset(...) #> 3. └─DatasetFactory$create(...) #> 4. └─FileFormat$create(match.arg(format), ...) #> 5. └─CsvFileFormat$create(schema = schema, ...) #> 6. └─arrow:::check_schema(options[["schema"]], options[["read_options"]]$column_names) #> 7. └─rlang::abort(...) ``` After this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> FileSystemDataset with 1 csv file #> x: int64 #> y: int64 ``` * Closes: apache#34092 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
thisisnic
added a commit
to thisisnic/arrow
that referenced
this issue
Mar 1, 2023
…ol_names left as TRUE (the default) (apache#34217) Before this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> Error in `check_schema()`: #> ! Values in `column_names` must match `schema` field names #> ✖ `x` and `y` not present in `column_names` #> Backtrace: #> ▆ #> 1. └─arrow (local) `<fn>`(...) #> 2. └─arrow::open_dataset(...) #> 3. └─DatasetFactory$create(...) #> 4. └─FileFormat$create(match.arg(format), ...) #> 5. └─CsvFileFormat$create(schema = schema, ...) #> 6. └─arrow:::check_schema(options[["schema"]], options[["read_options"]]$column_names) #> 7. └─rlang::abort(...) ``` After this PR: ``` r library(arrow) tf <- tempfile() df <- tibble::tibble(x = 1, b = 2) write_csv_arrow(df, tf) open_csv_dataset(tf, schema = schema(x = int64(), y = int64()), skip = 1) #> FileSystemDataset with 1 csv file #> x: int64 #> y: int64 ``` * Closes: apache#34092 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug, including details regarding any error messages, version, and platform.
I was putting together an example to diagnose a user error, and got an unexpected error message:
This may be due to the fact that the
col_names
param has a default value ofTRUE
.Component(s)
R
The text was updated successfully, but these errors were encountered: