-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Read CSV with comma as decimal mark #29184
Comments
Dewey Dunnington / @paleolimbot: library(arrow, warn.conflicts = FALSE)
tf <- tempfile()
write("col1;col2\n1,23;val1\n4,56;val2\n", tf)
# how it's done elswhere
read.csv2(tf)
#> col1 col2
#> 1 1.23 val1
#> 2 4.56 val2
readr::read_csv2(tf, show_col_types = FALSE)
#> ℹ Using "','" as decimal and "'.'" as grouping mark. Use `read_delim()` for more control.
#> # A tibble: 2 × 2
#> col1 col2
#> <dbl> <chr>
#> 1 1.23 val1
#> 2 4.56 val2
readr::read_delim(
tf,
delim = ";",
locale = readr::locale(decimal_mark = ","),
show_col_types = FALSE
)
#> # A tibble: 2 × 2
#> col1 col2
#> <dbl> <chr>
#> 1 1.23 val1
#> 2 4.56 val2
# possible syntax in arrow::read_csv_arrow()
read_csv_arrow(
tf,
parse_options = CsvParseOptions$create(delimiter = ";"),
convert_options = CsvConvertOptions$create(decimal_point = ",")
)
#> Error in CsvConvertOptions$create(decimal_point = ","): unused argument (decimal_point = ",")
read_csv2_arrow(tf)
#> Error in read_csv2_arrow(tf): could not find function "read_csv2_arrow" Where the CsvConvertOptions are defined: Lines 526 to 559 in 670af33
Lines 79 to 149 in 670af33
arrow/cpp/src/arrow/csv/options.h Lines 107 to 108 in 670af33
|
@paleolimbot Your instructions here are 🔥 |
thisisnic
added a commit
that referenced
this issue
Oct 9, 2023
### Rationale for this change Allow customisable decimal points when reading data ### What changes are included in this PR? Expose the C++ option in R ### Are these changes tested? Aye ### Are there any user-facing changes? Indeed * Closes: #29184 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
JerAguilon
pushed a commit
to JerAguilon/arrow
that referenced
this issue
Oct 23, 2023
### Rationale for this change Allow customisable decimal points when reading data ### What changes are included in this PR? Expose the C++ option in R ### Are these changes tested? Aye ### Are there any user-facing changes? Indeed * Closes: apache#29184 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
loicalleyne
pushed a commit
to loicalleyne/arrow
that referenced
this issue
Nov 13, 2023
### Rationale for this change Allow customisable decimal points when reading data ### What changes are included in this PR? Expose the C++ option in R ### Are these changes tested? Aye ### Are there any user-facing changes? Indeed * Closes: apache#29184 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
### Rationale for this change Allow customisable decimal points when reading data ### What changes are included in this PR? Expose the C++ option in R ### Are these changes tested? Aye ### Are there any user-facing changes? Indeed * Closes: apache#29184 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Followup to ARROW-13421. There is a new ConvertOption, that part is easy. There may be some subtleties in emulating the readr way of supporting this since it uses a broader
locale()
object, but maybe we just addread_csv2_arrow
(matchingreadr::read_csv2
andbase::read.csv2
) and that's enough.Reporter: Neal Richardson / @nealrichardson
Note: This issue was originally created as ARROW-13531. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: