-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_delim delimiters at end of line not treated as delimiters #1328
Comments
The issue is the first header line does not match the rest of the file, the first line has only one delimiter, but the rest of the file has two. e.g. if the first line was library(readr)
text <- "Aldur;Fjöldi;
0-5 ára;1.287;
6-12 ára;1.438;
13-16 ára;730;
17-24 ára;1.409;
25-34 ára;3.891;
35-66 ára;6.561;
67 ára og eldri;1.683;
Samtals;16.999;"
is_locale <- locale("is", decimal_mark = ",", grouping_mark = ".")
read_delim(text, delim = ";", locale = is_locale)
#> New names:
#> * `` -> ...3
#> Rows: 8 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ";"
#> chr (1): Aldur
#> lgl (1): ...3
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 8 × 3
#> Aldur Fjöldi ...3
#> <chr> <dbl> <lgl>
#> 1 0-5 ára 1287 NA
#> 2 6-12 ára 1438 NA
#> 3 13-16 ára 730 NA
#> 4 17-24 ára 1409 NA
#> 5 25-34 ára 3891 NA
#> 6 35-66 ára 6561 NA
#> 7 67 ára og eldri 1683 NA
#> 8 Samtals 16999 NA A workaround would be to skip the first line and give the column names explicitly text <- "Aldur;Fjöldi
0-5 ára;1.287;
6-12 ára;1.438;
13-16 ára;730;
17-24 ára;1.409;
25-34 ára;3.891;
35-66 ára;6.561;
67 ára og eldri;1.683;
Samtals;16.999;"
read_delim(text, delim = ";", skip = 1, col_names = strsplit(readr::read_lines(text, n_max = 1), ";")[[1]], locale = is_locale)
#> Rows: 8 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ";"
#> chr (1): Aldur
#> lgl (1): X3
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 8 × 3
#> Aldur Fjöldi X3
#> <chr> <dbl> <lgl>
#> 1 0-5 ára 1287 NA
#> 2 6-12 ára 1438 NA
#> 3 13-16 ára 730 NA
#> 4 17-24 ára 1409 NA
#> 5 25-34 ára 3891 NA
#> 6 35-66 ára 6561 NA
#> 7 67 ára og eldri 1683 NA
#> 8 Samtals 16999 NA Created on 2021-11-18 by the reprex package (v2.0.1) |
Thank you for taking the time to clear this up, Jim (and again, thanks for a fantastic package). |
I wanted to point out the following change between versions 1.4.0 and 2.0.0 and ask if this is a feature or a bug.
Delimiters, in this case ";" used to be treated as delimiters at the end of line, even if there was not a delimiter in the header line in the text document. But this has changed with the new parsing engine, and I didn't see it mentioned in the release notes. This is definitely not a big issue, especially thanks to the brilliant with_edition() function, but I wanted to point this out as some services which hand off data delimited with ";", (erroneously?) have delimiters at the end of lines for the data.
Anyway, thanks for all your work on this incredible package!
Created on 2021-11-18 by the reprex package (v2.0.0)
The text was updated successfully, but these errors were encountered: