New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offer backwards compatibility to < 1.2.0 in name repair #546
Comments
This update has broken so much code, how could this be overlooked? |
If you want the old style of name repair in the presence of the current CRAN versions of both readxl and tibble, I suggest that you define a function like library(readxl)
legacy_repair <- function(nms, prefix = "X", sep = "__") {
if (length(nms) == 0) return(character())
blank <- nms == ""
nms[!blank] <- make.unique(nms[!blank], sep = sep)
new_nms <- setdiff(paste(prefix, seq_along(nms), sep = sep), nms)
nms[blank] <- new_nms[seq_len(sum(blank))]
nms
}
legacy_repair(rep_len("", 3))
#> [1] "X__1" "X__2" "X__3"
legacy_repair(c("x", "", "x"))
#> [1] "x" "X__1" "x__1"
readxl_test_sheet <- "~/rrr/readxl/tests/testthat/sheets/names-need-repair-xlsx.xlsx"
read_excel(readxl_test_sheet)
#> New names:
#> * `a b` -> `a b..1`
#> * `a b` -> `a b..2`
#> * `` -> `..3`
#> # A tibble: 2 x 4
#> `a b..1` `a b..2` ..3 `c%&$`
#> <dbl> <chr> <chr> <dbl>
#> 1 1 a one 1.1
#> 2 2 b two 2.2
read_excel(readxl_test_sheet, .name_repair = legacy_repair)
#> # A tibble: 2 x 4
#> `a b` `a b__1` X__1 `c%&$`
#> <dbl> <chr> <chr> <dbl>
#> 1 1 a one 1.1
#> 2 2 b two 2.2 Or you could decline name repair and use library(readxl)
readxl_test_sheet <- "~/rrr/readxl/tests/testthat/sheets/names-need-repair-xlsx.xlsx"
x <- read_excel(readxl_test_sheet, .name_repair = "minimal")
x
#> # A tibble: 2 x 4
#> `a b` `a b` `` `c%&$`
#> <dbl> <chr> <chr> <dbl>
#> 1 1 a one 1.1
#> 2 2 b two 2.2
tibble::repair_names(
tibble::as_tibble(x, validate = FALSE), prefix = "X", sep = "__"
)
#> # A tibble: 2 x 4
#> `a b` `a b__1` X__1 `c%&$`
#> <dbl> <chr> <chr> <dbl>
#> 1 1 a one 1.1
#> 2 2 b two 2.2 Or maybe stay on an earlier version of readxl or tibble until the time is right to take control of variable names? This should not affect sheet reads where the names are well-formed to begin with. We really want to reduce the inconsistencies across tidyverse packages re: how column names are repaired. I know this requires some adjustments for people, but this was a considered move. It was also pre-announced (see comment I'm about to write in response to @raredd). |
This is not an oversight. readxl has been downloaded almost 600K times since its December 2018 release, from RStudio CRAN mirrors alone. This is the very first issue opened about name repair, about 6 weeks later. We go through complete revdep checks for readxl and for tibble and we search & analyze non-package code on GitHub before we make such changes. Changes to column name repair were pre-announced when v1.1.0 was released in April of 2018. There are costs to changing and to not changing and we think making this change is a net improvement to the ecosystem. |
Thank you very much @jennybc -- that solution works for the sheets I had had trouble with. (Apologies for the lack of reprex and for seeming ungrateful -- I'm truly spoiled by this package.) |
No worries @HughParsonage, I know this is a disruption and the average level of pain is not indicative of what any specific user experiences. |
@jennybc There are already four built-in name repairs, why not simply add a 5th that was the standard for however long since readxl was released? |
Addresses tidyverse/readxl#546 Addresses tidyverse/tidyr#641 Closes r-lib#359
Addresses tidyverse/readxl#546 Addresses tidyverse/tidyr#641 Closes r-lib#359
Since the introduction of
.name_repair
changes the output ofread_excel
and since it is quite difficult to reproduce the same effect, it would be nice to offer a package option, saygetOption("readxl.old_repair_names")
, to provide backwards compatibility. Otherwise, the only alternative appears to be a hack usingassignInNamespace
. That is, instead ofJust have
options("readxl.old_repair_names" = TRUE)
and then modifyset_readxl_names
likeThe text was updated successfully, but these errors were encountered: