New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ...j the standard disambiguating suffix in name repair #566
Comments
|
Can we instead recommend to use
|
|
I'm motivated by the case of empty names and a phenomenon I think will be fairly prevalent: the nonsyntactic names are short-lived or beside the point, but they prevent someone from doing something that seems totally unrelated. Like the dplyr issue above where, surprisingly, nonsyntactic names prevent |
|
I agree it feels icky to make another change but we're still in early days of the tibble 2.x series and I suspect it's a net positive, long-term. |
|
I still believe we can educate users to default to dplyr is being fixed and now also works for names like |
|
tidyverse/dplyr#4094 seems difficult to fix for names like How about a renaming strategy that sits between
What would quotable repair do to backticks? library(tibble)
library(magrittr)
data <- tibble(`\`` = 1)
data %>% names()
#> [1] "`"
as_tibble(data, .name_repair = "unique")
#> # A tibble: 1 x 1
#> `\``
#> <dbl>
#> 1 1
as_tibble(data, .name_repair = "universal")
#> New names:
#> * `\`` -> .
#> # A tibble: 1 x 1
#> .
#> <dbl>
#> 1 1Created on 2019-02-15 by the reprex package (v0.2.1.9000) |
|
I still feel like there's time to just make three dots the standard suffix. I really think we should. |
This comment has been minimized.
This comment has been minimized.
|
@jennybc: Why three dots? Why not just one? |
This comment has been minimized.
This comment has been minimized.
|
Well, |
|
Just to get it all written out, the reason to reconsider the "disambiguating numeric suffix" is that, in cases where the name was missing, it becomes the entire name. And even though we don't promise to make names syntactic when |
|
Recording a bit more motivation ... We are happy for our automatically repaired names to be ugly, because that helps create awareness that the user is working with names created by someone else. But in many cases, part of the reason people are oblivious -- and can safely remain so -- is that they were never going to use those weird variable names anyway. But the mere presence of non-syntactic names can cause friction in unexpected places, such as |
|
I hear you, and I agree now that changing the default suffix is the best way to handle that problem in the short-medium term. The problem with I'll implement the changes and see how many revdep failures we have. If we can't fix all downstream failures in the very short term, I'd rather add a compatibility mode. |
Can you spell this out more? It's so easy to overlook weird stuff and edge cases in this naming domain, that I want to make sure I'm holding the important facts in my head. |
|
It's a number. That's why .1 <- 5
#> Error in 0.1 <- 5: invalid (do_set) left-hand side to assignment
a <- .1
a
#> [1] 0.1Created on 2019-02-18 by the reprex package (v0.2.1.9000) |
|
OK so you mean it's a syntactic expression but not a syntactic name. |
|
Yes. I'm already sold on the idea that we should use three dots. What do you think about converting |
Well, this is already addressed for tibble:::unique_names("")
#> New names:
#> * `` -> `..1`
#> [1] "..1"Created on 2019-02-18 by the reprex package (v0.2.1.9000) |
|
We got this "for free" with the motivating principle for |
|
Nice, I wasn't aware of that. If we also expand |
|
Or perhaps we always append a number to |
|
My feelings about But I'm not sure that it's worth special casing it for |
But I do like the idea of having some operational principle whereby each level of repair has this sort of practical significance. |
|
So how about always converting |
|
If it were up to me, I wouldn't do it. Or I would treat it like But I can let it go if you feel strongly. |
|
Coming here from community.rstudio.com, I assume the new behaviour means that instead of x <- tibble::as_tibble(list(1), .name_repair = "unique")
#> New names:
#> * `` -> `..1`
dplyr::rename(x, a = `..1`)
#> Error in .f(.x[[i]], ...): ..1 used in an incorrect context, no ... to look inthere will be this? x <- tibble::as_tibble(list(1), .name_repair = "unique")
#> New names:
#> * `` -> `...1`
dplyr::rename(x, a = `...1`)
#> # A tibble: 1 x 1
#> a
#> <dbl>
#> 1 1 (BTW, this originated for me as a problem with readxl). |
|
@dpprdan Yeah, exactly. Yes, readxl is also fueling my interest in both name repair and this specific issue thread, as Excel seems to be the R world's main supplier of horrifically-named tibbles. |
|
@krlmlr Does this summarize our current status?
|
|
Agreed. FWIW, users shouldn't rely on these auto-generated names by default. One way to rename is by position: library(tidyverse)
data <-
list(a = 1, 2, 3, q = 4) %>%
as_tibble(.name_repair = "minimal")
data
#> # A tibble: 1 x 4
#> a `` `` q
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 4
data %>%
as_tibble(.name_repair = "unique") %>%
rename(b = 2, c = 3)
#> New names:
#> * `` -> `..2`
#> * `` -> `..3`
#> # A tibble: 1 x 4
#> a b c q
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 4Created on 2019-02-22 by the reprex package (v0.2.1.9000) |
Agreed. In the medium term, I'd like to add an article in readxl on this. And it will definitely contain the advice that if your downstream code has hard-wired variable names, you need to be intentional at import time. Many names won't need repair but if they do and you refer to them by name later, you should choose I think our main goal with this change to the disambiguating suffix is to prevent some mysterious problems, like we've seen with the scoped dplyr verbs. In that example, the user is not making any explicit name use, but turns out it matters because of an implementation detail in the scoped verbs. |
- Three dots are used even for `"unique"` name repair (#566).
|
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
I think we should go ahead and use
...jas the standard suffix when.name_repair = "unique". These names don't have to be syntactic but it's so easy to make them so (they are already ugly and have lots of dots) and it will reduce the downstream burden of dealing with non-syntactic names (example: tidyverse/dplyr#4094).This would be good to do ASAP I think in a tiny release. What do you think @krlmlr?
The text was updated successfully, but these errors were encountered: