Join GitHub today
repair_names() feature requests #217
I've just started running readxl's output through
But I noticed that
x <- list(1:3, var2 = letters[1:3], c(1, 2, 3) + 0.1, var2 = letters[26:24]) names(x) <- NA tibble::repair_names(tibble::as_data_frame(x, validate = FALSE)) #> # A tibble: 3 × 4 #> V1 var2 V2 var21 #> <int> <chr> <dbl> <chr> #> 1 1 a 1.1 z #> 2 2 b 2.1 y #> 3 3 c 3.1 x readr::read_csv(",var2,,var2\n1,'a',1.1,'z'\n2,'b',2.1,'y'\n3,'c',3.1,'x'") #> Warning: Missing column names filled in: 'X1' , 'X3'  #> Warning: Duplicated column names deduplicated: 'var2' => 'var2_1'  #> # A tibble: 3 × 4 #> X1 var2 X3 var2_1 #> <int> <chr> <dbl> <chr> #> 1 1 'a' 1.1 'z' #> 2 2 'b' 2.1 'y' #> 3 3 'c' 3.1 'x'
Feature requests, some inspired by readr:
The first one is for a better interactive experience. Otherwise, aimed at programmatic work with tibbles that may have been subjected to name repair.
Yes, it certainly helps.
It feels like the tidyverse should have a preferred remedy for nonexistent and duplicated column names. And tibble is currently the logical place to implement that. So I'd still like to reach a consensus across packages.
What do you think of exporting
Perhaps my suggestion about leading underscores is not so bright. It creates non-syntactic names. I still think the convention should be something easily detectable via regex, i.e. unlikely to be present in original column names.
x <- list(1:3, var2 = letters[1:3], c(1, 2, 3) + 0.1, var2 = letters[26:24]) names(x) <- NA tibble::repair_names(tibble::as_data_frame(x, validate = FALSE), prefix = "__X", sep = "__") #> # A tibble: 3 × 4 #> `__X__1` var2 `__X__2` var2__1 #> <int> <chr> <dbl> <chr> #> 1 1 a 1.1 z #> 2 2 b 2.1 y #> 3 3 c 3.1 x
Moving towards a consensus on what should happen in current example:
Or we could decide whenever we incorporate a number, it refers to absolute column position
Even more consistent? Drop the
Now every name that needs repair simply gets
I particularly like @jennybc's idea of simply appending the absolute position, without using
referenced this issue
Feb 9, 2017
I think we should:
I'm not sure what we should do with respect to backward compatibility. I suspect few packages depend on this behaviour; it's going to be more user code. I think as long as there's a clear way to get to the old behaviour it shouldn't cause much hassle (especially since we're now describing what's happening)