Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaming with select() creates duplicate variables with a grouped dataframe #5841

Closed
woodtho opened this issue Apr 7, 2021 · 0 comments · Fixed by #5842
Closed

Renaming with select() creates duplicate variables with a grouped dataframe #5841

woodtho opened this issue Apr 7, 2021 · 0 comments · Fixed by #5842
Assignees
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@woodtho
Copy link

woodtho commented Apr 7, 2021

Renaming a variable, in a select statement, with the same name as the grouping variable results in a dataframe with 2 variables with the same name. There is a warning that the grouping variable was added but no error about duplicated variable names until the variable is referenced.

Not sure if this is the expected behavior, but it seemed weird to me. Especially since mutating a new variable with the same name as the existing grouping variable successfully replaces the original grouping variable.

library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.4

(df <- tribble(
  ~prov,            ~value, ~prov_fr,
  "Ontario",         10,    "Ontario",
  "Quebec",          20,    "Québec",
  "Rest of Canada",  30,    "Reste du Canada"
) %>%
    group_by(prov) %>% 
    select(prov = prov_fr, value))
#> Adding missing grouping variables: `prov`
#> # A tibble: 3 x 3
#> # Groups:   prov [3]
#>   prov           prov            value
#>   <chr>          <chr>           <dbl>
#> 1 Ontario        Ontario            10
#> 2 Quebec         Québec             20
#> 3 Rest of Canada Reste du Canada    30

df %>% select(prov)
#> Error: Names must be unique.
#> x These names are duplicated:
#>   * "prov" at locations 1 and 2.


tribble(
  ~prov,            ~value, ~prov_fr,
  "Ontario",         10,    "Ontario",
  "Quebec",          20,    "Québec",
  "Rest of Canada",  30,    "Reste du Canada"
) %>%
  group_by(prov) %>% 
  mutate(prov = "a")
#> # A tibble: 3 x 3
#> # Groups:   prov [1]
#>   prov  value prov_fr        
#>   <chr> <dbl> <chr>          
#> 1 a        10 Ontario        
#> 2 a        20 Québec         
#> 3 a        30 Reste du Canada

Created on 2021-04-07 by the reprex package (v0.3.0)

@romainfrancois romainfrancois self-assigned this Apr 8, 2021
@romainfrancois romainfrancois added the bug an unexpected problem or unintended behavior label Apr 8, 2021
@romainfrancois romainfrancois added this to the 1.0.6 milestone Apr 8, 2021
romainfrancois added a commit that referenced this issue Apr 19, 2021
* select() not creating duplicate variable on grouped data frames.

closes #5841

* explain

* Apply suggestions from code review

Co-authored-by: Hadley Wickham <h.wickham@gmail.com>

* merge tests, and snapshot

Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants