Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column silently dropped when its name conflicts with value col name in gather.data.frame() #347

Closed
jarodmeng opened this issue Aug 21, 2017 · 2 comments
Labels
bug an unexpected problem or unintended behavior pivoting ♻️ pivot rectangular data to different "shapes"

Comments

@jarodmeng
Copy link

gather.data.frame() computes which columns to gather using this snippet.

  quos <- quos(...)
  if (is_empty(quos)) {
    gather_vars <- setdiff(names(data), c(key_var, value_var))
  } else {
    gather_vars <- unname(tidyselect::vars_select(names(data), !!! quos))
  }

When the user doesn't provide a ... argument, the gather columns are computed to be anything that's not key or value. This logic works when there's no existing column having the same name as value. If there is, that column would be silently dropped from the gathering.

I've produced a simple reprex below to illustrate.

packageVersion("tidyr")
#> [1] '0.7.0'

library(tidyr)
library(tibble)

XYZ <- data.frame(
  X = rnorm(2, 0, 1),
  Y = rnorm(2, 0, 2),
  Z = rnorm(2, 0, 4)
)

gather(XYZ, name, value)
#>   name      value
#> 1    X  0.1051179
#> 2    X  1.4617993
#> 3    Y -0.9121167
#> 4    Y  2.5378557
#> 5    Z -2.1496032
#> 6    Z -2.4815070

gather(XYZ, name, Y)
#>            Y name          Y
#> 1 -0.9121167    X  0.1051179
#> 2  2.5378557    X  1.4617993
#> 3 -0.9121167    Z -2.1496032
#> 4  2.5378557    Z -2.4815070
@hadley hadley added bug an unexpected problem or unintended behavior pivoting ♻️ pivot rectangular data to different "shapes" labels Nov 15, 2017
@hadley
Copy link
Member

hadley commented Nov 16, 2017

Slightly more minimal reprex, along with behaviour for tibbles.

library(tidyr)

XYZ <- data.frame(
  X = 1,
  Y = 1,
  Z = 2
)

XYZ %>% gather(key = "name", value = "Y")
#>   Y name Y
#> 1 1    X 1
#> 2 1    Z 2
XYZ %>% tibble::as.tibble() %>% gather(key = "name", value = "Y")
#> Error: Column `Y` must have a unique name

Created on 2017-11-16 by the reprex package (0.1.1.9000).

@hadley
Copy link
Member

hadley commented Nov 16, 2017

Basically same issue as #255 but for gather(), which uses a slightly different code path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior pivoting ♻️ pivot rectangular data to different "shapes"
Projects
None yet
Development

No branches or pull requests

2 participants