New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column silently dropped when its name conflicts with value col name in gather.data.frame() #347

Closed
jarodmeng opened this Issue Aug 21, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@jarodmeng

jarodmeng commented Aug 21, 2017

gather.data.frame() computes which columns to gather using this snippet.

  quos <- quos(...)
  if (is_empty(quos)) {
    gather_vars <- setdiff(names(data), c(key_var, value_var))
  } else {
    gather_vars <- unname(tidyselect::vars_select(names(data), !!! quos))
  }

When the user doesn't provide a ... argument, the gather columns are computed to be anything that's not key or value. This logic works when there's no existing column having the same name as value. If there is, that column would be silently dropped from the gathering.

I've produced a simple reprex below to illustrate.

packageVersion("tidyr")
#> [1] '0.7.0'

library(tidyr)
library(tibble)

XYZ <- data.frame(
  X = rnorm(2, 0, 1),
  Y = rnorm(2, 0, 2),
  Z = rnorm(2, 0, 4)
)

gather(XYZ, name, value)
#>   name      value
#> 1    X  0.1051179
#> 2    X  1.4617993
#> 3    Y -0.9121167
#> 4    Y  2.5378557
#> 5    Z -2.1496032
#> 6    Z -2.4815070

gather(XYZ, name, Y)
#>            Y name          Y
#> 1 -0.9121167    X  0.1051179
#> 2  2.5378557    X  1.4617993
#> 3 -0.9121167    Z -2.1496032
#> 4  2.5378557    Z -2.4815070
@hadley

This comment has been minimized.

Member

hadley commented Nov 16, 2017

Slightly more minimal reprex, along with behaviour for tibbles.

library(tidyr)

XYZ <- data.frame(
  X = 1,
  Y = 1,
  Z = 2
)

XYZ %>% gather(key = "name", value = "Y")
#>   Y name Y
#> 1 1    X 1
#> 2 1    Z 2
XYZ %>% tibble::as.tibble() %>% gather(key = "name", value = "Y")
#> Error: Column `Y` must have a unique name

Created on 2017-11-16 by the reprex package (0.1.1.9000).

@hadley

This comment has been minimized.

Member

hadley commented Nov 16, 2017

Basically same issue as #255 but for gather(), which uses a slightly different code path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment