Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gather() should possibly generate warnings or errors when key and value columns overlap with column in input #496

Closed
JohnMount opened this issue Sep 21, 2018 · 3 comments

Comments

@JohnMount
Copy link

commented Sep 21, 2018

It seems like some of the examples below should generate warnings or errors.

df <- data.frame(
  x = 1:3, 
  y = 4:6, 
  choice = c("x", "y", "x"), 
  stringsAsFactors = FALSE)
  
tidyr::gather(df, "k", "v")


    ##        k v
    ## 1      x 1
    ## 2      x 2
    ## 3      x 3
    ## 4      y 4
    ## 5      y 5
    ## 6      y 6
    ## 7 choice x
    ## 8 choice y
    ## 9 choice x


tidyr::gather(df, "k", "x")


    ##        k x
    ## 1      y 4
    ## 2      y 5
    ## 3      y 6
    ## 4 choice x
    ## 5 choice y
    ## 6 choice x


tidyr::gather(df, "x", "v")


    ##        x v
    ## 1      y 4
    ## 2      y 5
    ## 3      y 6
    ## 4 choice x
    ## 5 choice y
    ## 6 choice x


tidyr::gather(df, "x", "y")


    ##        x y
    ## 1 choice x
    ## 2 choice y
    ## 3 choice x


tidyr::gather(df, "x", "x")


    ##        x x
    ## 1      y 4
    ## 2      y 5
    ## 3      y 6
    ## 4 choice x
    ## 5 choice y
    ## 6 choice x
@JohnMount JohnMount changed the title gather() should generate warnings or errors when key and value columns overlap with column in input gather() should possibly generate warnings or errors when key and value columns overlap with column in input Sep 21, 2018
@hadley

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

More minimal reprex:

``` r
library(tidyr)
df <- tibble::tibble(
  x = 1:2, 
  y = 4:5, 
  choice = c("x", "y"), 
)
  
gather(df, "k", "v")
#> # A tibble: 6 x 2
#>   k      v    
#>   <chr>  <chr>
#> 1 x      1    
#> 2 x      2    
#> 3 y      4    
#> 4 y      5    
#> 5 choice x    
#> 6 choice y
gather(df, "k", "x")
#> # A tibble: 4 x 2
#>   k      x    
#>   <chr>  <chr>
#> 1 y      4    
#> 2 y      5    
#> 3 choice x    
#> 4 choice y
gather(df, "x", "v")
#> # A tibble: 4 x 2
#>   x      v    
#>   <chr>  <chr>
#> 1 y      4    
#> 2 y      5    
#> 3 choice x    
#> 4 choice y
gather(df, "x", "y")
#> # A tibble: 2 x 2
#>   x      y    
#>   <chr>  <chr>
#> 1 choice x    
#> 2 choice y
gather(df, "x", "x")
#> Error: Column name `x` must not be duplicated.
#> Use .name_repair to specify repair.

Created on 2019-01-04 by the reprex package (v0.2.1)

@hadley

This comment has been minimized.

Copy link
Member

commented Mar 3, 2019

Need to resolve this problem before can check other results:

library(dplyr, warn.conflicts = FALSE)
library(tidyr)

df <- tibble::tibble(
  x = c(1, 2), 
  y = c(10, 20), 
  z = c(100, 200), 
)

# Values are incorrect
df %>% pivot(pivot_spec_long(df, everything()))
#> # A tibble: 6 x 2
#>   variable value
#>   <chr>    <dbl>
#> 1 x            1
#> 2 y            2
#> 3 z           10
#> 4 x           20
#> 5 y          100
#> 6 z          200

Created on 2019-03-03 by the reprex package (v0.2.1.9000)

@hadley hadley added this to the v1.0.0 milestone Mar 3, 2019
@hadley

This comment has been minimized.

Copy link
Member

commented Mar 3, 2019

These now all appear to return reasonable results

library(tidyr)
df <- tibble::tibble(
  x = 1:2, 
  y = 4:5, 
  choice = 6:7, 
)
  
# gather(df, "k", "v")
pivot(df, pivot_spec_long(df, 1:3, "v", "k"))
#> # A tibble: 6 x 2
#>   k          v
#>   <chr>  <int>
#> 1 x          1
#> 2 y          2
#> 3 choice     4
#> 4 x          5
#> 5 y          6
#> 6 choice     7

# gather(df, "k", "x")
pivot(df, pivot_spec_long(df, 1:3, "x", "k"))
#> # A tibble: 6 x 2
#>   k          x
#>   <chr>  <int>
#> 1 x          1
#> 2 y          2
#> 3 choice     4
#> 4 x          5
#> 5 y          6
#> 6 choice     7

# gather(df, "x", "v")
pivot(df, pivot_spec_long(df, 1:3, "v", "x"))
#> # A tibble: 6 x 2
#>   x          v
#>   <chr>  <int>
#> 1 x          1
#> 2 y          2
#> 3 choice     4
#> 4 x          5
#> 5 y          6
#> 6 choice     7

# gather(df, "x", "y")
pivot(df, pivot_spec_long(df, 1:3, "y", "x"))
#> # A tibble: 6 x 2
#>   x          y
#>   <chr>  <int>
#> 1 x          1
#> 2 y          2
#> 3 choice     4
#> 4 x          5
#> 5 y          6
#> 6 choice     7

# gather(df, "x", "x")
pivot(df, pivot_spec_long(df, 1:3, "x", "x"))
#> New names:
#> x -> x..1
#> x -> x..2
#> # A tibble: 6 x 2
#>   x..1    x..2
#>   <chr>  <int>
#> 1 x          1
#> 2 y          2
#> 3 choice     4
#> 4 x          5
#> 5 y          6
#> 6 choice     7

Created on 2019-03-03 by the reprex package (v0.2.1.9000)

@hadley hadley closed this Mar 3, 2019
hadley added a commit that referenced this issue Mar 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.