Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spread when id column has names #525

Closed
wangyuchen opened this issue Dec 28, 2018 · 2 comments
Closed

spread when id column has names #525

wangyuchen opened this issue Dec 28, 2018 · 2 comments
Labels
reprex needs a minimal reproducible example

Comments

@wangyuchen
Copy link

tidyr/R/id.R

Line 41 in cbdd14e

if (!is_null(attr(x, "n")) && !drop) return(x)

I think you'd want to add exact = TRUE here to match attribute "n" exactly.

Otherwise, when drop = FALSE, the empty attribute condition gets used and it will partial match all attributes. It happened that I had a named vector in my data frame and it was passed into id_var(). It matched the "names" attribute and returned the variable without adding the "n" attribute.

@wangyuchen wangyuchen changed the title spread when id columns has names spread when id column has names Dec 28, 2018
@hadley hadley added the reprex needs a minimal reproducible example label Jan 4, 2019
@hadley
Copy link
Member

hadley commented Jan 4, 2019

Can you please provide a minimal reprex (reproducible example)? That will help us create a unit test to make sure we fix the bug.

@wangyuchen
Copy link
Author

Yes of course. Please see if this works.

library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.4.4
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.4.4
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# named vector
id_col <- c(x = 1, y = 2, z = 3)

# spread key contains levels not included in data
# need to use drop = FALSE when spread
spread_df <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3,
             id = id_col)  # name attribute will be dropped
spread_df
#>   key out id
#> x   a   1  1
#> y   b   2  2
#> z   c   3  3

# both work fine
spread_df %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df %>%  
  spread(key, out, drop = FALSE) 
#>   id  a  b  c  d  e
#> 1  1  1 NA NA NA NA
#> 2  2 NA  2 NA NA NA
#> 3  3 NA NA  3 NA NA


spread_df2 <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3) %>% 
  mutate(id = id_col)  # the name attribute will be preserved in mutate

spread_df2 %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df2 %>%  
  spread(key, out, drop = FALSE) 
#> Error: Result 1 is not a length 1 atomic vector

Created on 2019-01-04 by the reprex package (v0.2.1)

Ryo-N7 added a commit to Ryo-N7/tidyr that referenced this issue Jan 19, 2019
romainfrancois added a commit to romainfrancois/tidyr that referenced this issue Feb 5, 2019
@hadley hadley closed this as completed in 0b27690 Feb 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reprex needs a minimal reproducible example
Projects
None yet
Development

No branches or pull requests

2 participants