Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spread when id column has names #525

Closed
wangyuchen opened this issue Dec 28, 2018 · 2 comments
Closed

spread when id column has names #525

wangyuchen opened this issue Dec 28, 2018 · 2 comments
Labels

Comments

@wangyuchen
Copy link

@wangyuchen wangyuchen commented Dec 28, 2018

tidyr/R/id.R

Line 41 in cbdd14e

if (!is_null(attr(x, "n")) && !drop) return(x)

I think you'd want to add exact = TRUE here to match attribute "n" exactly.

Otherwise, when drop = FALSE, the empty attribute condition gets used and it will partial match all attributes. It happened that I had a named vector in my data frame and it was passed into id_var(). It matched the "names" attribute and returned the variable without adding the "n" attribute.

@wangyuchen wangyuchen changed the title spread when id columns has names spread when id column has names Dec 28, 2018
@hadley hadley added the reprex label Jan 4, 2019
@hadley
Copy link
Member

@hadley hadley commented Jan 4, 2019

Can you please provide a minimal reprex (reproducible example)? That will help us create a unit test to make sure we fix the bug.

@wangyuchen
Copy link
Author

@wangyuchen wangyuchen commented Jan 4, 2019

Yes of course. Please see if this works.

library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.4.4
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.4.4
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# named vector
id_col <- c(x = 1, y = 2, z = 3)

# spread key contains levels not included in data
# need to use drop = FALSE when spread
spread_df <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3,
             id = id_col)  # name attribute will be dropped
spread_df
#>   key out id
#> x   a   1  1
#> y   b   2  2
#> z   c   3  3

# both work fine
spread_df %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df %>%  
  spread(key, out, drop = FALSE) 
#>   id  a  b  c  d  e
#> 1  1  1 NA NA NA NA
#> 2  2 NA  2 NA NA NA
#> 3  3 NA NA  3 NA NA


spread_df2 <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3) %>% 
  mutate(id = id_col)  # the name attribute will be preserved in mutate

spread_df2 %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df2 %>%  
  spread(key, out, drop = FALSE) 
#> Error: Result 1 is not a length 1 atomic vector

Created on 2019-01-04 by the reprex package (v0.2.1)

Ryo-N7 added a commit to Ryo-N7/tidyr that referenced this issue Jan 19, 2019
romainfrancois added a commit to romainfrancois/tidyr that referenced this issue Feb 5, 2019
@hadley hadley closed this in 0b27690 Feb 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants