spread when id column has names #525

wangyuchen · 2018-12-28T22:40:19Z

Line 41 in cbdd14e

if (!is_null(attr(x, "n")) && !drop) return(x)

I think you'd want to add exact = TRUE here to match attribute "n" exactly.

Otherwise, when drop = FALSE, the empty attribute condition gets used and it will partial match all attributes. It happened that I had a named vector in my data frame and it was passed into id_var(). It matched the "names" attribute and returned the variable without adding the "n" attribute.

The text was updated successfully, but these errors were encountered:

hadley · 2019-01-04T14:15:12Z

Can you please provide a minimal reprex (reproducible example)? That will help us create a unit test to make sure we fix the bug.

wangyuchen · 2019-01-04T22:15:13Z

Yes of course. Please see if this works.

library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.4.4
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.4.4
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# named vector
id_col <- c(x = 1, y = 2, z = 3)

# spread key contains levels not included in data
# need to use drop = FALSE when spread
spread_df <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3,
             id = id_col)  # name attribute will be dropped
spread_df
#>   key out id
#> x   a   1  1
#> y   b   2  2
#> z   c   3  3

# both work fine
spread_df %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df %>%  
  spread(key, out, drop = FALSE) 
#>   id  a  b  c  d  e
#> 1  1  1 NA NA NA NA
#> 2  2 NA  2 NA NA NA
#> 3  3 NA NA  3 NA NA


spread_df2 <- 
  data.frame(key = factor(1:3, 1:5, letters[1:5]),
             out = 1:3) %>% 
  mutate(id = id_col)  # the name attribute will be preserved in mutate

spread_df2 %>%  
  spread(key, out, drop = TRUE) 
#>   id  a  b  c
#> 1  1  1 NA NA
#> 2  2 NA  2 NA
#> 3  3 NA NA  3

spread_df2 %>%  
  spread(key, out, drop = FALSE) 
#> Error: Result 1 is not a length 1 atomic vector

^{Created on 2019-01-04 by the reprex package (v0.2.1)}

closes tidyverse#525

wangyuchen changed the title ~~spread when id columns has names~~ spread when id column has names Dec 28, 2018

hadley added the reprex needs a minimal reproducible example label Jan 4, 2019

Ryo-N7 added a commit to Ryo-N7/tidyr that referenced this issue Jan 19, 2019

fixes tidyverse#525

163918e

romainfrancois added a commit to romainfrancois/tidyr that referenced this issue Feb 5, 2019

spread() works fine when the id variable has names.

08e124c

closes tidyverse#525

hadley closed this as completed in 0b27690 Feb 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spread when id column has names #525

spread when id column has names #525

wangyuchen commented Dec 28, 2018

hadley commented Jan 4, 2019

wangyuchen commented Jan 4, 2019

spread when id column has names #525

spread when id column has names #525

Comments

wangyuchen commented Dec 28, 2018

hadley commented Jan 4, 2019

wangyuchen commented Jan 4, 2019