{{ message }}

# summarise() does not correctly coerce factors with different levels#1678

Closed
opened this issue Mar 1, 2016 · 2 comments
Closed

# summarise() does not correctly coerce factors with different levels#1678

opened this issue Mar 1, 2016 · 2 comments
Labels
Milestone

### helix123 commented Mar 1, 2016

 I response to #1556 with the latest dev version. I expected the output for group b-1 (2nd row) to be all 0's on the summarised variables. ``````library(dplyr) df <- data.frame(grp=c("a","a","a","b","b","b"), grp2=c("2","2","2","2","1","1"), fac=c("1","1","1","1","0","0")) dplyr::summarise(dplyr::group_by(df, grp, grp2), mean = mean(as.numeric(fac)-1), sum1 = sum(as.numeric(fac)-1), sum2 = sum(fac == 1), any1.0 = factor(ifelse(any(fac == 1), 1, 0)), any1.1 = ifelse(any(fac == 1), 1, 0), any2.0 = factor(if(any(fac == 1)) 1 else 0), any2.1 = if(any(fac == 1)) 1 else 0, any3.0 = factor(if(any(fac == factor(1, levels = c(0,1)))) 1 else 0), any3.1 = if(any(fac == factor(1, levels = c(0,1)))) 1 else 0) Source: local data frame [3 x 11] Groups: grp [?] grp grp2 mean sum1 sum2 any1.0 any1.1 any2.0 any2.1 any3.0 any3.1 (fctr) (fctr) (dbl) (dbl) (int) (fctr) (dbl) (fctr) (dbl) (fctr) (dbl) 1 a 2 1 3 3 1 1 1 1 1 1 2 b 1 0 0 0 1 0 1 0 1 0 3 b 2 1 1 1 1 1 1 1 1 1 `````` ``````Session info ------------------------------------------------------------------------------------------------------------------------------------------------------------- setting value version R version 3.2.3 (2015-12-10) system x86_64, mingw32 ui RStudio (0.99.875) language (EN) collate German_Germany.1252 tz Europe/Berlin date 2016-03-01 Packages ----------------------------------------------------------------------------------------------------------------------------------------------------------------- package * version date source assertthat 0.1 2013-12-06 CRAN (R 3.2.2) curl 0.9.6 2016-02-17 CRAN (R 3.2.3) DBI 0.3.1 2014-09-24 CRAN (R 3.2.2) devtools 1.10.0 2016-01-23 CRAN (R 3.2.3) digest 0.6.9 2016-01-08 CRAN (R 3.2.3) dplyr * 0.4.3.9000 2016-03-01 Github (hadley/dplyr@7d4e0ba) git2r 0.13.1 2015-12-10 CRAN (R 3.2.3) httr 1.1.0 2016-01-28 CRAN (R 3.2.3) knitr 1.12.3 2016-01-22 CRAN (R 3.2.3) lazyeval 0.1.10 2015-01-02 CRAN (R 3.2.2) magrittr 1.5 2014-11-22 CRAN (R 3.2.2) memoise 1.0.0 2016-01-29 CRAN (R 3.2.3) nycflights13 * 0.1 2014-07-22 CRAN (R 3.2.3) R6 2.1.2 2016-01-26 CRAN (R 3.2.3) Rcpp 0.12.3 2016-01-10 CRAN (R 3.2.3) rstudioapi 0.5 2016-01-24 CRAN (R 3.2.3) withr 1.0.1 2016-02-04 CRAN (R 3.2.3) `````` The text was updated successfully, but these errors were encountered:

### 1va commented Mar 3, 2016

 You could specify the levels (of the if/else results) to prevent wrong/unexpected concatenation of factors with different levels: `factor(ifelse(any(fac == 1), 1, 0), levels = c(0, 1))` But I agree that the expected behaviour would be that the levels are handled automatically similar to `unlist(list(factor(0), factor(1))` Compare `c(factor(0), factor(1))` and `c(factor(0, levels = c(0, 1)), factor(1, levels = c(0, 1)))` changed the title summarise with factors summarise() does not correctly coerce factors with different levels Mar 8, 2016

### hadley commented Mar 8, 2016

 Minimal reprex: ```data_frame(x = 1:2) %>% group_by(x) %>% summarise( y = if(x == 1) "a" else "b", z = factor(y) )```