Skip to content

summarize dropping attributes of columns #1237

@renkun-ken

Description

@renkun-ken

I'm working on a new package formattable that tries to add some formatting on vectors and data frames for more friendly printing. It uses attributes to store metadata e.g. formatting rules.

I test it with data.table and everything works fine in latest dev version since Rdatatable/data.table#1160 is fixed. In some previous version of dplyr, it seems that it did not support atomic vectors with customized classes (e.g. formattable numeric) as table columns. Now it works with such columns but summarize seems not to preserve attributes.

> library(dplyr)
> library(formattable)
> df <- data.frame(id = 1:10, ret = percent(rnorm(10, 0.1, 0.1)))
> df
   id    ret
1   1 22.82%
2   2  6.74%
3   3 16.15%
4   4 -5.82%
5   5  2.27%
6   6 12.12%
7   7 -5.73%
8   8 16.96%
9   9 -2.90%
10 10  9.13%
> df %>% summarize(ret = mean(ret))
         ret
1 0.07174314

while filter, arrange and group_by do not drop attributes of ret.

> df %>% filter(ret >= mean(ret))
  id    ret
1  1 22.82%
2  3 16.15%
3  6 12.12%
4  8 16.96%
5 10  9.13%
> df %>% arrange(ret)
   id    ret
1   4 -5.82%
2   7 -5.73%
3   9 -2.90%
4   5  2.27%
5   2  6.74%
6  10  9.13%
7   6 12.12%
8   3 16.15%
9   8 16.96%
10  1 22.82%
> df %>% group_by(group = id %% 3)
Source: local data frame [10 x 3]
Groups: group

   id    ret group
1   1 22.82%     1
2   2  6.74%     2
3   3 16.15%     0
4   4 -5.82%     1
5   5  2.27%     2
6   6 12.12%     0
7   7 -5.73%     1
8   8 16.96%     2
9   9 -2.90%     0
10 10  9.13%     1

My session info:

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5     formattable_0.0.16.1 dplyr_0.4.2         

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.6     digest_0.6.8    assertthat_0.1  mime_0.3        chron_2.3-47   
 [6] R6_2.0.1        xtable_1.7-4    DBI_0.3.1       magrittr_1.5    lazyeval_0.1.10
[11] tools_3.2.1     htmlwidgets_0.5 markdown_0.7.7  shiny_0.12.1    httpuv_1.3.2   
[16] parallel_3.2.1  htmltools_0.2.6 knitr_1.10.5   

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions