-
Notifications
You must be signed in to change notification settings - Fork 133

Description
After updating to the most recent version of the package, I noticed that a) the new console output was great, and b) that printing was substantially slower for tibbles with ~50 or more columns. In addition to printing slower, the output process hangs between printing the tabular data preview and the list of columns excluded therefrom.
This reprex uses gapminder data to make a tibble with 1 row and 711 columns. I exaggerated the number of columns in an effort to make it reproducible on machines with better specs than my middling i-5 and 8 gigs of ram.
load packages
library(gapminder)
library(tidyverse)
#-- Attaching packages --------------------------------------- tidyverse 1.2.1 --
# v ggplot2 2.2.1 v purrr 0.2.4
# v tibble 1.4.1 v dplyr 0.7.4
# v tidyr 0.7.2 v stringr 1.2.0
# v readr 1.1.1 v forcats 0.2.0
make the test data
tst_tibble <- gapminder %>%
# change the year filter to add or subtract columns
# from the final tibble
filter(year < 1975) %>%
unite(loc_yr, continent, country, year) %>%
select(loc_yr, lifeExp) %>%
spread(loc_yr, lifeExp)
tst_df <- as.data.frame(tst_tibble)
simple timing
system.time(print(tst_tibble))
# user system elapsed
# 8.24 0.00 8.28
system.time(print(tst_df))
# user system elapsed
# 0.55 0.00 0.56
conclusions
Obviously tibble is doing more work to print its output than data.frame(), but the ~15X jump in time seems like quite a lot more than it was in previous versions, and also more than it should be to produce the output that is actually shown on screen. I unfortunately don't have time to downgrade tibble, or test timing more rigorously, but I'll check later and update.
My only hypothesis is that tibble is applying its print processing to all the columns, including the hidden ones, before it shrinks the output and sends it to the console, but I don't know enough to figure out whether or not that's the case.
session info
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.2.0 stringr_1.2.0 dplyr_0.7.4 purrr_0.2.4 readr_1.1.1 tidyr_0.7.2
[7] tibble_1.4.1 ggplot2_2.2.1 tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 cellranger_1.1.0 pillar_1.0.1 compiler_3.4.3 plyr_1.8.4
[6] bindr_0.1 tools_3.4.3 lubridate_1.7.1 jsonlite_1.5 nlme_3.1-131
[11] gtable_0.2.0 lattice_0.20-35 pkgconfig_2.0.1 rlang_0.1.6 psych_1.7.8
[16] cli_1.0.0 rstudioapi_0.7 yaml_2.1.16 parallel_3.4.3 haven_1.1.0
[21] bindrcpp_0.2 xml2_1.1.1 httr_1.3.1 hms_0.4.0 grid_3.4.3
[26] glue_1.2.0 R6_2.2.2 readxl_1.0.0 foreign_0.8-69 modelr_0.1.1
[31] reshape2_1.4.3 magrittr_1.5 scales_0.5.0 rvest_0.3.2 assertthat_0.2.0
[36] mnormt_1.5-5 colorspace_1.3-2 stringi_1.1.6 lazyeval_0.2.1 munsell_0.4.3
[41] broom_0.4.3 crayon_1.3.4