New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new_tibble() is slow for lots of repeated use #471
Comments
Thanks. One obvious way to shave off more than half of the slowdown is to unclass the argument. The rest of the time is spent on checks which are currently mandatory: # from #501
pkgload::load_all("~/git/R/tibble")
#> Loading tibble
x <- as_tibble(setNames(map(as.list(letters), "[", c(1, 1, 1)), LETTERS))
bench::mark(
tibble = new_tibble(x, nrow = 3, subclass = "my_class"),
tibble_list = new_tibble(unclass(x), nrow = 3, subclass = "my_class"),
new_valid_tibble = new_valid_tibble(unclass(x), .nrow = 3, .subclass = "my_class"),
set_tibble_class = set_tibble_class(unclass(x), .nrow = 3, .subclass = "my_class"),
base = structure(x, row.names = .set_row_names(3), class = c("my_class", "tbl_df", "tbl", "data.frame"))
)
#> # A tibble: 5 x 10
#> expression min mean median max `itr/sec` mem_alloc n_gc
#> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 tibble 551.8µs 757.31µs 640µs 2.14ms 1320. 2.3KB 12
#> 2 tibble_li… 123.66µs 155.37µs 139.65µs 482.87µs 6436. 1.25KB 14
#> 3 new_valid… 85.92µs 106.42µs 96.58µs 367.57µs 9397. 1.25KB 14
#> 4 set_tibbl… 6.36µs 8.71µs 7.54µs 83.84µs 114829. 768B 3
#> 5 base 5.03µs 5.9µs 5.64µs 31.84µs 169607. 512B 3
#> # … with 2 more variables: n_itr <int>, total_time <bch:tm> Created on 2018-10-13 by the reprex package (v0.2.1.9000) @hadley: Do we want to support tibble constructors with fewer checks (perhaps as arguments to |
I think we should refactor What is |
Made my day. |
So the constructor doesn't even check correctness of names? Just copy attributes and set row names and class? Currently, |
I think the general rule is that constructor should check types/lengths and nothing else, in order to be as performant as possible. If we can check names very quickly, I think it's ok to include, but generally the constructor is a developer facing function that should only be used when you know that the values are correct. |
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
new_tibble()
provides convenience to easily create a new subclass, but it may be inhibited by downstream packages due to its speed performance, when calling it many times.Created on 2018-08-25 by the reprex
package (v0.2.0).
The text was updated successfully, but these errors were encountered: