You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When binding many 1 row tibbles vec_c() is 20% to 40% faster than vec_rbind(). I would have expected vec_rbind() to be faster as this seems to be the main purpose of it.
library(vctrs)
row_list1<- vec_rep(vec_chop(mtcars), 1e3)
row_list10<- vec_rep(vec_chop(mtcars), 10e3)
ptype<- vec_ptype(row_list1[[1]])
bench::mark(
vec_c1= vec_c(!!!row_list1, .ptype=ptype),
vec_rbind1= vec_rbind(!!!row_list1, .ptype=ptype),
check=TRUE,
iterations=3
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.#> # A tibble: 2 × 6#> expression min median `itr/sec` mem_alloc `gc/sec`#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>#> 1 vec_c1 151ms 217ms 4.70 8.47MB 6.26#> 2 vec_rbind1 161ms 208ms 4.74 7.49MB 7.90bench::mark(
vec_c10= vec_c(!!!row_list10, .ptype=ptype),
vec_rbind10= vec_rbind(!!!row_list10, .ptype=ptype),
check=TRUE,
iterations=3
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.#> # A tibble: 2 × 6#> expression min median `itr/sec` mem_alloc `gc/sec`#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>#> 1 vec_c10 1.81s 2.04s 0.507 87.7MB 1.01#> 2 vec_rbind10 2.65s 2.72s 0.364 71.8MB 1.34
For my own use case, I have many one-row tibbles, and I would like to call vec_rbind() internally in a package (c.f. wlandau/crew#123). The package makes sure all the names are already consistent and correct, so I do not need any name checking or name repair. On my machine, the fastest supported name repair option is responsible for 50-60% of the execution time. It would be great to be able to disable name processing completely and cut out the overhead.
packageVersion("data.table")
#> [1] ‘1.14.8’
packageVersion("vctrs")
#> [1] ‘0.6.3’result<-crew:::monad_tibble(crew::crew_eval(12))
list<- replicate(1e6, result, simplify=FALSE)
system.time(data.table::rbindlist(list, use.names=FALSE))
#> user system elapsed #> 0.924 0.014 0.940
system.time(vctrs::vec_rbind(list, .name_repair="universal_quiet"))
#> user system elapsed #> 1.338 0.061 1.400
When binding many 1 row tibbles
vec_c()
is 20% to 40% faster thanvec_rbind()
. I would have expectedvec_rbind()
to be faster as this seems to be the main purpose of it.Created on 2021-10-09 by the reprex package (v2.0.1)
Session info
The text was updated successfully, but these errors were encountered: