slice() could be made faster when the arguments in ... are expensive to compute by evaluating ... only once per group. In the example below where the argument to slice takes one second to compute and there are two groups, the code takes ~4 seconds to run instead of ~2.
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 4.1.2
library(data.table, warn.conflicts = FALSE)
dt <- data.table(id = c(1, 2))
return_1 <- function() {Sys.sleep(1); 1}
library(tictoc)
tic()
dt %>%
lazy_dt() %>%
group_by(id) %>%
slice(return_1())
#> Source: local data table [2 x 1]
#> Groups: id
#> Call: `_DT1`[`_DT1`[, .I[return_1()[between(return_1(), -.N, .N)]],
#> by = .(id)]$V1]
#>
#> id
#> <dbl>
#> 1 1
#> 2 2
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
toc()
#> 4.074 sec elapsed
tic()
dt[dt[,
{
slice_ind <- return_1()
.I[slice_ind[between(slice_ind, -.N, .N)]]
},
by = .(id)
]$V1]
#> id
#> 1: 1
#> 2: 2
toc()
#> 2.014 sec elapsed
Created on 2022-07-19 by the reprex package (v2.0.1)
slice()could be made faster when the arguments in...are expensive to compute by evaluating...only once per group. In the example below where the argument to slice takes one second to compute and there are two groups, the code takes ~4 seconds to run instead of ~2.Created on 2022-07-19 by the reprex package (v2.0.1)