-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map* functions are slow when .f
is a string
#820
Comments
I reported basically the same issue (#749) some time ago (although I didn't compare to base). library(purrr)
x <- lapply(1:10000, function(name) list(a = "A"))
pluck_impl <- purrr:::pluck_impl
pluck2 <- function(.x, index, .default = NULL) {
.Call(
pluck_impl,
x = .x,
index = index,
missing = .default,
strict = FALSE
)
}
microbenchmark::microbenchmark(
purrr = map_chr(x, 'a'),
purrr2 = map_chr(x, pluck2, index = list('a')),
base = vapply(x, `[[`, 'a', FUN.VALUE = "")
)
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> purrr 23.668865 25.666429 27.828165 27.157205 29.682467 36.531673 100
#> purrr2 9.491911 10.316131 11.255159 11.192060 12.000602 14.780147 100
#> base 2.400959 2.751333 3.118299 3.052053 3.271311 5.760484 100 Created on 2021-04-07 by the reprex package (v2.0.0) |
I think the root cause is that library(purrr)
x <- lapply(1:10000, function(.) list(a = "A"))
bench::mark(
pluck(x, 500, "a"),
x[[500]][["a"]],
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 pluck(x, 500, "a") 2.88µs 3.21µs 300832. 7.77KB 30.1
#> 2 x[[500]][["a"]] 250ns 293ns 3106710. 0B 0 Created on 2022-08-27 by the reprex package (v2.0.1) I'm not sure how much we can do here given that |
OTOH most of the time we call |
With changes in #899: library(purrr)
x <- lapply(1:1000, function(.) list(a = "A"))
bench::mark(
map_chr(x, 'a'),
map_chr(x, `[[`, "a"),
vapply(x, `[[`, 'a', FUN.VALUE = "")
)
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt>
#> 1 map_chr(x, "a") 1.11ms 1.16ms 838. 60.16KB
#> 2 map_chr(x, `[[`, "a") 553.25µs 593.62µs 1643. 194.49KB
#> 3 vapply(x, `[[`, "a", FUN.VALUE = "") 219.29µs 230.46µs 4266. 7.86KB
#> # … with 1 more variable: `gc/sec` <dbl> Created on 2022-08-28 by the reprex package (v2.0.1) So still not amazing, but twice as fast is nothing to sneeze at. I'm a little suprised that |
For example,
map_chr()
is 10x slower thanvapply()
:A bit slower would be OK, but this is way slower than the base alternative. For my particular use case, I was surprised to find that a single
map_chr
call was the most expensive part of my code (taking about 78% of the time); after switching tovapply
, it was much faster.The text was updated successfully, but these errors were encountered: