Skip to content

speed up collector() helper #657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 31, 2023
Merged

speed up collector() helper #657

merged 3 commits into from
Mar 31, 2023

Conversation

simonpcouch
Copy link
Contributor

This helper takes up most all of the time in collect_predictions(). tune will see some speedup here, but stacks::add_candidates() especially benefits.

With main dev:

library(tidymodels)

res <- 
  fit_resamples(
    linear_reg(), 
    mpg ~ ., 
    bootstraps(mtcars),
    control = control_grid(save_pred = TRUE)
  )

bench::mark(
  predictions = tune:::collector(res, ".predictions"),
  metrics = tune:::collector(res, ".metrics"),
  check = FALSE
)
#> # A tibble: 2 × 6
#>   expression       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 predictions   3.12ms    3.4ms      283.   621.4KB     6.24
#> 2 metrics       3.02ms   3.35ms      298.    36.5KB     6.22

With this PR:

#> # A tibble: 2 × 6
#>   expression       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 predictions    433µs    445µs     2236.    75.7KB     6.18
#> 2 metrics        328µs    337µs     2960.    12.8KB     8.26

R/collect.R Outdated

res <-
vctrs::vec_cbind(
vctrs::list_unchop(x[[coll_col]]),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As used in tune, coll_col is always one of .metrics or .predictions--i.e. always has length one.

R/collect.R Outdated
res <-
vctrs::vec_cbind(
vctrs::list_unchop(x[[coll_col]]),
vctrs::vec_rep_each(x[, id_cols], times = vctrs::list_sizes(x[[coll_col]]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the previous select()ion, the only other columns are ids.

Co-authored-by: Davis Vaughan <davis@rstudio.com>
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants