Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to disable result caching in bench::mark() #58

Closed
MarcusKlik opened this issue Sep 10, 2019 · 4 comments
Closed

Option to disable result caching in bench::mark() #58

MarcusKlik opened this issue Sep 10, 2019 · 4 comments
Labels
feature a feature request or enhancement

Comments

@MarcusKlik
Copy link

thanks for providing an excellent package, bench's extended set of measurements as compared to other benchmarking packages are extremely useful.

I have a feature request for method bench::mark(), when using it to time a number of expressions that each require a significant amount of RAM, it would be convenient to have an option to disable the caching of results. Caching a potentially large number of large sized objects quickly eats the available memory which limits benchmarking of e.g. large vectors:

nr_of_ints <- 1e8

res <- bench::mark(
  integer(nr_of_ints),
  integer(nr_of_ints),
  max_iterations = 10)

# 2 times 400 MB
object.size(res)
#> 800014016 bytes

I realize that there is a workaround by providing a custom method that returns a small result:

nr_of_ints <- 1e8

fx <- function(nr_of_ints) {
  integer(nr_of_ints)
  TRUE
}

res <- bench::mark(
  fx(nr_of_ints),
  fx(nr_of_ints),
  max_iterations = 10)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.

# small
object.size(res)
#> 99464 bytes

With this workaround, the chance of introducing additional garbage collections during benchmarking increases (and these are also measured). So that seems like a less elegant solution :-)

Would it be an idea to skip caching the results when check = FALSE ?

thanks and all the best!

@jimhester
Copy link
Member

The garbage collections are happening at the function boundary, you can avoid this by not creating a function.

res <- bench::mark(
  a = { integer(nr_of_ints); NULL },
  b = { integer(nr_of_ints); NULL },
  max_iterations = 10)

But also not keeping the results when check = FALSE might be ok

@jimhester jimhester added the feature a feature request or enhancement label Sep 10, 2019
@MarcusKlik
Copy link
Author

MarcusKlik commented Sep 11, 2019

Hi @jimhester,

great, thanks for the workaround!

The option to disable result caching would be convenient but from your earlier comment I understand that the focus of bench::mark() is on smaller datasets and for those such an option is not really relevant, so I guess there is something to be said for both approaches...

@jimhester
Copy link
Member

As of fce1e23 setting check = FALSE also disables the storage of the results.

@MarcusKlik
Copy link
Author

great, thanks for adding the feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants