-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using rlang with memoise #79
Comments
Bumping this, |
We figured out a nice hack for some use scenarios just in case anyone lands here. We used Example is below:
We ignore
The |
I don't think there's a general solution to this problem. memoise takes the (evaluated) input values, combines them, and hashes them; the resulting value is used as a key. This isn't going to play well with nonstandard evaluation, where you want to use the unevaluated expression as the key. To illustrate, here's a normal function, which prints a string representation of the input. Each time it executes, it also prints library(rlang)
library(memoise)
f <- function(x) {
message("Running f()")
paste("The input value was:", x)
}
a <- 10
f(a + 1)
#> Running f()
#> [1] "The input value was: 11"
a <- 20
f(a + 1)
#> Running f()
#> [1] "The input value was: 21" Now let's memoize it. When there's a cache hit, it will return the same value, but it will not print out fm <- memoise(f)
a <- 10
fm(a + 1)
#> Running f()
#> [1] "The input value was: 11"
fm(a + 1) # Will have a cache hit, so won't print "Running f()".
#> [1] "The input value was: 11"
a <- 20
fm(a + 1)
#> Running f()
#> [1] "The input value was: 21" When it sees the input value of Now, here's a function with non-standard evaluation. It uses g <- function(expr) {
message("Running g()")
expr_string <- deparse(enexpr(expr))
paste("The input expression was:", expr_string)
}
a <- 10
g(a + 1)
#> Running g()
#> [1] "The input expression was: a + 1"
a <- 20
g(a + 1)
#> Running g()
#> [1] "The input expression was: a + 1" The only thing that matters for the result is the unevaluated input expression gm <- memoise(g)
a <- 10
gm(a + 1)
#> Running g()
#> [1] "The input expression was: a + 1"
gm(a + 1) # Will have a cache hit.
#> [1] "The input expression was: a + 1"
a <- 20
gm(a + 1)
#> Running g()
#> [1] "The input expression was: a + 1" Notice that this time around, when the value of Suppose But what about for ## Note: This chunk is not what actually happens with fm().
## It shows what would happen if fm() used the unevaluated expression for the key.
a <- 10
fm(a+1)
#> Running f()
#> [1] "The input value was: 11"
a <- 20
fm(a+1)
#> [1] "The input value was: 11" I think that what you want is for
However, In summary h <- function(expr) {
message("Running h()")
paste("The input expression was:", deparse(expr))
}
a <- 10
h(expr(a + 1))
#> Running h()
#> [1] "The input expression was: `a + 1`"
hm <- memoise(h)
a <- 10
hm(expr(a + 1))
#> Running h()
#> [1] "The input expression was: `a + 1`"
a <- 20
hm(expr(a + 1)) # Cache hit
#> [1] "The input expression was: `a + 1`" |
Hi @wch,
Another thing I find puzzling here, why did
But in chunk 5, changing the input value in global env and passing it to
If I'm getting the gist of the last chunk I think it's that if unevaluated expressions passed in will not change the hash even if the global environment inputs to that expression are changed. However, I was noticing that if we use
Unfortunately, with the size of our app and the number of repetitious long db queries, NSE is a must. We've nested the |
Sorry, that was some bad copying and pasting and editing. It should have been
Sorry I wasn't clear about that -- chunk 5 shows what would happen if it used the unevaluated expression for the key. I've added a comment to the top of that chunk to make it clearer. |
@wch Ah, thank you for clarifying that! |
Here's a wrapper function for library(memoise)
library(rlang)
# A version of memoise(). Some parameters can be designated as `expr_vars`. For
# these params, the unevaluated expression will be used for caching, instead of
# the value (which is what is normally used for caching).
memoise2 <- function(f, ..., expr_vars = character(0)) {
f_wrapper <- function(args) {
eval_tidy(expr(f(!!!args)))
}
f_wrapper_m <- memoise(f_wrapper, ...)
function(...) {
# Capture args as (unevaluated) quosures
dot_args <- dots_definitions(...)$dots
# For each arg, if it's in our set of `expr_vars`, extract the expression
# (and discard the environment); if it's not in that set, then evaluate the
# quosure.
dot_args <- mapply(
dot_args,
names(dot_args),
FUN = function(quo, name) {
if (name %in% expr_vars) {
get_expr(quo)
} else {
eval_tidy(quo)
}
},
SIMPLIFY = FALSE
)
# Print out the captured items
# str(dot_args)
# Call the memoized wrapper function.
f_wrapper_m(dot_args)
}
}
f <- function(x, y) {
message("Running f()")
paste0("Captured x: ", deparse(enexpr(x)), ". Evaluated y: ", y)
}
fm <- memoise2(f, expr_vars = c("x"))
a <- 10
b <- 10
f(x=a+1, b+2)
#> Running f()
#> [1] "Captured x: a + 1. Evaluated y: 12"
# Run the memoized version twice. Second time results in a hit.
fm(x=a+1, b+2)
#> Running f()
#> [1] "Captured x: a + 1. Evaluated y: 12"
fm(x=a+1, b+2)
#> [1] "Captured x: a + 1. Evaluated y: 12"
# Changing `b` causes a cache miss, because it results a different value for y,
# and y is a normal arg, where the value is used for caching.
b <- 20
fm(x=a+1, b+2)
#> Running f()
#> [1] "Captured x: a + 1. Evaluated y: 22"
# Changing `a` does NOT cause a cache miss, because when we called memoise2, we
# designated `x` as an arg for which the unevaluated expression should be used for
# caching, instead of the value.
a <- 20
fm(x=a+1, b+2)
#> [1] "Captured x: a + 1. Evaluated y: 22" Note that there are some limitations:
|
Hi @wch, I don't think I explained my question too well honestly, but we were able to solve it. In our case, we have a two standard inputs to our memoised function
To overcome the limitation with NSE causing cache misses (or incorrect cache matches) we use the We made it such that You'll also see code that ensures this doesn't happen in
I know this is all highly specific and probably not entirely clear but I hope it's useful for folks landing on this thread. Happy to answer questions if need be. I'm wondering if you see any potential exceptions or pitfalls to this situation, provided that we observe the syntactical conventions that make it work, that might cause memoisation to start malfunctioning? |
This *appears* correct. I tested it by inspection with the SUGG and CRAM data on a set seed in CV. I think I *could* set up a test for this, but it'd be a lot of work, a time-consuming test, and amount to repeatedly re-doing said test-by-inspection. As it is, this memoization looks like it cuts the run time of CV down to a quarter of what it was. refs: https://memoise.r-lib.org/reference/memoise.html#details https://rdrr.io/r/base/ns-hooks.html r-lib/profvis#134 r-lib/memoise#79 (comment)
The below example fails because memoise can't cache with arguments that are expressions.
This seems like it should be fine? Line 19 of
memoise
seems to be the culprit.SessionInfo()
The text was updated successfully, but these errors were encountered: