-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache invalidation is too strict? #58
Comments
I'd be happy to contribute a PR for this, if you think it's an issue worth addressing. I think the difference stems from the If caches are shared across multiple machines with S3/Drive/Dropbox, it seems useful to allow for differing filenames. (I was wrong in my previous comment; the cross-platform aspect doesn't really matter.) |
`body(some_function)` returns the function's `srcref`, which will differ across computers. Using `as.character(body(some_function))` hashes the contents of the function, but not the source file. See also: `digest::sha1` https://cran.r-project.org/web/packages/digest/vignettes/sha1.html Fixes #58
I wonder if |
As far as I can tell, both work. Do you prefer one way over the other? |
It's interesting that timings are opposite of what I expected: bench::mark(
digest::digest(as.character(body(tibble::new_tibble))),
digest::digest(as.list(body(tibble::new_tibble))),
check = FALSE
)
#> # A tibble: 2 x 10
#> expression min mean median max `itr/sec` mem_alloc n_gc n_itr
#> <chr> <bch:t> <bch> <bch:t> <bch:tm> <dbl> <bch:byt> <dbl> <int>
#> 1 digest::d… 86.9µs 107µs 95.4µs 532.67µs 9343. 5.93MB 6 4428
#> 2 digest::d… 340.3µs 441µs 399.9µs 1.03ms 2269. 85.74KB 2 1103
#> # ... with 1 more variable: total_time <bch:tm> Created on 2018-11-07 by the reprex package (v0.2.1) As far as I can tell, the |
I noticed that
memoise
gives different results when a function issource
-d vs run line by line, or when a package is installed on one platform vs another. (As Jim Hester mentioned in #55, this happens because the function itself is included in the hash, and I guess the underlying bytecode can differ.)Below is a quick example that demonstrates the non-caching behavior when the program is run line by line or sourced.
One way to address this, if you think it's a problem that should be addressed, would be to use or borrow from
digest::sha1
.sha1
is a generic that dispatches to methods that try to look at meaningful differences (see vignette). Note that in the example below, the digests differ, but the sha1s are the same.Do you think this is worth addressing? It seems like a non-issue for in-memory caches, but could be a pain for on-disk, potentially cross-platform caches.
The text was updated successfully, but these errors were encountered: