Skip to content
This repository has been archived by the owner on Apr 21, 2020. It is now read-only.
/ mustashe Public archive
forked from jhrcook/mustashe

A system for stashing and loading the results of long running computations.

License

Notifications You must be signed in to change notification settings

jimbrig/mustashe

 
 

mustashe

CRAN status R build status Travis build status AppVeyor build status Codecov test coverage

The goal of ‘mustashe’ is to save time on long-running computations by storing and reloading the resulting object after the first run. The next time the computation is run, instead of evaluating the code, the stashed object is loaded. ‘mustashe’ is great for storing intermediate objects in an analysis.

Installation

You can install the released version of ‘mustashe’ from CRAN with:

install.packages("mustashe")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("jhrcook/mustashe")

Loading ‘mustashe’

The ‘mustashe’ package is loaded like any other, using the library() function.

library(mustashe)

Basic example

Below is a simple example of how to use the stash() function from ‘mustashe’.

Let’s say, for part of an analysis, we are running a long simulation to generate random data rnd_vals. This is mocked below using the Sys.sleep() function. We can time this process using the ‘tictoc’ library.

tictoc::tic("random simulation")
stash("rnd_vals", {
    Sys.sleep(3)
    rnd_vals <- rnorm(1e5)
})
#> Stashing object.
tictoc::toc()
#> random simulation: 3.382 sec elapsed

Now, if we come back tomorrow and continue working on the same analysis, the second time this process is run the code is not evaluated because the code passed to stash() has not changed. Instead, the random values rnd_vals is loaded.

tictoc::tic("random simulation")
stash("rnd_vals", {
    Sys.sleep(3)
    rnd_vals <- rnorm(1e5)
})
#> Loading stashed object.
tictoc::toc()
#> random simulation: 0.053 sec elapsed

Dependencies

A common problem with storing intermediates is that they have dependencies that can change. If a dependency changes, then we want the stashed value to be updated. This is accomplished by passing the names of the dependencies to the depends_on argument.

For instance, let’s say we are calculating some value foo using x. (For the following example, I will use a print statement to indicate when the code is evaluated.)

x <- 100

stash("foo", depends_on = "x", {
    print("Calculating `foo` using `x`.")
    foo <- x + 1
})
#> Stashing object.
#> [1] "Calculating `foo` using `x`."

foo
#> [1] 101

Now if x is not changed, then the code for foo does not get re-evaluated.

x <- 100

stash("foo", depends_on = "x", {
    print("Calculating `foo` using `x`.")
    foo <- x + 1
})
#> Loading stashed object.

foo
#> [1] 101

But if x does change, then foo gets re-evaluated.

x <- 200

stash("foo", depends_on = "x", {
    print("Calculating `foo` using `x`.")
    foo <- x + 1
})
#> Updating stash.
#> [1] "Calculating `foo` using `x`."

foo
#> [1] 201

Attribution

The inspiration for this package came from the cache() feature in the ‘ProjectTemplate’ package. While the functionality and implementation are a bit different, this would have been far more difficult to do without referencing the source code from ‘ProjectTemplate’.

About

A system for stashing and loading the results of long running computations.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%