Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatically exposing functions from Stan to R #119

Open
wds15 opened this issue Jan 22, 2024 · 0 comments
Open

automatically exposing functions from Stan to R #119

wds15 opened this issue Jan 22, 2024 · 0 comments

Comments

@wds15
Copy link

wds15 commented Jan 22, 2024

As I rely by now on a ton of Stan functions and I do regularly expose these to R... which involves quite often that I have to compile the C++ code in every new R session. The sourceCpp thing from Rcpp can be made to always use the same cache directory, but also that is an approach which has flaws (concurrent access to this from parallel running processes). Here is now another approach which could possibly be include in rstantools: The idea is to wrap the functions to be exposed into a Stan model. I noticed that rstantools does already a lot to expose these functions via Rcpp attributes to R, but not all pitfalls are handled... which is why I hacked up the function below, which may serve as a template for a new feature in rstantools:

#' Creates an R package under the location of path which has mereley
#' the provided Stan functions as source code. The code gets compiled
#' and loaded via devtools load_all. The user gets returned back an
#' environment containing the functions callable from R. Note that the
#' functions do not get attached to the global environment. By default
#' overwrite=TRUE such that whenever the stan functions change a new
#' version of the package is created from scratch.
load_stan_functions <- function(path, ..., overwrite=TRUE) {
    stan_functions_str <- paste(c(...), collapse="\n")
    if(stan_functions_str == "") {
        message("No Stan functions found.")
        return(invisible(new.env()))
    }
    stan_functions_code <- paste0(c("functions {\n", stan_functions_str, "\n}\n"))
    stan_file <- cmdstanr::write_stan_file(stan_functions_code)
    pkg_stan_file <- file.path(path, "inst", "stan", basename(stan_file))
    if(file.exists(pkg_stan_file) & tools::md5sum(pkg_stan_file) == tools::md5sum(stan_file)) {
        message("Stan function file already exists! Nothing to do!")
    } else {
        ## package exists, but content is different => clean it if this is requested; or error
        if(overwrite) {
            unlink(path, recursive=TRUE, force=TRUE)
        } else {
            stop("Stan functions have changed and not permitted to overwrite.")
        }
    }
    if(!dir.exists(path)) {
        old_opts <- options(usethis.allow_nested_project=TRUE, usethis.quiet=TRUE)
        suppressMessages(rstantools::rstan_create_package(path, rstudio=FALSE, stan_files=stan_file, license=FALSE, open=FALSE))
        options(old_opts)
        rpkg <- basename(path)
        ns <- file.path(path, "NAMESPACE")
        cat("# Generated by roxygen2: do not edit by hand\n", file=ns)
        cat("import(Rcpp)\n", file=ns, append=TRUE)
        cat("import(methods)\n", file=ns, append=TRUE)
        cat(paste0("useDynLib(", rpkg, ", .registration = TRUE)\n"), file=ns, append=TRUE)
        pkgbuild::compile_dll(path, compile_attributes=TRUE, quiet=TRUE, debug=FALSE)        
        ## rewrite RcppExports so that Stan functions have working
        ## default arguments for the output stream, rng and lp
        ## arguments...would be great to have this done again whenever
        ## RcppExports changes, but not really needed.
        rcpp_exposed_functions <- new.env()
        rcpp_exports_file <- file.path(path, "R", "RcppExports.R")
        source(rcpp_exports_file, local=rcpp_exposed_functions)
        compiled_functions <- ls(rcpp_exposed_functions)
        ## WARNING: rng functions will by default always create a new rng
        ## object each time the function is called. It would be better
        ## to setup once an rng upon package load and then by default
        ## point there. Also note that each created rng uses the
        ## default seed of 0!
        for (x in compiled_functions) {
            FUN <- get(x, envir = rcpp_exposed_functions)
            args <- formals(FUN)
            if ("pstream__" %in% names(args)) 
                args$pstream__ <- quote(rstan::get_stream())
            if ("lp__" %in% names(args)) 
                args$lp__ <- 0
            if ("base_rng__" %in% names(args)) {
                message("Function ", x, " uses a random number generator.\nBy default an rng instance is created for each function invocation with seed 0.\nIt is recommended to use the base_rng__ argument explicitly, see ?rstan::get_rng for details.")
                args$base_rng__ <- quote(rstan::get_rng())
            }
            formals(FUN) <- args
            assign(x, FUN, envir = rcpp_exposed_functions)
        }
        rcpp_exports_comment <- c(grep("^#", readLines(rcpp_exports_file), value=TRUE), "")
        rcpp_exports <- file(rcpp_exports_file, open="wt")
        writeLines(rcpp_exports_comment, con=rcpp_exports)
        for(x in compiled_functions) {
            cat(x, "<- ", file=rcpp_exports)
            dput(get(x, envir = rcpp_exposed_functions), file=rcpp_exports)
        }
        close(rcpp_exports)
    }
    message("Loading Stan package for functions from path ", path)
    invisible(devtools::load_all(path, attach=TRUE, quiet=TRUE)$env)
}

Having this defined one can easily write any custom functions as a simple package to disk. Compilation then happens only once and once. Every subsequent load will use the compiled binaries and return instantly an environment with the exposed Stan functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant