Skip to content

stantargets: reproducible Stan pipelines at scale #430

Closed
@wlandau

Description

@wlandau

Submitting Author: Will Landau (@wlandau)
Repository: https://github.com/wlandau/stantargets
Version submitted: 0.0.0.9000
Editor: @melvidoni
Reviewers: @sakrejda @mattwarkentin

Due date for @sakrejda: 2021-03-31

Due date for @mattwarkentin: 2021-03-31
Archive: TBD
Version accepted: TBD


  • Paste the full DESCRIPTION file inside a code block below:
Package: stantargets
Title: Targets for Stan Workflows
Description: Bayesian data analysis usually incurs long runtimes
  and cumbersome custom code. A specialized pipeline toolkit for
  Bayesians, the 'stantargets' R package leverages
  'targets' and 'cmdstanr' to ease these burdens.
  'stantargets' makes it super easy to set up useful scalable
  Stan pipelines that automatically parallelize the computation
  and skip expensive steps when the results are already up to date.
  Minimal custom code is required, and there is no need to manually
  configure branching, so usage is much easier than 'targets' alone.
  'stantargets' can access all of 'cmdstanr''s major algorithms
  (MCMC, variational Bayes, and optimization) and it supports
  both single-fit workflows and multi-rep simulation studies.
  For the statistical methodology, please refer to 'Stan' documentation
  (Stan Development Team 2020) <https://mc-stan.org/>.
Version: 0.0.0.9000
License: MIT + file LICENSE
URL: https://wlandau.github.io/stantargets/, https://github.com/wlandau/stantargets
BugReports: https://github.com/wlandau/stantargets/issues
Authors@R: c(
  person(
    given = c("William", "Michael"),
    family = "Landau",
    role = c("aut", "cre"),
    email = "will.landau@gmail.com",
    comment = c(ORCID = "0000-0003-1878-3253")
  ),
  person(
    family = "Eli Lilly and Company",
    role = "cph"
  ))
Depends:
  R (>= 3.5.0)
Imports:
  cmdstanr (>= 0.2.0),
  digest (>= 0.6.21),
  fst (>= 0.9.4),
  posterior (>= 0.1.2),
  purrr (>= 0.3.4),
  qs (>= 0.14.1),
  rlang (>= 0.4.8),
  stats,
  targets (>= 0.0.1),
  tarchetypes (>= 0.0.1),
  tibble (>= 3.0.4),
  tools
Suggests:
  dplyr (>= 1.0.2),
  fs (>= 1.5.0),
  knitr (>= 1.28),
  R.utils (>= 2.10.1),
  rmarkdown (>= 2.1),
  testthat (>= 3.0.0),
  visNetwork (>= 2.0.9),
  withr (>= 2.1.2)
Remotes:
  stan-dev/cmdstanr,
  stan-dev/posterior
SystemRequirements: CmdStan >= 2.25.0
Encoding: UTF-8
Language: en-US
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

stantargets is very similar to jagstargets (#425). stantargets leverages the existing workflow automation capabilities of targets to orchestrate computation and skip up-to-date tasks in Bayesian data analysis pipelines. stantargets reduces the burden of user-side custom code that targets would otherwise require, which helps free statisticians to focus more on the models and less on the software engineering.

  • Who is the target audience and what are scientific applications of this package?

stantargets is for Bayesian statisticians who develop and run Stan models. Example workflows range from individual analyses of clinical data to large-scale simulation-based calibration studies for validation.

targets already provides the same kind of workflow automation, but it requires more custom code to set up a workflow. stantargets uses specialized domain knowledge to make this process easier. Packages rstan and cmdstanr interface with Stan but do not provide the same kind of workflow automation. In light of the recent preprint by Gelman et al. (2020), I believe the Stan Development Team would be very interested in this kind of workflow automation.

N/A

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

N/A

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

If JOSS is still an option, I would like to publish there. I have prepared a manuscript at https://github.com/wlandau/stantargets/blob/main/inst/paper.md.

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions