dynamite: Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data #554

santikka · 2022-08-18T13:33:02Z

Date accepted: 2022-12-13
Submitting Author Name: Santtu Tikka
Submitting Author Github Handle: @santikka
Other Package Authors Github handles: @helske
Repository: https://github.com/santikka/dynamite
Version submitted: 0.0.1
Submission type: Stats
Badge grade: gold
Editor: @noamross
Reviewers: @nicholasjclark, @LucyMcGowan

Due date for @nicholasjclark: 2022-09-27

Due date for @LucyMcGowan: 2022-10-20
Archive: TBD
Version accepted: TBD
Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: dynamite
Title: Bayesian Modeling and Causal Inference for Multivariate
    Longitudinal Data
Version: 0.0.1
Authors@R: c(
    person("Santtu", "Tikka", , "santtuth@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0003-4039-4342")),
    person("Jouni", "Helske", , "jouni.helske@iki.fi", role = "aut",
           comment = c(ORCID = "0000-0001-7130-793X"))
  )
Description: Easy-to-use and efficient interface for 
  Bayesian inference of complex panel (time series) data. The package supports 
  joint modeling of multiple measurements per individual, time-varying and
  time-invariant effects, and a wide range of discrete and 
  continuous distributions. Estimation of the models is carried out via 'Stan'.
License: GPL (>= 3)
URL: https://github.com/santikka/dynamite
BugReports: https://github.com/santikka/dynamite/issues
Depends: 
    R (>= 4.1.0)
Imports: 
    bayesplot,
    checkmate,
    cli,
    data.table (>= 1.14.3),
    dplyr,
    glue,
    ggplot2,
    MASS,
    posterior,
    rlang,
    rstan (>= 2.26.11),
    stats,
    tidyr,
    utils
Suggests: 
    covr,
    knitr,
    plm,
    rmarkdown,
    testthat (>= 3.0.0)
VignetteBuilder: 
    knitr
Config/testthat/edition: 3
Encoding: UTF-8
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd",
    "srr::srr_stats_roclet"))
RoxygenNote: 7.2.1
LazyData: true
LazyDataCompression: xz

Pre-submission Inquiry

A pre-submission inquiry has been approved in issue #552

General Information

Who is the target audience and what are scientific applications of this package?

The package is mainly intended for applied researchers working with complex panel data.
Panel data is common in many scientific fields, especially in sociology and econometrics. For example, analysing individual-level life-course data is valuable for assessing the effects of policy reforms and other interventions.

Paste your responses to our General Standard G1.1 here, describing whether your software is:

The dynamite R package provides easy-to-use interface for Bayesian inference of complex panel (time series) data comprising of multiple measurements per multiple individuals measured in time. The main features distinguishing the package and the underlying methodology from many other approaches are:

Support for both time-invariant and time-varying effects modeled via B-splines.
Joint modeling of multiple measurements per individual (multiple channels) based directly on the assumed data generating process.
Support for non-Gaussian observations: Currently Gaussian, Categorical, Poisson, Bernoulli, Binomial, Negative Binomial, Gamma, Exponential, and Beta distributions are available and these can be mixed arbitrarily in multichannel models.
Allows evaluating realistic long-term counterfactual predictions which take into account the dynamic structure of the model by posterior predictive distribution simulation.
Transparent quantification of parameter and predictive uncertainty due to a fully Bayesian approach.
User-friendly and efficient R interface with state-of-the-art estimation via Stan.

There are several R packages in CRAN focusing on panel data analysis including but not limited to:

plm for linear panel data models.
fixest for fixed effects and different distributions of response variables (based on stats::family).
panelr for "within-between" models which combine fixed effect and random effect models.
lavaan for general structural equation modelling, and thus can be used to estimate various panel data models such as cross-lagged panel models with fixed or random intercepts.

However, to the best of our knowledge, there are no other R packages (or software in general) that support all features of dynamite simultaneously. Thus The first implementation of a novel algorithm seems most applicable.

(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

Not applicable.

Badging

What grade of badge are you aiming for? (bronze, silver, gold)

Gold.

If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold)

We designed dynamite from the ground up with the standards in mind and thus we think the package fulfills all four aspects. We have been able to comply with 129 standards with only 42 N/A standards across the "Bayesian and Monte Carlo" and "Regression and Supervised Learning" categories. The package is very general and capable of supporting a wide range of data and model structures as demonstrated by the examples and the package tests. In our opinion, the internal structure of the package is also well motivated and compartmentalized. We have also tried to carefully select the package dependencies and keep them to a minimum.

Technical checks

Confirm each of the following by checking the box.

I have read the rOpenSci packaging guide.
I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
I/we have read the Statistical Software Peer Review Guide for Authors.

I/we have run autotest checks on the package, and ensured no tests fail.
autotest reports failures for the following tests:

    type       test_name        fn_name          parameter parameter_type   operation                      content         
   <chr>      <chr>            <chr>            <chr>     <chr>            <chr>                          <chr>            
 1 diagnostic single_char_case get_code         time      single character upper-case character parameter is case dependent
 2 diagnostic single_char_case get_code         group     single character upper-case character parameter is case dependent
 3 diagnostic single_char_case get_data         time      single character upper-case character parameter is case dependent
 4 diagnostic single_char_case get_data         group     single character upper-case character parameter is case dependent
 5 diagnostic single_char_case get_priors       time      single character upper-case character parameter is case dependent
 6 diagnostic single_char_case get_priors       group     single character upper-case character parameter is case dependent
 7 message    NA               mcmc_diagnostics NA        NA               normal function call           0 of 400 iterati…
 8 message    NA               mcmc_diagnostics NA        NA               normal function call           0 of 400 iterati…
 9 message    NA               mcmc_diagnostics NA        NA               normal function call           E-BFMI indicated…
10 message    NA               mcmc_diagnostics NA        NA               NA                             0 of 400 iterati…
11 message    NA               mcmc_diagnostics NA        NA               NA                             0 of 400 iterati…
12 message    NA               mcmc_diagnostics NA        NA               NA                             E-BFMI indicated…

The first six of these report that the time and group parameters of several functions are case dependent. This is by design, as both of these parameters correspond to column names of the input data. While not entirely sure, we believe the remaining failures occur because the function mcmc_diagnostic produces messages while not being a print or a summary method, but this is also by design as the function outputs diagnostic messages related to the MCMC runs. The output of autotest is not very clear on this matter, as all columns are just filled with NA values.

The srr_stats_pre_submit() function confirms this package may be submitted.
The pkgcheck() function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.

This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.

Publication options

Do you intend for this package to go on CRAN?
Do you intend for this package to go on Bioconductor?

Code of conduct

I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

The text was updated successfully, but these errors were encountered:

ropensci-review-bot · 2022-08-18T13:33:03Z

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot · 2022-08-18T13:33:06Z

🚀

Editor check started

👋

ropensci-review-bot · 2022-08-18T13:51:42Z

Oops, something went wrong with our automatic package checks. Our developers have been notified and package checks will appear here as soon as we've resolved the issue. Sorry for any inconvenience.

mpadge · 2022-08-23T13:09:18Z

@ropensci-review-bot check package

ropensci-review-bot · 2022-08-23T13:09:20Z

Thanks, about to send the query.

ropensci-review-bot · 2022-08-23T13:09:23Z

🚀

Editor check started

👋

ropensci-review-bot · 2022-08-23T13:10:00Z

Checks for dynamite (v0.0.1)

git hash: a8d932ca

✔️ Package name is available
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✖️ The following functions have no documented return values: [coef.dynamitefit, dynamite, fitted.dynamitefit, formula.dynamitefit, plot_nus, +.dynamiteformula, predict.dynamitefit, print.dynamitefit, print.dynamiteformula, summary.dynamitefit]
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✖️ Function names are duplicated in other packages
✔️ Package has continuous integration checks.
✔️ Package coverage is 97.8%.
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3)

1. rOpenSci Statistical Standards (`srr` package)

This package is in the following category:

Bayesian and Monte Carlo
Regression and Supervised Learning

✔️ All applicable standards [v0.1.0] have been documented in this package (129 complied with; 42 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.

2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	1102
internal	dynamite	418
internal	graphics	21
internal	methods	3
imports	utils	81
imports	stats	59
imports	dplyr	29
imports	rlang	16
imports	checkmate	11
imports	glue	11
imports	cli	6
imports	ggplot2	4
imports	posterior	4
imports	tidyr	4
imports	rstan	3
imports	data.table	2
imports	bayesplot	1
imports	MASS	NA
suggests	covr	NA
suggests	knitr	NA
suggests	plm	NA
suggests	rmarkdown	NA
suggests	testthat	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

c (108), list (96), length (69), paste0 (66), args (53), for (47), attr (44), data.frame (39), as.list (32), match.call (32), rep (31), seq_len (25), unique (25), do.call (22), character (19), which (18), vapply (17), is.na (16), names (16), logical (15), seq_along (15), drop (13), by (12), deparse1 (11), is.null (10), mean (10), nzchar (10), all (9), apply (9), debug (9), lapply (9), parent.frame (9), as.integer (7), integer (7), rank (7), seq.int (7), sort (6), unlist (6), array (5), assign (5), colnames (5), dim (5), log (5), message (5), mode (5), nrow (5), try (5), vector (5), as.numeric (4), call (4), diff (4), eval (4), gsub (4), as.data.frame (3), cbind (3), I (3), identical (3), setdiff (3), sub (3), sum (3), any (2), aperm (2), expand.grid (2), levels (2), max (2), ncol (2), seq (2), structure (2), suppressWarnings (2), t (2), beta (1), class (1), det (1), diag (1), duplicated (1), get (1), gl (1), gregexec (1), ifelse (1), intersect (1), is.factor (1), is.finite (1), match (1), min (1), new.env (1), numeric (1), parse (1), paste (1), prod (1), range (1), regmatches (1), replace (1), replicate (1), row.names (1), sample (1), sample.int (1), signif (1), strsplit (1), substitute (1), typeof (1), union (1), warning (1), which.max (1), which.min (1), with (1)

dynamite

ifelse_ (84), paste_rows (41), get_responses (17), data_lines_default (10), get_predictors (10), onlyif (10), model_lines_default (9), warning_ (9), prepare_channel_default (8), formula_rhs (6), get_quoted (5), as.data.frame.dynamitefit (4), get_families (4), has_past (4), coef.dynamitefit (3), evaluate_specials (3), get_formulas (3), assign_deterministic (2), complete_lags (2), create_blocks (2), cs (2), default_priors (2), default_priors_categorical (2), deterministic_response (2), extract_lags (2), extract_nonlags (2), find_lags (2), formula_lhs (2), formula_past (2), formula_terms (2), full_model.matrix (2), full_model.matrix_predict (2), get_originals (2), get_terms (2), indenter_ (2), join_dynamiteformulas (2), lag_ (2), parse_global_lags (2), parse_lags (2), parse_new_lags (2), parse_singleton_lags (2), prepare_eval_envs (2), prepare_lagged_response (2), stop_ (2), which_deterministic (2), which_stochastic (2), abort_factor (1), abort_negative (1), abort_nonunit (1), add_dynamiteformula (1), as_data_frame_alpha (1), as_data_frame_beta (1), as_data_frame_corr_nu (1), as_data_frame_default (1), as_data_frame_delta (1), as_data_frame_lambda (1), as_data_frame_nu (1), as_data_frame_omega (1), as_data_frame_omega_alpha (1), as_data_frame_phi (1), as_data_frame_sigma (1), as_data_frame_sigma_nu (1), as_data_frame_tau (1), as_data_frame_tau_alpha (1), as_draws_df.dynamitefit (1), as_draws.dynamitefit (1), assign_initial_values (1), assign_lags (1), assign_lags_init (1), aux (1), check_ndraws (1), check_newdata (1), check_priors (1), clear_nonfixed (1), confint.dynamitefit (1), create_blocks.default (1), create_data (1), create_functions (1), create_generated_quantities (1), create_model (1), create_parameters (1), create_transformed_data (1), create_transformed_parameters (1), data_lines_bernoulli (1), data_lines_beta (1), data_lines_binomial (1), data_lines_categorical (1), data_lines_exponential (1), data_lines_gamma (1), data_lines_gaussian (1), data_lines_negbin (1), data_lines_poisson (1), drop_terms (1), drop_unused (1), dynamite (1), dynamitechannel (1), dynamitefamily (1), dynamiteformula (1), dynamiteformula_ (1), evaluate_deterministic (1), fill_time (1), fill_time_predict (1), fitted.dynamitefit (1), formula_specials (1), formula.dynamitefit (1), generate_random_intercept (1), generate_sim_call (1), get_code (1), get_code.dynamitefit (1), get_code.dynamiteformula (1), get_data (1), get_data.dynamitefit (1), get_data.dynamiteformula (1), get_priors (1), get_priors.dynamitefit (1), get_priors.dynamiteformula (1), get_special_term_indices (1), impute_newdata (1), increment_formula (1), initialize_deterministic (1), is_supported (1), is.dynamitefamily (1), is.dynamitefit (1), is.dynamiteformula (1), lags (1), lines_wrap (1), locf (1), mcmc_diagnostics (1), mcmc_diagnostics.dynamitefit (1), message_ (1), model_lines_bernoulli (1), model_lines_beta (1), model_lines_binomial (1), model_lines_categorical (1), model_lines_exponential (1), model_lines_gamma (1), model_lines_gaussian (1), model_lines_negbin (1), model_lines_poisson (1), ndraws.dynamitefit (1), nobs.dynamitefit (1), parameters_lines_bernoulli (1), parameters_lines_beta (1), parameters_lines_binomial (1), parameters_lines_categorical (1), parameters_lines_default (1), parameters_lines_exponential (1), parameters_lines_gamma (1), parameters_lines_gaussian (1), parameters_lines_negbin (1), parameters_lines_poisson (1), parse_data (1), parse_newdata (1), parse_past (1), parse_present_lags (1), plot_betas (1), plot_deltas (1), plot_nus (1), plot.dynamitefit (1), predict_dynamitefit (1), predict.dynamitefit (1), prepare_channel_bernoulli (1), prepare_channel_beta (1), prepare_channel_binomial (1), prepare_channel_categorical (1), prepare_channel_exponential (1), prepare_channel_gamma (1), prepare_channel_gaussian (1), prepare_channel_negbin (1), prepare_channel_poisson (1), prepare_common_priors (1), prepare_prior (1), prepare_splines (1), prepare_stan_input (1), values (1), verify_lag (1)

utils

data (79), capture.output (1), combn (1)

stats

formula (21), var (7), df (5), sd (4), D (3), model.matrix.lm (3), na.action (3), na.pass (3), offset (3), complete.cases (2), setNames (2), terms (2), sigma (1)

dplyr

bind_rows (12), filter (7), mutate (3), summarise (3), left_join (2), matches (1), n (1)

graphics

mtext (10), title (9), pairs (2)

rlang

caller_env (16)

checkmate

test_character (3), test_flag (3), test_string (3), test_int (2)

glue

glue (11)

cli

cli_abort (2), qty (2), cli_inform (1), cli_warn (1)

ggplot2

labs (3), position_dodge (1)

posterior

summarise_draws (2), as_draws (1), ndraws (1)

tidyr

expand_grid (2), full_seq (1), unnest (1)

methods

is (2), new (1)

rstan

extract (2), check_hmc_diagnostics (1)

data.table

setDT (1), setkeyv (1)

bayesplot

mcmc_combo (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 30 files) and
2 authors
1 vignette
6 internal data files
14 imported packages
36 exported functions (median 7 lines of code)
404 non-exported functions in R (median 9 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	30	89.3
files_vignettes	2	85.7
files_tests	11	91.7
loc_R	5956	96.5	TRUE
loc_vignettes	877	89.0
loc_tests	2413	95.3	TRUE
num_vignettes	1	64.8
data_size_total	2661885	98.5	TRUE
data_size_median	349085	96.0	TRUE
n_fns_r	440	96.6	TRUE
n_fns_r_exported	36	82.0
n_fns_r_not_exported	404	97.8	TRUE
n_fns_per_file_r	8	83.4
num_params_per_fn	2	11.9
loc_per_fn_r	9	24.3
loc_per_fn_r_exp	7	13.5
loc_per_fn_r_not_exp	9	27.1
rel_whitespace_R	4	76.1
rel_whitespace_vignettes	13	68.1
rel_whitespace_tests	9	86.4
doclines_per_fn_exp	37	45.3
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	696	96.9	TRUE

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

4. `goodpractice` and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

id	name	conclusion	sha	run_number	date
2903919221	R-CMD-check	success	a8d932	295	2022-08-22
2903919220	test-coverage	success	a8d932	295	2022-08-22

3b. `goodpractice` results

`R CMD check` with rcmdcheck

R CMD check generated the following note:

checking installed package size ... NOTE
installed size is 11.1Mb
sub-directories of 1Mb or more:
data 7.2Mb
doc 1.0Mb
R 2.5Mb

R CMD check generated the following check_fail:

rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 97.82

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 6 potential issues:

message	number of times
Avoid library() and require() calls in packages	5
unexpected symbol	1

5. Other Checks

Details of other checks (click to open)

✖️ The following 10 function names are duplicated in other packages:

- aux from seewave
- get_code from norgeo, rmonad, xpose
- get_data from canvasXpress.data, cbsodataR, cimir, completejourney, CVXR, danstat, deckgl, ecb, finnishgrid, ggPMX, ggvis, hydroscoper, insight, jtools, mapbayr, metacoder, missCompare, optimall, qrmtools, r4googleads, radiant.data, radous, rbedrock, rchallenge, rsimsum, SWIM, swissparl, tidyLPA, tidySEM, trending, tsmp, ugatsdb, xpose
- get_priors from CausalQueries, insight
- lags from smooth, tis, TTR
- mcmc_diagnostics from bpr, rater, rnmamod
- obs from metacoder, observer
- plot_deltas from spruce
- random from CoOL, decisionSupport, distributions3, gam, gamlss, ggdmc, lidR, messydates, simr, sodium
- splines from rpatrec

Package Versions

package	version
pkgstats	0.1.1.20
pkgcheck	0.1.0.9
srr	0.0.1.178

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

mpadge · 2022-08-23T13:15:51Z

@santikka Thank you for your help debugging and improving our check system. Sorry the checks took so long to appear here.

@adamhsparks Please ignore the failing checks. The duplicated names check will soon be downgraded from a fail to a note. The other check is brand new. @santikka Feel free to modify your repo to describe return values, but do so at your own leisure, and consider all checks as passing for now. Thanks for all the help!

santikka · 2022-08-23T14:02:17Z

@mpadge No worries, glad to help! I will add the missing return values soon.

santikka · 2022-08-24T07:20:04Z

@mpadge About the return values: the system does not seem to recognize function aliases. For example in the above, obs and aux are aliases of dynamiteformula and thus have the same return value implicitly. If I explicitly add the return tag to these aliases as well, the resulting documentation looks rather silly with regards to the output value:

Value
A dynamiteformula object.

A dynamiteformula object.

A dynamiteformula object.

adamhsparks · 2022-08-25T13:30:21Z

@ropensci-review-bot assign @noamross as editor

ropensci-review-bot · 2022-08-25T13:30:23Z

Assigned! @noamross is now the editor

noamross · 2022-09-06T22:56:30Z

@ropensci-review-bot seeking reviewers

ropensci-review-bot · 2022-09-06T22:56:32Z

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/554_status.svg)](https://github.com/ropensci/software-review/issues/554)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

noamross · 2022-09-06T22:57:35Z

@ropensci-review-bot assign @nicholasjclark as reviewer

ropensci-review-bot · 2022-09-06T22:57:37Z

@nicholasjclark added to the reviewers list. Review due date is 2022-09-27. Thanks @nicholasjclark for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot · 2022-09-06T22:57:39Z

@nicholasjclark: If you haven't done so, please fill this form for us to update our reviewers records.

nicholasjclark · 2022-09-21T00:44:18Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
☒ As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

☒ A statement of need: clearly stating problems the software is designed to solve and its target audience in README
☒ Installation instructions: for the development version of package and any non-standard dependencies in README
☒ Vignette(s): demonstrating major functionality that runs successfully locally
☒ Function Documentation: for all exported functions
☒ Examples: (that run successfully locally) for all exported functions
☒ Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

☒ Installation: Installation succeeds as documented.
☒ Functionality: Any functional claims of the software been confirmed.
☒ Performance: Any performance claims of the software been confirmed.
☐ Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
☒ Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 5

☒ Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

Review Comments

This is a very nicely-documented package that provides a rich set of functions all tailored towards a single purpose or estimating complex panel models and interrogating resulting models. The authors make good arguments as to why the package is needed, how it extends existing software and who the target audience is. I have not used these models personally so I found it a little difficult to understand how I would go about implementing them for my own purposes. On this note I do feel that the planned accompanying papers will be essential for introducing unfamiliar users to the package. Some extra detail in the vignette and readme could help to provide a bit more context about what the parameters mean and how the models can be interpreted, but overall this does not detract from the soundness of the package itself.

The only issue I find in general testing is that I receive a single failure during unit testing:

Error (test-recovery.R:22:3): parameters for the linear regression are recovered as with lm Error in do.call("dynamite", list(dformula = x, data = data, group = group, time = time, debug = list(no_compile = TRUE), ...)): argument “group” is missing, with no default.

All examples pass on my machine, and they are appropriate for demonstrating the functionality of the package. I do however have a few general notes:

There appears to be a typo in the vignette on line 383 due to a missing closing parenthesis (change mutate(invest = inv, firm = factor(firm) to mutate(invest = inv, firm = factor(firm)))

On line 398 in the vignette I receive the following error: unused argument (random_intercept = TRUE)

On line 514 in the vignette, the text states that the output is a two-component list, but in my run I get a dataframe

Some guidelines for contributing could be listed explicitly if the authors would welcome community contributions

The package has quite a few dependencies that make installation a lengthy process. Could any of this be streamlined? Also, the reliance on the newest version of R could be a deterrent for some users. If dplyr and magrittr are used then why is the native pipe necessary?

I see no reason why the authors shouldn’t allow for cmdstanr (in addition to rstan) to be a backend option for sampling. This would allow a broader range of users to work with the package, as Cmdstan is usually ahead of rstan in terms of features (not to mention much better performance on most systems)

The options for updating priors and for generating Stan code and data objects are incredibly useful and will allow users to modify the files to suit their analyses. However some additional annotations of the Stan code would be helpful to give users better familiarity with the programs, as would a worked example to illustrate how the code could be modified and run outside of dynamite. I also find annotations to be a bit lacking throughout most of the functions. Again this doesn’t harm performance or tidiness of the package but it makes it challenging for users (such as myself) who might want to adapt some of the code for their own workflows and packages. For example, what is the purpose of each chain in the long pipe used in the as_draws_df.dynamitefit function? A few simple annotations would clarify what calculations / manipulations are taking place

It is not clear how users can simulate from the prior to help inform model development. This would be a useful addition to vignettes / examples as it is an essential component of a modern Bayesian workflow

mcmc_diagnostics function is very useful for a quick overview of estimator performance

Very impressive and thorough set of unit tests

Each function appears to have a standalone usage that is well documented

Links to the code used for creation of example datasets are excellent tools for demonstrating how users can implement simulations in their own workflows

noamross · 2022-09-21T14:52:49Z

Thank you for your review, @nicholasjclark! I note you used the non-statistical reviewer template. Would you be able to use the statistical one at https://stats-devguide.ropensci.org/pkgreview.html#pkgrev-template? My apologies, it looks like the one linked above is dead so you probably searched out and found that one.

I am still seeking a second reviewer, @santikka. I suggest limiting updates until you have inputs from both.

ropensci-review-bot · 2022-09-29T13:41:51Z

@LucyMcGowan added to the reviewers list. Review due date is 2022-10-20. Thanks @LucyMcGowan for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot · 2022-09-29T13:41:53Z

@LucyMcGowan: If you haven't done so, please fill this form for us to update our reviewers records.

ropensci-review-bot · 2022-10-18T13:41:57Z

📆 @LucyMcGowan you have 2 days left before the due date for your review (2022-10-20).

santikka · 2022-11-21T13:45:20Z

Hi @noamross! Would it be possible to get a status update regarding the review process? It has now been over one month since the second review deadline.

LucyMcGowan · 2022-11-21T13:46:54Z

Im so sorry I’m so behind! Hoping to wrap this up soon

LucyMcGowan · 2022-12-06T16:11:43Z

Package Review

As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Compliance with Standards

This package complies with a sufficient number of standards for a gold badge
This grade of badge is the same as what the authors wanted to achieve

The authors identify 42 standards that they deem "Non-applicable". I agree with these designations, the majority describe methods that are out of scope for the project or not appropriate given the methods.

The authors complied with the remaining 129 standards. In particular, the documentation and testing in this package is very thorough.

For packages aiming for silver or gold badges:

This package extends beyond minimal compliance with standards in the following ways: (please describe)

General Review

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Community guidelines including contribution guidelines in the README or CONTRIBUTING
The documentation is sufficient to enable general use of the package beyond one specific use case

I did find one TODO left in the lfactor documentation (#TODO definition and constraint on lambdas) that maybe was meant to be updated?

Algorithms

The algorithms are encoded well. The language is appropriate, and all tests passed on my machine.

Testing

This package has extensive testing and all tests passed on my machine.

The examples and vignette, however, are not currently compiling, for example, when I run the first example of the dynamite function I get the following error:

library(dynamite)
fit <- dynamite(
       dformula = obs(y ~ -1 + varying(~x), family = "gaussian") +
         lags(type = "varying") +
         splines(df = 20), gaussian_example, "id", "time",
       chains = 1,
       refresh = 0
     )
#> Error in `[.data.table`(data, idx, cl, env = list(cl = cl)): unused argument (env = list(cl = cl))

^{Created on 2022-12-06 by the reprex package (v2.0.1)}

EDIT updating to the development version of the data.table package fixed this, however in the README when I tried to run data.table::update.dev.pkg() I got the following message: Error: 'update.dev.pkg' is not an exported object from 'namespace:data.table'.

Visualisation (where appropriate)

Do visualisations aid the primary purposes of statistical interpretation of results?
I am able to see the visualization in the README, they look good. Unfortunately, I am not able to compile the vignette due to the error above. EDIT I was able to compile the vignette after updating to the development version of data.table, the visualizations there are also good.
Are there any aspects of visualisations which could risk statistical misinterpretation?
No

Package Design

Is the package well designed for its intended purpose?
Yes
In relation to External Design: Do exported functions and the relationships between them enable general usage of the package?
Yes
In relation to External Design: Do exported functions best serve inter-operability with other packages?
Yes
In relation to Internal Design: Are algorithms implemented appropriately in terms of aspects such as efficiency, flexibility, generality, and accuracy?
Yes
In relation to Internal Design: Could ranges of admissible input structures, or form(s) of output structures, be expanded to enhance inter-operability with other packages?
The output seems to work well with common functions, like those from the tidyverse.

Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 4

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

santikka · 2022-12-07T08:33:51Z

@LucyMcGowan Thank you for the review!

Regarding the error, dynamite currently requires the development version of the data.table package (which can be installed via data.table::update.dev.pkg()). There is some confusion about the version numbers, since the required version 1.14.3 of data.table is already on CRAN, but the particular feature used by dynamite was not included in the CRAN release: Rdatatable/data.table#5538 we will update the readme accordingly to avoid confusion.

nicholasjclark · 2022-12-08T00:40:38Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you may have (had) with the package authors (or otherwise remove this statement)
☒ As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Compliance with Standards

☒ This package complies with a sufficient number of standards for a (bronze/silver/gold) badge
☒ This grade of badge is the same as what the authors wanted to achieve

The following standards currently deemed non-applicable (through tags of @srrstatsNA) could potentially be applied to future versions of this software: (Please specify)

Please also comment on any standards which you consider either particularly well, or insufficiently, documented.

For packages aiming for silver or gold badges:

☒ This package extends beyond minimal compliance with standards in the following ways: (please describe)

The authors have been able to comply with 129 standards with only 42 N/A standards across the “Bayesian and Monte Carlo” and “Regression and Supervised Learning” categories and the function design is very well compartmentalised in my view

General Review

Documentation

The package includes all the following forms of documentation:

☒ A statement of need clearly stating problems the software is designed to solve and its target audience in README
☒ Installation instructions: for the development version of package and any non-standard dependencies in README
☒ Community guidelines including contribution guidelines in the README or CONTRIBUTING
☒ The documentation is sufficient to enable general use of the package beyond one specific use case

The following sections of this template include questions intended to be used as guides to provide general, descriptive responses. Please remove this, and any subsequent lines that are not relevant or necessary for your final review.

Algorithms

How well are algorithms encoded?
Is the choice of computer language appropriate for that algorithm, and/or envisioned use of package?
Are aspects of algorithmic scaling sufficiently documented and tested?
Are there any aspects of algorithmic implementation which could be improved?

As the maintainers have now allowed for cmdstanr to be used as a backend, any efficiency updates can generally be automatically incorporated as long as users keep their Cmdstan and cmdstanr packages up to date. Perhaps some instructions to the users in the Readme can help to clarify this

Testing

Regardless of actual coverage of tests, are there any fundamental software operations which are not sufficiently expressed in tests?
Is there a need for extended tests, or if extended tests exists, have they been implemented in an appropriate way, and are they appropriately documented?

Very impressive and thorough set of unit tests

Visualisation (where appropriate)

Do visualisations aid the primary purposes of statistical interpretation of results?
Are there any aspects of visualisations which could risk statistical misinterpretation?

Visualisations make use of ggplot objects, which are very familiar to most potential users. There is no risk of misinterpretation in my view

Package Design

Is the package well designed for its intended purpose?
In relation to External Design: Do exported functions and the relationships between them enable general usage of the package?
In relation to External Design: Do exported functions best serve inter-operability with other packages?
In relation to Internal Design: Are algorithms implemented appropriately in terms of aspects such as efficiency, flexibility, generality, and accuracy?
In relation to Internal Design: Could ranges of admissible input structures, or form(s) of output structures, be expanded to enhance inter-operability with other packages?

Function designs and levels of documentation are excellent and the returned structures will integrate very well with popular post-processing packages designed to work with stanfit objects. The algorithms make use of stan, which is the most efficient implementation of MCMC in the R programming environment. I see no issues with the way the package is designed

☒ Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 5

☒ Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

LucyMcGowan · 2022-12-08T18:46:32Z

Thank you! I updated to the dev version of data.table and was able to compile everything. I did try to follow the instructions in the README by running data.table::update.dev.pkg() but got the following message: Error: 'update.dev.pkg' is not an exported object from 'namespace:data.table'. I have updated my review above. Thank you!

santikka · 2022-12-08T19:41:42Z

@LucyMcGowan It seems that at some point the development version installation function of data.table was renamed into update_dev_pkg instead, apologies! The README has been corrected accordingly.

santikka · 2022-12-09T08:31:39Z

Based on the reviews, we have taken some additional steps to make the package installation more straightforward and the requirements less strict:

The package no longer depends on dplyr or tidyr, and the internal package code that previously used these packages is now written entirely using data.table instead. dplyr and tidyr are only used in examples and tests, and are now included as 'Suggests' instead.
The package no longer depends on R version 4.1.0. The native pipe |> is no longer used internally, and is used only in the examples conditionally on the installed R version of the user.
The categorical distribution is now also supported for older rstan and cmdstanr versions.

noamross · 2022-12-09T19:43:56Z

@ropensci-review-bot submit review #554 (comment) time 5

noamross · 2022-12-09T19:44:26Z

@ropensci-review-bot submit review #554 (comment) time 4

noamross · 2022-12-09T19:56:24Z

Thank you all for bearing with us and following up! @LucyMcGowan and @nicholasjclark, you seem to have indicated in your follow-ups, but to be unambiguous, do the updates resolve any outstanding issues for acceptance at gold level?

LucyMcGowan · 2022-12-09T20:01:19Z

Yes, I approve acceptance at the gold level. Thanks @noamross!

nicholasjclark · 2022-12-13T03:38:34Z

Thank you all for bearing with us and following up! @LucyMcGowan and @nicholasjclark, you seem to have indicated in your follow-ups, but to be unambiguous, do the updates resolve any outstanding issues for acceptance at gold level?

Yes I approve acceptance as well. Thanks very much

noamross · 2022-12-13T22:39:07Z

@ropensci-review-bot approve dynamite

ropensci-review-bot · 2022-12-13T22:39:11Z

Approved! Thanks @santikka for submitting and @nicholasjclark, @LucyMcGowan for your reviews! 😁

To-dos:

Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You will need to enable two-factor authentication for your GitHub account.
This invitation will expire after one week. If it happens write a comment @ropensci-review-bot invite me to ropensci/<package-name> which will re-send an invitation.
After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
Fix all links to the GitHub repo to point to the repo under the ropensci organization.
Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
- deactivate the automatic deployment you might have set up
- remove styling tweaks from your pkgdown config but keep that config file
- replace the whole current pkgdown website with a redirecting page
- replace your package docs URL with https://docs.ropensci.org/package_name
- In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
Fix any links in badges for CI and coverage to point to the new repository URL.
Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

santikka · 2022-12-14T05:55:00Z

@ropensci-review-bot finalize transfer of dynamite

ropensci-review-bot · 2022-12-14T05:55:03Z

Transfer completed.
The dynamite team is now owner of the repository and the author has been invited to the team

mpadge mentioned this issue Aug 18, 2022

Add "Remotes" field to DESC ropensci/dynamite#46

Closed

adamhsparks added the 0/editorial-team-prep label Aug 18, 2022

mpadge mentioned this issue Aug 23, 2022

August news ropensci/roweb3#375

Merged

mpadge mentioned this issue Aug 24, 2022

New Check: All .Rd files have \value tags ropensci-review-tools/pkgcheck#157

Closed

ropensci-review-bot assigned noamross Aug 25, 2022

ropensci-review-bot added 1/editor-checks and removed 0/editorial-team-prep labels Aug 25, 2022

ropensci-review-bot added 2/seeking-reviewer(s) and removed 1/editor-checks labels Sep 6, 2022

mpadge added the stats label Sep 20, 2022

helske mentioned this issue Sep 21, 2022

get_* functions do not work without group argument ropensci/dynamite#48

Closed

ropensci-review-bot added 3/reviewer(s)-assigned and removed 2/seeking-reviewer(s) labels Sep 29, 2022

noamross added 4/review(s)-in-awaiting-changes 5/awaiting-reviewer(s)-response and removed 3/reviewer(s)-assigned labels Dec 9, 2022

noamross removed the 4/review(s)-in-awaiting-changes label Dec 9, 2022

ropensci-review-bot added 6/approved-gold-v0.2 Statistical software grade 6/approved and removed 5/awaiting-reviewer(s)-response labels Dec 13, 2022

ropensci-review-bot closed this as completed Dec 13, 2022

dynamite: Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data #554

dynamite: Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data #554

Comments

santikka commented Aug 18, 2022 • edited by ropensci-review-bot

Due date for @LucyMcGowan: 2022-10-20 Archive: TBD Version accepted: TBD Language: en

Pre-submission Inquiry

General Information

Badging

Technical checks

Publication options

Code of conduct

ropensci-review-bot commented Aug 18, 2022

ropensci-review-bot commented Aug 18, 2022

ropensci-review-bot commented Aug 18, 2022

mpadge commented Aug 23, 2022

ropensci-review-bot commented Aug 23, 2022

ropensci-review-bot commented Aug 23, 2022

ropensci-review-bot commented Aug 23, 2022 • edited by mpadge

Checks for dynamite (v0.0.1)

1. rOpenSci Statistical Standards (srr package)

2. Package Dependencies

3. Statistical Properties

3a. Network visualisation

4. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

5. Other Checks

Editor-in-Chief Instructions:

mpadge commented Aug 23, 2022

santikka commented Aug 23, 2022

santikka commented Aug 24, 2022

adamhsparks commented Aug 25, 2022

ropensci-review-bot commented Aug 25, 2022

noamross commented Sep 6, 2022

ropensci-review-bot commented Sep 6, 2022

noamross commented Sep 6, 2022

ropensci-review-bot commented Sep 6, 2022

ropensci-review-bot commented Sep 6, 2022

nicholasjclark commented Sep 21, 2022

Package Review

Documentation

Functionality

Review Comments

noamross commented Sep 21, 2022

ropensci-review-bot commented Sep 29, 2022

ropensci-review-bot commented Sep 29, 2022

ropensci-review-bot commented Oct 18, 2022

santikka commented Nov 21, 2022

LucyMcGowan commented Nov 21, 2022

LucyMcGowan commented Dec 6, 2022 • edited

Package Review

Compliance with Standards

General Review

Documentation

Algorithms

Testing

Visualisation (where appropriate)

Package Design

santikka commented Dec 7, 2022

nicholasjclark commented Dec 8, 2022

Package Review

Compliance with Standards

General Review

Documentation

Algorithms

Testing

Visualisation (where appropriate)

Package Design

LucyMcGowan commented Dec 8, 2022

santikka commented Dec 8, 2022

santikka commented Dec 9, 2022

noamross commented Dec 9, 2022

noamross commented Dec 9, 2022

noamross commented Dec 9, 2022 • edited

LucyMcGowan commented Dec 9, 2022

nicholasjclark commented Dec 13, 2022

santikka commented Aug 18, 2022 •

edited by ropensci-review-bot

Due date for @LucyMcGowan: 2022-10-20
Archive: TBD
Version accepted: TBD
Language: en

ropensci-review-bot commented Aug 23, 2022 •

edited by mpadge

1. rOpenSci Statistical Standards (`srr` package)

4. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

LucyMcGowan commented Dec 6, 2022 •

edited

noamross commented Dec 9, 2022 •

edited