-
Notifications
You must be signed in to change notification settings - Fork 0
Part IV Function Reference 7
← Part IV — Exhaustive Function Reference (6/7) · gdpar Wiki Home · Part V — Stan Templates (1/3) →
Purpose
Pre-flight diagnostic that produces a z-like statistic measuring whether divergent MCMC transitions are disproportionately concentrated in regions of draw-space where the log prior scale (log_sigma) and the raw NCP normalization (raw_norm) co-vary. A large magnitude suggests that the divergences are structurally linked to the parametrization geometry, informing the CP/NCP decision.
Arguments
| Argument | Type | Meaning |
|---|---|---|
log_sigma |
numeric vector | Per-draw log of the prior scale parameter (e.g. log(sigma_a) or log(sigma_W)). |
raw_norm |
numeric vector | Per-draw raw NCP normalization quantity (e.g. a column of W_raw or a_raw). |
divergent |
logical or integer vector | Per-draw divergence indicator; used as a logical subset index into the per-draw arrays. |
Mathematics
Let
Define the per-draw interaction score:
Restrict to divergent draws and compute:
where
The returned statistic is:
Returns
A numeric scalar (the statistic NA_real_ if any of the following hold: length mismatch among the three inputs; sd(log_sigma) or sd(raw_norm) is NA or n_div < 1L); sd(g) is NA or
Notes
- No S3 dispatch; a plain closure.
- The
divergentargument is used directly as a logical index (g[divergent]), so it must be coercible to logical. Integer0/1vectors work because R recycles them as logical. - The sign convention
$g = -z_\sigma z_r$ means that draws where both standardized quantities are large in the same direction contribute negatively; the centering by$\bar{g}$ removes the baseline association.
Purpose
Generates resampling indices for a chain-aware block bootstrap. Contiguous blocks of block_size draws are resampled within each chain (never across chains), producing an integer vector of length n_draws that can be used to subset a draws matrix. This mitigates MCMC autocorrelation that would cause a naive i.i.d. bootstrap to underestimate standard errors.
Arguments
| Argument | Type | Meaning |
|---|---|---|
n_draws |
integer (coerced) | Total number of draws across all chains. |
n_chains |
integer (coerced) | Number of chains. |
block_size |
integer (coerced, default 10L) |
Length of each contiguous block. Must be in |
Mathematics
Let
- Define the chain's global start offset:
$\text{start}_c = (c-1)\,D + 1$ . - Sample
$B$ block-start positions uniformly with replacement from$\{1, 2, \dots, D - L + 1\}$ . - For the
$b$ -th sampled start$s_b$ , the block contributes indices$\{\text{start}_c + s_b - 1, \dots, \text{start}_c + s_b + L - 2\}$ . - Concatenate all
$B \times L$ indices and truncate to the first$D$ entries.
The final output is the concatenation across all chains, yielding exactly
Returns
An integer vector of length n_draws containing 1-based indices into the original draws matrix.
Notes
- Raises an error of class
"gdpar_input_error"(viagdpar_abort) ifn_draws %% n_chains != 0L. - Raises an error of class
"gdpar_input_error"ifblock_size < 1Lorblock_size > draws_per_chain. The error payload includesdata = list(block_size = ..., draws_per_chain = ...). - Assumes chain-first row ordering: chain 1 iterations 1…D, then chain 2 iterations 1…D, etc.
- Because
$B = \lceil D/L \rceil$ , the raw concatenated block indices may exceed$D$ ; the truncationchain_idx[seq_len(draws_per_chain)]handles this, meaning the last block may be partially represented. - Uses
sample.intwithreplace = TRUE, so the same block start can appear multiple times.
preflight_info_ratio_t_from_fit(fit, component, n_coords, n_chains, theta_anchor = NULL, dim_W = NULL, d = NULL, tau_cp = 5, tau_ncp = 2)
Purpose
Extracts the effective coefficients and per-draw reference scales from a cmdstanr pre-flight fit (Path B′) and delegates to preflight_info_ratio_t to compute the info-ratio z-statistics. Handles two components: "a" (scalar/vector coefficients) and "W" (basis-function coefficients requiring reference-point differencing).
Arguments
| Argument | Type | Meaning |
|---|---|---|
fit |
cmdstanr fit object | The NCP pre-flight model fit. |
component |
character | Either "a" or "W". |
n_coords |
integer | Number of free coordinates component == "a". Ignored for "W". |
n_chains |
integer | Number of chains in the underlying sampler. |
theta_anchor |
numeric scalar | Anchor value component == "W". |
dim_W |
integer | Basis dimension component == "W". |
d |
integer | Output dimension of component == "W". |
tau_cp |
numeric scalar (default 5) | CP threshold, forwarded to preflight_info_ratio_t. |
tau_ncp |
numeric scalar (default 2) | NCP threshold, forwarded to preflight_info_ratio_t. |
Mathematics
Component "a":
for
Component "W":
Let
The effective coefficient at coordinate
The per-draw reference scale (conditional prior SD of
These are then passed to preflight_info_ratio_t.
Returns
A list with elements t_cp and t_ncp (the z-statistics from preflight_info_ratio_t). Returns list(t_cp = NA_real_, t_ncp = NA_real_) on any of the following failure conditions:
-
component == "a":draws_coefordraws_sigmaisNULLor has zero columns; anysigma_vecisNAor$\le 0$ . -
component == "W":theta_anchor,dim_W, ordisNULL; any ofdraws_theta,draws_sigma,draws_WisNULLor has zero columns; anytheta_vecorsigma_vecisNA; anysigma_vec$\le 0$ ;ncol(draws_W) != dim_W * d; anybasis_normis non-finite or$\le 0$ . -
componentis neither"a"nor"W": falls through to the finallist(t_cp = NA_real_, t_ncp = NA_real_).
Notes
- All
fit$draws()calls are wrapped intryCatch; extraction errors yieldNULL, which triggers the NA-list return. - For
"W", theW_rawdraws matrix is reshaped into a 3-D array of dimensionsc(n_draws, dim_W, d)viaarray(). The column ordering of the draws matrix is assumed to bedim_W * dwith the first index (basis$k$ ) varying fastest, consistent with Stan's column-major flattening ofW_raw[k, j]. - The
vapplyforbasis_diffcomputestheta_vec^k - theta_anchor^kfor each$k$ ; if the result is not a matrix (e.g.,dim_W == 1), it is explicitly reshaped ton_drawsrows. -
tau_cpandtau_ncpare forwarded unchanged topreflight_info_ratio_t.
preflight_info_ratio_t(effective_coef, reference_scale_per_draw, n_chains, tau_cp = 5, tau_ncp = 2, alpha = 0.025, n_boot = 200L, block_size = 10L)
Purpose
Core engine of the Path B′ info-ratio filter. Computes the mean log information ratio across coordinates on the full data, estimates its standard error via a chain-aware block bootstrap, and returns two z-statistics testing against qnorm(1 - alpha) to decide parametrization.
Arguments
| Argument | Type | Meaning |
|---|---|---|
effective_coef |
numeric matrix (n_draws × n_coords) |
Per-draw effective coefficients. Coerced to matrix if not already. |
reference_scale_per_draw |
numeric vector (length n_draws) |
Per-draw reference scale; must be strictly positive. |
n_chains |
integer | Number of chains; n_draws must be divisible by this. |
tau_cp |
numeric scalar (default 5) | CP information-ratio threshold. |
tau_ncp |
numeric scalar (default 2) | NCP information-ratio threshold. |
alpha |
numeric scalar (default 0.025) | Retained for documentation only; not used in computation. The caller is responsible for comparing the returned z-statistics to qnorm(1 - alpha). |
n_boot |
integer (default 200L) |
Number of bootstrap replicates. Must be |
block_size |
integer (default 10L) |
Block length for the chain-aware bootstrap. Clamped to min(block_size, n_draws %/% n_chains). |
Mathematics
Full-data statistic. For each coordinate
The point estimate is the cross-coordinate mean:
Bootstrap standard error. For each replicate
- Draw indices
$\mathcal{I}_b$ viablock_bootstrap_indices(n_draws, n_chains, block_size). - Compute
$\ell_{j,b} = \log(\bar{r}_b / \mathrm{sd}(\text{eff}_{b,j}))$ where$\bar{r}_b$ and$\mathrm{sd}(\text{eff}_{b,j})$ are computed on the resampled rows. -
$m_b = \mathrm{mean}_j(\ell_{j,b})$ .
Replicates with non-finite
Z-statistics:
Returns
A list list(t_cp = ..., t_ncp = ...). Returns list(t_cp = NA_real_, t_ncp = NA_real_) under any of the following conditions:
-
effective_coefhas fewer than 1 row or 1 column after coercion. -
length(reference_scale_per_draw) != n_draws. - Any element of
effective_coefisNA. - Any element of
reference_scale_per_drawisNAor$\le 0$ . - Any coordinate's full-data
sdisNAor$\le 0$ . -
full_mean_refis non-finite or$\le 0$ . -
m(the mean log info ratio) is non-finite. -
n_chains < 1Lorn_draws %% n_chains != 0L. -
block_sizeafter clamping is< 1L. -
n_boot < 2L. - Fewer than 2 finite bootstrap means remain after filtering.
-
se(sd of bootstrap means) is non-finite or$\le 0$ .
Notes
-
block_sizeis silently clamped:block_size <- min(as.integer(block_size), n_draws %/% n_chains). If this yields a value< 1L, the function returns NA. - Within the bootstrap loop, individual replicates that produce
NAor non-positivesd_bor non-finite/non-positivemean_ref_bare recorded asNA_real_and subsequently filtered out byboot_means[is.finite(boot_means)]. - The
alphaparameter has no effect on the returned values; it exists for API symmetry and documentation. The caller must explicitly comparet_cpandt_ncptoqnorm(1 - alpha). - The names
t_cpandt_ncpare kept for symmetry with the Path B implementation; conceptually they are z-statistics under the asymptotic regime of largen_boot. - No side effects; no S3 dispatch.
resolve_parametrization(parametrization, parametrization_a, parametrization_W, prior, stan_data, amm, preflight_seed, verbose)
Purpose
Central dispatcher that reconciles three layers of parametrization specification — a single global parametrization switch, optional per-parameter overrides (parametrization_a, parametrization_W), and an automatic preflight sampler — into two concrete Boolean flags (cp_a, cp_W) plus a metadata record explaining how each flag was obtained. It is the bridge between user-facing parametrization arguments and the downstream Stan model configuration, deciding whether the centered (cp) or non-centered (ncp) parametrization is used for the a and W parameter blocks.
Arguments
-
parametrization—character(1). Global switch; recognized values are"ncp"(non-centered),"cp"(centered), and"auto"(defer to preflight). Any other value causesswitch()to returnNULL, which propagates as an edge case (see Notes). -
parametrization_a—character(1)orNULL. Per-parameter override for theablock. When non-NULL, equality to"cp"is tested; any other string is treated as non-cp. -
parametrization_W—character(1)orNULL. Per-parameter override for theWblock, interpreted analogously toparametrization_a. -
prior— passed through topreflight_parametrization()when preflight is required. Type/meaning determined by that function. -
stan_data— passed through topreflight_parametrization()when preflight is required. -
amm— passed through topreflight_parametrization()when preflight is required. -
preflight_seed— passed through as thepreflight_seedargument ofpreflight_parametrization()when preflight is required. -
verbose— passed through as theverboseargument ofpreflight_parametrization()when preflight is required.
Mathematics
The function implements a three-tier precedence resolution. Define the user-level indicator
and analogously
Preflight is invoked iff at least one block is unresolved at both tiers:
The final flag for each block follows the precedence user > preflight > global:
with
again symmetric in
Returns
A named list with three elements:
-
cp_a—logical(1). Final centered-parametrization flag for theablock, coerced viaas.logical(). -
cp_W—logical(1). Final centered-parametrization flag for theWblock, coerced viaas.logical(). -
meta—list. Diagnostics/audit record. Two mutually exclusive shapes:-
Preflight path (
needs_pf_a || needs_pf_W):metaispf$metareturned bypreflight_parametrization(), withmeta$decision_reason_aoverwritten to"user_explicit_a"whencp_a_useris notNA, andmeta$decision_reason_Woverwritten to"user_explicit_W"whencp_W_useris notNA. The non-overwritten reason retains whateverpreflight_parametrization()set. -
No-preflight path:
metais a freshly constructed list with:used_preflight = FALSEn_divergent = NA_integer_div_pct = NA_real_ebfmi_min = NA_real_t_attribution_a = NA_real_t_attribution_W = NA_real_t_info_cp_a = NA_real_t_info_ncp_a = NA_real_t_info_cp_W = NA_real_t_info_ncp_W = NA_real_-
decision_reason_a∈{"user_explicit_a", "user_global"} -
decision_reason_W∈{"user_explicit_W", "user_global"}
-
Preflight path (
Notes
- The global flag is shared:
cp_W_global <- cp_a_global, soparametrization = "cp"/"ncp"forces both blocks identically unless a per-parameter override intervenes. -
parametrization_a == "cp"(and theWanalogue) is a strict string equality test; supplying e.g."CP","centered", or any non-"cp"value yieldsFALSE, i.e. is silently interpreted as non-centered. No validation or warning is emitted. - If
parametrizationis not one of"ncp","cp","auto",switch()returnsNULL, socp_a_globalandcp_W_globalbecomeNULL. Subsequentis.na(cp_a_global)then returns a zero-lengthlogical, which alters the scalar&&short-circuit behavior inneeds_pf_a/needs_pf_W; this is an unvalidated input path and the resulting control flow is not explicitly guarded. - When preflight runs but only one block was
"auto"-unresolved (the other being user-explicit), preflight is still executed (because the condition is||), andpf$cp_a/pf$cp_Wmay both be consulted; however, the user-explicit block's flag is taken from the user value, and only itsdecision_reason_*is overwritten — the underlying preflight diagnostic fields inmetareflect the full preflight run regardless. -
as.logical()is applied to the finalcp_a/cp_W. For the no-preflight path these are already logical scalars; for the preflight path they are whateverpreflight_parametrization()returned (expected logical), coerced to logical. - No errors are raised directly by this function; any error propagates from
preflight_parametrization(). - No S3 dispatch, no I/O side effects beyond those internal to
preflight_parametrization()(e.g. sampling, console output whenverbose). - The function is not exported (no
@exportvisible); it is an internal resolver invoked during model setup.
Purpose
Returns the canonical default prior expression (in Stan syntax) for a given parameter kind used in the multi-parametric extension of the gdpar package. This internal function is used by generate_stan_code*() when a param_spec carries a prior_canonical_kind that has not been overridden by the user via the priors_by_kind argument of gdpar_prior.
Arguments
-
kind
Type: Character scalar.
Meaning: Identifies the parameter kind. Must be one of the known canonical kinds:"mu","log_sigma","log_phi","logit_p","log_shape","log_nu","logit_pi", or"power_p".
Mathematics
No formula; a simple switch mapping.
Returns
A character scalar containing the Stan-syntax prior expression for the given kind. If an unrecognized kind is provided, the function calls gdpar_abort with an internal error.
Notes
- This function is internal (not exported) and is used during Stan code generation.
- The canonical priors are weakly informative defaults calibrated for covariates standardized to unit variance and outcomes on the natural scale of the family.
- An internal error is raised if
kindis not one of the recognized kinds.
Purpose
Returns the variance of the canonical default prior for a given parameter kind. This internal function is used in the post-fit information ratio diagnostic (D-B1) introduced in sub-phase 8.3.4 of Block 8. The contraction metric compares each K-individual slot's posterior variance against the canonical prior variance returned here.
Arguments
-
kind
Type: Character scalar.
Meaning: Identifies the parameter kind. Must be one of the known canonical kinds.
Mathematics
For each kind, the variance is derived from the canonical prior returned by .gdpar_canonical_prior_for_kind:
- For
mu,logit_p,logit_pi: prior isnormal(0, 2.5), variance =$2.5^2 = 6.25$ . - For
log_sigma,log_phi,log_shape: prior isnormal(0, 1), variance =$1$ . - For
log_nu: prior isnormal(log(10), 1), variance =$1$ (variance of normal depends only on scale). - For
power_p: prior isuniform(1.01, 1.99), variance =$(1.99 - 1.01)^2 / 12$ .
Returns
A numeric scalar with the canonical prior variance. If kind is unrecognized, returns NA_real_.
Notes
- This function is internal (not exported).
- The variance for
power_pis computed as$(b-a)^2/12$ for a uniform distribution on$[a, b]$ .
Purpose
Returns the character vector of all parameter kinds for which a canonical prior is registered. This internal function is used by the validator of priors_by_kind in gdpar_prior.
Arguments
None.
Mathematics
None.
Returns
A character vector containing the eight known canonical kinds: "mu", "log_sigma", "log_phi", "logit_p", "log_shape", "log_nu", "logit_pi", "power_p".
Notes
- This function is internal (not exported) and is used for input validation.
gdpar_prior(theta_ref, sigma_theta_ref, sigma_a, sigma_b, sigma_W, sigma_y, phi, priors_by_kind) -- the real signature.
Purpose
Builds a prior specification object of class gdpar_prior that is consumed by gdpar when path = "bayes". It allows users to specify priors for the population-level and hierarchical parameters of the AMM (Asymmetric Multinomial Model) hierarchical Bayesian model. Defaults are weakly informative on the linear-predictor scale.
Arguments
-
theta_ref
Type: Character scalar (default"normal(0, 2.5)").
Meaning: Prior on the population referencetheta_ref(or its hyperparametermu_theta_refwhen grouping is active) on the linear-predictor scale, in Stan syntax. -
sigma_theta_ref
Type: Character scalar (default"student_t(3, 0, 1)").
Meaning: Prior on the hierarchical scale oftheta_ref[g]across groups (positive-truncated in Stan). Used only when grouping is active. -
sigma_a
Type: Character scalar (default"student_t(3, 0, 1)").
Meaning: Prior on the hierarchical scale of the additive component coefficients (positive-truncated in Stan). -
sigma_b
Type: Character scalar (default"student_t(3, 0, 1)").
Meaning: Prior on the hierarchical scale of the multiplicative contribution to the linear predictor (positive-truncated in Stan). Internally, the model samplesc_b = theta_ref * b_coef; this prior applies toc_b. -
sigma_W
Type: Character scalar (default"student_t(3, 0, 1)").
Meaning: Prior on the hierarchical scale of the modulating component coefficients (positive-truncated in Stan). -
sigma_y
Type: Character scalar (default"student_t(3, 0, 2.5)").
Meaning: Prior on the residual standard deviation for Gaussian families (positive-truncated in Stan). -
phi
Type: Character scalar (default"gamma(2, 0.1)").
Meaning: Prior on the negative-binomial dispersion parameterphi(Stan parametrization: variance =$\mu + \mu^2 / \phi$ ). -
priors_by_kind
Type: Optional named list of character scalars (defaultNULL).
Meaning: Overrides for canonical priors by parameter kind (used when K > 1 in the multi-parametric extension). Each element must be a non-empty character scalar in Stan syntax. Recognized kinds are those from.gdpar_known_canonical_kinds().
Mathematics
No explicit formulas; the function constructs a list of prior strings.
Returns
An object of class gdpar_prior (also inheriting from list) containing the seven legacy character snippets and the priors_by_kind override list.
Notes
- The function validates that each of the seven legacy arguments is a non-empty character scalar. If not, it calls
gdpar_abortwith agdpar_input_error. - If
priors_by_kindisNULL, it is set to an empty list. If provided, it must be a named list with no empty names; each name must be a recognized canonical kind (as returned by.gdpar_known_canonical_kinds). Each element must be a non-empty character scalar. Otherwise,gdpar_abortis called. - The class
gdpar_prioris used by downstream functions (e.g.,gdpar, code generators) to inject prior specifications into the Stan model. - The defaults are calibrated for standardized covariates and outcomes on the natural scale of the family. Scale parameters are declared positive in Stan; the prior strings are interpreted accordingly.
- The object is intended for use with
gdpar(path = "bayes").
Purpose
Returns the effective prior expression for a given parameter kind by first checking for a user-supplied override in prior$priors_by_kind and, if absent, falling back to the canonical default from .gdpar_canonical_prior_for_kind. This internal function is used by the code generator and tests of the canonical-kind override mechanism.
Arguments
-
prior
Type: Agdpar_priorobject.
Meaning: The prior specification object (created bygdpar_prior). -
kind
Type: Character scalar.
Meaning: The parameter kind to resolve.
Mathematics
None; a lookup operation.
Returns
A character scalar containing the Stan-syntax prior expression for the given kind. If the kind is not found in prior$priors_by_kind, the canonical default is returned.
Notes
- This function is internal (not exported).
- It assumes
kindis one of the known canonical kinds; otherwise, the canonical default lookup may fail with an internal error (via.gdpar_canonical_prior_for_kind). - Used during Stan code generation to insert the appropriate prior for each parameter kind in the model.
Purpose Retrieves the specific prior specification for a given component kind from a gdpar_prior object. It first checks for an override in the object's priors_by_kind list. If no override exists, it falls back to the canonical (default) prior for that kind defined elsewhere in the package.
Arguments
-
prior: An object of classgdpar_prior(or a list-like object containing an element namedpriors_by_kind). -
kind: A character string identifying the model component (e.g., parameter group) for which to retrieve the prior.
Returns
The prior specification (the exact structure depends on the canonical/override implementation) for the requested kind. If an override exists in prior$priors_by_kind[[kind]], it is returned; otherwise, the result of .gdpar_canonical_prior_for_kind(kind) is returned.
Notes
- This is an internal function (note the leading
.). It is a utility used to standardize prior lookups, allowing user-specified overrides via thepriors_by_kindslot. - The fallback function
.gdpar_canonical_prior_for_kind()is not defined in this section and must be provided by the package infrastructure.
Purpose An S3 print method for objects of class gdpar_prior. It formats and prints a structured summary of the prior object to the console.
Arguments
-
x: An object of classgdpar_prior. -
...: Additional arguments passed to the generic; ignored in this method.
Returns
Invisibly returns the input object x.
Notes
- This is an exported function (due to the
@exporttag), providing the standardprint()interface forgdpar_priorobjects. - The output lists the values of the following top-level prior parameters, if they exist in
x:theta_ref,sigma_theta_ref,sigma_a,sigma_b,sigma_W,sigma_y, andphi. - If
xcontains a non-null, non-emptypriors_by_kindlist, each element (key-value pair) in that list is printed indented under a "priors_by_kind" header. The keys are the component names (thekindargument in other functions), and the values are their corresponding prior specifications. - The function uses
cat()for direct console output andformat(..., width = ...)for alignment.
Purpose
Classifies a fitted gdpar_fit object into one of three dispatch "paths" so that downstream residual helpers can select the correct data-extraction and array-shaping logic. The three paths mirror the three Stan data-assembly routines in gdpar.R (.assemble_stan_data, .assemble_stan_data_multi, .assemble_stan_data_K) and the three Stan templates.
Arguments
| Argument | Type | Meaning |
|---|---|---|
object |
list (gdpar_fit) |
Fitted model object; must contain scalar integer elements K and p. |
Returns
A length-1 character string:
-
"K_individual"— whenobject[["K"]]is non-NULLand strictly greater than1L. -
"multivariate"— whenKisNULLor≤ 1andobject[["p"]]is non-NULLand strictly greater than1L. -
"scalar"— otherwise (the fall-through default).
Notes
K is checked first and takes precedence over p. If either element is NULL or not greater than 1L, the corresponding branch is skipped. No validation of object structure is performed beyond the is.null / > 1L tests; a malformed object silently falls through to "scalar".
Purpose
Returns the set of stan_id integer codes that identify integer-valued (count or binary) response families. Used by .gdpar_get_y_obs() to decide whether to read y_int or y_real from stan_data.
Arguments
None.
Returns
An integer vector of length 7: c(2L, 3L, 4L, 10L, 11L, 12L, 13L). Per the roxygen documentation these correspond to Poisson, negative-binomial-2, Bernoulli, and the ZIP / ZINB / hurdle-Poisson / hurdle-NB mixture families.
Notes
The mapping from stan_id to family name is not validated inside this helper; callers are expected to use the returned vector for %in% membership tests only.
Purpose
Extracts the observed response vector (or matrix) from a gdpar_fit object's stan_data component, choosing between y_real and y_int according to the family's stan_id.
Arguments
| Argument | Type | Meaning |
|---|---|---|
object |
list (gdpar_fit) |
Fitted model object containing $stan_data, $family, and (for path classification) $K / $p. |
Returns
-
Multivariate path (
p > 1): a numeric matrix of shape$n \times p$ , taken fromstan_data$y_realif it is a non-empty matrix, otherwise fromstan_data$y_int. -
Scalar / K-individual paths: a numeric vector of length
$n$ . Iffamily$stan_idis in.gdpar_integer_family_stan_ids(),as.numeric(sd$y_int)is returned; otherwiseas.numeric(sd$y_real).
Notes
- For the multivariate path,
y_realis preferred overy_int; if neither is a non-NULLmatrix with at least one row, the function callsgdpar_abort()withclass = "gdpar_internal_error"and message"Internal error: multivariate fit lacks both y_real and y_int in stan_data.". - For scalar / K-individual paths, if
object$family$stan_idisNULL,gdpar_abort()is called withclass = "gdpar_internal_error"and message"Internal error: family lacks stan_id; cannot determine y_real vs y_int.". - The
stan_idextraction uses an identicalif/elsebranch for both"K_individual"and"scalar"paths (object$family$stan_idin both cases), so the code is functionally uniform across those two paths despite the explicit branch.
Purpose
Extracts posterior draws of the Stan parameter y_pred from the CmdStanFit / stanfit object stored in object$fit, reshaping the flat draws matrix into the path-appropriate array layout.
Arguments
| Argument | Type | Meaning |
|---|---|---|
object |
list (gdpar_fit) |
Fitted model object; object$fit must support $draws(variables = "y_pred", format = "draws_matrix"). |
Mathematics
For the multivariate path, variable names follow the pattern y_pred[i,k] with
and
For the scalar / K-individual paths, variable names follow y_pred[i] with
Returns
-
Multivariate path: numeric array of shape
$S \times n \times p$ . -
Scalar / K-individual paths: numeric matrix of shape
$S \times n$ .
In both cases NA_real_ and then filled column-by-column.
Notes
- Calls
require_suggested("posterior", "extract posterior y_pred draws")before extraction; if the posterior package is unavailable this will error. - Variable-name parsing uses
regexecwith patterns^y_pred\[(\d+),(\d+)\]$(multivariate) or^y_pred\[(\d+)\]$(scalar / K-individual). If any parsed index isNA,gdpar_abort()is called withclass = "gdpar_internal_error"and a message indicating which path failed. - The dimensions
$n$ and$p$ (multivariate) or$n$ (scalar) are inferred asmax(idx_i)/max(idx_k)from the parsed variable names, not fromobject$stan_data. -
unclass(draws)is used to strip thedraws_matrixclass before column indexing.
Purpose
Returns a single family-name string suitable for dispatching family-specific residual computations (deviance, quantile). For the multivariate path the family is per-coordinate; for the other paths it is the single location family.
Arguments
| Argument | Type | Meaning |
|---|---|---|
object |
list (gdpar_fit) |
Fitted model object. |
coord |
NULL or integer |
Coordinate index selecting a family from object$family$families for the multivariate path. Ignored for scalar / K-individual paths. |
Returns
A length-1 character string:
-
Multivariate path:
object$family$families[[coord]]$name. -
Scalar / K-individual paths:
object$family$name.
Notes
If the path is "multivariate" and coord is NULL, gdpar_abort() is called with class = "gdpar_internal_error" and message "Internal error: multivariate path requires explicit coord index.". No bounds checking on coord is performed; an out-of-range index will propagate a standard subscript out of bounds error from [[.
Purpose
Tests whether a family name corresponds to a discrete (count or binary) distribution, used to switch on uniform jittering in Dunn–Smyth randomized quantile residuals.
Arguments
| Argument | Type | Meaning |
|---|---|---|
name |
character | Family name string (as returned by .gdpar_family_name_for_residuals()). |
Returns
TRUE if name is one of "poisson", "neg_binomial_2", "bernoulli", "zip", "zinb", "hurdle_poisson", "hurdle_neg_binomial_2"; otherwise FALSE.
Notes
Vectorised via %in%, so a character vector of length > 1 returns a logical vector of the same length, although callers typically pass a scalar.
Purpose
Computes a single family-specific deviance residual contribution for each
Arguments
| Argument | Type | Meaning |
|---|---|---|
y |
numeric vector | Observed responses. |
mu |
numeric vector | Predicted means (same length as y). |
family_name |
character (length 1) | Family name string. |
dispersion |
NULL or numeric |
Dispersion / scale parameter (1 when NULL. |
Mathematics
Let .Machine$double.epsfamily_name:
Gaussian (
Poisson / ZIP / hurdle-Poisson (with
Negative-binomial-2 / ZINB / hurdle-NB2 (
Bernoulli (
Beta (
Gamma (
Lognormal (loc-scale):
Student-
Tweedie and unrecognised families (fallback):
Returns
A numeric vector of the same length as y containing the signed deviance residuals.
Notes
- The NB2 deviance is described in the roxygen as "approximated by Poisson deviance with shape inflation factor, conservative"; the implemented formula is the standard NB2 deviance with
$\phi$ as the reciprocal dispersion / shape parameter. - Mixtures and hurdle families that are not explicitly listed (e.g.
zip,zinb,hurdle_poisson,hurdle_neg_binomial_2) are handled by the Poisson or NB2 branches respectively, not by the fallback. - The fallback branch (Tweedie and any unrecognised
family_name) uses a Pearson-like surrogate$r_i = \mathrm{sign}(y_i-\mu_i)\sqrt{2|y_i-\mu_i|}$ , matching the roxygen description. -
epsis computed once as.Machine$double.eps^(1/3). - All vectorised operations rely on
ifelse/pmax/pmin, so inputs of length > 1 are handled element-wise.
Purpose
Computes Bayesian randomized quantile residuals (Dunn & Smyth, 1996) by averaging the conditional CDF over posterior draws of
Arguments
| Argument | Type | Meaning |
|---|---|---|
y_obs |
numeric vector | Observed responses, length |
y_pred_mat |
numeric matrix | Posterior predictive draws of shape |
family_name |
character (length 1) | Family name, used to determine discreteness via .gdpar_family_is_discrete(). |
randomize_seed |
NULL or integer |
If non-NULL, the global RNG state is saved, set.seed(randomize_seed) is called, and the original state is restored on exit. |
Mathematics
For each observation
Discrete families (Dunn–Smyth randomization): draw
where
Continuous families:
Then .Machine$double.eps
Returns
A numeric vector of length
Notes
-
Seed handling: when
randomize_seedis non-NULL, the function checks for.Random.seedinglobalenv(). If it exists, it is captured intoold_seedand restored viaon.exit(assign(".Random.seed", old_seed, envir = globalenv()), add = TRUE). Thenset.seed(randomize_seed)is called. If.Random.seeddoes not exist inglobalenv(), no restoration is registered (theon.exitis inside theif (exists(...))block), butset.seedis still called. - For discrete families, exactly one uniform variate
$v_i$ is drawn per observation viastats::runif(1L), so the randomization is per-observation independent. - The loop over
seq_len(n)is not vectorised for the CDF computation; each observation'syp <- y_pred_mat[, i]column is extracted and compared toy_obs[i]. - Clamping with
epspreventsqnormfrom returning±Infwhen$u_i$ is exactly 0 or 1. - No input validation is performed on the dimensions of
y_pred_mat; ifnrow(y_pred_mat)orncol(y_pred_mat)is inconsistent withlength(y_obs), indexing will produceNAor recycling behaviour.
Purpose
Detects whether the DHARMa package is installed and available, implementing the "Suggests with detect-and-use" pattern (decision E1.A of the 8.3.9 scoping).
Arguments
None.
Returns
TRUE if requireNamespace("DHARMa", quietly = TRUE) succeeds; FALSE otherwise.
Notes
The call is wrapped in requireNamespace with quietly = TRUE, so no warning or message is emitted on failure. The function is a thin predicate intended for use in if (.gdpar_has_dharma()) { ... } guards.
Purpose
Internal helper that constructs a DHARMa simulation object from a fitted gdpar_fit model by forwarding the posterior predictive draws of y_pred and the observed response y to DHARMa::createDHARMa(). Used by the exported wrapper gdpar_dharma_object().
Arguments
-
object: An object of classgdpar_fit. The fitted model from which observed responses and posterior predictive draws are extracted. -
coord:NULLor an integer scalar. For multivariate fits (p > 1), selects the coordinate (column) for which the DHARMa object is built; required (non-NULL) in that case. Ignored for scalar / K-individual paths.
Mathematics
For each observation coord of a multivariate fit). The fitted predicted response is the posterior mean per observation:
These are passed to DHARMa::createDHARMa() with simulatedResponse = t(Y_pred) (draws as columns), observedResponse = y_obs, fittedPredictedResponse = \hat{\mu}, and integerResponse set according to whether the family is discrete.
Returns
A DHARMa::DHARMa simulation object (return value of DHARMa::createDHARMa()).
Notes
- Raises an error of class
gdpar_input_errorif DHARMa is not installed (via.gdpar_has_dharma()), directing the user toresiduals(., type = "quantile")as a fallback. - Raises an error of class
gdpar_input_errorif the fit path is"multivariate"andcoordisNULL. - For multivariate fits,
y_predis indexed asy_pred[, , coord](an$S \times n$ matrix) andy_obsasy_obs[, coord]; for other paths,y_predandy_obsare used directly. - Discreteness of the family is determined by
.gdpar_family_is_discrete(.gdpar_family_name_for_residuals(object, coord = coord)), which controls theintegerResponseflag passed to DHARMa. -
fitted_predis computed viacolMeans(y_pred_mat), i.e. the posterior mean per observation across draws.
residuals.gdpar_fit(object, type = c("quantile", "response", "pearson", "deviance"), coord = NULL, randomize_seed = NULL, ...)
Purpose
Exported S3 method implementing stats::residuals() for objects of class gdpar_fit. Computes posterior-predictive residuals of four types — response, Pearson, deviance, and Bayesian randomized quantile (Dunn-Smyth 1996) — from the y_pred draws emitted by the Stan templates.
Arguments
-
object: An object of classgdpar_fit(validated viaassert_inherits()). -
type: Character scalar matched againstc("quantile", "response", "pearson", "deviance")viamatch.arg(). Default is"quantile". -
coord:NULLor an integer scalar in$[1, p]$ . Only used for multivariate fits. WhenNULL(default), residuals are returned as an$n \times p$ matrix; when specified, a length-$n$ vector for that coordinate is returned. -
randomize_seed:NULLor an integer scalar. When provided, sets the RNG seed used by the randomized quantile residual under discrete families. WhenNULL, the global RNG state is used. -
...: Unused; present for S3 generic compatibility.
Mathematics
Let
-
Response:
$r_i = y_i - \hat{\mu}_i$ . -
Pearson:
$r_i = (y_i - \hat{\mu}_i) / \max(\hat{\sigma}_i, \varepsilon)$ , where$\varepsilon =$ .Machine$double.eps^(1/3). -
Deviance: family-specific, delegated to
.gdpar_deviance_residual(y_obs, mu_hat, family_name); for mixtures (ZIP/ZINB) and Hurdle families a Pearson-like surrogate is used. -
Quantile (Bayesian randomized quantile, Dunn and Smyth 1996): averages the nonparametric ECDF of
y_preddraws at$y_i$ across posterior draws, with uniform jitter on the equality mass when the family is discrete. Under correct specification,$r_i \overset{\text{approx}}{\sim} \mathcal{N}(0,1)$ .
Returns
- Scalar / K-individual paths: a numeric vector of length
$n$ . - Multivariate path with
coord = NULL: a numeric matrix of size$n \times p$ with column namespaste0("dim_", seq_len(p)). - Multivariate path with
coord = k: a numeric vector of length$n$ for coordinate$k$ .
Notes
- Dispatches on the S3 class
gdpar_fitregistered forstats::residuals. - The fit path is determined by
.gdpar_fit_path_class(object). - For multivariate fits with non-
NULLcoord, the function validatescoordis an integer in$[1, p]$ (wherep <- object[["p"]]); otherwise raises an error of classgdpar_input_errorwith a formatted message. - For multivariate fits with
coord = NULL, residuals are computed coordinate-by-coordinate in afor (k in seq_len(p))loop, with the family name retrieved per coordinate via.gdpar_family_name_for_residuals(object, coord = k). - Actual residual computation is delegated to
.gdpar_residuals_dispatch(). -
y_obsandy_predare obtained from.gdpar_get_y_obs(object)and.gdpar_get_y_pred_draws(object)respectively.
Purpose
Internal type-dispatcher that computes residuals for a single (y_obs, y_pred_mat) pair according to type. Called by residuals.gdpar_fit() once the appropriate coordinate slice has been extracted.
Arguments
-
y_obs: Numeric vector of length$n$ . Observed responses for the coordinate being processed. -
y_pred_mat: Numeric matrix of dimensions$S \times n$ . Posterior predictive draws (draws in rows, observations in columns). -
type: Character scalar; one of"response","pearson","deviance","quantile". Nomatch.arg()is performed here (the caller has already matched). -
family_name: Character scalar naming the response family, used for deviance and quantile residuals. -
randomize_seed:NULLor integer scalar; passed through to.gdpar_quantile_residuals_bayesian().
Mathematics
-
"response": returns$y - \text{colMeans}(Y_{\text{pred}})$ , i.e.$r_i = y_i - \hat{\mu}_i$ . -
"pearson": with$\hat{\mu}_i = \text{colMeans}(Y_{\text{pred}})_i$ and$\hat{\sigma}_i = \text{sd}(Y_{\text{pred}, \cdot, i})$ (sample standard deviation across draws), returns
-
"deviance": returns.gdpar_deviance_residual(y_obs, mu_hat, family_name). - otherwise (
"quantile"): returns.gdpar_quantile_residuals_bayesian(y_obs, y_pred_mat, family_name, randomize_seed).
Returns
A numeric vector of length
Notes
- The function uses early
return()statements; the final branch (quantile) is the fall-through default whentypeis not"response","pearson", or"deviance". -
sd_hatis computed viaapply(y_pred_mat, 2L, stats::sd)(per-column sample SD across draws). - The floor
eps <- .Machine$double.eps^(1 / 3)prevents division by zero when a coordinate has zero posterior predictive variance. - No input validation is performed; the caller is responsible for argument checking.
Purpose
Exported function that extracts the posterior predictive draws y_pred emitted by the Stan templates. Independent of the rstantools generic; exported under the name gdpar_posterior_predict() to avoid a hard dependency on rstantools. When rstantools or brms is loaded, posterior_predict() routes to this method via the S3 method registered in NAMESPACE.
Arguments
-
object: An object of classgdpar_fit(validated viaassert_inherits()). -
ndraws:NULL(default) or an integer scalar. When given, subsamples the firstndrawsposterior draws (rows of the matrix / first slices of the array). -
...: Unused; present for S3 generic compatibility.
Mathematics
Let ndraws = NULL, all ndraws = m is supplied, the first
Returns
- Scalar / K-individual paths: a numeric matrix of dimensions
$S \times n$ (or$\min(\text{ndraws}, S) \times n$ whenndrawsis given). - Multivariate path: a numeric array of dimensions
$S \times n \times p$ (or$\min(\text{ndraws}, S) \times n \times p$ ).
In both cases drop = FALSE is used so dimensions are preserved.
Notes
- Raises an error of class
gdpar_input_errorifndrawsis not a positive integer scalar (after coercion viaas.integer()): conditions arelength(ndraws) != 1L,is.na(ndraws), orndraws < 1L. - The underlying draws are obtained from
.gdpar_get_y_pred_draws(object). - Subsampling is performed via
seq_len(min(ndraws, n_total))wheren_totalisnrow(out)for matrices ordim(out)[1L]for arrays. - When
ndrawsexceeds the available number of draws, all draws are returned silently (no error).
Purpose
Exported wrapper that builds and returns a DHARMa::DHARMa simulation object from a fitted gdpar_fit model, enabling DHARMa diagnostic routines such as testResiduals(), testZeroInflation(), and plotResiduals().
Arguments
-
object: An object of classgdpar_fit(validated viaassert_inherits()). -
coord:NULLor an integer scalar between$1$ and$p$ . Required for multivariate fits; ignored for scalar / K-individual paths.
Returns
A DHARMa simulation object (the return value of .gdpar_build_dharma_object()).
Notes
- Thin wrapper: validates
objectinherits fromgdpar_fitand then delegates entirely to.gdpar_build_dharma_object(object, coord = coord). - Requires the suggested package DHARMa; if not installed,
.gdpar_build_dharma_object()raises an error of classgdpar_input_errorpointing toresiduals(., type = "quantile")as the built-in fallback. - For multivariate fits,
coordmust be supplied (non-NULL); otherwise.gdpar_build_dharma_object()raises an error of classgdpar_input_error.
pp_check.gdpar_fit(object, type = c("dens_overlay", "hist", "ecdf_overlay", "stat", "intervals"), coord = NULL, ndraws = 50L, ...)
Purpose
Exported S3 method implementing the bayesplot::pp_check() generic for objects of class gdpar_fit. Forwards y_pred draws and the observed response y to one of the bayesplot::ppc_* family of functions selected by type, producing a posterior predictive check plot.
Arguments
-
object: An object of classgdpar_fit(validated viaassert_inherits()). -
type: Character scalar matched againstc("dens_overlay", "hist", "ecdf_overlay", "stat", "intervals")viamatch.arg(). Default is"dens_overlay". -
coord:NULLor an integer scalar in$[1, p]$ . Required for multivariate fits; selects the coordinate to plot. -
ndraws: Integer scalar; subsamples the firstndrawsposterior draws before plotting. Defaults to50L. -
...: Additional arguments forwarded to the selected bayesplot function.
Mathematics
Let
The selected bayesplot function is then invoked as fn(y = y_obs_vec, yrep = Y_rep_sub, ...).
Returns
A ggplot object produced by the selected bayesplot ppc_* function.
Notes
- Registered via
@exportS3Method bayesplot::pp_checkso that the method is available whenbayesplotis loaded. - Calls
require_suggested("bayesplot", "produce a posterior predictive check plot"); if bayesplot is not installed, an error is raised (the exact class is determined byrequire_suggested()). - The fit path is determined by
.gdpar_fit_path_class(object). - For multivariate fits:
- If
coordisNULL, raises an error of classgdpar_input_errorinstructing the user to passcoord = k. - Coerces
coordto integer and validates it lies in$[1, p]$ wherep <- object[["p"]]; otherwise raises an error of classgdpar_input_errorwith a formatted message. - Extracts
y_obs_vec <- as.numeric(y_obs[, coord])andyrep_mat <- y_pred[, , coord].
- If
- For scalar / K-individual paths:
y_obs_vec <- as.numeric(y_obs)andyrep_mat <- y_pred. -
ndrawsis coerced viaas.integer(); raises an error of classgdpar_input_erroriflength(ndraws) != 1L,is.na(ndraws), orndraws < 1L. - Subsampling:
keep <- seq_len(min(ndraws, nrow(yrep_mat)));yrep_mat <- yrep_mat[keep, , drop = FALSE]. - The bayesplot function is selected via
switch(type, ...):-
"dens_overlay"→bayesplot::ppc_dens_overlay -
"hist"→bayesplot::ppc_hist -
"ecdf_overlay"→bayesplot::ppc_ecdf_overlay -
"stat"→bayesplot::ppc_stat -
"intervals"→bayesplot::ppc_intervals
-
- The selected function is invoked with named arguments
y = y_obs_vec,yrep = yrep_mat, and forwarded....
Purpose Counts the number of individual-scope (per-observation or per-group) parameters
Arguments
| Argument | Type | Meaning |
|---|---|---|
family |
gdpar_family or gdpar_family_multi
|
A family object whose param_specs field (if present) lists the parameter slots and their scopes. |
Mathematics
Population-level auxiliary parameters (e.g.
Returns Integer scalar
Notes
- For a
gdpar_family_multi, the function extracts the first family fromfamily$families[[1L]]and inspects itsparam_specs. - Backward compatibility: when
param_specsisNULL(families predating the Block 8.0 sub-phase refactor), the function returns1Lunconditionally. - The clamping
max(K, 1L)ensures that even if noparam_specsentry has an individual scope, the returned value is at least 1.
Purpose Guards against the unsupported combination of a multi-parametric family (
Arguments
| Argument | Type | Meaning |
|---|---|---|
family |
gdpar_family or gdpar_family_multi
|
The family object; forwarded to .gdpar_n_params_individual() to obtain |
p |
Integer scalar | The AMM dimension (number of coordinate functions). |
Returns invisible(NULL) when the guard passes (i.e. the combination is allowed). Aborts with a gdpar_unsupported_feature_error otherwise, carrying a data list with fields K and p.
Notes
- The guard calls
.gdpar_n_params_individual(family)internally to determine$K$ . - The error message directs the user to choose either
$K > 1$ with$p = 1$ , or$K = 1$ with$p > 1$ . - Side effect on failure: calls
gdpar_abort()which halts execution.
Purpose Prevents routing a family to legacy amm_distrib_K.stan).
Arguments
| Argument | Type | Meaning |
|---|---|---|
family |
gdpar_family or gdpar_family_multi
|
The family object. For a gdpar_family_multi, the first element of family$families is inspected. |
Minimum stan_id
stan_id |
Family name | Minimum |
Parameter slots |
|---|---|---|---|
| 5 | Beta | 2 | |
| 6 | Gamma | 2 | |
| 7 | lognormal_loc_scale | 2 | |
| 8 | Student-t | 3 | |
| 9 | Tweedie | 3 | |
| 10 | ZIP | 2 | |
| 11 | ZINB | 3 | |
| 12 | hurdle_poisson | 2 | |
| 13 | hurdle_neg_binomial_2 | 3 |
Families with stan_id 1–4 (Normal, Bernoulli, Poisson, Neg-Binomial-2 in stan_id that is NULL, NA, or not in the above table also bypass the guard.
Returns invisible(NULL) when the guard passes. Aborts with a gdpar_unsupported_feature_error on failure.
Notes
- The guard computes
$K$ via.gdpar_n_params_individual(family). If$K \geq \texttt{min\_K}$ , it passes. - On failure, the error message names the family, the required
$K$ , the received$K$ , lists the auxiliary slot names, and gives an example of how to elevate them viagdpar_bf(). - Auxiliary slot names are read from
base$param_specswhen available; otherwise the fallbackc("location", "dispersion")is used. The first slot is treated as the location; all remaining are auxiliaries.
Purpose Canonical dispatcher for inst/stan/_canonical_pieces/, injects helper fragments when requested, and performs placeholder substitution to produce the final Stan model source. Replaces duplicated inline template-read + substitute logic that previously lived in generate_stan_code() and generate_stan_code_multi().
Arguments
| Argument | Type | Meaning |
|---|---|---|
spec |
Named list | Canonical codegen specification. Required fields depend on spec$p_class. See table below. |
spec fields by p_class
p_class value |
Template file | Required fields |
|---|---|---|
"p1" |
amm_canonical_p1.stan |
prior, cp_a, cp_W, mle
|
"pmulti" |
amm_canonical_pmulti.stan (or spec$template_name) |
prior, cp_a, cp_W, cp_a_per_k
|
"distrib_K" |
amm_canonical_distrib_K.stan (or spec$template_name) |
prior, cp_a, cp_W, cp_a_per_K, family
|
"pmulti_KxP" |
amm_canonical_pmulti_KxP.stan |
prior, cp_a, cp_W, family
|
Algorithm / Substitution logic
-
Validation. Verify
spec$p_class$\in \{$ "p1","pmulti","distrib_K","pmulti_KxP"$\}$ ; abort withgdpar_internal_errorotherwise. -
Template resolution. Select the Stan template file name. For names starting with
"amm_canonical_", search ininst/stan/_canonical_pieces/; otherwise search ininst/stan/. Bothsystem.file()(installed package) and a directinst/fallback are tried. -
Helpers injection (Sub-sub-fase 9.3.a colateral, Sesion B9.4). If the template source contains the literal placeholder
// {{CANONICAL_HELPERS}}, readamm_canonical_helpers.stanfrom_canonical_pieces/and substitute it into the placeholder viagsub(..., fixed = TRUE). This injects the canonical definitions ofbspline_basis_evalandapply_W_basis_diffwithout usingcmdstan#includesemantics. -
Placeholder substitution. Build a named list
replacementsdepending onp_class:-
"p1":$\texttt{A\_SCALE} = \begin{cases} \texttt{""} & \text{if } \texttt{cp\_a} = \texttt{TRUE} \\ \texttt{"\ *\ sigma\_a[1]"} & \text{otherwise} \end{cases}\ \texttt{A\_PRIOR} = \begin{cases} \texttt{"normal(0, sigma\_a[1])"} & \text{if } \texttt{cp\_a} \\ \texttt{"normal(0, 1)"} & \text{otherwise} \end{cases}$ Analogous logic forW_SCALEandW_PRIORbased oncp_W. Whenspec$mleisTRUE, bothcp_aandcp_Ware forced toTRUE. -
"distrib_K": Validatescp_a_per_K(logical vector, noNA, length$\geq 1$ ; defaults toas.logical(spec$cp_a)[1L]). Callsgenerate_a_blocks_K(cp_a_per_K)to producedata_decl,tp_block,model_blockstrings. Calls.gdpar_build_theta_ref_prior_block(family)for the per-family anchor block. Replaces{{THETA_REF_PRIOR_BLOCK}},{{DATA_CP_A_PER_K_DECL}},{{TP_A_BLOCK}},{{MODEL_A_BLOCK}}, plus the standard prior and CP/W placeholders. Supports EB-side templates viaspec$template_nameoverride. -
"pmulti_KxP": Calls.gdpar_build_theta_ref_prior_block_KxP(family)for the K×P anchor block. Uses inline NCP-only a/b blocks (no per-slot CP substitutions in this sub-phase). Replaces{{THETA_REF_PRIOR_BLOCK}}plus the standard prior and CP/W placeholders. -
"pmulti": Validatescp_a_per_k(same validation asdistrib_K). Callsgenerate_a_blocks_multi(cp_a_per_k)fordata_decl,tp_block,model_block. Replaces those blocks plus the standard prior and CP/W placeholders.
All branches replace the following common placeholders:
{{PRIOR_THETA_REF}},{{PRIOR_SIGMA_THETA_REF}},{{PRIOR_SIGMA_A}},{{PRIOR_SIGMA_B}},{{PRIOR_SIGMA_W}},{{PRIOR_SIGMA_Y}},{{PRIOR_PHI}},{{W_SCALE}},{{W_PRIOR}}. -
-
Leftover check. After substitution, if any
{{…}}pattern remains in the source, abort withgdpar_internal_errorreporting the unsubstituted placeholder. -
MLE strip. For
p_class == "p1"withspec$mle == TRUE, call.strip_priors_block(src)to remove the entire priors block.
Returns Character scalar: the fully substituted Stan model source string, ready for cmdstanr::cmdstan_model(write_stan_file(...)).
Notes
- Caching is delegated to
cmdstanr, which hashes the source string. CP and NCP variants compile and cache independently. - Bit-exactness: the canonical pieces are frozen byte-identical copies of the legacy
amm_main.stanandamm_distrib_multi.stan; the dispatcher preserves the same substituted strings as the legacy paths for identical inputs. - The
distrib_Kbranch'stemplate_nameoverride supports EB-side templates (amm_eb_marginal_K.stan,amm_eb_conditional_K.stan) per D34 of Sub-fase 8.6.C. - The
pmulti_KxPbranch uses.gdpar_build_theta_ref_prior_block_KxP()(distinct from thedistrib_Kbranch's.gdpar_build_theta_ref_prior_block()), reflecting the K×P architecture from Sub-bloque 9.3.d under decision I.iv + J.iv.A. -
spec$prioris expected to be an object (likelygdpar_prior) with fields:theta_ref,sigma_theta_ref,sigma_a,sigma_b,sigma_W,sigma_y,phi(each a character string of Stan code for the corresponding prior distribution). - Side effects on validation failure: calls
gdpar_abort()with informative messages.
Purpose
Exported entry point for generating the Stan model source code string for a General Dynamic Parameter (GDPAR) model via reference anchoring. Constructs a parameter-configuration list describing the "Path 1" (univariate) canonical template and delegates all template loading, placeholder substitution and emission to the internal .gdpar_emit_canonical_stan helper.
Arguments
| Argument | Type | Meaning |
|---|---|---|
prior |
object of class gdpar_prior
|
Prior specification for all model parameters. Passed through to the Stan code generator. |
cp_a |
logical scalar (default FALSE) |
If TRUE, the additive component FALSE, the non-centered parametrization (NCP) is used. |
cp_W |
logical scalar (default FALSE) |
If TRUE, the modulating component FALSE, NCP is used. |
mle |
logical scalar (default FALSE) |
If TRUE, the prior block is stripped from the emitted Stan source so that the model reduces to maximum-likelihood estimation (used by gdpar_bvm_check). |
Mathematics
No direct formula; this function is a configuration wrapper. The parametrization choice affects how the Stan template rewrites the sampling statements for the latent parameters
Returns
A character scalar containing the complete Stan model source code, ready for compilation by cmdstanr::cmdstan_model.
Notes
- Raises an error (via
assert_inherits) ifpriordoes not inherit from class"gdpar_prior". - The internal helper
.gdpar_emit_canonical_stanreceives a list with keysp_class(hard-coded to"p1"),prior,cp_a,cp_W, andmle. - When
mle = TRUE, the emitted source is further processed by.strip_priors_blockto remove all prior sampling statements.
Purpose
Internal utility that removes the prior block from an already-substituted Stan source string. The block is delimited by the sentinel comment lines // BEGIN PRIORS and // END PRIORS (inclusive). Used when generate_stan_code(mle = TRUE) is requested so that the Stan model contains only the likelihood and deterministic transformations.
Arguments
| Argument | Type | Meaning |
|---|---|---|
src |
character scalar | Fully substituted Stan source code (single string with embedded newlines). |
Mathematics No formula; this is a string-processing operation: $$\text{out} = \text{lines}[1:(\texttt{begin_idx}-1)] ;||; \text{lines}[(\texttt{end_idx}+1):\text{end}]$$
Returns
A character scalar equal to src with every line from the // BEGIN PRIORS marker through the // END PRIORS marker (inclusive) removed.
Notes
- Raises a fatal error with class
"gdpar_internal_error"(viagdpar_abort) if:- Exactly one
// BEGIN PRIORSmarker is not found. - Exactly one
// END PRIORSmarker is not found. - The
BEGINmarker appears at or after theENDmarker.
- Exactly one
- The markers are matched via
grep("//\\s*BEGIN PRIORS", ...)andgrep("//\\s*END PRIORS", ...)(flexible whitespace). - This is a template invariant violation guard: the sentinel comments must exist in every canonical Stan template.
Purpose
Internal helper that writes a Stan model source string to a temporary .stan file and returns the file path. The file name incorporates a deterministic hash of the source content so that identical source strings reuse the same file within a session, enabling cmdstanr's internal compilation cache.
Arguments
| Argument | Type | Meaning |
|---|---|---|
src |
character scalar | Complete Stan model source code to be written. |
Mathematics
The hash is computed as follows. The source string is converted to its raw byte representation, then formatted as numeric strings, concatenated, and truncated to 16 characters. Any character not in [A-Za-z0-9] is replaced with "0":
Returns
A character scalar with the absolute path to the written .stan file.
Notes
- Complies with CRAN policy by writing only to
tempdir(). - The hash is a best-effort cache key; collisions are astronomically unlikely for 16 alphanumeric characters but not formally guaranteed.
- Uses
writeLines(src, path)which appends a trailing newline ifsrcdoes not already end with one.
Purpose
Internal function that assembles the named list of data passed to Stan's data block via cmdstanr::sample(data = ...). This is the univariate (single-coordinate, .assemble_stan_data_multi when the model is multivariate (
Arguments
| Argument | Type | Meaning |
|---|---|---|
design |
list (from build_amm_design) |
Contains design matrices Z_a (Z_b (X ( |
family |
object of class gdpar_family
|
Specifies the response distribution (Gaussian, Poisson, negative binomial, Bernoulli, etc.) and its Stan integer ID. |
amm |
object of class amm_spec
|
The AMM (additive modulating model) specification; carries a, b, W sub-specs and the dimension count p. |
y |
numeric or integer vector of length |
The observed outcome vector. |
theta_anchor |
numeric scalar | The reference (anchor) value |
group_id |
NULL or vector of length |
Optional grouping variable for multilevel/hierarchical extensions. If NULL, a single group is assumed. |
Mathematics No explicit formula; the function constructs the data objects that enter the Stan log-likelihood. Key dimensions exposed:
-
$n$ : number of observations (length(y)) -
$J_a$ : columns of$Z_a$ (additive fixed-effect basis) -
$J_b$ : columns of$Z_b$ (additive random-effect basis) -
$d$ : columns of$X$ (shared modulating covariates) -
$\text{dim\_W}$ : dimensionality of the$W$ basis (0 if unused)
Family-specific outcome encoding: $$\text{y_real} = \begin{cases} y & \text{if family} = \texttt{"gaussian"} \ \mathbf{0} & \text{otherwise} \end{cases}, \qquad \text{y_int} = \begin{cases} y & \text{if family} \in {\texttt{poisson}, \texttt{neg_binomial_2}, \texttt{bernoulli}} \ \mathbf{0} & \text{otherwise} \end{cases}$$
Returns A named list with the following Stan data fields (among others):
| Field | Type | Description |
|---|---|---|
n |
integer | Number of observations. |
family_id |
integer | Stan-side family identifier. |
use_a, use_b, use_W
|
0 or 1 | Flags indicating which AMM components are active. |
J_a, J_b
|
integer | Column counts of |
dim_W |
integer | Dimension of |
d |
integer | Column count of |
Z_a, Z_b, X
|
matrix | Design matrices (empty |
y_real, y_int
|
numeric/integer vector | Outcome in real and integer form. |
theta_anchor |
double | Anchor value. |
use_dispersion_y, use_dispersion_phi
|
0 or 1 | Flags for Gaussian dispersion |
W_type_id, W_n_knots_full, W_knots_full, W_degree
|
various | Modulating basis spline metadata. |
use_groups, J_groups, group_id
|
integer / integer vector | Grouping fields. |
Notes
- Calls
.gdpar_guard_multiparam_multivariate(family, p_value)and.gdpar_guard_K_below_family_min(family)as input validation guards. - If
familyinherits from"gdpar_family_multi"oramm$p > 1, dispatches to.assemble_stan_data_multiand returns immediately. - Design matrices with zero columns are replaced with
matrix(0, n, 0)to satisfy Stan's requirement for non-null matrix data. -
group_idis resolved via.resolve_group_id.
Purpose
Internal helper that validates and coerces the optional group_id argument into the integer-level representation required by the Stan template. Handles the backward-compatible single-group case (when group_id_arg is NULL) and the multilevel case.
Arguments
| Argument | Type | Meaning |
|---|---|---|
group_id_arg |
NULL or a vector of length |
Grouping variable. When NULL, a single group is assumed. |
n |
positive integer | Expected length of the group identifier vector (must match the number of observations). |
Mathematics
When group_id_arg is not NULL, it is coerced to a factor. The integer codes returned by as.integer(as.factor(group_id_arg)) assign contiguous levels
Returns A named list with three elements:
| Field | Type | Description |
|---|---|---|
use_groups |
integer 0 or 1 |
0 when group_id_arg is NULL; 1 otherwise. |
J_groups |
integer |
Number of distinct groups. 1 in the single-group regime. |
group_id |
integer vector of length |
Per-observation group index (1-based, contiguous). |
Notes
- Raises an error with class
"gdpar_input_error"iflength(group_id_arg) != n. - Raises an error with class
"gdpar_input_error"if any element ofgroup_id_argisNA. - In the
NULLcase,group_idisrep(1L, n)andJ_groupsis1L.
Purpose
Internal function implementing the multivariate (assemble_stan_data. Converts ragged per-coordinate design matrices, the multivariate family object, the outcome matrix, and per-coordinate anchor values into the data list expected by the amm_distrib_multi.stan Stan template. Since Stan does not natively support ragged arrays, per-coordinate design matrices
Arguments
| Argument | Type | Meaning |
|---|---|---|
design |
list (from .build_amm_design_multi) |
Contains Z_a_list and Z_b_list (lists of X matrix. |
family |
object of class gdpar_family_multi
|
Multivariate family with per-coordinate marginal families and a common Stan ID. Must have $p matching amm$p. |
amm |
object of class amm_spec with amm$p > 1
|
The multivariate AMM specification. |
y |
|
Outcome matrix. Column |
theta_anchor |
numeric scalar or vector of length |
Anchor value(s). Scalar is broadcast to length |
group_id |
NULL or vector of length |
Optional grouping variable. |
Mathematics Key operations:
-
Column padding (ragged-to-rectangular): $$Z_{a}^{[k]}{\text{padded}} = \bigl[,Z{a}^{[k]} ;\big|; \mathbf{0}{n \times (J{a,\max} - J_{a}^{[k]})},\bigr]$$ Analogously for
$Z_b$ . The Stan template iterates only over$1 : J_{a}^{[k]}$ , so padded zeros never enter the likelihood. -
Modulating basis dimension:
$$\text{dim_W} = \frac{\dim(W)}{1}, \qquad W_{\text{per_k_dim}} = \frac{\text{dim_W}}{p}$$ The total$W$ dimension must be a positive multiple of$p$ for separable bases. -
Anchor broadcasting: $$\theta^{\star}_k = \begin{cases} \theta^{\star} & \text{if } \texttt{length}(\theta^{\star}) = 1 \ \theta^{\star}_k & \text{if } \texttt{length}(\theta^{\star}) = p \end{cases}$$
-
Outcome encoding (per-coordinate): $$\text{y_real}{ik} = \begin{cases} y{ik} & \text{if family is Gaussian} \ 0 & \text{otherwise} \end{cases}, \qquad \text{y_int}{ik} = \begin{cases} y{ik} & \text{if family} \in {\texttt{poisson}, \texttt{neg_binomial_2}, \texttt{bernoulli}} \ 0 & \text{otherwise} \end{cases}$$
Returns
A named list with the Stan data fields expected by amm_distrib_multi.stan:
| Field | Type | Description |
|---|---|---|
n |
integer | Number of observations. |
p |
integer | Number of coordinates. |
family_id |
integer | Stan-side family ID. |
use_a, use_b, use_W
|
0 or 1 | Activity flags for AMM components. |
J_a_max, J_b_max
|
integer | Maximum column count across coordinates for |
J_a_per_k, J_b_per_k
|
integer vector of length |
Per-coordinate column counts. |
Z_a, Z_b
|
list of |
Padded design matrices. |
dim_W |
integer | Total |
d |
integer | Column count of |
W_per_k_dim |
integer | Per-coordinate |
X |
|
Shared modulating covariate matrix. |
y_real |
|
Real-valued outcome encoding. |
y_int |
|
Integer outcome encoding. |
theta_anchor |
double vector of length |
Per-coordinate anchor values. |
use_dispersion_y, use_dispersion_phi
|
0 or 1 | Dispersion flags. |
W_type_id, W_n_knots_full, W_knots_full, W_degree
|
various | Spline metadata for |
use_groups, J_groups, group_id
|
integer / integer vector | Grouping fields. |
Notes
- Raises
"gdpar_internal_error"ifamm$pisNULLor not positive. - Raises
"gdpar_input_error"iffamilydoes not inherit from"gdpar_family_multi"whenamm$p > 1. - Raises
"gdpar_input_error"iffamily$pdoes not equalamm$p. - Raises
"gdpar_input_error"ifyis not a matrix/array, or ifncol(y) != p. - Raises
"gdpar_internal_error"ifdim_Wis not a positive multiple of$p$ (required for separable bases). - Raises
"gdpar_input_error"iftheta_anchoris neither length 1 nor length$p$ . - Calls
.gdpar_resolve_W_stan_datato obtain the modulating basis metadata (W_type_id, knots, degree). - Grouping is resolved via
.resolve_group_id(same as the univariate path).
generate_stan_code_multi(prior, cp_a = FALSE, cp_W = FALSE, cp_a_per_k = NULL, template_name = "amm_distrib_multi.stan")
Purpose: Translates a user-facing request for Stan code for a multi-coordinate AMM (Additive Mixed Model) into the canonical template machinery. It maps legacy template names to their canonical equivalents and delegates code assembly to .gdpar_emit_canonical_stan.
Arguments:
-
prior: An object of class"gdpar_prior"specifying the prior configuration for the model. -
cp_a: A logical scalar (defaultFALSE). IfTRUE, indicates that theacomponent (random effects) should use a centered parameterization (CP). WhenTRUE, it overrides per-coordinate settings. -
cp_W: A logical scalar (defaultFALSE). IfTRUE, indicates that the spatial/spline componentWshould use a centered parameterization. -
cp_a_per_k: An optional logical vector of lengthp(the multivariate dimension). Each element indicates whether the corresponding coordinate'sacomponent should use CP (TRUE) or non-centered parameterization (NCP,FALSE). IfNULL(default), the choice is governed by the scalarcp_aflag. -
template_name: A character string (default"amm_distrib_multi.stan") identifying the desired Stan template. The function maps legacy names to canonical pieces ininst/stan/_canonical_pieces/.
Returns: The result of calling .gdpar_emit_canonical_stan with a list containing the translated template name and all arguments. This is typically a list of character strings representing the Stan code blocks.
Notes:
- The function asserts that
priorinherits from the"gdpar_prior"class. - The template name translation is handled by an internal
switchstatement. The default"amm_distrib_multi.stan"is mapped to"amm_canonical_pmulti.stan". - The function does not generate the code directly; it prepares the input for the internal canonical emitter.
Purpose: Generates the Stan code blocks (data_decl, tp_block, model_block) for the a component (random effects) in a multi-coordinate model. It dispatches on the pattern of the cp_a_per_k vector to produce code for three cases: uniform NCP, uniform CP, or mixed CP/NCP across coordinates.
Arguments:
-
cp_a_per_k: A logical vector of lengthp(the multivariate dimension). Each element indicates the parameterization for that coordinate.
Mathematics: The function generates Stan code for the transformation of raw random effects a_raw into the scaled coefficients a_coef. For each coordinate k:
-
Uniform NCP:
a_coef[k][j] = a_raw[...] * sigma_a[k]with a priora_raw ~ normal(0, 1). -
Uniform CP:
a_coef[k][j] = a_raw[...]with a priorsegment(a_raw, ...) ~ normal(0, sigma_a[k]). -
Mixed: A runtime binary flag
cp_a_per_k_data[k]selects between the two transformations and priors.
Returns: A named list with three character scalar elements:
-
data_decl: Stan code for any required data block declarations (empty for uniform cases, contains an integer array declaration for the mixed case). -
tp_block: Stan code for thetransformed parametersblock that computesa_coef. -
model_block: Stan code for themodelblock that specifies the prior ona_raw.
Notes:
- The
tp_blockalways includes a guardif (use_a == 1)to skip if theacomponent is not used. - In the mixed case, the caller must inject
stan_data$cp_a_per_k_dataas an integer vector into the Stan data. - The generated code uses
a_raw_offset(a per-coordinate index vector) to segment the flat-packeda_rawvector.
.assemble_stan_data_K(design_K, family, amm_list_canonical, y, theta_anchor_K, group_id = NULL, family_id_k_vector = NULL)
Purpose: Assembles the data list for the K-individual (multiple families) Stan template (amm_distrib_K.stan). This is the counterpart of assemble_stan_data for the regime where K > 1 (multiple outcome distributions) and p = 1 (univariate outcomes per distribution). It consumes design matrices, family specifications, and anchors, and produces the structured list expected by the Stan template.
Arguments:
-
design_K: A list returned by.build_amm_design_K(), containing per-slot design matrices (Z_a_k_list,Z_b_k_list,X) and slot metadata. -
family: Agdpar_familyobject. Itsparam_specsmust have been promoted toscope = "per_observation"for the K-individual slots. Must not be agdpar_family_multiobject. -
amm_list_canonical: A named list ofamm_specobjects of lengthK. Used to extract per-slot flags (use_a,use_b,W). -
y: Numeric or integer vector of outcomes (lengthn). -
theta_anchor_K: Numeric vector of lengthKwith the per-slot anchor values. -
group_id: Optional integer vector of lengthnfor group indices.NULLindicates a single group. -
family_id_k_vector: Optional integer vector of lengthKspecifying per-slot family IDs (Stan distribution codes). IfNULL, all slots use the same family as thefamilyargument.
Mathematics:
- The function constructs a data list for a distributional regression model with canonical link functions per outcome slot.
- Each slot
khas its own family (identified byfamily_id_k[k]), a design matrix for the location parameter (Z_a_k), and optional design matrices for scale (Z_b_k) and spline (W) components. - The anchor
theta_anchor_K[k]is the offset for the linear predictor of slotk. - The likelihood is family-specific, with families identified by integer codes (e.g., 1=Gaussian, 3=neg_binomial_2, 5=Beta, etc.).
Returns: A named list with the following elements (among others) expected by amm_distrib_K.stan:
-
n,K: Data dimensions. -
family_id_k: Integer vector of per-slot family IDs. -
inv_link_id_per_slot: Integer vector of per-slot inverse-link function IDs. -
use_a_k,use_b_k: Integer vectors indicating presence ofa/bcomponents per slot. -
use_W,dim_W: Global flags for the spline component. -
J_a_per_k,J_b_per_k,J_a_max,J_b_max: Column counts for design matrices. -
Z_a_k,Z_b_k: Lists of padded design matrices (padded to max column count). -
X: Population-level design matrix. -
y_real,y_int: Outcome vectors (real or integer, depending on family). -
theta_anchor_K: Numeric vector of per-slot anchors. -
use_dispersion_y_k,use_dispersion_phi_k: Integer vectors (currently all zeros) for dispersion flags. -
W_type_id,W_n_knots_full,W_knots_full,W_degree: Spline component parameters. -
use_groups,J_groups,group_id: Grouping information.
Notes:
-
Input Validation: The function performs extensive checks:
-
design_Kmust be a list with elementZ_a_k_list. -
Kmust be ≥ 2. -
amm_list_canonicallength must equalK. -
familymust be agdpar_family(notgdpar_family_multi) with a non-NAstan_id. -
stan_idmust be one of the built-in codes (1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13). -
ymust be a vector (not a multi-column matrix) for theK > 1path. - Outcome range is checked per family (e.g., Beta requires y ∈ (0,1), count families require non-negative integers).
-
theta_anchor_Kmust have lengthK. - If
family_id_k_vectoris provided, it must be a numeric vector of lengthKwith no NAs, and its first element must matchfamily$stan_id.
-
-
Internal Error Handling: Raises
gdpar_abortwith specific error classes (gdpar_internal_error,gdpar_unsupported_feature_error,gdpar_input_error) for invalid inputs. -
Design Matrix Padding: Design matrices
Z_a_kandZ_b_kare padded with zeros to have a common maximum column count (J_a_max,J_b_max) across slots. -
Grouping: Uses
.resolve_group_idto standardize grouping information. -
Inverse Links: Computes per-slot inverse-link IDs via
.gdpar_compute_inv_link_id_per_slot. - Side Effects: None; returns the data list.
Purpose Generates the three Stan code fragments (data declaration, transformed-parameters block, and model block) that govern how the raw a coefficients are rescaled into a_coef_k for the K-individual template. The function inspects whether every component of cp_a_per_K is TRUE (all centered), FALSE (all non-centered), or mixed, and emits the corresponding Stan source. This controls the centered vs. non-centered parametrization of the group-level a coefficients in the generated Stan program.
Arguments
| Argument | Type | Meaning |
|---|---|---|
cp_a_per_K |
Logical vector of length ≥ 1 | Per-K indicator: TRUE means the a coefficients for that K-slot use the centered parametrization (no sigma scaling in the transformed parameters block), FALSE means non-centered (raw standard-normal variates are multiplied by sigma_a_k in transformed parameters). |
Mathematics
The function selects among three code-generation strategies for mapping raw parameters a_raw to a_coef_k[k][j]:
-
All non-centered (
all_ncp): everyFALSE. The transformed-parameters assignment is
and the model block applies a unit-normal prior:
-
All centered (
all_cp): everyTRUE. The transformed-parameters assignment is simply
and the model block applies the group-level prior per segment:
-
Mixed: neither all
TRUEnor allFALSE. A per-K binary flagcp_a_per_K_data[k]is read from data. In the transformed-parameters block a localscale_kis computed:
and a_coef_k[k][j] = a_raw[...] * scale_k. The model block branches per K: if centered, the segment prior is
Returns A named list with three character scalar elements:
| Element | Content |
|---|---|
data_decl |
Stan data block fragment declaring cp_a_per_K_data (empty string when all_ncp or all_cp; non-empty array declaration for mixed). |
tp_block |
Stan transformed parameters block fragment assigning a_coef_k[k][j] (conditionally guarded by any_use_a == 1 and use_a_k[k] == 1 && J_a_per_k[k] > 0). |
model_block |
Stan model block fragment placing the prior on a_raw. |
Notes
- The
all_ncpbranch setsdata_declandmodel_blockfor a globala_raw ~ normal(0,1)(no per-K loop in the model block). - The
all_cpbranch loops over K in the model block, applyingsegment(...)withsigma_a_kper K. - In the mixed branch the data declaration
cp_a_per_K_datais an array of K integers bounded[0,1]; the caller must supply this in the Stan data list. - All three branches guard the
tp_blockwithif (any_use_a == 1)so the block is a no-op when noacoefficients are present. - Side effects: none; the function is pure code-generation.
Purpose (Internal, not exported.) Builds the Stan code block that assigns the anchor priors to theta_ref_k for the single-parametric K-individual distributional-regression template. For most families the block is a vectorized canonical prior; for Tweedie (stan_id == 9) the third distributional slot (the power parameter uniform(1.01, 1.99) prior. The returned block still contains {{PRIOR_THETA_REF}} and {{PRIOR_SIGMA_THETA_REF}} placeholders, which the caller (generate_stan_code_K) substitutes on a subsequent pass.
Arguments
| Argument | Type | Meaning |
|---|---|---|
family |
gdpar_family object or NULL
|
The distributional family. When NULL, or when family$stan_id is NA, the canonical (default) vectorized block is returned. When stan_id == 9 (Tweedie), a Tweedie-specific block is emitted. |
Mathematics
The default block implements a hierarchical anchor-prior structure:
-
Grouped (
use_groups == 1): hyper-priors on$\mu_{\theta_{\text{ref}}}$ and$\sigma_{\theta_{\text{ref}}}$ , then for each group$g$ :
-
Ungrouped (
use_groups != 1): a direct prior on$\theta_{\text{ref},k}[1]$ .
For Tweedie, the block is split: slots {{PRIOR_THETA_REF}}, while slot
Returns A character scalar containing a Stan code snippet (with {{PRIOR_THETA_REF}} / {{PRIOR_SIGMA_THETA_REF}} placeholders still present).
Notes
- The function is
@noRd/@keywords internal; it is never exported. - It relies on the caller ordering the
THETA_REF_PRIOR_BLOCKsubstitution before thePRIOR_THETA_REFsubstitution so that the inner placeholders survive the first pass. - When
familyisNULLorstan_idisNA, the default (non-Tweedie) block is always returned. - Side effects: none.
Purpose (Internal, not exported.) Parallel to .gdpar_build_theta_ref_prior_block but emits the anchor-prior block for the multi-parametric multivariate (K×P) Full-Bayes Path C template amm_canonical_pmulti_KxP.stan. The anchor container differs: theta_ref_kp[g, k] is an array[J_groups, K] vector[p] (per-slot, per-coord), and the hierarchical hyper-prior is per-slot mu_theta_ref_kp[1, k] and sigma_theta_ref_kp[1, k]. For Tweedie (stan_id == 9), slot uniform(1.01, 1.99) on every coordinate
Arguments
| Argument | Type | Meaning |
|---|---|---|
family |
gdpar_family object or NULL
|
Same semantics as .gdpar_build_theta_ref_prior_block: NULL or NA stan_id yields the default vectorized block; stan_id == 9 yields the Tweedie-specific block. |
Mathematics
The default block iterates over
-
Grouped: for each
$k$ :
then for each group
-
Ungrouped: for each
$k$ :
For Tweedie, the block adds a conditional on
(and analogously theta_ref_kp[1, 3][jj] in the ungrouped branch).
Returns A character scalar with Stan code containing {{PRIOR_THETA_REF}} and {{PRIOR_SIGMA_THETA_REF}} placeholders.
Notes
- The function does not need to know
Kat code-generation time because it emits afor (k in 1:K)loop in the Stan source;Kis data. - The Tweedie branch uses
if (k <= 2)rather than slicing[1:2]because the container is a vector of lengthp, and the distinction is by slot (the outerkindex), not by inner vector elements. Slots 1 and 2 are the location and scale parameters; slot 3 is the power parameter. - Same caller-ordering requirement as
.gdpar_build_theta_ref_prior_block: theTHETA_REF_PRIOR_BLOCKsubstitution must precedePRIOR_THETA_REF. - Side effects: none.
Purpose (Internal, not exported.) Thin wrapper around .gdpar_emit_canonical_stan for the multi-parametric multivariate Full-Bayes regime (K ≥ 2 slots crossed with p ≥ 2 coords). It dispatches to the canonical template-dispatcher by supplying the piece class "pmulti_KxP". The wrapper exists so that bootstrap scripts (data-raw/bootstrap_fb_goldens_KxP.R) and Tier-1 compare-path tests can emit and compile the Path C template deterministically, while the user-facing guard .gdpar_guard_multiparam_multivariate still aborts gdpar() when K > 1 and p > 1.
Arguments
| Argument | Type | Meaning |
|---|---|---|
prior |
gdpar_prior object |
The prior specification; validated by assert_inherits(prior, "gdpar_prior", "prior"). |
cp_W |
Logical scalar (default FALSE) |
If TRUE, the modulating component W is sampled in the centered parametrization; if FALSE, non-centered. |
family |
gdpar_family object or NULL (default NULL) |
When supplied, drives the family-specific anchor-prior block via .gdpar_build_theta_ref_prior_block_KxP(). When NULL, the canonical vectorized block is used. |
Returns A character scalar containing the complete Stan model source code (ready for cmdstanr::cmdstan_model()), produced by .gdpar_emit_canonical_stan after all placeholder substitutions.
Notes
- The first line performs argument validation:
assert_inherits(prior, "gdpar_prior", "prior")raises an error ifprioris not of the correct class. - The function delegates entirely to
.gdpar_emit_canonical_stan, passing a configuration list with keysp_class = "pmulti_KxP",prior,cp_W, andfamily. No template-reading or substitution logic is performed locally. - The function is not exported; it is consumed only by internal bootstrap and test machinery.
- Side effects: none beyond the validation assertion.
.gdpar_fb_KxP_fit(y_matrix, family, amm_list, data, formula_rhs, theta_anchor_kp = NULL, iter_warmup = 200L, iter_sampling = 200L, chains = 2L, seed = 91001L, refresh = 0L, ...)
Purpose
Internal workhorse that orchestrates a full Bayesian (FB) full-rank amm_list, generates and compiles the Stan model via the canonical-piece code-generation pipeline, constructs a deterministic per-chain initialisation function (reusing the EB-side canonical helper), invokes cmdstanr::cmdstan_model$sample(), and returns a list bundling the fitted object together with metadata needed for downstream post-processing (e.g. family_name, dimensions, MCMC settings, cmdstan version).
Arguments
| Argument | Type | Meaning |
|---|---|---|
y_matrix |
numeric matrix | Response matrix with |
family |
gdpar_family object |
Distributional family for the likelihood (e.g. Gaussian, beta, gamma, student-t, Tweedie). Must inherit from class "gdpar_family". |
amm_list |
named list of amm_spec objects |
Length-$p element (defaulting to |
data |
data.frame |
Observational data whose row count determines |
formula_rhs |
formula (right-hand side) | Right-hand-side formula used inside .build_amm_design_KxP() to construct the design matrices. |
theta_anchor_kp |
numeric matrix or NULL
|
NULL, defaults to a zero matrix |
iter_warmup |
integer scalar | Number of warmup (adaptation) iterations per chain. Default 200L. |
iter_sampling |
integer scalar | Number of post-warmup sampling iterations per chain. Default 200L. |
chains |
integer scalar | Number of MCMC chains. Default 2L. |
seed |
integer scalar | PRNG seed passed to cmdstanr. Default 91001L. |
refresh |
integer scalar | Console refresh interval for Stan progress output. Default 0L (silent). |
... |
Additional arguments forwarded to cmdstanr::cmdstan_model$sample() (e.g. adapt_delta, max_treedepth). |
Algorithm
-
Package check – Guard that
cmdstanris available; abort with class"gdpar_unsupported_feature_error"if not. -
Input validation – Assert
familyinherits"gdpar_family"; assertamm_listis a list of length$\ge 2$ ; extract$K = |\texttt{amm\_list}|$ and$p = \texttt{amm\_list}[[1]]\$p$ (default$1$ ), then assert$p \ge 2$ . -
W-basis materialisation – For every slot
$k \in \{1,\dots,K\}$ , if anWbasis object exists but carriesNULLorNAdimension, callmaterialize_W_basis(W, p = p). -
Design construction – Call
.build_amm_design_KxP(amm_list, data, formula_rhs)to producedesign_KxP. -
Anchor default – If
theta_anchor_kpisNULL, set it tomatrix(0, K, p). -
Response validation – Assert
y_matrixis a numeric matrix withnrow(data)rows andpcolumns. -
Stan data assembly – Call
.assemble_stan_data_KxP(design_KxP, family, amm_list, y_matrix, theta_anchor_kp, path = "FB"). -
Prior & code generation – Create a default prior via
gdpar_prior(); generate the Stan source viagenerate_stan_code_KxP_FB(prior, cp_W = FALSE, family = family); write it to a temporary file and compile withcmdstanr::cmdstan_model. -
Initialisation function – Define
fb_init_fn(chain_id)that calls.gdpar_eb_make_random_init_KxP(stan_data, seed_offset = chain_id, base_seed = seed). This produces chain-deterministic starting values with:-
$\theta_{\text{ref},kp} \sim$ small jitter, -
$\sigma_{a,k} = 0.1 + \text{jitter}$ , -
$\sigma_{b,k} = 0.1 + \text{jitter}$ , -
$a_{\text{raw}} \sim$ small jitter, -
$c_{b,kp,\text{raw}} \sim$ small jitter. Parameters not consumed by the FB extension (e.g.sigma_W,W_rawwhenuse_W = 0;sigma_y_pop_k,phi_pop_kwhen alluse_dispersion_*_kare zero) have size zero by construction, so missing init entries are harmless.
-
-
Sampling – Call
model$sample(...)with the assembled data and all MCMC control arguments. -
Return – Bundle the
cmdstanrfit object,stan_data, family identifiers, dimensional constants ($K,\ p,\ n$ ), MCMC settings, seed, and the cmdstan version string.
Returns A named list with elements:
| Element | Type | Description |
|---|---|---|
fit |
CmdStanMCMC |
The fitted MCMC object from cmdstanr. |
stan_data |
named list | The complete Stan data list passed to the model. |
family_name |
character | Human-readable family name (e.g. "gaussian"). |
family_stan_id |
integer | Numeric family identifier used inside Stan. |
K |
integer | Number of individuals/slots. |
p |
integer | Number of outcome dimensions per slot. |
n |
integer | Number of observations (rows of data). |
seed |
integer | PRNG seed used. |
iter_warmup |
integer | Warmup iterations per chain. |
iter_sampling |
integer | Post-warmup iterations per chain. |
chains |
integer | Number of MCMC chains. |
cmdstan_version |
character (or value from helper) | Version string of the running CmdStan installation. |
Notes
- The function is not exported (leading dot in name).
- Several
gdpar_abort()calls raise classed errors:"gdpar_unsupported_feature_error"(missingcmdstanr),"gdpar_input_error"(invalidamm_listlength,$p < 2$ , mismatchedy_matrixdimensions). - The W-basis materialisation loop is a defensive side-effect that mutates
amm_listin-place for any slot whoseWobject has unmaterialised dimension. - The init function deliberately reuses the EB-side canonical helper
.gdpar_eb_make_random_init_KxP()to avoid inv-logit / exp blow-ups near link-support boundaries for distributional families (beta, gamma, student-t, etc.), which would occur withcmdstanr's defaultinit = 2. - Extra arguments (
...) are forwarded verbatim tocmdstanr::cmdstan_model$sample().
generate_stan_code_K(prior, cp_a = FALSE, cp_W = FALSE, cp_a_per_K = NULL, family = NULL, template_name = "amm_distrib_K.stan")
Purpose
Reads a inst/stan/ directory and substitutes prior, parametrisation, and generate_stan_code_multi(). Starting from sub-phase 8.3.5b the function also accepts an optional gdpar_family object to emit a family-specific anchor-prior block (.gdpar_emit_canonical_stan().
Arguments
| Argument | Type | Meaning |
|---|---|---|
prior |
gdpar_prior object |
Prior specification governing the substitution of prior-related placeholders (e.g. hyperprior scale expressions). |
cp_a |
logical scalar | Centring flag for the cp_a_per_K is NULL, this value is broadcast as a one-element pseudo-vector (legacy mode). Default FALSE (non-centred). |
cp_W |
logical scalar | If TRUE, the modulating component FALSE (default), non-centred parametrisation. |
cp_a_per_K |
logical vector of length NULL
|
Per-slot centring control for the NULL (default) falls back to broadcasting cp_a uniformly across all |
family |
gdpar_family object or NULL
|
When supplied, drives the family-specific anchor-prior block emission via .gdpar_build_theta_ref_prior_block(). NULL (default) retains the canonical vectorised block for backward compatibility with existing test call sites. |
template_name |
character scalar | Name of the Stan template file. Default "amm_distrib_K.stan". Internally mapped to a canonical-piece filename (see mapping table below). |
Algorithm / Template Translation
The function performs a name mapping from legacy template names to their canonical-piece equivalents before delegating:
Legacy template_name
|
Canonical-piece name |
|---|---|
"amm_distrib_K.stan" |
"amm_canonical_distrib_K.stan" |
"amm_eb_marginal_K.stan" |
"amm_canonical_eb_marginal_K.stan" |
"amm_eb_conditional_K.stan" |
"amm_canonical_eb_conditional_K.stan" |
| (any other) | used as-is |
After mapping, the function assembles a configuration list:
and passes it to .gdpar_emit_canonical_stan(config) which performs the actual template reading and placeholder substitution.
Mathematics
The W-related placeholders {{W_SCALE}} and {{W_PRIOR}} are identical across the multivariate and {{CANONICAL_HELPERS}} placeholder absorbs the in-line bspline_basis_eval and apply_W_basis_diff helpers following the G.iv pattern. In-line Tweedie function blocks and apply_inv_link_by_id remain
When family is non-NULL, the {{THETA_REF_PRIOR_BLOCK}} placeholder is replaced by the output of .gdpar_build_theta_ref_prior_block(family), which emits the per-family prior on anchor-point parameters family is NULL, the legacy vectorised block is retained.
Returns
A character scalar containing the fully substituted Stan model source code, ready for compilation via cmdstanr::cmdstan_model().
Notes
- The function is not exported (lacks
@exporttag; documented with@keywords internaland@noRd). -
prioris validated viaassert_inherits(prior, "gdpar_prior", "prior"), which aborts if the class does not match. - The template relocation (from
inst/stan/amm_distrib_K.stantoinst/stan/_canonical_pieces/amm_canonical_distrib_K.stan) and the analogous relocation of the EB-side$K$ templates were performed in sessions B9.5 and B9.6 respectively. Bit-exact Stan semantics preservation was verified token-wise during the migration. - The function does not write the Stan source to disk; it only returns the string. The caller is responsible for persisting and compiling.
Purpose
Internal helper that signals a structured error condition. It wraps base::stop with a class hierarchy so that user code can catch specific gdpar errors via tryCatch and optionally inspect additional fields embedded in the condition object.
Arguments
-
message— Character scalar. The human-readable error message. -
class— Character vector (defaultcharacter()). Additional class names prepended to the condition's class vector, ahead of the fixed suffixc("gdpar_error", "error", "condition"). -
data— Named list (defaultlist()). Additional fields copied into the condition object for programmatic access by handlers.
Returns
Never returns; signals an error condition of class c(class, "gdpar_error", "error", "condition") via stop.
Notes
- The condition object is constructed with
structure(..., list(message = message, call = sys.call(-1))). Thecallfield is set tosys.call(-1), i.e., the call of the function that invokedgdpar_abort, so the reported call site reflects the caller rather thangdpar_abortitself. - Each element of
datais copied into the condition object by name via aforloop overnames(data); existing fields (message,call) can be overwritten ifdatacontains entries namedmessageorcall. - Because
stopis invoked, execution halts unless the condition is caught by atryCatchhandler matching one of the condition's classes. - Marked
@keywords internaland@noRd; not exported.
Purpose
Internal helper that signals a structured warning condition. It wraps base::warning with a class hierarchy so that user code can catch specific gdpar warnings via withCallingHandlers.
Arguments
-
message— Character scalar. The human-readable warning message. -
class— Character vector (defaultcharacter()). Additional class names prepended to the condition's class vector, ahead of the fixed suffixc("gdpar_warning", "warning", "condition"). -
data— Named list (defaultlist()). Additional fields copied into the condition object for programmatic access by handlers.
Returns
Invisibly NULL; signals a warning condition of class c(class, "gdpar_warning", "warning", "condition") via warning.
Notes
- The condition object is constructed identically to
gdpar_abortin structure:list(message = message, call = sys.call(-1))with the class vector set toc(class, "gdpar_warning", "warning", "condition"). - Each element of
datais copied into the condition object by name via aforloop overnames(data); existing fields (message,call) can be overwritten ifdatacontains entries namedmessageorcall. - Unlike
gdpar_abort,warningdoes not halt execution by default (unlessoptions(warn = 2)is set, which converts warnings to errors). - The return value is governed by
warning, which returnsNULLinvisibly;gdpar_warnitself does not explicitly return, so the value is whateverwarningreturns. - Marked
@keywords internaland@noRd; not exported.
Purpose
Internal helper that signals a structured informative message condition. It wraps base::message with a class hierarchy so that user code can suppress specific gdpar messages via suppressMessages or handle them via withCallingHandlers.
Arguments
-
message— Character scalar. The human-readable informative message. -
class— Character vector (defaultcharacter()). Additional class names prepended to the condition's class vector, ahead of the fixed suffixc("gdpar_message", "message", "condition"). -
data— Named list (defaultlist()). Additional fields copied into the condition object for programmatic access by handlers.
Returns
Invisibly NULL; signals a message condition of class c(class, "gdpar_message", "message", "condition") via message.
Notes
- The condition object is constructed with
list(message = paste0(message, "\n"), call = sys.call(-1)). A trailing newline ("\n") is appended tomessagebefore embedding it in the condition, so the displayed text ends with a newline. - Each element of
datais copied into the condition object by name via aforloop overnames(data); existing fields (message,call) can be overwritten ifdatacontains entries namedmessageorcall. - The return value is governed by
message, which returnsNULLinvisibly;gdpar_informitself does not explicitly return. - Marked
@keywords internaland@noRd; not exported.
Purpose
Internal helper that checks whether a Suggests-listed package is installed before attempting to use it. This enforces CRAN policy that Suggests packages may only be invoked through conditional loading (e.g., requireNamespace), never assumed to be present.
Arguments
-
pkg— Character scalar. The name of the suggested package to check. -
reason— Character scalar. A human-readable description of what the package is needed for; embedded verbatim into the error message shown to the user.
Mathematics
The check is a simple predicate:
If FALSE, an error is signaled.
Returns
Invisibly TRUE if the package is available. Otherwise, signals a condition of class c("gdpar_missing_dependency_error", "gdpar_error", "error", "condition") via gdpar_abort and does not return.
Notes
- Availability is tested with
requireNamespace(pkg, quietly = TRUE), which returnsTRUEif the package can be loaded without attaching it and without printing status messages. - On failure,
gdpar_abortis called with:-
message:sprintf("Package '%s' is required to %s but is not installed. Please install it.", pkg, reason). -
class:"gdpar_missing_dependency_error". -
data:list(package = pkg, reason = reason), so handlers can programmatically access the offending package name and the reason string viacond$packageandcond$reason.
-
- Because the error is signaled through
gdpar_abort, thecallfield of the condition issys.call(-1)relative togdpar_abort, i.e., the call torequire_suggested. - Marked
@keywords internaland@noRd; not exported.
Purpose Internal helper to enforce that an object belongs to one of the specified classes; used to validate arguments before proceeding with computation.
Arguments
-
x: Object to check. Type:ANY. -
cls: Character vector of acceptable classes. Type:character. -
arg_name: Character scalar with the argument name as it appears in the calling function's signature, used to construct an informative error message. Type:character.
Mathematics Not applicable.
Returns Invisibly returns TRUE on success. Otherwise, it signals a condition of class gdpar_input_error via gdpar_abort.
Notes
- The condition is signaled using
gdpar_abortwith the class"gdpar_input_error". - The
datalist attached to the condition contains the original argument name, the expected classes, and the received class(es). - The check is performed using
inherits(x, cls).
Purpose Internal helper to validate that an argument is a single, finite numeric value within a specified inclusive range, with optional allowance for NULL.
Arguments
-
x: Object to check. Type:ANY. -
arg_name: Character scalar with the argument name. Type:character. -
lower: Numeric scalar specifying the inclusive lower bound. Defaults to-Inf. Type:numeric. -
upper: Numeric scalar specifying the inclusive upper bound. Defaults toInf. Type:numeric. -
allow_null: Logical flag. IfTRUE,NULLis accepted without further checks. Defaults toFALSE. Type:logical.
Mathematics Not applicable.
Returns Invisibly returns TRUE on success. Otherwise, it signals a condition of class gdpar_input_error via gdpar_abort.
Notes
- The function checks for
NULLfirst, followed by numeric type, length, and finiteness. - The range check is inclusive:
$x \in [lower, upper]$ . - The error message for an out-of-range value includes the lower and upper bounds and the received value.
Purpose Internal helper to validate that an argument is a positive integer scalar (i.e., a count) and optionally enforce a maximum value.
Arguments
-
x: Object to check. Type:ANY. -
arg_name: Character scalar with the argument name. Type:character. -
max: Integer upper bound. Defaults toInf, meaning no upper bound. Type:numeric.
Mathematics Not applicable.
Returns Invisibly returns TRUE on success. Otherwise, it signals a condition of class gdpar_input_error via gdpar_abort.
Notes
- A valid value must satisfy:
is.numeric(x),length(x) == 1L,is.finite(x),x >= 1, andxmust equalas.integer(x). - The check for integer equivalence is performed by comparing
xtoas.integer(x). Due to floating-point precision, this may reject non-integer numbers that are very close to integers.
Purpose Internal helper to validate that an argument is a one-sided formula (e.g., ~ x + y).
Arguments
-
x: Object to check. Type:ANY. -
arg_name: Character scalar with the argument name. Type:character. -
allow_null: Logical flag. IfTRUE,NULLis accepted. Defaults toTRUE. Type:logical.
Mathematics Not applicable.
Returns Invisibly returns TRUE on success. Otherwise, it signals a condition of class gdpar_input_error via gdpar_abort.
Notes
- The check for a formula object is performed using
inherits(x, "formula"). - A one-sided formula in R is represented as an object of length 2 (the call to
~and its single right-hand-side argument).
Purpose Internal helper to validate that an argument is a data frame and that it contains specific columns.
Arguments
-
x: Object to check. Type:ANY. -
arg_name: Character scalar with the argument name. Type:character. -
required_vars: Character vector of variable names that must be present as columns inx. Defaults to an empty vector. Type:character.
Mathematics Not applicable.
Returns Invisibly returns TRUE on success. Otherwise, it signals a condition of class gdpar_input_error via gdpar_abort.
Notes
- The function first checks for data frame structure using
is.data.frame(x). - It then computes
missing_vars <- setdiff(required_vars, colnames(x)). - If any required variables are missing, the error message lists them in single quotes.
W_basis(type = c("polynomial", "bspline", "user"), degree = NULL, knots = NULL, df = NULL, boundary_knots = NULL, basis_fn = NULL, dim = NULL, p = NULL)
Purpose
Constructor for the S3 class W_basis. It defines the finite-dimensional functional space from which the modulating component evaluator) that maps a value of theta_ref to a numeric basis vector, together with metadata (type, degree, knots, df, boundary_knots, dim, p, block_indices) consumed by downstream code-generation and pre-flight machinery.
Arguments
| Argument | Type | Meaning |
|---|---|---|
type |
Character scalar | Basis type; resolved via match.arg to one of "polynomial", "bspline", "user". |
degree |
Positive integer or NULL
|
Polynomial degree (type = "polynomial") or B-spline degree (type = "bspline"). Defaults to 1L (polynomial) or 3L (bspline) when NULL. Ignored for "user". |
knots |
Numeric vector or NULL
|
Interior knots passed to splines::bs. Required for "bspline" when df is not supplied. Mutually exclusive with df. |
df |
Positive integer or NULL
|
Degrees of freedom passed to splines::bs. Required for "bspline" when knots is not supplied. Mutually exclusive with knots. |
boundary_knots |
Numeric vector of length 2 or NULL
|
Fixed boundary knots for the B-spline expansion. Ignored when type != "bspline". |
basis_fn |
Function or NULL
|
User-supplied evaluator; required for type = "user", ignored otherwise. Must accept a numeric vector of length p and return a numeric vector of length dim. |
dim |
Positive integer or NULL
|
Declared basis dimension; required for type = "user", computed automatically for the package-provided types. |
p |
Positive integer or NULL
|
Dimension of theta_ref. When non-NULL, the basis is eagerly materialized via materialize_W_basis() before return. |
Mathematics
For type = "polynomial" the evaluator computes, for a numeric vector
Cross-terms between coordinates are excluded; the layout is block-by-coordinate.
For type = "bspline" the evaluator applies splines::bs independently to each scalar coordinate
where boundary_knots is NULL and knots is supplied, the boundary is set to range(c(knots, tk)) per evaluation point; when boundary_knots is supplied it is passed as Boundary.knots.
Returns
A list of class c("W_basis", "list") with components:
-
type— the resolved basis type string. -
degree— integer (polynomial/bspline) orNA_integer_(user). -
knots— the supplied knot vector orNULL. -
df— the supplied df orNULL. -
boundary_knots— the supplied boundary knots orNULL. -
dim—NA_integer_at construction for polynomial/bspline (populated later bymaterialize_W_basis()); the user-declared integer fortype = "user". -
evaluator— closure mappingtheta_refto a numeric vector. -
p—NULLunlesspwas supplied (then set bymaterialize_W_basis()). -
block_indices—NULLunless materialized.
When p is supplied, materialize_W_basis(obj, p) is invoked prior to return, so dim, p, and block_indices are populated.
Notes
-
assert_count(p, "p")is called whenpis non-NULL;pis then coerced to integer. - For
type = "bspline": if bothknotsanddfareNULL, an error of class"gdpar_input_error"is raised. If both are non-NULL, a mutual-exclusivity error is raised. -
boundary_knotsvalidation (when non-NULL): must be numeric, length exactly 2, both finite, andboundary_knots[1] < boundary_knots[2]; otherwise a"gdpar_input_error"is raised withdata = list(received = boundary_knots). Whenknotsis also supplied, the interior knots must satisfymin(knots) > boundary_knots[1]andmax(knots) < boundary_knots[2]; violation raises a"gdpar_input_error"withdatacontainingknots_rangeandboundary_knots. -
require_suggested("splines", "evaluate B-spline bases")is invoked for the bspline branch. - For
type = "user":basis_fnmust passis.function; otherwise a"gdpar_input_error"is raised.dimis validated viaassert_count. Theevaluatorslot is set directly tobasis_fn(no wrapper). - The
degreeargument is validated viaassert_countfor both polynomial and bspline branches (after default resolution). - No S3 methods are defined in this section; the class attribute is set manually as
c("W_basis", "list").
Purpose
Internal finalizer that records the total basis dimension dim, the theta_ref dimension p, and (for separable types) the per-coordinate block_indices on an existing W_basis object. For polynomial and bspline bases the dimension is derived analytically from constructor arguments; for user bases the evaluator is probed at rep(0.5, p) and the declared dim is validated against the actual output length.
Arguments
| Argument | Type | Meaning |
|---|---|---|
wb |
W_basis object |
The basis object to finalize. Validated via assert_inherits(wb, "W_basis", "wb"). |
p |
Positive integer | Length of theta_ref. Validated via assert_count; coerced to integer. |
Mathematics
For type = "polynomial":
For type = "bspline":
with block_indices given by the same formula as the polynomial case.
For type = "user": the evaluator is probed at out must be numeric and satisfy block_indices is set to NULL.
Returns
A modified W_basis object (mutated in place and returned) with dim, p, and block_indices populated. For user bases, block_indices remains NULL.
Notes
- If
wb$pis already non-NULL, non-NA, and differs from the suppliedp, a"gdpar_input_error"is raised withdata = list(wb_p = wb$p, requested_p = p). - For user bases, evaluator errors during probing are caught via
tryCatchand re-raised as"gdpar_input_error"withdata = list(probe = probe). - If the user evaluator returns a non-numeric object, a
"gdpar_input_error"is raised withdata = list(returned_class = class(out)). - If the user evaluator returns a vector whose length differs from
wb$dim, a"gdpar_input_error"is raised (nodatafield). - Marked
@keywords internaland@noRd; not exported.
Purpose
Documented (via roxygen) as an exported function that splits a separable multivariate W_basis (polynomial or bspline with p of univariate W_basis objects, each materialized with p = 1L. For type = "user" the function is documented to return NULL with a warning, since separability of a user-supplied evaluator cannot be inferred automatically.
Arguments
| Argument | Type | Meaning |
|---|---|---|
wb |
W_basis |
A basis object with p populated (either at construction or via materialize_W_basis()). |
Returns
Per the roxygen: a list of length wb$p of W_basis objects (separable types), or NULL with a warning (type = "user").
Notes
- The function body is not present in this section (section 1 of 2); only the roxygen documentation and
@exporttag appear. The implementation is deferred to a subsequent section. The description above reflects the documented contract, not verified source behavior.
Purpose: Decomposes a materialized multivariate W_basis (with p > 1) into a list of p per-coordinate W_basis objects, each with p = 1L. This supports the separable-multivariate Stan data assembly path, where each coordinate k gets its own scalar basis.
Arguments:
-
wb(W_basis): A materialized W basis object. Must inherit from"W_basis"and have a non-null, non-NApslot.
Mathematics: For a separable basis of dimension
Each
Returns: A list of length wb$p whose elements are W_basis objects with p = 1L; or NULL for type == "user" (after a warning); or NULL for unrecognized types (implicit fall-through).
Notes:
- Calls
assert_inherits(wb, "W_basis", "wb")— raises an error ifwbis not aW_basis. - If
wb$pisNULLorNA, aborts viagdpar_abortwith class"gdpar_input_error", instructing the user to either construct withp = ...or callmaterialize_W_basis(wb, p). - For
type == "user": issueswarning(...)(not an abort) and returnsNULL, because separability cannot be inferred automatically for user-supplied bases. - For
type == "polynomial": each child isW_basis(type = "polynomial", degree = wb$degree, p = 1L). - For
type == "bspline": each child carriesdegree,knots,df, andboundary_knotsfrom the parent, withp = 1L. - The loop variable
kis unused inside thelapplybody — it merely indexes position.
Purpose: S3 print method for objects of class "W_basis". Renders a human-readable summary of the basis configuration to stdout.
Arguments:
-
x(W_basis): The basis object to print. -
...(any): Unused; present for S3 generic compatibility.
Returns: Invisibly returns x.
Notes:
- Prints
<W_basis>header followed by indented fields. -
typeis always printed. -
degreeis printed only when!is.na(x$degree)(soNAdegree is suppressed). -
knotsis printed (comma-separated, viaformat()) only when!is.null(x$knots). -
dfis printed only when!is.null(x$df). -
boundary_knotsis printed (comma-separated) only when!is.null(x$boundary_knots). -
dimis printed only when!is.na(x$dim). -
pis printed when!is.null(x$p) && !is.na(x$p), with a contextual label:-
" (univariate)"whenx$p == 1L, -
" (multivariate, separable)"whenx$p != 1Land!is.null(x$block_indices), -
" (multivariate)"otherwise.
-
-
block_indicesis printed when non-NULL. Each elementidxof the list is rendered as:-
as.character(idx)whenlength(idx) == 1L, -
sprintf("%d-%d", min(idx), max(idx))otherwise. Elements are joined by", ".
-
- No side effects beyond console output.
Purpose: Resolves (derives) the interior knots of a B-spline W_basis from df and boundary_knots when knots is not explicitly provided. This ensures a deterministic knot vector for Stan-side Cox–de Boor recursion across HMC steps. Mutates and returns the input W.
Arguments:
-
W(W_basis): A W basis object, expected to havetype,knots,df,degree, andboundary_knotsslots.
Mathematics: Given degree
If
If numeric(0L) (no interior knots).
Returns: The input W_basis object with W$knots populated (mutated in place and returned). Passthrough (unchanged) when W$type != "bspline" or when W$knots is already non-NULL.
Notes:
- Early return (passthrough) if
W$type != "bspline". - Early return (passthrough) if
W$knotsis already set (non-NULL). - If
W$knotsis NULL and eitherW$dforW$boundary_knotsis NULL, aborts viagdpar_abortwith class"gdpar_input_error"anddata = list(df = W$df, boundary_knots = W$boundary_knots). - If
n_int < 0L(i.e.,df < degree), aborts viagdpar_abortwith class"gdpar_input_error"and a formatted message. - If
n_int == 0L, setsW$knots <- numeric(0L)and returns. - The function mutates
Wdirectly (R's copy-on-modify semantics apply to the caller's reference only if the caller reassigns the return value).
Purpose: Assembles the augmented (full) B-spline knot vector consumed by Stan-side Cox–de Boor recursion. This vector has clamped boundary knots repeated
Arguments:
-
W(W_basis): A B-spline W basis object (must havetype,boundary_knots,degree, and optionallyknots/df).
Mathematics: Let
with total length splines::bs(intercept = FALSE)) yields per-coordinate df
Returns: A numeric vector (the augmented knot vector). Returns numeric(0L) if W$type != "bspline".
Notes:
- If
W$type != "bspline", returnsnumeric(0L)immediately. - If
W$boundary_knotsis NULL, aborts viagdpar_abortwith class"gdpar_internal_error". - Internally calls
.gdpar_resolve_bspline_knots(W)to ensure interior knots are populated. - If the resolved
knotsslot is NULL (should not happen after resolution, but defensively handled), usesnumeric(0L). - Does not mutate
W; operates on a local copyW_resolved.
Purpose: Estimates a conservative range for boundary_knots. Applies the canonical-link inverse to the outcome as a proxy.
Arguments:
-
y_vec(numeric): The outcome/response vector. -
family(list/environment): A family object with astan_idfield, used to determine the canonical inverse link.
Mathematics: Let y_vec.
- If
$|y| < 2$ , return$[-1, 1]$ . - Determine
link_idvia.gdpar_canonical_inv_link_id_slot1(family$stan_id):-
link_id == 0(identity / unknown):$z = y$ . -
link_id == 1(logit): clamp$\tilde{y} = \min(\max(y, 10^{-6}),\, 1 - 10^{-6})$ , then$z = \log\!\left(\frac{\tilde{y}}{1 - \tilde{y}}\right)$ . - Otherwise (log):
$z = \log(\max(y, 10^{-6}))$ .
-
- Compute
$m = [\min(z),\, \max(z)]$ and$s_z = \text{sd}(z)$ . - Fallbacks for
$s_z$ : if not finite or zero,$s_z = |m_2 - m_1| / 4$ ; if still not finite or zero,$s_z = 1$ . - Return
$[m_1 - s_z,\; m_2 + s_z]$ .
Returns: A length-2 numeric vector c(lower, upper) representing the estimated
Notes:
- Filters
y_vecto finite values viais.finite()before processing. - If fewer than 2 finite values remain, returns
c(-1, 1)immediately. - The call to
.gdpar_canonical_inv_link_id_slot1(family$stan_id)is wrapped intryCatch; on error,link_iddefaults to0L(identity proxy). The docstring notes that unknown/exoticstan_ids fall back to identity with a warning, though the code itself does not issue a warning — the warning (if any) would come from.gdpar_canonical_inv_link_id_slot1. -
na.rm = TRUEis used inrange()andsd()as a secondary safety net. - The clamping constants
1e-6and1 - 1e-6preventlog(0)andlog(Inf)in the logit case.
Purpose: Unified entry point that assembles the W-related Stan data fields, dispatching over W$type into a single integer identifier W_type_id (0 = off, 1 = polynomial, 2 = B-spline) plus auxiliary fields for Cox–de Boor recursion. Used by the three assemble_stan_data* variants.
Arguments:
-
W(W_basisorNULL): The W basis object. -
use_W(integer/numeric): Flag indicating whether W is active.0Lmeans W is off. -
y_vec(numeric): The outcome vector, passed to.gdpar_estimate_theta_ref_proxy_rangefor B-spline boundary validation. -
family(list/environment): The family object, passed to.gdpar_estimate_theta_ref_proxy_range.
Mathematics: The returned W_type_id encodes the basis:
For B-splines, the augmented knot vector .gdpar_bspline_knots_full) is passed as W_knots_full with length W_n_knots_full, and W_degree
Returns: A named list of four fields, suitable for c()-merging into a stan_data list:
-
W_type_id(integer): 0, 1, or 2. -
W_n_knots_full(integer): Length of the augmented knot vector (0 for polynomial/off). -
W_knots_full(numeric): The augmented knot vector (empty for polynomial/off). -
W_degree(integer): The basis degree (0 for off).
Notes:
- If
use_W == 0LorWis NULL, returns the empty/default list immediately (all zeros/empties). - For
type == "polynomial": returnsW_type_id = 1L, no knots,W_degree = as.integer(W$degree). - For
type == "bspline":- If
W$boundary_knotsis NULL, aborts viagdpar_abortwith class"gdpar_input_error". - Calls
.gdpar_resolve_bspline_knots(W)to populate interior knots. - Calls
.gdpar_estimate_theta_ref_proxy_range(y_vec, family)to get the projected$\theta_{\text{ref}}$ range. - Calls
.gdpar_validate_bspline_boundary_range(W_resolved, proj_rng)to enforce the boundary containment contract. - Calls
.gdpar_bspline_knots_full(W_resolved)to build the augmented knot vector. - Returns
W_type_id = 2L,W_n_knots_full = length(knots_full),W_knots_full = as.double(knots_full),W_degree = as.integer(W_resolved$degree).
- If
- For any other type (including
"user"): aborts viagdpar_abortwith class"gdpar_unsupported_feature_error"anddata = list(W_type = W$type). - The
y_vecandfamilyarguments are only used in the B-spline branch.
Purpose: Validates that the B-spline boundary_knots contain the projected
Arguments:
-
W(W_basis): A W basis object withtypeandboundary_knotsslots. -
projected_range(numeric, length 2): The projected$\theta_{\text{ref}}$ range[lower, upper], typically from.gdpar_estimate_theta_ref_proxy_range.
Mathematics: Let
Equivalently, it aborts when
Returns: invisible(NULL) on success (or on early-exit conditions).
Notes:
- Early return (no-op) if
W$type != "bspline". - Early return (no-op) if
W$boundary_knotsis NULL. - Coerces
projected_rangeto numeric viaas.numeric(). - If
projected_rangedoes not have length exactly 2, or if any element is non-finite, aborts viagdpar_abortwith class"gdpar_internal_error"anddata = list(received = projected_range). This is classified as an internal error (not input error), suggesting it should not arise from user action. - On boundary violation, aborts via
gdpar_abortwith class"gdpar_input_error"anddata = list(boundary_knots = b, projected_range = rng). The message includes both ranges formatted with%g. - The function does not modify
W.
← Part IV — Exhaustive Function Reference (6/7) · gdpar Wiki Home · Part V — Stan Templates (1/3) →
- Part I — Conceptual Framework
- Part II — Mathematical Foundations
- Part III — Computational Architecture
- Part IV — Exhaustive Function Reference (1/7)
- Part IV — Exhaustive Function Reference (2/7)
- Part IV — Exhaustive Function Reference (3/7)
- Part IV — Exhaustive Function Reference (4/7)
- Part IV — Exhaustive Function Reference (5/7)
- Part IV — Exhaustive Function Reference (6/7)
- Part IV — Exhaustive Function Reference (7/7)
- Part V — Stan Templates (1/3)
- Part V — Stan Templates (2/3)
- Part V — Stan Templates (3/3)
- Part VI — Data, Benchmarks, Tests & References