Part IV Function Reference 3

← Part IV — Exhaustive Function Reference (2/7) · gdpar Wiki Home · Part IV — Exhaustive Function Reference (4/7) →

`.gdpar_eb_make_random_init(stan_data, seed_offset = 1L, base_seed = NULL)`

Purpose

Generates a list of random initial values for the Stan HMC sampler in the Empirical Bayes (EB) workflow. The structure of the returned list is conditioned on the flags and dimensions carried in stan_data, so that only parameters relevant to the configured model are initialised.

Arguments

stan_data (list): The data list prepared for Stan. The following fields are consulted:
- J_groups (integer): number of reference-parameter groups $J$.
- use_groups (integer flag, 0/1): whether group-level hyperparameters are active.
- use_a (integer flag, 0/1): whether the $a$ AMM component is active.
- J_a (integer): dimension of the $a$ component.
- use_b (integer flag, 0/1): whether the $b$ AMM component is active.
- J_b (integer): dimension of the $b$ component.
- use_W (integer flag, 0/1): whether the $W$ AMM component is active.
- dim_W (integer): row dimension of the $W$ matrix.
- d (integer): column dimension of the $W$ matrix (latent dimension).
- use_dispersion_y (integer flag, 0/1): whether an observation-level dispersion is active.
- use_dispersion_phi (integer flag, 0/1): whether a $\phi$ dispersion parameter is active.
seed_offset (integer, default 1L): integer added to base_seed to derive the RNG seed.
base_seed (integer or NULL, default NULL): base seed. If NULL, the global RNG state is left untouched.

Mathematics

When base_seed is non-NULL, the effective seed is

$$ \texttt{rng_seed} = \texttt{as.integer(base_seed)} + \texttt{seed_offset}. $$

The draws are:

$$ \theta_{ref,j} \sim \mathcal{N}(0,, 0.5^2), \quad j = 1,\dots,J $$

and, when the corresponding flag is set:

$$ \mu_{\theta_{ref}} \sim \mathcal{N}(0,, 0.5^2), \qquad \sigma_{\theta_{ref}} = |Z| + 0.1, \quad Z \sim \mathcal{N}(0,1) $$

$$ \sigma_a = |Z| + 0.1, \qquad a_{raw} \sim \mathcal{N}(0,, 0.5^2) \in \mathbb{R}^{J_a} $$

$$ \sigma_b = |Z| + 0.1, \qquad c_{b,raw} \sim \mathcal{N}(0,, 0.5^2) \in \mathbb{R}^{J_b} $$

$$ \sigma_W = |Z| + 0.1, \qquad W_{raw} \sim \mathcal{N}(0,, 0.5^2) \in \mathbb{R}^{\texttt{dim_W} \times d} $$

$$ \sigma_y = |Z| + 0.1, \qquad \phi = |Z| + 1 $$

Returns

A named list suitable for passing as init to a Stan sampler. Scalar parameters are wrapped in 1-element arrays via as.array(); W_raw is a matrix; theta_ref, a_raw, and c_b_raw are numeric vectors.

Notes

When base_seed is non-NULL, the function calls set.seed(rng_seed) and registers an on.exit handler that restores the prior .Random.seed state in .GlobalEnv (if it existed) upon return. If .Random.seed did not exist in .GlobalEnv, the handler does nothing (the seed set by set.seed persists).
The on.exit handler is registered with add = TRUE, so it composes with any pre-existing exit handlers.
Flags are tested with isTRUE(... == 1L), so any value other than exactly 1L (including TRUE or 1) is treated as inactive.

`.gdpar_eb_apply_correction(eb_correction, laplace_result, stan_data, p = 1L, verbose)`

Purpose

Entry point for the Proposition 7B coverage-discrepancy correction in the EB workflow. In the scalar regime ($p = 1$) it computes a scalar inflation constant directly; for $p > 1$ it delegates to .gdpar_eb_correction_matrix(). The correction is not applied to the raw draws here—only the scaling object is returned for downstream S3 methods.

Arguments

eb_correction (logical): whether the correction should be applied.
laplace_result (list): result of the Laplace approximation step. Must contain theta_ref_cov (a matrix, or at least an indexable object for the [1L, 1L] element in the scalar path).
stan_data (list): the Stan data list. Passed through but not directly used in the scalar computation.
p (integer, default 1L): dimensionality of the reference parameter for the correction.
verbose (logical): whether to emit a diagnostic warning when the correction is disabled.

Mathematics

Scalar form ($p = 1$), from v07 Section 6 / v07b Section 5.2:

$$ C_{g,\alpha} = \kappa(\alpha) \cdot \bigl(g'(\xi^*)\bigr)^2 \cdot \bigl(J^{\xi}\bigr)^2 \cdot \bigl(I_{\theta\theta}^{marg}\bigr)^{-1} $$

For the default identity functional $g(\xi) = \xi$ and $p = 1$, this reduces to:

$$ C_{g,\alpha} = \kappa(\alpha) \cdot \mathrm{Var}^{marg}(\theta_{ref}) $$

with $\kappa(\alpha_{95\%}) = 1.92$ hardcoded in the function.

Returns

A list with two elements:

applied (logical): TRUE if the correction was successfully computed, FALSE otherwise.
constant (numeric scalar): the scalar correction $C_{g,\alpha}$ when applied = TRUE; NA_real_ otherwise.

When $p > 1$, the return value is whatever .gdpar_eb_correction_matrix() produces (a $p \times p$ matrix in constant).

Notes

If eb_correction is FALSE and verbose is TRUE, a warning is issued via gdpar_warn() with class "gdpar_diagnostic_warning", stating that intervals will use nominal coverage and may under-cover by $O(n^{-1})$.
The marginal variance is extracted as laplace_result$theta_ref_cov[1L, 1L] inside a tryCatch; any error yields NA_real_.
If the marginal variance is not finite or is $\leq 0$, the function returns applied = FALSE, constant = NA_real_ silently (no warning).
For $p > 1$, p is coerced to integer before the delegation check.

`.gdpar_eb_correction_matrix(eb_correction, laplace_result, stan_data, p = 1L, verbose)`

Purpose

Computes the matrix-valued Proposition 7B* coverage-discrepancy correction for the multivariate regime ($p > 1$). This is the companion of the scalar path in .gdpar_eb_apply_correction() and implements v07b Section 5.1.

Arguments

eb_correction (logical): whether the correction should be applied.
laplace_result (list): Laplace approximation result containing theta_ref_cov.
stan_data (list): Stan data list (passed through, not used in computation).
p (integer, default 1L): dimension of the reference parameter.
verbose (logical): intended for diagnostics (not directly used in the body beyond being accepted).

Mathematics

Matrix form (Proposition 7B*, v07b Section 5.1):

$$ C^{*}_{g,\alpha} = \kappa(\alpha) \cdot J^{\xi,T} \cdot \Sigma^{marg}_{\theta_{ref}} \cdot J^{\xi} $$

For the default identity functional $g(\xi) = \xi$, the Jacobian $J^{\xi} = I_p$, so:

$$ C^{*}_{g,\alpha} = \kappa(\alpha) \cdot \Sigma^{marg}_{\theta_{ref}} $$

with $\kappa(\alpha_{95\%}) = 1.92$. At $p = 1$ this collapses to the scalar form $\kappa(\alpha) \cdot \mathrm{Var}^{marg}(\theta_{ref})$.

Returns

A list with two elements:

applied (logical): TRUE if the matrix correction was successfully computed.
constant (matrix): the $p \times p$ (or matching cov_mat dimension) correction matrix when applied = TRUE; an NA_real_ matrix of appropriate size otherwise.

Notes

The function aborts silently to applied = FALSE with an NA matrix in the following cases:
1. eb_correction is not TRUE.
2. laplace_result$theta_ref_cov is NULL, not a matrix, or non-square (extraction wrapped in tryCatch returning NULL on error).
3. Any element of cov_mat is non-finite.
4. Eigenvalues of cov_mat (computed via eigen(..., symmetric = TRUE, only.values = TRUE)) are non-finite, or any eigenvalue is $< -10^{-10}$ (i.e., the matrix is not positive semi-definite within tolerance).
When the PSD check fails, the returned NA matrix has dimensions matching nrow(cov_mat) / ncol(cov_mat), not necessarily p.
The eigenvalue extraction is wrapped in tryCatch returning NA_real_ on error, which then triggers the non-finite check.
Downstream S3 methods are expected to fall back to nominal credible intervals when applied = FALSE.

`.gdpar_eb_resolve_K_inputs(formula, amm, W, family, formula_set_input, amm_list_input, classic_with_amm_calls, family_is_named_list)`

Purpose

Resolves the three possible K-input patterns (formula set, named list of amm_spec, or classic formula with AMM wrapper calls) into a single canonical amm_list_canonical, and promotes the family scope accordingly. This mirrors the K-input dispatch logic of gdpar() and is the EB-path companion of .gdpar_K. The logic is intentionally duplicated rather than refactored to preserve bit-exact behaviour of golden tests.

Arguments

formula (formula or gdpar_formula_set): the model formula or formula set.
amm (amm_spec or named list of amm_spec): the AMM specification(s).
W (matrix or NULL): the $W$ matrix passed to AMM construction.
family (gdpar_family or named list of gdpar_family): the response family specification.
formula_set_input (logical): whether formula is a gdpar_formula_set.
amm_list_input (logical): whether amm is a named list of amm_spec.
classic_with_amm_calls (logical): whether the formula RHS contains a()/b()/W() wrapper calls.
family_is_named_list (logical): whether family is a named list (heterogeneous K-slot pattern).

Returns

A list with elements:

amm_list_canonical (named list of amm_spec): the resolved canonical AMM specifications, one per K-slot.
K (integer): length of amm_list_canonical.
outcome_name (character): the name of the outcome variable extracted from the formula.
formula_env (environment): the environment associated with the formula.
family_promoted: the family object after scope promotion (either a promoted gdpar_family or a heterogeneous family structure).
family_id_k_vector (integer vector or NULL): per-observation family IDs when the heterogeneous path is taken; NULL otherwise.

Notes

Three dispatch branches, evaluated in order:

formula_set_input branch: amm must be the default amm_spec() (checked via .gdpar_is_default_amm_spec()); otherwise an error of class "gdpar_input_error" is raised. The canonical list is built by .gdpar_formula_set_to_amm_spec_list(formula, W). outcome_name and formula_env are taken from formula$outcome and formula$env.
amm_list_input branch: amm is used directly as amm_list_canonical. Each slot name must be non-empty (checked via nzchar()), each entry must inherit from class "amm_spec", and slot names must be unique (anyDuplicated(...) == 0L). formula must be a two-sided formula (length(formula) == 3L). Violations raise "gdpar_input_error". outcome_name is as.character(formula[[2L]]); formula_env is environment(formula).
Classic (else) branch: amm must be the default amm_spec(). The first eligible parameter name is extracted from family—either family[[1L]]$param_specs[[1L]]$name (if family_is_named_list) or family$param_specs[[1L]]$name. A gdpar_formula_set is constructed via do.call(gdpar_formula_set, args_for_fs) with the formula named by that parameter, then .gdpar_formula_set_to_amm_spec_list(fs, W) builds the canonical list.

After resolution, $K = \texttt{length(amm\_list\_canonical)}$:

$K > 1$: If family_is_named_list, calls .gdpar_resolve_heterogeneous_family_K(family, names(amm_list_canonical)) and unpacks location_family and family_id_k_vector. Otherwise calls .gdpar_promote_scope_per_observation(family, names(amm_list_canonical)) with family_id_k_vector = NULL.
$K = 1$: If family_is_named_list, raises "gdpar_input_error" (heterogeneous path requires $K \geq 2$). Otherwise calls .gdpar_promote_scope_per_observation(family, k_name) with family_id_k_vector = NULL.

Errors raised (all via gdpar_abort with class = "gdpar_input_error"):

Formula set path with non-default amm.
Named-list amm with empty slot name, non-amm_spec entry, or duplicated names.
Named-list amm with formula that is not a two-sided formula.
Classic path with non-default amm.
Heterogeneous family (family_is_named_list = TRUE) resolved to $K = 1$.

The data field of the abort is populated for some errors (e.g., list(slot = ..., received = ...) and list(K = K)).

`.gdpar_eb_run_K(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, family_id_k_vector, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group, parametrization, id_check_rigor, eb_correction, laplace_control, call, ...)`

Purpose

Primary orchestrator for the Empirical-Bayes ("eb") estimation path under the regime $K > 1$ slots with $p = 1$ (univariate outcome shared across all slots). Internally labeled as "Sub-phase 8.6.C Path B." The function performs the complete pipeline: input validation, design-matrix construction for $K$ slots, anchor resolution, per-slot and $K$-level identifiability checks, group-aliasing checks, Stan data assembly, Laplace marginal maximization, conditional MCMC sampling, diagnostic computation, and EB correction application, returning a fitted object of class gdpar_eb_fit.

Arguments

Argument	Type	Meaning
`amm_list_canonical`	named list of length $K$	Canonical AMM (anchor model matrix) specifications. Each element is a list potentially containing `$a` (formula for the $a$-basis), `$b` (formula for the $b$-basis), and `$W` (pre-specified basis matrix or formula). Names become `slot_names`.
`family`	list / character	Response family specification passed to Stan code generators and data assemblers.
`data`	data.frame	Data containing the outcome column and all covariates referenced in slot formulas.
`prior`	list	Prior specification passed to Stan code generators.
`anchor`	various	Anchor specification (scalar, vector, or special keyword) resolved by `resolve_anchor_K`.
`outcome_name`	character	Name of the outcome column in `data`.
`formula_env`	environment	Environment attached to all internally constructed formulas via `stats::as.formula(..., env = formula_env)`.
`family_id_k_vector`	integer vector	Per-slot family identifiers of length $K$, forwarded to `.assemble_stan_data_K`.
`skip_id_check`	logical	If `TRUE`, all identifiability checks (per-slot and $K$-level) are bypassed and `id_report` is set to `NULL`.
`chains`	numeric/integer	Number of MCMC chains for the conditional model; coerced to integer.
`iter_warmup`	numeric/integer	Warmup iterations; coerced to integer.
`iter_sampling`	numeric/integer	Sampling iterations; coerced to integer.
`adapt_delta`	numeric	Stan NUTS `adapt_delta` control parameter.
`max_treedepth`	numeric/integer	Stan NUTS maximum tree depth; coerced to integer.
`refresh`	numeric/integer	Stan output refresh interval; coerced to integer.
`verbose`	logical	Controls `show_messages`, `show_exceptions` in Stan sampling, and verbosity of helper calls.
`seed`	integer or `NULL`	Random seed for Stan sampling and Laplace maximization. If non-`NULL`, coerced to integer.
`group`	various	Grouping specification for hierarchical structure, resolved by `.resolve_group_argument`.
`parametrization`	character	Requested parametrization (`"cp"` for centered, otherwise non-centered). In Path B both `cp_a` and `cp_W` are set uniformly across all $K$ slots.
`id_check_rigor`	various	Rigor level forwarded to `.check_identifiability_K` for the $K$-level check.
`eb_correction`	logical	Whether to apply the Proposition 7B coverage-discrepancy correction.
`laplace_control`	list	Control parameters forwarded to `.gdpar_eb_maximize_marginal`.
`call`	call	The original top-level function call, stored in the returned object.
`...`	any	Extra named arguments merged into the `sample_args` list passed to `cmdstanr`'s `$sample()` method, potentially overriding defaults.

Mathematics

The function implements a two-stage Empirical Bayes estimator:

Stage 1 — Laplace marginal maximization. The marginal Stan model is generated and compiled. The marginal log-posterior of the anchor parameters is maximized:

$$\hat{\boldsymbol{\theta}}_{\mathrm{ref}} = \arg\max_{\boldsymbol{\theta}_{\mathrm{ref}}} ; \log p(\boldsymbol{\theta}_{\mathrm{ref}} \mid \mathbf{y})$$

where $\boldsymbol{\theta}_{\mathrm{ref}} \in \mathbb{R}^{J_{\mathrm{groups}} \times K}$ is the vector of per-group, per-slot anchor parameters. The Laplace helper returns the mode $\hat{\boldsymbol{\theta}}_{\mathrm{ref}}$ and its standard error.

Stage 2 — Conditional MCMC. The conditional Stan model is generated, compiled, and sampled with $\hat{\boldsymbol{\theta}}_{\mathrm{ref}}$ plugged in as data (theta_ref_k_data), drawing from:

$$p(\boldsymbol{\xi} \mid \mathbf{y}, \hat{\boldsymbol{\theta}}_{\mathrm{ref}})$$

EB correction (Proposition 7B, scalar form at $p=1$). At $p = 1$, the tensor-valued correction degenerates to a per-slot scalar:

$$C_{g,\alpha}[k] = \kappa(\alpha) \cdot \Sigma^{\mathrm{marg}}_{\theta_{\mathrm{ref},k}}$$

where $\Sigma^{\mathrm{marg}}_{\theta_{\mathrm{ref},k}}$ is the marginal variance of the $k$-th slot's anchor. The correction is applied by .gdpar_eb_apply_correction.

Identifiability diagnostic test point. For each slot $k$, the diagnostic anchor is:

$$\theta_{\mathrm{diag},k} = \begin{cases} 1 & \text{if } b_k \text{ exists and } |\theta_{\mathrm{ref},k}| < 10^{-8} \ \theta_{\mathrm{ref},k} & \text{otherwise} \end{cases}$$

This avoids testing identifiability at a degenerate zero anchor when a $b$-basis is present.

Returns

A list with S3 class c("gdpar_eb_fit", "list") containing:

Element	Type / Structure	Description
`theta_ref_hat`	numeric	Laplace point estimate of the anchor (flat vector of length $J_{\mathrm{groups}} \times K$).
`theta_ref_se`	numeric	Standard error of the Laplace estimate.
`conditional_fit`	`CmdStanMCMC`	The cmdstanr fit object from the conditional model.
`amm_list_canonical`	named list	The input AMM list (with `$W` slots materialized).
`family`	—	The input family.
`prior`	—	The input prior.
`design_K`	list	Design structure from `.build_amm_design_K`, containing `Z_a_k_list`, `Z_b_k_list`, `X`, etc.
`anchor`	numeric vector	Resolved anchor values of length $K$ from `resolve_anchor_K`.
`stan_data`	list	Assembled Stan data list from `.assemble_stan_data_K`, augmented with `K_slots` and `p_dim`.
`identifiability_report`	named list or `NULL`	Per-slot identifiability reports (named by `slot_names`) with a `K_level` attribute; `NULL` when `skip_id_check = TRUE`.
`diagnostics`	—	MCMC diagnostics from `compute_diagnostics`.
`diagnostics_numerical`	—	Laplace optimizer diagnostics from `laplace_result$diagnostics`.
`parametrization`	list	Resolved parametrization with elements `cp_a` (logical), `cp_W` (logical), `cp_a_per_K` (`NULL`), and `meta` (list with `mode = "eb_K_path_B"`, `note`, `requested`).
`group_info`	list or `NULL`	Resolved group information from `.resolve_group_argument`.
`correction_applied`	logical	Whether the EB correction was applied.
`eb_correction_constant`	—	The correction constant from `.gdpar_eb_apply_correction`.
`call`	call	The original function call.
`path`	character	Always `"eb"`.
`K`	integer	Number of slots.
`slot_names`	character	Names of the AMM list elements.

Notes

Input validation errors (class gdpar_input_error):

If outcome_name is not a column in data.
If the outcome y is a matrix or an array with length(dim(y)) > 1 (Path B requires a length-$n$ univariate vector shared across all $K$ slots).
If y contains any non-finite values: for numeric y, any !is.finite(y) (NA, NaN, Inf); for non-numeric y, any is.na(y).

Identifiability errors (class gdpar_identifiability_error):

If any per-slot check gdpar_check_identifiability returns rep_k$passed != TRUE, the error data field contains slot (name) and report (the full report).
If the $K$-level check .check_identifiability_K returns passed != TRUE, the error data field contains report (the $K$-level report).
Both are bypassed when skip_id_check = TRUE.

Formula construction:

The union of all variables across all slots' $a and $b formulas is collected. If empty, the RHS is "1"; otherwise it is paste(union_vars, collapse = " + ").
The full formula is outcome_name ~ rhs_str with env = formula_env.
The RHS is extracted as formula_full[c(1L, 3L)] (a one-sided formula) and updated with ~ . + 0 to remove the intercept.

W basis materialization:

For each slot with a non-NULL $W element, materialize_W_basis(amm_list_canonical[[k]]$W, p = 1L) is called in place, mutating amm_list_canonical.

Parametrization resolution:

Both cp_a and cp_W are set to identical(parametrization, "cp"), meaning the same parametrization is applied uniformly across all $K$ slots. The meta$note explicitly states that per-slot preflight (cp_a_per_K) is queued but not yet implemented.

theta_ref_k_data reshaping:

theta_hat is extracted as as.numeric(laplace_result$theta_ref_hat).
J_groups_loc is read from stan_data$J_groups and coerced to integer.
The if/else branches both produce matrix(theta_hat, nrow = J_groups_loc, ncol = K, byrow = FALSE) — the two branches are functionally identical, suggesting a placeholder for future differentiation.
The resulting matrix is assigned to stan_data_cond$theta_ref_k_data, intended for Stan's array[J_groups] vector[K] consumer.

Stan model lifecycle:

The marginal Stan source is generated by .gdpar_eb_generate_stan_marginal, written to a tempfile via write_stan_to_tempfile, and compiled with cmdstanr::cmdstan_model.
The conditional Stan source is generated by .gdpar_eb_generate_stan_conditional and undergoes the same write-and-compile cycle.
Both tempfile paths are transient (side effect on the filesystem).

Sample arguments:

The sample_args list is constructed with explicit integer coercions for chains, iter_warmup, iter_sampling, max_treedepth, and refresh.
adapt_delta is passed without coercion.
show_messages and show_exceptions are both set to verbose.
seed is added only if non-NULL.
Extra arguments from ... are merged into sample_args by name, potentially overriding any of the above defaults.
Sampling is invoked via do.call(conditional_model$sample, sample_args).

Group aliasing:

When group_info is non-NULL, .check_group_aliasing_c7 is called for each slot $k$ with a design list containing Z_a = design_K$Z_a_k_list[[k]], Z_b = design_K$Z_b_k_list[[k]], and X = design_K$X.

Trailing roxygen block:

The section concludes with a roxygen @noRd documentation block for an internal function implementing the tensor-valued Proposition 7B* correction under $K > 1$ and $p > 1$. The function itself is not defined in this section; its documented signature includes parameters eb_correction, laplace_result_per_slot, K, p, and verbose, and it returns a list with applied, constant (3D array $[K, p, p]$), and slot_dispositions. This function appears in a subsequent section.

`.gdpar_eb_correction_tensor(eb_correction, laplace_result_per_slot, K = 2L, p = 1L, verbose = TRUE)`

Purpose

Builds a three-dimensional correction tensor for the Path C empirical-Bayes (EB) regime. The tensor scales each slot's reference-parameter covariance (extracted from per-slot Laplace results) by a fixed multiplier and is consumed downstream by S3 coverage methods. If any slot fails validation, the entire correction is disabled and downstream methods fall back to nominal coverage.

Arguments

eb_correction — logical scalar (or any value testable by isTRUE). When not TRUE, the function short-circuits and returns a disabled result with no slot processing.
laplace_result_per_slot — list of length K. Each element is a Laplace-fit result object expected to contain a theta_ref_cov_k field holding the $p \times p$ covariance of the reference parameters for that slot.
K — integer-ish scalar; number of slots. Coerced to integer. Default 2L.
p — integer-ish scalar; number of coordinates per slot. Coerced to integer. Default 1L.
verbose — logical scalar; when TRUE and at least one slot fails, a diagnostic warning is emitted via gdpar_warn.

Mathematics

For each slot $k \in \{1, \dots, K\}$ that passes validation, the correction constant is

$$C_k ;=; \kappa_{\alpha,0.95};\Sigma_k, \qquad \kappa_{\alpha,0.95} = 1.92,$$

where $\Sigma_k$ is the theta_ref_cov_k matrix for slot $k$. The resulting tensor has shape $K \times p \times p$ with slice $C[k,\cdot,\cdot] = C_k$.

The positive-semidefinite check uses eigenvalues $\lambda_i(\Sigma_k)$ computed via eigen(..., symmetric = TRUE, only.values = TRUE); a slot is rejected if any $\lambda_i < -10^{-10}$.

Returns

A named list with three components:

applied — logical scalar. TRUE only if every slot passed validation and the tensor was filled.
constant — numeric array of dimensions c(K, p, p). When applied = TRUE, filled with the scaled covariances; otherwise filled with NA_real_ (the "empty tensor").
slot_dispositions — named character vector of length K (names are seq_len(K) coerced to character). Each entry is one of: "disabled" (correction globally off), "missing" (covariance absent or wrong shape), "non_finite" (covariance contains non-finite entries), "non_psd" (eigenvalue check failed), or "ok".

Notes

The multiplier kappa_alpha_95 is hardcoded to 1.92 (not the exact 1.959964… standard-normal 97.5th percentile).
When eb_correction is not TRUE, the returned slot_dispositions are all "disabled" and names are set via setNames(rep("disabled", K), seq_len(K)).
When any slot fails, any_failed is set, the function returns applied = FALSE with an empty (NA) tensor, and—if verbose—a warning of class "gdpar_diagnostic_warning" is emitted via gdpar_warn summarising the count and unique failure types.
The PSD eigen-decomposition is wrapped in tryCatch; an error from eigen yields NA_real_ values, which then trigger the "non_psd" disposition.
The covariance shape check requires is.matrix(cov_k), nrow == p, and ncol == p.

`empty_tensor()` (nested closure inside `.gdpar_eb_correction_tensor`)

Purpose

Local helper that allocates a fresh $K \times p \times p$ array of NA_real_, used as the default/disabled constant tensor.

Arguments

None. Captures K and p from the enclosing .gdpar_eb_correction_tensor scope.

Returns

A numeric array of dimensions c(K, p, p) filled entirely with NA_real_.

Notes

Defined as a closure; not accessible outside its parent function.

`.build_amm_design_KxP(amm_list_canonical, data, formula_rhs)`

Purpose

Constructs the per-slot multivariate (ragged) design matrices for the Path C $K \times p$ specification. Iterates over the K slots of a canonicalised amm_spec list, enforces homogeneous $p \geq 2$ across slots, and delegates to .build_amm_design_multi() for each slot. The returned structure is the direct input consumed by .assemble_stan_data_KxP().

Arguments

amm_list_canonical — named list of length $K \geq 2$ of amm_spec objects. Each object must carry a $p field (defaulting to 1L via %||% if absent) that is $\geq 2$ and identical across all slots.
data — data frame containing the variables referenced by the per-slot AMM specifications. Validated by assert_data_frame().
formula_rhs — two-sided formula identifying the covariate columns of data used as the linear factor $x$. Passed through verbatim to .build_amm_design_multi() for each slot.

Returns

A named list with:

K — integer scalar; number of slots.
p — integer scalar; the homogeneous coordinate dimension (taken from the first slot).
slot_names — character vector of length K; the names() of amm_list_canonical.
design_per_slot — named list of length K. Each entry is the list returned by .build_amm_design_multi(a_k, data, formula_rhs) for that slot's amm_spec.

Notes

Aborts with class "gdpar_internal_error" via gdpar_abort if amm_list_canonical is not a list or has length < 2.
Aborts if any element lacks a non-empty name (is.null(slot_names) or any(!nzchar(slot_names))).
Aborts if the per-slot $p values are not all $\geq 2$ or not all identical. The error message includes the comma-separated p_per_slot vector.
Each slot is validated with assert_inherits(a_k, "amm_spec", ...) before delegation.
The $p extraction uses a$p %||% 1L, so a missing $p field is treated as 1L—which then triggers the homogeneous-$p \geq 2$ abort.

`.assemble_stan_data_KxP(design_KxP, family, amm_list_canonical, y_matrix, theta_anchor_kp, group_id = NULL, path = c("EB", "FB"), cp_W = FALSE)`

Purpose

Assembles the complete named-list data block consumed by the Path C Stan templates (amm_eb_marginal_KxP.stan / amm_eb_conditional_KxP.stan for the EB path; amm_canonical_pmulti_KxP.stan for the FB path). Dispatches on path to enforce or lift the Sub-phase 8.6.D first-iteration restrictions: EB hardcodes use_W = 0 and restricts stan_id to $\{1, 3\}$; FB enables the modulating $W$ (globally shared) and accepts the extended Path B family set $\{1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13\}$.

Arguments

design_KxP — list returned by .build_amm_design_KxP(). Must contain $design_per_slot, $K, and $p.
family — promoted gdpar_family object (validated by assert_inherits). Must carry a $stan_id field and a $name field.
amm_list_canonical — named list of K amm_spec objects with $p \geq 2$ per slot. Used to extract per-slot use_a / use_b flags and (FB path) $W$ metadata.
y_matrix — numeric or integer matrix of outcomes, shape $n \times p$.
theta_anchor_kp — numeric matrix of shape $K \times p$; per-slot per-coordinate anchors on the linear-predictor scale.
group_id — optional integer vector of length $n$. Resolved via .resolve_group_id().
path — character scalar; one of "EB" or "FB" (resolved by match.arg). Default "EB".
cp_W — logical scalar. Present in the signature but not referenced anywhere in the function body.

Mathematics

Per-slot per-coordinate design matrices are packed into 4D arrays with zero-padding:

$$Z_{a,kp}[k, j, \cdot, \cdot] \in \mathbb{R}^{n \times J_{a,\max}}, \qquad Z_{b,kp}[k, j, \cdot, \cdot] \in \mathbb{R}^{n \times J_{b,\max}},$$

where

$$J_{a,\max} = \max_{k,j} J_a^{(k,j)}, \qquad J_{b,\max} = \max_{k,j} J_b^{(k,j)},$$

and $J_a^{(k,j)} = \text{ncol}(d_k\$\text{Z\_a\_list}[[j]])$ (similarly for $b$). Slots/coordinates with fewer columns than the maximum are right-padded with zeros.

For the FB path with $W$ enabled, the total $W$ parameter dimension is

$$\text{dim_W}_{\text{total}} = K \cdot p \cdot W_{\text{per_kj_dim}},$$

with $W_{\text{per\_kj\_dim}} = \text{amm}\$W\$\text{dim}$ taken from the first slot that declares $W$.

The use_a_k / use_b_k flags are computed as

$$\text{use_a}_k = \mathbb{1}!\left[a_ka \neq \text{NULL} ;\lor; \exists d \in a_k\text{dims}:; da \neq \text{NULL}\right],$$

and analogously for use_b_k.

Returns

A named list. The base list (returned for both paths) contains:

Field	Type	Description
`n`	integer	Number of observations.
`K`	integer	Number of slots.
`p`	integer	Coordinate dimension.
`family_id_k_vector`	integer vector (length `K`)	Homogeneous `stan_id` replicated `K` times.
`inv_link_id_per_slot`	integer vector (length `K`)	Computed by `.gdpar_compute_inv_link_id_per_slot()`.
`use_a_k`	integer vector (length `K`)	Per-slot $a$-component flag.
`use_b_k`	integer vector (length `K`)	Per-slot $b$-component flag.
`use_W`	integer scalar	`0L` (EB) or `as.integer(any_W)` (FB).
`J_a_max`	integer	Maximum $a$-design column count.
`J_b_max`	integer	Maximum $b$-design column count.
`J_a_per_kp`	integer matrix ($K \times p$)	Per-slot per-coord $a$-design sizes.
`J_b_per_kp`	integer matrix ($K \times p$)	Per-slot per-coord $b$-design sizes.
`Z_a_kp`	numeric array ($K \times p \times n \times J_{a,\max}$)	Padded $a$-design matrices.
`Z_b_kp`	numeric array ($K \times p \times n \times J_{b,\max}$)	Padded $b$-design matrices.
`y_real`	numeric matrix ($n \times p$)	Real-valued outcomes (or zeros if `needs_real` is `FALSE`).
`y_int`	integer matrix ($n \times p$)	Integer-valued outcomes (or zeros if `needs_int` is `FALSE`).
`theta_anchor_kp`	list of `K` double vectors (each length `p`)	Row-wise decomposition of the input matrix.
`use_dispersion_y_k`	integer vector (length `K`)	Always zero in both paths.
`use_dispersion_phi_k`	integer vector (length `K`)	Always zero in both paths.
`use_groups`	(from `.resolve_group_id`)	Group flag.
`J_groups`	(from `.resolve_group_id`)	Number of groups.
`group_id`	(from `.resolve_group_id`)	Group index vector.
`K_slots`	integer	Redundant copy of `K`.
`p_dim`	integer	Redundant copy of `p`.

For the FB path only, the list is extended (c(base_list, ...)) with:

Field	Type	Description
`dim_W`	integer	Total $W$ dimension ($K \cdot p \cdot W_{\text{per\_kj\_dim}}$), or `0L`.
`d`	integer	Number of columns in the shared design matrix $X$.
`W_per_kj_dim`	integer	Per-(slot, coord) basis dimension.
`X`	numeric matrix ($n \times d$)	Shared linear-factor design matrix (or $n \times 0$).
`W_type_id`	(from `.gdpar_resolve_W_stan_data`)	$W$ basis type identifier.
`W_n_knots_full`	(from `.gdpar_resolve_W_stan_data`)	Knot count.
`W_knots_full`	(from `.gdpar_resolve_W_stan_data`)	Knot vector.
`W_degree`	(from `.gdpar_resolve_W_stan_data`)	Spline degree.

Notes

EB path restrictions: use_W is hardcoded to 0L; stan_id must be in $\{1, 3\}$ (Gaussian or Negative Binomial), otherwise a "gdpar_unsupported_feature_error" is raised. If any slot declares W != NULL on the EB path, a "gdpar_unsupported_feature_error" is raised.
FB path extensions: stan_id must be in $\{1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13\}$; otherwise a "gdpar_unsupported_feature_error" is raised. $W$ is enabled if any slot declares it; the first such slot's $W object defines the basis metadata (shared globally).
Outcome validation: For count families (stan_id $\in \{3, 10, 11, 12, 13\}$), every entry of y_matrix must be a finite, non-negative integer; otherwise a "gdpar_input_error" is raised. For continuous families (stan_id $\in \{1, 5, 6, 7, 8, 9\}$), every entry must be finite.
y_real / y_int population: needs_real is TRUE for stan_id $\in \{1, 5, 6, 7, 8, 9\}$; needs_int is TRUE for stan_id $\in \{3, 10, 11, 12, 13\}$. The unused matrix is zero-filled.
theta_anchor_kp is validated as a $K \times p$ matrix and then decomposed row-wise into a list of K length-p double vectors via lapply(seq_len(K), function(k) as.double(theta_anchor_kp[k, ])).
family_id_k_vector is rep(as.integer(stan_id), K)—homogeneous across slots regardless of path.
use_dispersion_y_k / use_dispersion_phi_k are zero vectors in both paths (the FB comment notes future B9.7+ may lift this).
cp_W is accepted as a parameter but never read.
Internal errors (class "gdpar_internal_error") are raised for: invalid design_KxP structure; K < 2 or p < 2; y_matrix not a matrix; y_matrix column count mismatch; theta_anchor_kp shape mismatch; FB path with dim_W <= 0 when use_W == 1.
Calls .resolve_group_id(), .gdpar_compute_inv_link_id_per_slot(), and (FB only) .gdpar_resolve_W_stan_data().
The pad_to local helper (see below) handles zero-padding of design matrices.

`pad_to(z, target_cols, n_rows)` (nested closure inside `.assemble_stan_data_KxP`)

Purpose

Zero-pads a design matrix z to target_cols columns. If target_cols is 0L, returns an $n_{\text{rows}} \times 0$ matrix. If z already has at least target_cols columns, returns z unchanged. Otherwise right-pads with a zero matrix.

Arguments

z — numeric matrix; the per-slot per-coordinate design matrix to pad.
target_cols — integer scalar; the target column count ($J_{a,\max}$ or $J_{b,\max}$).
n_rows — integer scalar; the number of rows to use when target_cols == 0L (i.e., $n$).

Returns

A numeric matrix with nrow(z) rows and max(ncol(z), target_cols) columns (or n_rows rows and 0 columns when target_cols == 0L).

Notes

Defined as a local closure inside .assemble_stan_data_KxP; captures nothing from the enclosing scope (all inputs are explicit arguments). When target_cols > 0L but ncol(z) >= target_cols, z is returned as-is (no truncation occurs even if z has more columns than the target).

`.gdpar_eb_make_random_init_KxP(stan_data, seed_offset = 1L, base_seed = NULL)`

Purpose Internal helper that fabricates a random initial-values list for the cmdstanr optimizer / Laplace approximation in the Path C K×p EB workflow. The returned list conforms to the cmdstanr automatic packing convention for the theta_ref_kp parameter (a 3D array [J, K, p]) and conditionally emits the auxiliary scale / raw-coefficient parameters that the K×p Stan template exposes when group structure or free a/b coefficients are active.

Arguments

stan_data — list. The Stan data environment. The following fields are consulted (via null-coalescing %||%): K_slots (fallback K), p_dim (fallback p), J_groups (fallback 1L), use_groups (fallback 0L), use_a_k, use_b_k, J_a_per_kp, J_b_per_kp.
seed_offset — integer scalar, default 1L. Integer added to base_seed to derive the per-start RNG seed, enabling distinct inits across multi-start iterations.
base_seed — integer scalar or NULL. When non-NULL, the function seeds the global RNG with as.integer(base_seed) + seed_offset and restores the prior .Random.seed state on exit. When NULL, no seeding is performed and the global RNG state is untouched.

Mathematics

The RNG seed is

$$ \text{rng_seed} = \begin{cases} \text{base_seed} + \text{seed_offset} & \text{base_seed} \neq \text{NULL} \ \text{NULL (no seeding)} & \text{otherwise} \end{cases} $$

Draws produced (all i.i.d. unless noted):

theta_ref_kp[g,k,c] $\sim \mathcal{N}(0,\, 0.1^2)$, shape $[J, K, p]$.
When use_groups == 1:
- mu_theta_ref_kp[1,k,c] $\sim \mathcal{N}(0,\, 0.1^2)$, shape $[1, K, p]$.
- sigma_theta_ref_kp[1,k,c] $= |\mathcal{N}(0.5,\, 0.05^2)|$, shape $[1, K, p]$.
When any use_a_k == 1:
- sigma_a_k[s] $= 0.1 + |\mathcal{N}(0,\, 0.02^2)|$ for $s = 1, \dots, n_{\sigma_a}$, where $n_{\sigma_a}$ is the count of slots $k$ satisfying use_a_k[k] == 1 and $\sum_{c} \mathbf{1}\{J_{a,\text{per\_kp}}[k,c] > 0\} > 0$.
- a_raw[j] $\sim \mathcal{N}(0,\, 0.1^2)$ for $j = 1, \dots, \sum_{k,c} J_{a,\text{per\_kp}}[k,c]$.
When any use_b_k == 1:
- sigma_b_k[k] $= 0.1 + |\mathcal{N}(0,\, 0.02^2)|$ for $k = 1, \dots, K$.
- c_b_kp_raw[j] $\sim \mathcal{N}(0,\, 0.1^2)$ for $j = 1, \dots, \sum_{k,c} J_{b,\text{per\_kp}}[k,c]$.

Returns

A named list. Always contains theta_ref_kp (a 3D numeric array of dim c(J, K, p)). Conditionally also contains:

mu_theta_ref_kp — 3D array [1, K, p] (only when use_groups == 1).
sigma_theta_ref_kp — 3D array [1, K, p] (only when use_groups == 1).
sigma_a_k — 1D numeric array of length n_sigma_a (only when any_use_a == 1 and n_sigma_a > 0).
a_raw — numeric vector of length total_J_a_free (only when any_use_a == 1 and total_J_a_free > 0).
sigma_b_k — 1D numeric array of length K (only when any_use_b == 1).
c_b_kp_raw — numeric vector of length total_J_b_free (only when any_use_b == 1 and total_J_b_free > 0).

Notes

Side effect: when base_seed is non-NULL, the global .Random.seed is overwritten via set.seed(rng_seed) and restored on function exit through an on.exit handler. If .Random.seed did not previously exist in .GlobalEnv, the handler performs no restoration (the seed state is left as set).
The slot-free-a mask is computed by coercing stan_data$J_a_per_kp to an integer matrix of shape K × p (row-major), then taking rowSums(.jap > 0L) > 0L intersected with use_a_k == 1L. This mirrors the n_sigma_a transformed-data quantity of the K×P Stan template; when every slot carries free a coefficients, $n_{\sigma_a} = K$ and the draw count is bit-identical to the unconditional case.
sigma_a_k and sigma_b_k are wrapped with as.array to ensure 1D-array typing expected by cmdstanr init packing.
No errors are raised by this function; malformed stan_data would propagate as errors from downstream coercions (e.g. as.integer, matrix).

`.gdpar_eb_maximize_marginal_KxP(model, stan_data, control, seed, verbose)`

Purpose

Step (i) of the EB workflow under Path C, specialized for the K×p regime. Runs a multi-start joint Laplace approximation over the full theta_ref_kp anchor tensor of shape [J_groups, K, p], selects the best init by marginal log-likelihood, draws from the Laplace approximation, extracts per-slot $p \times p$ covariance blocks (with canonical group-averaging when $J > 1$), applies per-slot adaptive Levenberg–Marquardt ridge perturbation, and gates the result on a condition-number threshold. The packaged output is consumed downstream by .gdpar_eb_correction_tensor().

Arguments

model — cmdstanr model object. Must expose $optimize and $laplace methods.
stan_data — list. Stan data list; must contain J_groups, K_slots, p_dim, plus whatever fields .gdpar_eb_make_random_init_KxP requires.
control — list. Must contain at least: multi_start_M (integer, number of starts), optim_algorithm (passed to model$optimize), laplace_draws (integer, number of Laplace draws), kappa_threshold (numeric, condition-number gate), and any fields consumed by .gdpar_eb_lm_perturb.
seed — integer scalar or NULL. Base seed for reproducibility; propagated to both the init generator and cmdstanr.
verbose — logical. Controls emission of informational messages via gdpar_inform and gdpar_warn.

Mathematics

Multi-start optimization. For $m = 1, \dots, M$:

$$ \theta^{(m)}_0 = \texttt{.gdpar_eb_make_random_init_KxP}(\text{stan_data},, m,, \text{seed}) $$

$$ \hat{\theta}^{(m)} = \arg\max_{\theta}, \log p(\theta \mid \text{data}) $$

with optimizer seed $\text{seed} + m$ (when seed non-NULL). The marginal log-likelihood of each start is $\ell_m = \texttt{opt\_m\$mle()["lp\_\_"]}$. The best start is

$$ m^\star = \arg\max_{m \in {1,\dots,M}} \ell_m $$

where inits that errored are skipped (their $\ell_m$ is NA_real_).

Laplace approximation. Given $\hat{\theta}^{(m^\star)}$,

$$ \theta^{(s)} \sim \mathcal{N}!\left(\hat{\theta}^{(m^\star)},, \left[-\nabla^2 \log p(\hat{\theta}^{(m^\star)} \mid \text{data})\right]^{-1}\right), \quad s = 1, \dots, S $$

with $S = \text{control\$laplace\_draws}$ and Laplace seed $\text{seed} + 1000$ (when seed non-NULL).

Posterior mean of the anchor tensor:

$$ \hat{\theta}_{g,k,c} = \frac{1}{S}\sum_{s=1}^{S} \theta^{(s)}_{g,k,c} $$

Per-slot covariance. For slot $k$, let $\mathcal{I}_k = \{(g, c) : g \in [J],\, c \in [p]\}$ index the $J \cdot p$ rows of the draws matrix belonging to slot $k$. The raw slot covariance is the sample covariance of those columns. When $J > 1$, the $J$ group-blocks of size $p \times p$ are averaged:

$$ \Sigma_k = \frac{1}{J}\sum_{g=1}^{J} \Sigma_{k,g} $$

where $\Sigma_{k,g}$ is the $p \times p$ sub-block of the slot covariance corresponding to group $g$. Each $\Sigma_k$ is then passed through .gdpar_eb_lm_perturb, yielding a ridge-perturbed $\tilde{\Sigma}_k$ with post-ridge condition number $\kappa_k$.

Dispersion diagnostic. Let $\mathcal{F} = \{m : \ell_m \text{ finite}\}$. Then

$$ \text{dispersion} = \begin{cases} \dfrac{\mathrm{sd}({\ell_m}_{m \in \mathcal{F}})}{\max!\left(\left|\overline{{\ell_m}_{m \in \mathcal{F}}}\right|,, 1\right)} & |\mathcal{F}| > 1 \ \text{NA_real_} & \text{otherwise} \end{cases} $$

Returns

A list with three top-level components:

theta_ref_kp_hat — 3D numeric array of dim c(J, K, p), the posterior-mean anchor tensor.
laplace_result_per_slot — named list of length K with names paste0("slot_", 1:K). Each entry is a list with element theta_ref_cov_k, a $p \times p$ numeric matrix (the ridge-perturbed, group-averaged slot covariance).
diagnostics — list with:
- kappa_per_slot — numeric vector of length K (post-ridge condition numbers).
- lm_lambda_per_slot — numeric vector of length K (LM ridge lambda used per slot).
- lm_n_iter_per_slot — integer vector of length K (LM iterations per slot).
- lm_status_per_slot — character vector of length K (LM status per slot; "not_needed" when no perturbation was required).
- multi_start_dispersion — numeric scalar or NA_real_.
- marginal_log_lik_history — numeric vector of length M (the $\ell_m$ values, with NA_real_ for failed starts).
- best_init_index — integer scalar $m^\star$.

Notes

Multi-start loop: each start's model$optimize call is wrapped in tryCatch; on error, verbose-gated gdpar_inform message of class gdpar_eb_message is emitted and the start is recorded as NULL (skipped via next). The lp__ extraction is itself wrapped in tryCatch returning NA_real_ on failure.
Best-init selection: a start becomes the new best if best_idx is NA, or if its $\ell_m$ is finite and strictly greater than the current best's recorded value (which may be NA). This means the first non-errored start with finite $\ell_m$ is always selected as a fallback.
All-starts-failed abort: if best_opt remains NULL after the loop, gdpar_abort is invoked with class c("gdpar_unsupported_feature_error") and data = list(history_lp = history_lp), recommending gdpar() (FB) instead.
Laplace failure abort: if model$laplace errors (caught by tryCatch returning NULL), gdpar_abort is invoked with class c("gdpar_eb_numerical_error") and data = list(history_lp = history_lp, best_idx = best_idx).
Missing-variable abort: the expected theta_ref_kp[g,k,c] variable names are enumerated via expand.grid(g = 1:J, k = 1:K, c = 1:p) (row-major in (g, k, c)). If the intersection with dimnames(draws)$variable does not cover all expected names, gdpar_abort is invoked with class c("gdpar_internal_error").
Condition-number gate: after per-slot LM perturbation, any slot with finite $\kappa_k > \text{control\$kappa\_threshold}$ or LM status "exhausted" triggers gdpar_abort of class c("gdpar_eb_numerical_error") with a detailed data list carrying per-slot kappa, lambda, n_iter, status, and the full history_lp.
Dispersion warning: when dispersion > 0.05 and verbose is TRUE, gdpar_warn of class c("gdpar_diagnostic_warning") is emitted with data = list(dispersion, history_lp), flagging possible multimodality.
Draws handling: uses posterior::subset_draws and posterior::as_draws_matrix; the %||% operator guards dimnames(draws)$variable.
Group-block indexing: within a slot's column subset, the $g$-th group block is assumed to occupy contiguous indices ((g-1)*p + 1):((g-1)*p + p), consistent with the (g, k, c) row-major enumeration from expand.grid.
Single-draws-column slot: when a slot has only one column in mat_kp (i.e. $J \cdot p = 1$ for that slot), the covariance is computed as matrix(stats::var(...), 1, 1) rather than stats::cov.
External dependencies: calls .gdpar_eb_lm_perturb, gdpar_inform, gdpar_abort, gdpar_warn (all defined elsewhere in the package).

`.resolve_anchor_KxP(anchor, family, y_matrix, K, p, verbose)`

Purpose

Internal helper that resolves the user-supplied anchor argument into a canonical $K \times p$ numeric matrix used by the Path C (multivariate, $K \ge 2$ AND $p \ge 2$) Empirical-Bayes pipeline. It supports four input modes: a finite numeric scalar (broadcast), a $K \times p$ finite matrix, the string "prior_mean", and the string "empirical_y".

Arguments

anchor — numeric scalar, matrix of shape $K \times p$, or character scalar in {"prior_mean", "empirical_y"}. The unresolved anchor specification.
family — a family object (list-like) exposing a linkfun function in its location slot; used only when anchor == "empirical_y".
y_matrix — numeric matrix of shape $n \times p$ containing the multivariate outcome; consumed only for the "empirical_y" branch via colMeans.
K — integer/numeric scalar; number of slots (rows of the resolved anchor).
p — integer/numeric scalar; number of coordinates per slot (columns of the resolved anchor).
verbose — logical scalar; when TRUE, an informational message is emitted for the "empirical_y" branch via gdpar_inform.

Mathematics

For the scalar branch the resolved matrix is

$$ A_{k,c} = a \in \mathbb{R}, \quad k=1,\dots,K,; c=1,\dots,p. $$

For "prior_mean",

$$ A_{k,c} = 0 \quad \forall, k,c. $$

For "empirical_y", let $\bar{y}_c = \frac{1}{n}\sum_{i=1}^{n} y_{i,c}$ be the column means of y_matrix and $g(\cdot)$ be family$linkfun. Then

$$ A_{k,c} = \begin{cases} g(\bar{y}_c) & k = 1, \ 0 & k = 2,\dots,K, \end{cases} \quad c = 1,\dots,p. $$

Returns

A matrix of mode double with nrow = K and ncol = p. In the "empirical_y" branch, row 1 holds $g(\bar y_1), \dots, g(\bar y_p)$ and rows $2,\dots,K$ are all zero.

Notes

Scalar branch: requires is.numeric(anchor), length(anchor) == 1L, and is.finite(anchor); otherwise falls through.
Matrix branch: requires is.matrix(anchor), nrow(anchor) == K, ncol(anchor) == p. If any entry is non-finite, raises an error of class c("gdpar_input_error", ...) via gdpar_abort with the message "Argument 'anchor' as a matrix must contain only finite values."
Character branch: only the two exact strings "prior_mean" and "empirical_y" are recognized. For "empirical_y", family$linkfun is applied per column inside a tryCatch; on error, gdpar_abort is invoked with class "gdpar_input_error" and a formatted message naming the offending column index and the captured condition message.
When verbose is TRUE and the "empirical_y" branch is taken, a message of class c("gdpar_anchor_message", ...) is emitted via gdpar_inform describing the computed anchor (formatted to 4 digits) and noting that all other slots are anchored at 0.
Fall-through (no recognized mode): gdpar_abort is called with class "gdpar_input_error" and data = list(received = anchor, K = K, p = p), with the message enumerating the four admissible forms.
No side effects beyond messages and errors; the function is pure with respect to its inputs.

`.gdpar_eb_run_KxP(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, family_id_k_vector, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group, parametrization, id_check_rigor, eb_correction, laplace_control, call, ...)`

Purpose

Internal orchestrator for the Path C Empirical-Bayes (EB) pipeline, invoked when $K \ge 2$ AND $p \ge 2$ (Sub-phase 8.6.D, first iteration). It validates inputs, builds the $K \times p$ AMM design, resolves the anchor, optionally runs per-slot identifiability checks, assembles Stan data, generates and compiles the marginal and conditional Stan models, runs Laplace mode-finding on the marginal model, runs HMC sampling on the conditional model, computes diagnostics, applies the EB correction tensor, and assembles a gdpar_eb_fit result object.

Arguments

amm_list_canonical — named list of length $K$ of canonicalized amm_spec objects. Each element is expected to expose fields $a, $b, $W, $dims, and $p. Names become slot_names.
family — family object/list; passed to .resolve_anchor_KxP, .gdpar_eb_check_stan_id_for_path, .assemble_stan_data_KxP, and the Stan-code generators. Must expose linkfun (used by the anchor resolver).
data — data.frame containing the outcome matrix-column and all RHS variables referenced by the AMM specifications.
prior — prior specification object; forwarded to the Stan-code generators .gdpar_eb_generate_stan_marginal and .gdpar_eb_generate_stan_conditional.
anchor — unresolved anchor specification; forwarded to .resolve_anchor_KxP.
outcome_name — character scalar naming the matrix-column in data that holds the $n \times p$ outcome.
formula_env — environment used as the enclosure for stats::as.formula constructions.
family_id_k_vector — (declared in the signature but not referenced in the body of this section) per-slot family-id vector.
skip_id_check — logical scalar. When TRUE, identifiability checking is skipped entirely and id_report is set to NULL.
chains — integer/numeric scalar; number of MCMC chains, coerced via as.integer.
iter_warmup — integer/numeric scalar; warmup iterations, coerced via as.integer.
iter_sampling — integer/numeric scalar; sampling iterations, coerced via as.integer.
adapt_delta — numeric scalar; CmdStan adapt_delta setting (passed uncoerced).
max_treedepth — integer/numeric scalar; CmdStan max_treedepth, coerced via as.integer.
refresh — integer/numeric scalar; CmdStan refresh, coerced via as.integer.
verbose — logical scalar; controls informational messages and show_messages/show_exceptions to CmdStan.
seed — NULL or integer/numeric scalar; PRNG seed for both Laplace mode-finding and HMC sampling.
group — group specification forwarded to .resolve_group_argument.
parametrization — character scalar; currently only "cp" is detected (both cp_a and cp_W are set to the same value); other values yield FALSE for both.
id_check_rigor — (declared in the signature but not referenced in the body of this section) rigor level for identifiability checks.
eb_correction — correction specification forwarded to .gdpar_eb_correction_tensor.
laplace_control — list of control parameters forwarded to .gdpar_eb_maximize_marginal_KxP.
call — the original call; stored verbatim in the returned object.
... — extra named arguments spliced into the sample_args list passed to cmdstanr::cmdstan_model$sample, overriding defaults.

Mathematics

Let $K$ be the number of slots and $p$ the (homogeneous) per-slot coordinate count. The function enforces

$$ p_k = p \quad \forall, k \in {1,\dots,K}, \qquad p \ge 2. $$

The anchor matrix $A \in \mathbb{R}^{K \times p}$ is resolved by .resolve_anchor_KxP. The Laplace stage produces a mode estimate

$$ \hat\theta^{\mathrm{ref}}_{g,k,c} \in \mathbb{R}^{J_{\mathrm{groups}} \times K \times p}, $$

together with per-slot covariance blocks $\Sigma_k \in \mathbb{R}^{p \times p}$. Per-group, per-slot, per-coordinate standard errors are obtained from the diagonal of each slot's covariance block and replicated across groups:

$$ \mathrm{SE}_{g,k,c} = \sqrt{\max\bigl([\Sigma_k]_{c,c},, 0\bigr)}, \qquad g = 1,\dots,J_{\mathrm{groups}}. $$

If $\Sigma_k$ is not a matrix, $\mathrm{SE}_{g,k,c} = \mathrm{NA}$ for all $g, c$ in that slot.

Returns

A list with S3 class c("gdpar_eb_fit", "list") containing:

theta_ref_kp_hat — the $J_{\mathrm{groups}} \times K \times p$ array of Laplace mode estimates (from laplace_result$theta_ref_kp_hat).
theta_ref_kp_se — array of shape $J_{\mathrm{groups}} \times K \times p$ of standard errors derived from per-slot covariance diagonals.
theta_ref_kp_cov_per_slot — list of length $K$ of per-slot covariance matrices (theta_ref_cov_k).
conditional_fit — the cmdstanr fit object returned by conditional_model$sample(...).
amm_list_canonical — the input amm_list_canonical (unchanged).
family — the input family.
prior — the input prior.
design_KxP — the design object returned by .build_amm_design_KxP.
anchor — the resolved $K \times p$ anchor matrix.
stan_data — the Stan data list returned by .assemble_stan_data_KxP.
identifiability_report — NULL when skip_id_check is TRUE; otherwise a named list of length $K$ of per-slot reports (entries may be NULL if a per-slot check errored).
diagnostics — object returned by compute_diagnostics(fit_cond, verbose = verbose).
diagnostics_numerical — laplace_result$diagnostics (numerical diagnostics from the Laplace stage).
parametrization — list with elements cp_a (logical), cp_W (logical), cp_a_per_K (NULL), and meta (a list with mode = "eb_KxP_path_C", a note string, and requested = list(parametrization = parametrization)).
group_info — value returned by .resolve_group_argument (may be NULL).
correction_applied — correction$applied.
correction_tensor_constant — correction$constant.
correction_tensor_dispositions — correction$slot_dispositions.
call — the input call.
path — "eb_KxP".
K — integer number of slots.
p — resolved per-slot coordinate count.
slot_names — names(amm_list_canonical).

Notes

Homogeneous-p requirement. Per-slot $p$ values are extracted via a$p %||% 1L. If any(p_per_slot != p_resolved), an error of class c("gdpar_unsupported_feature_error", ...) is raised with data = list(K = K, p_per_slot = p_per_slot), deferring heterogeneous-$p$ support to "Block 9.x".
Minimum $p$. If p_resolved < 2L, an error of class c("gdpar_internal_error", ...) is raised, indicating a dispatcher routing mistake.
Stan-id coverage. .gdpar_eb_check_stan_id_for_path(family, K, p_resolved) is called before any data validation; this is the gatekeeper for the supported (family, K, p) triples in the first iteration.
Outcome presence and shape. outcome_name must be a column of data; otherwise an error of class c("gdpar_input_error", ...) is raised. The outcome object must be a matrix (not a vector/factor); otherwise an error of class c("gdpar_input_error", ...) is raised with data = list(received_class = class(y_obj), outcome_name = outcome_name). Its column count must equal p_resolved; otherwise an error of class c("gdpar_input_error", ...) is raised with data = list(received_ncol = ncol(y_obj), p_resolved = p_resolved).
Outcome finiteness. For numeric outcomes, non-finite entries (NA, NaN, Inf) are flagged via !is.finite; for non-numeric outcomes, is.na is used. Any non-finite entry triggers an error of class c("gdpar_input_error", ...) reporting the count.
RHS union. The union of all.vars over each slot's $a, $b, and (if present) each element of $dims's $a/$b is collected. If empty, the RHS string defaults to "1"; otherwise it is paste(union_vars, collapse = " + "). A full formula outcome_name ~ rhs is constructed in formula_env, then the RHS is extracted as formula_full[c(1L, 3L)] and updated with ~ . + 0 to drop the intercept.
W component disabled. If any slot declares a non-NULL $W, an error of class c("gdpar_unsupported_feature_error", ...) is raised (per decision D39), directing the user to remove the W() wrapper or defer to Block 9.x.
Identifiability checks. When skip_id_check is TRUE, id_report is NULL. When FALSE, a verbose informational message of class c("gdpar_eb_message", ...) is emitted noting that the K-level joint check is deferred. Per-slot, the slot's own RHS variables are gathered (analogously to the union above), a per-slot RHS formula is built in formula_env, and gdpar_check_identifiability is called with theta_ref_init set to anchor_value[k, 1L] (or 1 when that anchor entry is near zero in absolute value — threshold 1e-8 — and the slot has a non-NULL $b). Each per-slot call is wrapped in tryCatch; on error, a verbose message of class c("gdpar_eb_message", ...) is emitted and the slot's report is set to NULL.
Group resolution. .resolve_group_argument(group, data, n = nrow(y_matrix), verbose = verbose) is called; if it returns non-NULL, its $group_id is forwarded to .assemble_stan_data_KxP as group_id.
Parametrization. Both cp_a and cp_W are set to identical(parametrization, "cp"); cp_a_per_K is NULL. The meta$note documents that the first iteration ships with NCP hardcoded in the Stan templates and that per-slot preflight is queued for Block 9.x.
Stan model generation and compilation. .gdpar_eb_generate_stan_marginal(prior, cp_a, cp_W, K, p, family) produces the marginal source, written to a tempfile via write_stan_to_tempfile and compiled with cmdstanr::cmdstan_model. The same pattern is used for the conditional source via .gdpar_eb_generate_stan_conditional.
Laplace stage. .gdpar_eb_maximize_marginal_KxP(model = marginal_model, stan_data = stan_data, control = laplace_control, seed = seed, verbose = verbose) returns laplace_result, expected to contain theta_ref_kp_hat (a 3D array) and laplace_result_per_slot (a list of length $K$ whose elements carry theta_ref_cov_k).
Conditional stage. stan_data_cond is a shallow copy of stan_data augmented with theta_ref_kp_data = laplace_result$theta_ref_kp_hat. The comment notes that cmdstanr's automatic packing accepts an R 3D array of shape $[J_{\mathrm{groups}}, K, p]$ for the Stan declaration array[J_groups, K] vector[p] theta_ref_kp_data. Sampling is invoked via do.call(conditional_model$sample, sample_args); sample_args includes data, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, show_messages = verbose, show_exceptions = verbose, and (if non-NULL) seed. Extra arguments from ... are spliced in by name, potentially overriding the defaults.
Diagnostics. compute_diagnostics(fit_cond, verbose = verbose) is called on the conditional fit; its return value is stored as diagnostics. The Laplace stage's own diagnostics are stored as diagnostics_numerical.
Correction tensor. .gdpar_eb_correction_tensor(eb_correction, laplace_result_per_slot, K, p_resolved, verbose) is called; its $applied, $constant, and $slot_dispositions are stored.
SE construction. A $J_{\mathrm{groups}} \times K \times p$ array of NA_real_ is allocated and filled per slot from sqrt(pmax(diag(cov_k), 0)) when cov_k is a matrix, else left as NA_real_. The same per-slot SE vector is replicated across all groups (per decision D43 = (a)).
Side effects. Writes two Stan source files to tempfiles; compiles two CmdStan models; runs Laplace optimization and HMC sampling (which may produce console output when verbose is TRUE); may emit gdpar_inform messages. No global state is mutated.
S3 class. The returned object is assigned class c("gdpar_eb_fit", "list"), enabling dispatch on gdpar_eb_fit methods downstream.
Unused formal arguments. family_id_k_vector and id_check_rigor appear in the signature but are not referenced in this section's body; they are preserved for API stability and downstream/future use.

R/families.R

`.gdpar_param_spec(name, link, did_status, did_condition, did_reference, support, prior_canonical_kind, scope, family_role)`

Purpose Internal constructor that builds a single structural-parameter specification object (class gdpar_param_spec). Each such object describes one parameter of a statistical family under the canonical contract established in the multi-parametric extension scoping (Block 8 Session 1, decision 1C): a family is represented as a list of marginal parameter specifications.

Arguments

Argument	Type	Meaning
`name`	character scalar	Short canonical name of the parameter (e.g. `"mu"`, `"sigma"`, `"phi"`, `"pi"`, `"p"`).
`link`	character scalar	Name of the link function to apply; passed to `.gdpar_link_funcs(link)`. Must be one of `"identity"`, `"log"`, `"logit"`.
`did_status`	character scalar	Identifiability status under Lemma 1B. One of `"holds"`, `"holds_under_condition"`, or `"user_responsible"`.
`did_condition`	character scalar or `NA_character_`	Human-readable description of the condition under which identifiability holds (relevant only when `did_status` is `"holds_under_condition"`).
`did_reference`	character scalar	Citation or pointer to the formal justification for the identifiability claim (e.g. `"Block 1, Section 6.4 (Lemma 1B)"`).
`support`	character scalar	Natural support of the parameter on the real line or a constrained domain. One of `"real_line"`, `"positive_real"`, `"unit_interval"`, `"bounded_open"`.
`prior_canonical_kind`	character scalar	Identifier for the canonical prior kind (decision 2b-iii), e.g. `"mu"`, `"log_sigma"`, `"logit_p"`, `"log_phi"`, `"log_shape"`, `"log_nu"`, `"power_p"`, `"logit_pi"`.
`scope`	character scalar	Scope of the parameter in the model hierarchy. One of `"per_observation"` (varies by observation) or `"population"` (shared across observations). Decision 2a-iii.
`family_role`	character scalar	Structural role of the parameter within the family. Must be one of the values returned by `.gdpar_known_family_roles()`.

Returns A list of class c("gdpar_param_spec", "list") with the following named elements:

Element	Value
`name`	The `name` argument.
`link`	The `link` argument.
`linkfun`	The link function $g(\mu)$ mapping from the natural parameter space to the linear predictor space.
`inv_link`	The inverse link function $g^{-1}(\eta)$ mapping from the linear predictor space back to the natural parameter space.
`did_status`	Identifiability status string.
`did_condition`	Identifiability condition string or `NA_character_`.
`did_reference`	Identifiability reference string.
`support`	Support string.
`prior_canonical_kind`	Prior kind identifier string.
`scope`	Scope string.
`family_role`	Role string.

Notes

The link functions (linkfun, inv_link) are obtained by calling .gdpar_link_funcs(link). If an unsupported link name is supplied, that helper will call gdpar_abort().
The class vector is c("gdpar_param_spec", "list"), enabling S3 dispatch on "gdpar_param_spec" first.
No validation of did_status, support, scope, or family_role against known values is performed inside this constructor; such validation is the responsibility of callers or upstream validators.

`.gdpar_known_family_roles()`

Purpose Returns the complete character vector of all recognised family_role values that a gdpar_param_spec entry may carry. Used by validators inside .gdpar_param_spec() and by codegen branches that emit family-specific Stan likelihood blocks.

Arguments None.

Returns A character vector of length 6:

c("location", "scale", "shape", "df", "mixture_pi", "power")

Value	Meaning
`"location"`	Position/location parameter ($\mu$ in any GLMM-like family; the per-observation primary parameter).
`"scale"`	Dispersion / scale parameter ($\sigma$ for Gaussian, $\phi$ for NB/Beta, $\phi$ for Tweedie).
`"shape"`	Shape parameter (Gamma shape, when relevant).
`"df"`	Degrees of freedom ($\nu$ for Student-t).
`"mixture_pi"`	Mixture probability (ZIP / ZINB / Hurdle).
`"power"`	Power parameter (Tweedie $p \in (1,2)$).

Notes

Pure function with no side effects.
The ordering of elements is fixed by the source code and may be relied upon for positional indexing.

`.gdpar_link_funcs(link)`

Purpose Link function factory. Given the name of a link, returns a list containing both the link function and its inverse. Centralizes the link-function switch used by the family and parameter-spec constructors.

Arguments

Argument	Type	Meaning
`link`	character scalar	Name of the link function. Must be one of `"identity"`, `"log"`, or `"logit"`.

Mathematics

For each supported link:

identity: $\mathrm{linkfun}(\mu) = \mu$, $\mathrm{inv\_link}(\eta) = \eta$
log: $\mathrm{linkfun}(\mu) = \log(\mu)$, $\mathrm{inv\_link}(\eta) = \exp(\eta)$
logit: $\mathrm{linkfun}(\mu) = \log\!\left(\dfrac{\mu}{1-\mu}\right)$, $\mathrm{inv\_link}(\eta) = \dfrac{1}{1 + \exp(-\eta)}$

Returns A named list with two elements:

Element	Type	Meaning
`linkfun`	function	The link function $g(\mu)$; takes a numeric vector on the natural parameter scale and returns the linear predictor scale.
`inv_link`	function	The inverse link function $g^{-1}(\eta)$; takes a numeric vector on the linear predictor scale and returns the natural parameter scale.

Notes

If link is not one of the three recognised names, the function calls gdpar_abort() with message "Internal error: unsupported link '<link>'." and error class "gdpar_internal_error". This abort occurs inside the switch() for inv_link first; the linkfun switch is never reached.
All returned closures are anonymous functions defined at call time.

`.gdpar_family_param_specs_for(name, link)`

Purpose Returns the canonical list of gdpar_param_spec objects describing every structural parameter that a built-in family admits as eligible for an individual specification. The first element is always the per-observation location parameter; subsequent elements are population-level auxiliary parameters in their canonical scope. The returned list is the registry of eligible parameters, not a subset selected for a particular model; promotion of auxiliary parameters to per-observation scope is handled downstream by .gdpar_promote_scope_per_observation().

Arguments

Argument	Type	Meaning
`name`	character scalar	Built-in family name. Must be one of: `"gaussian"`, `"poisson"`, `"neg_binomial_2"`, `"bernoulli"`, `"beta"`, `"gamma"`, `"student_t"`, `"tweedie"`, `"zip"`, `"zinb"`, `"hurdle_poisson"`, `"hurdle_neg_binomial_2"`.
`link`	character scalar	Link function name to use for the location ($\mu$) parameter. Passed through to `.gdpar_param_spec()`.

Returns A list of gdpar_param_spec objects (each of class c("gdpar_param_spec", "list")). The list length and contents depend on name:

Family	Parameters (in order)	`link` applied to
`"gaussian"`	`mu` (location), `sigma` (scale)	`mu`
`"poisson"`	`mu` (location)	`mu`
`"neg_binomial_2"`	`mu` (location), `phi` (scale)	`mu`
`"bernoulli"`	`mu` (location)	`mu`
`"beta"`	`mu` (location), `phi` (scale)	`mu`
`"gamma"`	`mu` (location), `shape` (shape)	`mu`
`"student_t"`	`mu` (location), `sigma` (scale), `nu` (df)	`mu`
`"tweedie"`	`mu` (location), `phi` (scale), `p` (power)	`mu`
`"zip"`	`mu` (location), `pi` (mixture_pi)	`mu`
`"zinb"`	`mu` (location), `phi` (scale), `pi` (mixture_pi)	`mu`
`"hurdle_poisson"`	`mu` (location), `pi` (mixture_pi)	`mu`
`"hurdle_neg_binomial_2"`	`mu` (location), `phi` (scale), `pi` (mixture_pi)	`mu`

All location parameters receive the user-supplied link. All auxiliary parameters use their own fixed link (always "log" for positive-real parameters, "logit" for unit-interval parameters, "identity" for the Tweedie power parameter).

Notes

If name does not match any recognised family, the function calls gdpar_abort() with message "Internal error: no canonical param_specs for family '<name>'." and error class "gdpar_internal_error".
Every location parameter has did_status = "holds", did_condition = NA_character_, and scope = "per_observation".
Every auxiliary parameter has did_status = "holds_under_condition" with a family-specific did_condition string and scope = "population".
The prior_canonical_kind for auxiliary parameters is determined by the family-specific convention (e.g. "log_sigma" for Gaussian sigma, "log_phi" for NB phi, "logit_pi" for ZIP pi).
For zip and zinb, the did_reference includes a literature citation (Lambert (1992) and Greene (1994) respectively). For hurdle_poisson and hurdle_neg_binomial_2, the reference includes Mullahy (1986).
The tweedie p parameter has support = "bounded_open" and prior_canonical_kind = "power_p", distinguishing it from all other auxiliary parameters which use "positive_real" or "unit_interval".
This function is a pure lookup; it does not mutate any global state.

`.gdpar_apply_did_override(param_specs, did_override, family_name)`

Purpose Implements the did_override plasticity argument of gdpar_family() (D2 of sub-phase 8.3.5a, 2026-05-21). Allows the user to keep the canonical likelihood and links of a built-in family but replace the identifiability descriptors of one or more parameter slots with their own declaration. Returns a new list of param-specs with the overrides applied; the input list is not mutated.

Arguments

Argument	Type	Meaning
`param_specs`	list of `gdpar_param_spec` objects	The canonical parameter specifications to modify.
`did_override`	`NULL` or a named list	If `NULL`, a no-op. Otherwise a named list whose names must be a subset of the parameter names in `param_specs`. Each entry is itself a named list with names from `{"did_status", "did_condition", "did_reference"}`.
`family_name`	character scalar	Family name used only for error messages.

Returns List of gdpar_param_spec objects with the user-supplied identifiability descriptor overrides applied.

Notes

The docstring is present in this section but the function body is not; the implementation follows in a subsequent section of R/families.R.
Validation requirements (documented but not visible in source here): did_override must be NULL or a named list whose names are a subset of the parameter names in the family's param_specs. Each entry must be a list whose names are a subset of c("did_status", "did_condition", "did_reference"); unknown fields raise an error. did_status must be one of "holds", "holds_under_condition", "user_responsible".
Design rationale: D2 of the sub-phase 8.3.5 scoping rejected registering a built-in inside .gdpar_K_custom_patterns() because that would blur the contract boundary between "package canonizes the likelihood" and "user declares D-ID". The override keeps the likelihood and links first-class while letting the user adjust identifiability semantics for their design.

`.gdpar_apply_did_override(param_specs, did_override, family_name)`

Purpose
Applies user-supplied overrides to the identifiability (D-ID) descriptors of one or more parameter specifications (param_specs) for a given family. This allows users to modify the default D-ID status, condition, and reference without altering the likelihood or link structure. It is an internal helper used during family construction.

Arguments

param_specs : A list of parameter specification objects (each a list with at least a name field). These represent the canonical parameters of the family.
did_override : Either NULL (no overrides) or a named list keyed by slot names (matching param_specs[*]$name). Each value must be a named list with optional fields: did_status, did_condition, did_reference.
family_name : A character scalar identifying the family (used in error messages).

Mathematics
No explicit mathematical formulas; implements input validation and slot-wise patching of D-ID descriptors.

Returns
The modified param_specs list with any applicable D-ID overrides applied.

Notes

Input validation is strict: did_override must be a named list with no empty names. Each slot entry must be a named list containing only allowed fields.
Allowed fields: "did_status", "did_condition", "did_reference".
Allowed values for did_status: "holds", "holds_under_condition", "user_responsible".
Raises a gdpar_input_error via gdpar_abort() if any validation fails, providing details about the offending slot or field.
If did_override is NULL, returns param_specs unchanged.
Modifies param_specs in-place (by reference) for any slot present in did_override.

`print.gdpar_param_spec(x, ...)`

Purpose
S3 print method for objects of class gdpar_param_spec. Provides a human-readable summary of a single parameter specification.

Arguments

x : An object of class gdpar_param_spec.
... : Additional arguments (unused; required for S3 generic compatibility).

Mathematics
None.

Returns
Invisibly returns the input x.

Notes

Exported as an S3 method for the base print generic.
Output includes: name, link, family_role, scope, support, did_status, did_condition (if not NA), and prior_canonical_kind.
Each field is printed on a separate line with fixed-width alignment.

`gdpar_family(name, link, did_override)`

Purpose
Constructs a standard (built-in) family object for use with the package, setting up the link function, parameter specifications, and Stan identifier. This function serves as the primary entry point for creating family objects for the built-in distributions.

Arguments

name : character scalar; one of "gaussian", "poisson", "neg_binomial_2", "bernoulli", "beta", "gamma", "student_t", "tweedie", "zip", "zinb", "hurdle_poisson", "hurdle_neg_binomial_2". Identifies the distributional family.
link : character scalar or NULL; the link function for the mean (location) parameter. If NULL, the default link for the chosen family is used.
did_override : list or NULL; if non-NULL, passed to .gdpar_apply_did_override to modify the default identifiability (D-ID) status of the family's parameters.

Returns
An object of class c("gdpar_family", "list") containing:

name : the family name.
link : the link function (character).
inv_link : the inverse-link function (closure).
linkfun : the link function itself (closure).
stan_id : integer identifier for the Stan template branch.
has_dispersion : logical; TRUE if the family has a dispersion parameter.
did_status : character; the identifiability status (e.g., "holds", "holds_under_condition", "user_responsible").
did_condition : character; description of condition under which identifiability holds, if any.
did_reference : character; reference supporting the identifiability claim.
param_specs : list of parameter specification lists (one per parameter, e.g., location and dispersion).

Notes

Raises a gdpar_input_error via gdpar_abort if the provided link is not in the set of allowed links for the chosen family.
The did_override argument is passed directly to .gdpar_apply_did_override (internal, not defined in this section) and can alter the default D-ID status of parameters.
The function internally calls .gdpar_family_param_specs_for (internal, not defined in this section) to obtain the default parameter specifications for the given family and link.

`gdpar_family_custom(name, link, did_holds, did_condition, stan_loglik_block, stan_log_lik_block, stan_y_pred_block, y_type, did_reference)`

Purpose
Constructs a custom family object for use with the package when the built-in families are insufficient. The user provides Stan code snippets for the likelihood, log-likelihood (for LOO-CV), and posterior predictive sampling, and explicitly declares identifiability (D-ID) status.

Arguments

name : character scalar; a unique identifier for the custom family. Must not coincide with any built-in family name.
link : character scalar; one of "identity", "log", "logit".
did_holds : logical scalar (TRUE or FALSE); whether the family is identifiable in its parameter. Missing values (NA) are not allowed.
did_condition : character scalar; description of the condition under which identifiability holds when did_holds = TRUE but identifiability is conditional. Use NA_character_ if identifiability holds unconditionally.
stan_loglik_block : character scalar; Stan code snippet for the model block that adds the log-likelihood for one observation. Must reference the linear predictor eta[i] and either y_real[i] (when y_type = "real") or y_int[i] (when y_type = "integer"). The legacy placeholder y[i] is not allowed.
stan_log_lik_block : character scalar; Stan code snippet for the generated quantities block that assigns to log_lik[i] for one observation. Used by gdpar_loo for LOO-CV.
stan_y_pred_block : character scalar; Stan code snippet for the generated quantities block that assigns to y_pred[i] for one observation. Used for posterior predictive checks.
y_type : character scalar; either "real" (outcome is real-valued, Stan references y_real[i]) or "integer" (outcome is integer-valued, Stan references y_int[i]).
did_reference : character scalar; citation or source supporting the identifiability declaration.

Returns
An object of class c("gdpar_family", "list") containing:

name : the custom family name.
link : the link function (character).
inv_link : the inverse-link function (closure).
linkfun : the link function itself (closure).
stan_id : NA_integer_ (custom families do not have a built-in Stan branch ID).
has_dispersion : FALSE (custom families are assumed to have only a location parameter).
did_status : derived from did_holds and did_condition (one of "holds", "holds_under_condition", "user_responsible").
did_condition : the condition text (or NA_character_).
did_reference : the user-provided reference.
stan_loglik_block : the user-provided model block snippet.
stan_log_lik_block : the user-provided generated quantities snippet for log_lik.
stan_y_pred_block : the user-provided generated quantities snippet for y_pred.
y_type : "real" or "integer".
is_custom : TRUE (flag indicating a custom family).
param_specs : a list of length 1 containing the parameter specification for the location parameter "mu".

Notes

Raises a gdpar_input_error via gdpar_abort for any validation failure (e.g., name coincides with a built-in, invalid link, missing did_holds, empty Stan snippets, legacy y[i] in snippets).
Emits a gdpar_did_message via gdpar_inform upon successful creation, reminding the user of their responsibility for the correctness of the Stan code and identifiability declaration.
The internal function .gdpar_param_spec (not defined in this section) is called to create the parameter specification for the location parameter "mu".

`.gdpar_K_custom_patterns()`

Purpose Internal (non-exported) registry function that returns a named list of canonical bi-parametric ($K=2$) likelihood patterns available for constructing custom families via gdpar_family_custom_K. Each pattern specifies the Stan log-probability density function identifier, the outcome type, and the slot-level descriptors (link, support, prior kind, scope, family role, D-ID metadata). The registry is the single source of truth for the descriptor-based wiring decided in sub-phase 8.3.4 (Block 8, D-A3.B option (b)).

Arguments

None.

Returns

A named list. Currently a single element is registered:

Name	`stan_id`	`y_type`	Slots
`lognormal_loc_scale`	`7L`	`"real"`	`mu` (location), `sigma` (scale)

Each element is itself a list with components:

stan_id — integer; the Stan-side family identifier used by the amm_distrib_K.stan dispatcher.
y_type — character; the Stan type of the outcome variable (e.g. "real").
slot_specs — a list of lists, each describing one distributional parameter slot. Every slot contains:
- name — parameter name (e.g. "mu", "sigma").
- link — link function name ("identity", "log").
- support — mathematical support ("real_line", "positive_real").
- prior_canonical_kind — canonical prior identifier ("mu", "log_sigma").
- scope — whether the parameter varies "per_observation" or is "population"-level.
- family_role — semantic role ("location", "scale").
- did_status — identifiability declaration ("holds", "holds_under_condition").
- did_condition — character or NA_character_; the condition under which D-ID holds.
- did_reference — citation string for the D-ID claim.

Notes

This function is never called directly by the user; it is consumed exclusively by gdpar_family_custom_K.
The registry is closed at the package level; adding a new $K=2$ pattern requires a source-code contribution (new list entry) rather than user-side injection, enforcing the auditability contract of D-A3.B option (b).
Currently only lognormal_loc_scale is registered (stan_id 7). The doc comments note that future sub-phases will extend the whitelist.

`gdpar_family_custom_K(name, stan_lpdf_id, did_holds = TRUE, did_condition = NULL, did_reference = NULL)`

Purpose Exported constructor for a $K=2$ custom distributional regression family. It selects a canonical bi-parametric likelihood pattern from the registry returned by .gdpar_K_custom_patterns, wires the descriptor-level metadata into gdpar_param_spec objects, and returns a gdpar_family object that routes through the same amm_distrib_K.stan branch as built-in families. This is the $K=2$ sibling of gdpar_family_custom (the $K=1$ free-form path).

Arguments

Argument	Type	Default	Meaning
`name`	character scalar	(required)	Unique identifier for the custom family. Must be non-empty, must not collide with any built-in family name.
`stan_lpdf_id`	character scalar	(required)	Registry key selecting a canonical pattern (currently only `"lognormal_loc_scale"` is valid).
`did_holds`	logical scalar	`TRUE`	Whether the D-ID condition of Lemma 1B holds for this family. If `FALSE`, every slot's `did_status` is overridden to `"user_responsible"`. `NA` is not allowed.
`did_condition`	character scalar or `NULL`	`NULL`	Optional override of the pattern-level D-ID condition string. When `NULL`, the registry's per-slot `did_condition` is used.
`did_reference`	character scalar or `NULL`	`NULL`	Optional override of the pattern-level D-ID citation. When `NULL`, the registry's per-slot `did_reference` is used.

Mathematics

No formula is computed; the constructor wires descriptors. However, the families it produces are intended for the AMM canonical form

$$y_i \sim D_k!\bigl(\cdot \mid \eta_{ik}\bigr), \qquad \eta_{ik} = g_k!\bigl(\theta_i[k]\bigr),$$

where $D_k$ is the likelihood selected by stan_lpdf_id, $g_k$ is the link function of slot $k$, and $\theta_i \in \mathbb{R}^K$ is the individual parameter vector. For the registered lognormal_loc_scale pattern the likelihood is

$$\log p(y_i \mid \mu_i, \sigma) = -\log(\sigma) - \frac{(\log y_i - \mu_i)^2}{2\sigma^2} - \frac{1}{2}\log(2\pi) - \log y_i,$$

with link $g_1(\mu)=\mu$ (identity) and $g_2(\sigma)=\log\sigma$ (log).

Returns

An object of class gdpar_family (also inherits "list") with components:

Component	Type	Description
`name`	character	The user-supplied family name.
`link`	character	Link function of the first (location) slot.
`inv_link`	function	Inverse-link function from the first slot's `gdpar_param_spec`.
`linkfun`	function	Link function object from the first slot's `gdpar_param_spec`.
`stan_id`	integer	Stan family identifier from the registry pattern (e.g. `7L`).
`has_dispersion`	logical	Always `TRUE` (both slots exist).
`did_status`	character	D-ID status of the first slot (possibly overridden).
`did_condition`	character or `NA`	D-ID condition of the first slot (possibly overridden).
`did_reference`	character	D-ID reference of the first slot (possibly overridden).
`y_type`	character	Stan outcome type from the pattern (e.g. `"real"`).
`is_custom`	logical	Always `TRUE`.
`stan_lpdf_id`	character	The registry key used.
`param_specs`	list of `gdpar_param_spec`	One per slot in the pattern (length 2 for the registered pattern).

Notes

Validation errors raised (via gdpar_abort with class "gdpar_input_error"):
- name is not a non-empty character scalar.
- name collides with a built-in family name (hard-coded vector of 12 built-in names: "gaussian", "poisson", "neg_binomial_2", "bernoulli", "beta", "gamma", "student_t", "tweedie", "zip", "zinb", "hurdle_poisson", "hurdle_neg_binomial_2").
- stan_lpdf_id is not a character scalar.
- stan_lpdf_id is not found in the registry.
- did_holds is not a non-NA logical scalar.
D-ID override logic: If did_holds is FALSE, every slot's did_status is set to "user_responsible" regardless of the registry value. The user-supplied did_condition and did_reference (when non-NULL) propagate to every slot, overriding the per-slot registry values.
Side effect: Emits an informational message (via gdpar_inform with class "gdpar_did_message") confirming the registration, the canonical pattern used, and the identifiability provenance.
No uniqueness check: The current source does not verify that name does not collide with a previously registered custom-K family in the session; only the hard-coded built-in vector is checked. This is consistent with the code as written; a session-level uniqueness guard may be added in a later sub-phase.
The returned object's top-level link, inv_link, and linkfun are taken from the first slot (param_specs[[1]]), i.e. the location parameter.

`print.gdpar_family(x, ...)`

Purpose S3 print method for objects of class gdpar_family. Provides a human-readable textual summary of the family's key metadata.

Arguments

Argument	Type	Meaning
`x`	`gdpar_family` object	The family to print.
`...`	(ignored)	Present for S3 generic compatibility; unused.

Returns

Invisibly returns x (the input object unchanged). Side effect is console output formatted as:

<gdpar_family>
  name           : <name>
  link           : <link>
  has_dispersion : <TRUE/FALSE>
  did_status     : <status>
  did_condition  : <condition>    # only if not NA
  did_reference  : <reference>
  param_specs    : <slot1_name (scope1)>, <slot2_name (scope2)>, ...

The param_specs line is only printed if x$param_specs is non-NULL. Each slot is formatted as "<name> (<scope>)" and the entries are comma-separated.

Notes

The did_condition line is suppressed when x$did_condition is NA (checked with !is.na()).
No validation of x is performed; the method trusts that the object has the expected structure.
This is the only print method for gdpar_family in this section. A separate print method for gdpar_family_multi is referenced in the gdpar_family_multi documentation but is not defined here.

`gdpar_family_multi(family, p, link)`

Purpose Exported constructor for a multivariate family object used with gdpar when the AMM specification has dimension $p > 1$. It builds a collection of $p$ univariate gdpar_family objects (one per coordinate of the individual parameter vector $\theta_i \in \mathbb{R}^p$) under the assumption that the likelihood factorizes across coordinates:

$$p(y_i \mid \theta_i) = \prod_{k=1}^{p} D_k(y_{ik} \mid \theta_i[k]),$$

with cross-dimensional coupling carried exclusively by the modulating component $W(\theta_{\text{ref}})$ of the AMM canonical form.

Arguments

Argument	Type	Default	Meaning
`family`	character scalar, `gdpar_family` object, or list of `gdpar_family`	(required)	When a character scalar, it is the name of a built-in family (one of `"gaussian"`, `"poisson"`, `"neg_binomial_2"`, `"bernoulli"`); `gdpar_family()` is called internally to construct the base family. When already a `gdpar_family` object, it is replicated $p$ times. When a list of length $p$, each element must be a `gdpar_family` (homogeneous restriction: all must share the same `stan_id`; heterogeneous per-coordinate families are deferred to a later sub-phase).
`p`	positive integer	(required)	Dimension of $\theta_i$. Must match the `p` of the `amm_spec` passed to `gdpar`.
`link`	character scalar	(implied: canonical link)	Link function used when `family` is supplied as a name. Ignored when `family` is already a `gdpar_family` object or a list.

Returns

An object of class gdpar_family_multi containing at minimum:

Component	Type	Description
`families`	list of `p` `gdpar_family` objects	One univariate family per coordinate.
`p`	integer	The dimension.
`homogeneous`	logical	`TRUE` when all $p$ families share the same `stan_id`.
`stan_id`	integer	Common Stan family identifier (present when homogeneous).
`has_dispersion`	logical	Common dispersion flag (present when homogeneous).
`name`	character	Common family name (present when homogeneous).
`did_status`	character	Common identifiability status (present when homogeneous).

A print method for gdpar_family_multi provides a human-readable summary.

Mathematics

The factorization follows architectural Option B of the Phase F decision:

$$p(y_i \mid \theta_i) = \prod_{k=1}^{p} D_k(y_{ik} \mid \theta_i[k]),$$

where $y_i \in \mathbb{R}^p$, $D_k$ is the univariate marginal distribution for coordinate $k$, and $\theta_i[k]$ is the $k$-th coordinate of the individual parameter vector. Cross-dimensional dependence enters through the AMM coupling $\theta_i = \theta_{\text{ref}} + W(\theta_{\text{ref}}) \, \delta_i$, not through the likelihood.

Notes

Scope restriction: This constructor handles the marginal factorization case only. Multi-parametric families (a single univariate outcome parametrized by the full vector $\theta_i \in \mathbb{R}^p$, e.g. a Gaussian with $\theta_i = (\mu_i, \log \sigma_i)$ in the distributional regression sense) are explicitly deferred to a dedicated post-validation block.
Homogeneity: In this version, even when a list of families is supplied, all entries must share the same stan_id. Heterogeneous per-coordinate families are deferred.
D-ID: The identifiability condition of Lemma 1B applies coordinate-wise: each univariate marginal $D_k$ identifies $\theta_i[k]$ from $y_{ik}$ independently. The cross-dimensional identifiability condition (C4-bis), guarding against aliasing between coordinates of $\theta_{\text{ref}}$ that share basis structure, is checked by gdpar_check_identifiability (Phase H, pending).
Dependency: Calls gdpar_family() when family is supplied as a character name.
The function body is not included in this section (section 4 of 7); only the roxygen block and the signature are present here. The implementation appears in a subsequent section.

`gdpar_family_multi(family, p, link = NULL)`

Purpose Constructs a multivariate family object (class gdpar_family_multi) representing a product of p univariate families across coordinates. This is the primary factory for multivariate response models in gdpar, enforcing homogeneity constraints in the current version.

Arguments

family: Input family specification. Accepted types:
- character(1): Name of a built-in family (e.g., "gaussian").
- gdpar_family: A pre-constructed univariate family object.
- list: A list of p gdpar_family objects, one per coordinate.
p: Integer > 0, the number of response dimensions (coordinates).
link: Optional character string specifying a link function. Ignored if family is already a gdpar_family object.

Mathematics None. This function is a constructor and validator.

Returns An object of class c("gdpar_family_multi", "list") with elements:

families: List of p gdpar_family objects.
p: Integer, the number of coordinates.
homogeneous: Logical, currently always TRUE (enforced).
stan_id, name, link, has_dispersion, did_status, did_condition, did_reference: Properties taken from the first (and only, due to homogeneity) family.
param_specs_per_coord: List of p param_specs lists.

Notes

Raises gdpar_input_error if p is not a positive integer, if family is an invalid type, if the list length mismatches p, or if any list element is not a gdpar_family.
Raises gdpar_unsupported_feature_error if the supplied families are heterogeneous (different stan_id or link). This is a known limitation; future versions will support heterogeneity.
If family is a character string, it is passed to gdpar_family() to create a base family, which is then replicated p times.
The link argument is ignored when a gdpar_family object is supplied to avoid silent overwrites; a warning-like abort is issued instead.

`print.gdpar_family_multi(x, ...)`

Purpose S3 print method for gdpar_family_multi objects, providing a concise summary.

Arguments

x: An object of class gdpar_family_multi.
...: Additional arguments (unused, present for S3 generic compatibility).

Mathematics None.

Returns Invisibly returns x.

Notes Prints a formatted summary to the console, including dimensions (p), homogeneity status, family name, link, and parameter specifications. If homogeneous is TRUE and param_specs_per_coord is available, it prints a summary of parameter names and scopes for the first coordinate.

`.gdpar_promote_scope_per_observation(family, k_names)`

Purpose Internal helper to promote specified parameters of a family to "per_observation" scope. This is used to implement K-individual specifications where some parameters vary per observation (e.g., in mixed-effects-like models). For multivariate families (gdpar_family_multi), promotion is applied to each coordinate's family.

Arguments

family: A gdpar_family or gdpar_family_multi object.
k_names: Character vector of parameter names to promote. May be NULL or empty (no-op).

Mathematics None. This function modifies metadata only.

Returns A copy of the input family object with updated scope fields in param_specs. The original is not mutated.

Notes

Raises gdpar_internal_error if k_names is not a character vector or if a family lacks param_specs.
Raises gdpar_input_error if any name in k_names is not among the eligible parameter names of the family.
For gdpar_family_multi, it updates each family in the families list and also updates the param_specs_per_coord field accordingly.

`.gdpar_canonical_inv_link_id_slot1(stan_id)`

Purpose Internal helper that returns the integer ID of the canonical inverse link function for the location slot (slot 1) of the built-in family identified by stan_id. This ID is used by Stan code for link dispatch.

Arguments

stan_id: Integer scalar, one of the built-in family Stan IDs (1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13).

Mathematics The mapping is:

stan_id == 1 (Gaussian) → identity link → ID 0
stan_id == 3 (Negative Binomial) → log link → ID 2
stan_id == 5 (Beta) → logit link → ID 1
stan_id == 6 (Gamma) → log link → ID 2
stan_id == 7 (Lognormal) → identity link (on log scale) → ID 0
stan_id == 8 (Student-t) → identity link → ID 0
stan_id == 9 (Tweedie) → log link → ID 2
stan_id == 10 (Zero-Inflated Poisson) → log link → ID 2
stan_id == 11 (Zero-Inflated Negative Binomial) → log link → ID 2
stan_id == 12 (Hurdle Poisson) → log link → ID 2
stan_id == 13 (Hurdle Negative Binomial) → log link → ID 2

Returns Integer scalar in {0, 1, 2}.

Notes

Raises gdpar_internal_error if stan_id is NULL or NA.
Raises gdpar_internal_error with a descriptive message if stan_id is not one of the recognized codes.

`.gdpar_canonical_inv_link_id(stan_id, slot_idx)`

Purpose Internal helper that returns the integer ID of the canonical inverse link function for an arbitrary slot index in the canonical parameter specifications of the built-in family identified by stan_id. This is used in the homogeneous branch of the heterogeneous resolver to obtain the correct link for a given slot.

Arguments

stan_id: Integer scalar, a built-in family Stan ID.
slot_idx: Integer scalar, 1-based slot index in the family's canonical param_specs.

Mathematics The mapping is family-specific and slot-specific. For example, for the Gaussian family (stan_id=1), slot 1 (mean) uses identity link (ID 0) and slot 2 (scale) uses log link (ID 2). The exact mapping is not provided in the visible code.

Returns Integer scalar in {0, 1, 2}.

Notes

The function body is not fully shown in the provided code, but it is expected to be a switch-case or lookup based on stan_id and slot_idx.
Likely raises gdpar_internal_error for invalid stan_id or slot_idx.

`.gdpar_canonical_inv_link_id_slot(stan_id, slot_idx)`

Purpose Internal helper that maps a given slot of a given family (identified by its Stan numeric ID) to a canonical integer inverse-link identifier used by the Stan template's apply_inv_link_by_id() dispatcher. It handles both built-in families (via a switch on stan_id) and the special custom-K pattern lognormal_loc_scale (stan_id 7).

Arguments

Argument	Type	Meaning
`stan_id`	Integer scalar (or coercible)	The Stan-family numeric identifier. Must not be `NULL` or `NA`.
`slot_idx`	Integer scalar	1-based index of the parameter slot whose inverse-link ID is requested. Must lie within `1:length(slot_specs)` for the relevant family/pattern.

Mathematics

The function does not compute a mathematical formula; it is a pure lookup/dispatch table. It returns the integer encoding of the link function associated with slot slot_idx:

$$ \text{inv_link_id} = \begin{cases} 0 & \text{if link} = \texttt{identity},\\ 1 & \text{if link} = \texttt{logit},\\ 2 & \text{if link} = \texttt{log}. \end{cases} $$

For stan_id 7 the slot specs are obtained from .gdpar_K_custom_patterns()[["lognormal_loc_scale"]]$slot_specs. For all other built-in families the specs come from .gdpar_family_param_specs_for(family_name, .gdpar_default_link_for(family_name)).

Returns An integer scalar in {0L, 1L, 2L} representing the canonical inverse-link identifier.

Notes

Raises a gdpar_internal_error if stan_id is NULL or NA.
For stan_id 7 (lognormal_loc_scale), the function bypasses the normal gdpar_family path and fetches specs directly from .gdpar_K_custom_patterns(), because lognormal_loc_scale is a custom-K registry pattern, not a built-in gdpar_family.
Raises a gdpar_internal_error if slot_idx is out of range for the resolved family/pattern.
Raises a gdpar_internal_error if stan_id does not correspond to any recognised built-in family (the family_name switch returns NULL).
Raises a gdpar_internal_error if the resolved link string for the slot is not one of identity, logit, or log.
No S3 dispatch; pure procedural logic.

`.gdpar_default_link_for(name)`

Purpose Returns the default (canonical) link-function string for a built-in family given its character name. Used internally by .gdpar_canonical_inv_link_id_slot to look up param_specs without the caller needing to know link conventions.

Arguments

Argument	Type	Meaning
`name`	Character scalar	One of the recognised built-in family names (see table below).

Returns A character scalar — the name of the default link function. The mapping is:

Family name	Default link
`gaussian`	`"identity"`
`poisson`	`"log"`
`neg_binomial_2`	`"log"`
`bernoulli`	`"logit"`
`beta`	`"logit"`
`gamma`	`"log"`
`student_t`	`"identity"`
`tweedie`	`"log"`
`zip`	`"log"`
`zinb`	`"log"`
`hurdle_poisson`	`"log"`
`hurdle_neg_binomial_2`	`"log"`

Notes

Implemented via a switch() with no default branch. If name does not match any of the listed families, switch() returns NULL silently (no explicit error is raised).
No S3 dispatch.

`.gdpar_canonical_support_slot1(stan_id)`

Purpose Returns the canonical support string of slot 1 (the location/first parameter) of the family identified by stan_id. Used by the heterogeneous validator (D4 of sub-phase 8.3.7): when family_id_k[k] is heterogeneous, the emitted support is the support of slot 1 of that family per the L1 refinement rule.

Arguments

Argument	Type	Meaning
`stan_id`	Integer scalar (or coercible)	The Stan-family numeric identifier.

Mathematics None — pure lookup table.

Returns A character scalar, one of "real_line", "positive_real", "unit_interval", or "bounded_open". The mapping is:

stan_id	Family	Slot 1 support
1	`gaussian`	`"real_line"`
3	`neg_binomial_2`	`"positive_real"`
5	`beta`	`"unit_interval"`
6	`gamma`	`"positive_real"`
7	`lognormal_loc_scale`	`"real_line"`
8	`student_t`	`"real_line"`
9	`tweedie`	`"positive_real"`
10	`zip`	`"positive_real"`
11	`zinb`	`"positive_real"`
12	`hurdle_poisson`	`"positive_real"`
13	`hurdle_neg_binomial_2`	`"positive_real"`

Notes

Raises a gdpar_internal_error if the integer-converted stan_id does not match any recognised key.
No S3 dispatch.

`.gdpar_support_subset_coherent(emitted, required)`

Purpose Predicate that tests whether the support emitted by a heterogeneous family's slot 1 is a coherent subset of the support required by the role of that slot in the location family. Used by the D4 heterogeneous validator.

Arguments

Argument	Type	Meaning
`emitted`	Character scalar	The support string emitted by the heterogeneous family's slot 1 (e.g. from `.gdpar_canonical_support_slot1`).
`required`	Character scalar	The support string required by the role of the slot in the location family's `param_specs`.

Mathematics

The subset-coherence relation is defined as:

$$ \begin{aligned} &\texttt{real_line} \subseteq {\texttt{real_line}}, \\ &\texttt{positive_real} \subseteq {\texttt{positive_real},; \texttt{real_line}}, \\ &\texttt{unit_interval} \subseteq {\texttt{unit_interval},; \texttt{positive_real},; \texttt{real_line}}, \\ &\texttt{bounded_open} \subseteq {\texttt{bounded_open},; \texttt{positive_real},; \texttt{real_line}}. \end{aligned} $$

The reflexive case (emitted == required) always returns TRUE.

Returns A logical scalar: TRUE if emitted is a coherent subset of required; FALSE otherwise (including the case where emitted is "custom" or any unrecognised string).

Notes

Custom supports ("custom") are implicitly rejected: they match none of the if branches and fall through to FALSE, because the heterogeneous family cannot have a custom slot 1 built-in.
Bounds are not checked for "bounded_open" (no built-in family registers a bounded range for slot 1).
No S3 dispatch.

`.gdpar_compatible_families_for_support(required, exclude_stan_ids = integer(0))`

Purpose Enumerates alternative built-in families whose slot 1 support satisfies the subset-coherence predicate against a given required support. Used by the D4 validator to build informative error messages suggesting replacement families when a user-supplied heterogeneous family fails validation.

Arguments

Argument	Type	Meaning
`required`	Character scalar	The support required by the role of the slot in the location family.
`exclude_stan_ids`	Integer vector (default `integer(0)`)	Stan IDs to exclude from the candidate list. Typically includes the location family's stan_id and the failing heterogeneous family's stan_id.

Mathematics None — iterates over a fixed candidate list and applies .gdpar_support_subset_coherent.

Returns A character vector of built-in family names whose slot 1 support coherently satisfies required, excluding any whose stan_id appears in exclude_stan_ids. May be length zero.

Notes

The candidate list is hardcoded and includes only six families: gaussian (1), neg_binomial_2 (3), beta (5), gamma (6), student_t (8), tweedie (9). Notably absent are zip, zinb, hurdle_poisson, hurdle_neg_binomial_2, poisson, and bernoulli — these are not offered as suggestions.
The location family itself is excluded via exclude_stan_ids (the homogeneous trivial case is not a suggestion).
No S3 dispatch.

`.gdpar_validate_heterogeneous_family_K(family_het_list, location_param_specs, slot_names)`

Purpose Validates a heterogeneous family list (a named list of gdpar_family objects, one per slot $k = 1, \dots, K$) against the location family's slot roles. Implements the D4 validator of sub-phase 8.3.7: for each slot $k \geq 2$, it checks that the support emitted by the heterogeneous family's slot 1 is a coherent subset of the support required by slot $k$'s role in the location family.

Arguments

Argument	Type	Meaning
`family_het_list`	Named list of `gdpar_family` objects, length $K$	The heterogeneous family specification. `[[1]]` is the location family. Names must match `slot_names`.
`location_param_specs`	List of $K$ `param_specs`	The location family's parameter specifications after promotion to per-observation, defining the canonical role assignment of the $K$ slots.
`slot_names`	Character vector, length $K$	Canonical slot names (e.g. `c("mu", "sigma", ...)`).

Mathematics None — validation logic only.

Returns invisible(NULL) on success.

Notes

Raises a gdpar_internal_error if the three inputs have inconsistent lengths $K$.
For each slot $k \geq 2$:
- Raises a gdpar_input_error if family_het_list[[k]] is not a gdpar_family object.
- Raises a gdpar_input_error if family_het_list[[k]]$stan_id is NA or NULL (custom free-form families are not admitted as heterogeneous slot entries; only built-in families and descriptor-based custom-K patterns are allowed).
- Skips validation (via next) if the heterogeneous family's stan_id equals the location family's stan_id (homogeneous slot).
- Raises a gdpar_input_error if the subset-coherence predicate fails. The error message includes the offending slot name, its role, the emitted and required supports, the location family name, and a suggestion list from .gdpar_compatible_families_for_support.
Slot 1 (the location family) is never validated here; it is the canonical reference that determines role assignment.
No S3 dispatch.

`.gdpar_inv_link_id_per_slot(family_id_k_vector, location_family)` (documentation only; body not in this section)

Purpose Computes the inv_link_id_per_slot integer vector of length $K$ passed to the Stan template's data block. Implements the L1 refined rule (D3.5 of 8.3.7) for dispatching inverse-link functions per slot.

Arguments

Argument	Type	Meaning
`family_id_k_vector`	Integer vector, length $K$	Per-slot Stan family ID. Element $k$ is the `stan_id` governing slot $k$.
`location_family`	A `gdpar_family` object	The location family (slot 1), used to look up canonical slot-$k$ links in the homogeneous case.

Mathematics

For each slot $k = 1, \dots, K$:

$$ \texttt{inv_link_id}[k] = \begin{cases} \text{canonical inv_link ID of slot } k \text{ of the location family} & \text{if } \texttt{family_id_k}[k] = \texttt{family_id_k}[1],\\ \text{canonical inv_link ID of slot } 1 \text{ of the family with stan_id } \texttt{family_id_k}[k] & \text{otherwise.} \end{cases} $$

The returned integer vector values lie in $\{0, 1, 2\}$, corresponding to identity (0), logit (1), and log (2) as defined in .gdpar_canonical_inv_link_id_slot.

Returns An integer vector of length $K$ with values in $\{0, 1, 2\}$.

Notes

Backward-compatibility note (D7 of 8.3.7): In the strict homogeneous regime (all family_id_k entries equal), the vector reproduces the canonical slot-by-slot links of the location family. The Stan refactor replaces previously hardcoded inv_link calls with apply_inv_link_by_id(), which preserves mathematical equivalence but is NOT guaranteed to be bit-for-bit identical with pre-8.3.7 outputs. Goldens for $K=2$ Beta and Gamma must be re-bootstrapped on close.
The function body is not present in this section; only the roxygen documentation block appears here.

`.gdpar_compute_inv_link_id_per_slot(family_id_k_vector, location_family)`

Purpose
Computes a vector of integer inverse-link identifiers for each slot in a heterogeneous family model. This is used to map the canonical inverse-link function of the location family to slots that share the same family as the location family, and to assign a slot-specific canonical inverse-link for slots with different families.

Arguments

family_id_k_vector (integer vector): Length-K vector of Stan family identifiers for each slot.
location_family (gdpar_family): The primary location family object whose stan_id determines the canonical inverse-link for matching slots.

Mathematics
For each slot $k = 1, \dots, K$:

If $\text{family\_id\_k\_vector}[k] == \text{location\_family\$stan\_id}$, then
$$ \text{out}[k] = \text{.gdpar_canonical_inv_link_id_slot}(\text{location_stan_id}, k). $$
Otherwise,
$$ \text{out}[k] = \text{.gdpar_canonical_inv_link_id_slot1}(\text{family_id_k_vector}[k]). $$ where .gdpar_canonical_inv_link_id_slot and .gdpar_canonical_inv_link_id_slot1 are helper functions (not defined in this section) that return integer inverse-link identifiers.

Returns
An integer vector of length $K$ where each element is the inverse-link identifier for the corresponding slot.

Notes

This is an internal helper function (not exported).
The function loops over $K$ slots.
The inverse-link identifier is used by Stan to compute the inverse-link function for the linear predictor in each slot.

`.gdpar_resolve_heterogeneous_family_K(family_input, slot_names)`

Purpose
Resolves a heterogeneous family argument (a named list of gdpar_family objects) into a canonical form for the model. It validates the input, ensures slot names match the formula set, checks for unsupported heterogeneity in certain location families, and propagates design (did) status overrides from the first slot (location family) to other slots that have the same family. This function implements the named-list public API (D5 of 8.3.7) and the materialization rule (D3.5 / L1 refined).

Arguments

family_input (either a gdpar_family object or a named list of gdpar_family objects):
- If a single gdpar_family, it is treated as homogeneous (all slots share the same family).
- If a named list, it must have one entry per slot, keyed by slot name.
slot_names (character vector): Length-K vector of canonical slot names from the gdpar_formula_set.

Returns
A named list with three components:

location_family (gdpar_family): The location family with per-slot design (did) status overrides propagated (only for slots that are heterogeneous).
family_id_k_vector (integer vector): Length-K vector of Stan family identifiers for each slot.
is_heterogeneous (logical): TRUE if at least one slot has a different Stan identifier from the first slot (i.e., the location family).

Notes

Input Validation:
- If family_input is not a gdpar_family or a list, an error of class gdpar_input_error is raised.
- If family_input is a list but not a named list (or has empty/duplicate names), an error is raised.
- The length of the list must equal $K$ (the number of slots).
- The names of the list must exactly match slot_names (no missing or extra names).
- Each element of the list must be a gdpar_family object.
Location Family Requirements:
- The first slot (location family) must have a non-NA stan_id (i.e., it must be a built-in or descriptor-based custom_K family).
Heterogeneity Restrictions:
- Heterogeneous families (different families across slots) are only supported for a subset of location families: Gaussian (1), negative binomial (3), beta (5), gamma (6), and lognormal_loc_scale (7). If heterogeneity is detected and the location family is not one of these, an error of class gdpar_unsupported_feature_error is raised.
Design (did) Propagation:
- For each slot $k \ge 2$ that has a different family from the location family, the did_status, did_condition, and did_reference fields in the location family's param_specs for that slot are overwritten with those from the heterogeneous family for that slot. The link, support, and prior canonical kind are not overwritten because the Stan side uses inv_link_id_per_slot and canonical priors are applied in linear predictor space.
Side Effects:
- The location_family returned is a modified copy of the original location family (promoted to have per-observation scope via .gdpar_promote_scope_per_observation).
- Validation of the heterogeneous family is performed by .gdpar_validate_heterogeneous_family_K (not defined in this section).
Error Handling:
- Uses gdpar_abort with appropriate classes and data for structured error reporting.
Internal Use:
- This is an internal function (not exported).

R/formula_spec.R

`formula_spec.R` (section 1 of 3)

`.gdpar_check_no_intercept_suppression(f, slot_name)`

Purpose

Internal validation guard that detects intercept-suppression syntax (- 1 or + 0) in a formula by inspecting the intercept attribute of the terms object produced by stats::terms(). In the AMM canonical form the population anchor $\theta_{\mathrm{ref}}$ is a structural term that cannot be removed; a user attempt to suppress it is treated as a contract violation and aborts with a gdpar_input_error.

Arguments

Argument	Type	Meaning
`f`	formula	A one-sided or two-sided formula object to be inspected.
`slot_name`	character (scalar)	Name of the slot within the enclosing `gdpar_formula_set`; used solely for constructing the error message.

Mathematics

The decision rule is:

$$ \text{abort} \iff \mathrm{attr}!\big(\mathrm{terms}(f),;\text{"intercept"}\big) = 0 $$

When stats::terms(f) raises an error (i.e. the formula cannot be parsed by terms), the function treats the formula as valid and returns TRUE:

$$ \mathrm{terms}(f) ;\text{fails} ;\Longrightarrow; \text{return } \mathrm{TRUE} $$

Returns

invisible(TRUE) in all non-aborting paths:

When stats::terms(f) fails and trm is NULL.
When the intercept attribute is non-zero (intercept retained).

When intercept suppression is detected, the function does not return; it calls gdpar_abort() with class = "gdpar_input_error" and a data list containing slot and formula (the deparsed formula at width.cutoff = 500L).

Notes

The error message is bilingual (English and Spanish) and directs users to glm()/brms/stan_glm() for anchor-free models.
There is a minor redundancy in the NULL branch: invisible(TRUE) is called and then immediately return(invisible(TRUE)) follows; the first call has no observable effect.
The function is marked @keywords internal and @noRd; it is not exported.
No S3 dispatch is involved; the function is a plain closure.

`.gdpar_assert_named_formula_list(args, fn_name)`

Purpose

Internal assertion that validates a list of arguments captured from ... in a constructor. It enforces four structural invariants: non-emptiness, full naming, name uniqueness, and that every element inherits from class "formula".

Arguments

Argument	Type	Meaning
`args`	list	A list captured from `...` in the calling constructor (e.g. `gdpar_formula_set`).
`fn_name`	character (scalar)	Name of the calling function, embedded in error messages for traceability.

Returns

invisible(TRUE) on success. On any validation failure, the function aborts via gdpar_abort() with class = "gdpar_input_error".

Notes

Validation proceeds in four sequential checks, each of which aborts on failure:

Non-emptiness. If length(args) == 0L, aborts with message "<fn_name>() requires at least one named formula." No data field is attached.
Naming. arg_names <- names(args). If arg_names is NULL or any element fails nzchar() (i.e. is ""), aborts with a message instructing the user to name formulas with the canonical parameter name (e.g. mu = y ~ a(x)). No data field is attached.
Uniqueness. anyDuplicated(arg_names) > 0L triggers an abort. Duplicates are collected via unique(arg_names[duplicated(arg_names)]) and reported with sQuote and comma separation. The data field is list(duplicates = dups).
Formula class. Iterates seq_along(args); if any element does not inherits(., "formula"), aborts with a message naming the slot, the function, and the received class(es) (via sQuote(class(args[[i]]))). The data field is list(slot = arg_names[[i]], received = class(args[[i]])).

The function is marked @keywords internal and @noRd.

`gdpar_formula_set(...)`

Purpose

Exported constructor that builds the canonical internal representation of a multi-parameter AMM formula set. This object is the single source of truth for which structural parameters of the distributional family receive an AMM (per-observation) design. It is consumed downstream by gdpar().

Arguments

Argument	Type	Meaning
`...`	Named formula objects	The first must be a two-sided formula whose LHS is a single symbol naming the outcome (e.g. `mu = y ~ a(x)`). Subsequent arguments must be one-sided formulas (e.g. `sigma = ~ a(x)`).

Mathematics

Let $\mathcal{F} = \{f_1, f_2, \dots, f_K\}$ be the set of $K$ formulas supplied. The constructor partitions $\mathcal{F}$ into:

$f_1$: a two-sided formula $\text{outcome} \sim \text{design}_1$, where $\text{outcome}$ is a bare symbol.
$f_k$ for $k \geq 2$: one-sided formulas $\sim \text{design}_k$.

Each formula name corresponds to a canonical parameter $\theta_k$ of the family. Downstream, gdpar() promotes every named parameter to:

$$ \mathrm{scope}(\theta_k) = \text{"per_observation"}, \quad k = 1, \dots, K $$

while unnamed family parameters retain their canonical scope (typically "population"). The population anchor $\theta_{\mathrm{ref}}$ is structural and cannot be suppressed.

Returns

An S3 object of class c("gdpar_formula_set", "list") with four components:

Component	Type	Description
`outcome`	character (scalar)	The outcome variable name extracted from the LHS of the first formula.
`formulas`	named list of formula	The original `args` list, preserving order and names. The first element is two-sided; the rest are one-sided.
`param_names`	character vector	Identical to `names(formulas)`.
`env`	environment	`environment(first_f)`, the environment of the first formula, used downstream for evaluation.

Notes

Validation is performed in a strict sequence; the first failing check aborts via gdpar_abort() with class = "gdpar_input_error":

Named formula list. Delegates to .gdpar_assert_named_formula_list(args, "gdpar_formula_set"), which checks non-emptiness, naming, uniqueness, and formula class.
First slot is two-sided. length(first_f) != 3L detects a non-two-sided first formula (R formulas have length 2 when one-sided, 3 when two-sided). The error data is list(slot = arg_names[[1L]], received = deparse(first_f, width.cutoff = 500L)).
First slot LHS is a bare symbol. outcome_expr <- first_f[[2L]]; if !is.symbol(outcome_expr), the function aborts. This rejects function calls on the LHS such as log(y). The error data is list(slot = arg_names[[1L]], received = deparse(outcome_expr, width.cutoff = 500L)). The outcome name is then extracted via as.character(outcome_expr).
Subsequent slots are one-sided. For i in 2:length(args), length(f) != 2L triggers an abort. The error data is list(slot = arg_names[[i]], received = deparse(f, width.cutoff = 500L)).
No intercept suppression. Iterates seq_along(args) and calls .gdpar_check_no_intercept_suppression(args[[i]], arg_names[[i]]) for every formula.

Set membership (slot names must be a subset of the family's eligible param_specs) is not enforced here; it is deferred to gdpar() once both the formula set and the family are available.

The function is exported (@export).

`gdpar_bf(...)`

Purpose

Exported brms-style sugar constructor that produces a gdpar_formula_set from a sequence of two-sided formulas. It is a convenience wrapper around gdpar_formula_set(), designed to feel familiar to users of brms::bf. The brms package is not a runtime dependency.

Arguments

Argument	Type	Meaning
`...`	Two-sided formula objects	The first carries the outcome on its LHS and defaults to slot name `"mu"`. Subsequent formulas carry the canonical parameter name on their LHS (e.g. `sigma ~ a(x)`).

Mathematics

The sugar translates a sequence of two-sided formulas into the canonical representation. For the default case:

$$ \texttt{gdpar_bf}\big(y \sim a(x_1),;\sigma \sim a(x_2)\big) ;\equiv; \texttt{gdpar_formula_set}\big(\mu = y \sim a(x_1),;\sigma = \sim a(x_2)\big) $$

When the first argument is explicitly named (e.g. theta = y ~ a(x)), that name overrides the default "mu":

$$ \texttt{gdpar_bf}(\theta = y \sim a(x)) ;\equiv; \texttt{gdpar_formula_set}(\theta = y \sim a(x)) $$

Returns

An object of class c("gdpar_formula_set", "list"), identical in structure to what gdpar_formula_set() returns (components: outcome, formulas, param_names, env).

Notes

The function body is not present in this source section. Only the roxygen documentation (@param, @return, @examples, @seealso, @export) appears in section 1 of 3. The implementation is expected in a subsequent section; based on the documentation, it internally calls gdpar_formula_set() after converting the two-sided formulas into the canonical form (first formula's LHS becomes the outcome, subsequent formulas' LHSs become slot names and their RHSs become one-sided formulas).
The function is exported (@export).
No S3 dispatch is involved at the constructor level.

`gdpar_bf(...)`

Purpose User-facing constructor that assembles a gdpar_formula_set from a variable number of formula arguments. It is the primary entry point for declaring the outcome formula together with one or more parameter (slot) formulas in a single call, performing full validation of shapes, names, and duplicates before delegating to gdpar_formula_set.

Arguments

... : formula objects. The first must be a two-sided formula whose left-hand side is the outcome variable (e.g. y ~ a(x)). Subsequent formulas must be two-sided with a single symbol on the left-hand side naming the parameter slot (e.g. sigma ~ a(x)), or one-sided formulas supplied via a named argument whose name matches the intended slot (e.g. sigma = ~ a(x)). Argument names are optional for positional use; when present they must agree with the LHS symbol of the corresponding formula.

Returns A gdpar_formula_set object, obtained by do.call(gdpar_formula_set, args) where args is the validated, named list of formulas. The first slot is named "mu" when no explicit name is supplied; subsequent slots are named after their LHS symbol (or their explicit argument name).

Notes

Raises a gdpar_input_error (via gdpar_abort) when:
- no arguments are supplied;
- any argument is not a formula;
- the first formula is not two-sided (length 3);
- any formula after the first is not two-sided;
- the LHS of a non-first formula is not a single symbol;
- an explicit argument name disagrees with the LHS symbol of its formula;
- duplicate slot names are detected.
For formulas after the first, the two-sided formula is converted to a one-sided formula ~ <RHS> preserving the original environment, before being placed in the named list passed to gdpar_formula_set.
The first formula is left untouched (still two-sided) in the list handed to gdpar_formula_set; downstream code is expected to extract the outcome from it.
Error objects carry a data field with diagnostic information (position, received class, duplicates, etc.).
Side effects: none beyond error signaling and construction of the returned object.

`print.gdpar_formula_set(x, ...)`

Purpose S3 print method for objects of class gdpar_formula_set. Renders a compact human-readable summary showing the number of slots K, the outcome variable, the parameter names in declaration order, and each slot's formula.

Arguments

x : a gdpar_formula_set object. Expected to be a list with components outcome (character scalar), param_names (character vector), and formulas (named list of formulas).
... : unused; present for S3 generic compatibility.

Returns Invisibly returns x.

Notes

Dispatches on class gdpar_formula_set.
Uses deparse(f, width.cutoff = 500L) to render each formula on a single logical line.
No validation of x is performed; malformed inputs may produce partial or confusing output.

`[[.gdpar_formula_set(x, i)`

Purpose S3 double-bracket subsetting operator for gdpar_formula_set objects. Extracts a single formula by slot name or integer index.

Arguments

x : a gdpar_formula_set object containing a formulas component (a named list).
i : a character scalar (slot name) or an integer scalar (position), passed directly to [[ on x$formulas.

Returns The single formula stored at slot i (not wrapped in a list).

Notes

Dispatches on class gdpar_formula_set.
Inherits the indexing semantics of R's [[ for lists: out-of-bounds integer indices return NULL, and partial matching may apply for character indices.
No re-validation is performed; the return is a bare formula.

`[.gdpar_formula_set(x, i)`

Purpose S3 single-bracket subsetting operator for gdpar_formula_set objects. Returns a named list of formulas for the requested slots.

Arguments

x : a gdpar_formula_set object containing a formulas component.
i : a character vector of slot names or an integer vector of positions, passed directly to [ on x$formulas.

Returns A named list of formulas (a plain list, not a gdpar_formula_set). The documentation explicitly notes that downstream re-validation requires reconstruction via gdpar_formula_set.

Notes

Dispatches on class gdpar_formula_set.
The return value loses the gdpar_formula_set class intentionally; callers needing a validated set must rebuild one.
Indexing semantics (recycling, out-of-bounds producing NA, partial matching) follow base R list behavior.

`names.gdpar_formula_set(x)`

Purpose S3 names method for gdpar_formula_set objects. Returns the slot (parameter) names in declaration order.

Arguments

x : a gdpar_formula_set object containing a param_names component (character vector).

Returns A character vector equal to x$param_names, giving the slot names in declaration order.

Notes

Dispatches on class gdpar_formula_set.
No validation is performed; the value is returned verbatim from x$param_names.

`length.gdpar_formula_set(x)`

Purpose S3 length method for gdpar_formula_set objects. Reports the number of parameter slots K.

Arguments

x : a gdpar_formula_set object containing a formulas component (a named list).

Returns An integer scalar equal to length(x$formulas), i.e. the number of slots (including the outcome slot mu).

Notes

Dispatches on class gdpar_formula_set.
The returned count includes the outcome slot, so it is the total number of formulas stored, not the number of auxiliary parameters.

`.gdpar_rhs_has_amm_calls(formula)`

Purpose Internal predicate used by the dispatch logic in gdpar (sub-phase 8.3.3, decision P-dispatch) to decide whether a classic two-sided formula should be routed through the new K-individual parser path or through the legacy single-amm_spec path. Returns TRUE when at least one top-level summand of the RHS is a call to one of the AMM wrappers a, b, or W.

Arguments

formula : a classic two-sided (or one-sided) formula object, or any R object.

Returns A logical scalar. FALSE when formula is not a formula, or when no top-level summand is a call to a, b, or W; TRUE otherwise.

Mathematics

Let the RHS expression be $e$. Define a predicate $H(e)$ recursively:

$$ H(e) = \begin{cases} H(e_2) \lor H(e_3) & \text{if } e \text{ is a binary call } e_1 + e_2 + e_3 \text{ with } e_1 \equiv +, \ \mathbb{1}{\mathrm{head}(e) \in {a, b, W}} & \text{if } e \text{ is a call with a symbol head}, \ \text{FALSE} & \text{otherwise.} \end{cases} $$

The function returns $H(e)$ where $e$ is formula[[3L]] for two-sided formulas or formula[[2L]] for one-sided formulas.

Notes

Non-throwing by design: an invalid RHS that the parser would later reject still returns TRUE here when an AMM wrapper is detected, so that dispatch routes the call to the parser which produces the canonical error message.
Only top-level summands of a + tree are inspected; nested calls inside other operators are not traversed (the recursion only descends through binary +).
A call head must be a symbol; calls with non-symbol heads (e.g. :: or $-qualified names) do not trigger TRUE.
Returns FALSE immediately when formula does not inherit from "formula".

`.gdpar_is_default_amm_spec(amm)`

Purpose Internal predicate that detects whether an amm_spec object is the canonical default Level 0 spec (no active components). Used by the dispatch in gdpar to treat an uncustomized amm argument as effectively absent when the new K-individual paths are taken.

Arguments

amm : any R object.

Returns A logical scalar. TRUE when amm inherits from "amm_spec" and all of the following hold: amm$a is NULL, amm$b is NULL, amm$W is NULL, amm$x_vars is NULL, amm$p is NULL or equal to 1L, and amm$dims is NULL. FALSE otherwise.

Notes

The check on p uses is.null(amm$p) || isTRUE(amm$p == 1L), so a missing p component also qualifies as default.
Accesses components via $, which returns NULL for absent components in lists; thus an amm_spec lacking some of these fields entirely is still considered default.
Does not throw; any non-amm_spec input returns FALSE via short-circuit of &&.

`.gdpar_parse_amm_formula(rhs_formula, slot_name)`

Purpose Internal parser that walks the RHS expression of a one-sided formula and extracts the AMM components declared via the special wrapper calls a(...), b(...), and W(). Produces a plain list descriptor that downstream dispatch combines with the external W_basis argument to build an amm_spec per slot.

Arguments

rhs_formula : a one-sided (or two-sided) formula whose RHS is to be parsed.
slot_name : a character scalar giving the slot name from the enclosing gdpar_formula_set, used solely for error messages.

Returns A named list with three components:

a_formula : a one-sided formula ~ <expr> when a(<expr>) appears in the RHS; NULL otherwise. The formula carries the environment of rhs_formula.
b_formula : a one-sided formula ~ <expr> when b(<expr>) appears in the RHS; NULL otherwise.
W_present : a logical scalar; TRUE when W() appears in the RHS.

Mathematics

Let the RHS expression be $e$, split into top-level summands $s_1, \ldots, s_n$ by .gdpar_split_amm_summands. For each summand $s_j$:

If $s_j \equiv 1$ (the symbol 1 or the numeric value 1), it is accepted as a Level 0 anchor-only term and skipped.
Otherwise $s_j$ must be a call whose head is a symbol in $\{a, b, W\}$.
For head $W$: the call must have zero arguments; sets W_present $\leftarrow$ TRUE.
For head $a$ or $b$: the call must have exactly one argument; the interior expression is wrapped as $\sim\,\text{interior}$ and stored in a_formula or b_formula respectively.

The output is $(A, B, w)$ where $A$ is a_formula, $B$ is b_formula, and $w$ is W_present.

Notes

Raises gdpar_internal_error (via gdpar_abort) when rhs_formula is not a formula; this is treated as an internal invariant violation.
Raises gdpar_input_error when:
- a summand is a bare term (not a call and not 1) — the message (partly in Spanish) instructs to wrap terms in a(), b(), or W(), or use ~ 1 for a degenerate Level 0 AMM;
- a call has a non-symbol head;
- a call's head is not one of a, b, W (unknown function);
- W() is called with any arguments (must be zero-argument);
- W() appears more than once in the same slot;
- a() or b() is called with a number of arguments other than one;
- a() or b() appears more than once in the same slot.
The interior of a() / b() is wrapped in ~ <deparse(interior)> using stats::as.formula with env = environment(rhs_formula), then passed to .gdpar_check_no_intercept_suppression for an additional validation (presumably rejecting 0 or -1 intercept-suppression syntax; the exact behavior is defined in that helper, not shown in this section).
The splitter .gdpar_split_amm_summands is invoked to flatten binary + trees; non-+ binary operators at the top level are rejected there.
P5 canonization: a bare RHS of 1 is accepted as Level 0 (anchor only); any other bare RHS without AMM wrappers aborts.
Error objects carry a data field with slot, term, fname, nargs, or allowed diagnostics as appropriate.

`.gdpar_split_amm_summands(expr, slot_name)`

Purpose

Internal helper used during AMM (Additive Multiplicative Model) formula parsing. It recursively decomposes an unevaluated R expression into a flat list of "summands" by walking down the top-level + nodes of the parse tree. It simultaneously enforces a syntactic restriction: at the top level of a slot's right-hand side, only additive composition (+) of wrapper calls (e.g. a(), b(), W()) is permitted; any other binary operator at the top level is rejected with a structured error. This guarantees that interactions and transformations are pushed inside the wrappers (e.g. a(x1:x2)) rather than appearing as top-level terms.

Arguments

expr — language (a call) or symbol/literal. The unevaluated R expression representing (a sub-tree of) the right-hand side of a slot formula. May be a single call, a symbol, or a literal constant.
slot_name — character(1). The name of the parameter slot currently being parsed; used only for constructing diagnostic error messages.

Mathematics

Given an expression tree $E$, the function computes a list $S(E)$ of top-level additive summands:

$$ S(E) = \begin{cases} S(E_L) \cup S(E_R) & \text{if } E = (E_L + E_R), \\ {E} & \text{otherwise,} \end{cases} $$

where the union preserves left-to-right order (i.e. c(S(E_L), S(E_R))). Before reaching the recursive base case, the function inspects any three-element call whose head is not +: if the head is one of $\{-, *, /, :, \wedge, |\}$, parsing is aborted.

Returns

A list of length $\geq 1$ whose elements are unevaluated R expressions (calls, symbols, or literals). Each element is a maximal sub-expression that is not itself a top-level + call. The list is in left-to-right reading order of the original expression.

Notes

The recursion is triggered only when expr is a call of length exactly 3L whose first element is identical to the symbol +. All other three-element calls are routed to the validation branch.
The set of forbidden top-level binary operators is exactly c("-", "*", "/", ":", "^", "|"). Note that + is handled separately (recursion) and unary - (a call of length 2) is not intercepted here.
When a forbidden operator is detected, the function calls gdpar_abort with:
- class = "gdpar_input_error",
- data = list(slot = slot_name, operator = op),
- a message (in Spanish) explaining that AMM canonizes only additive composition of the wrappers a(), b(), W(), and that interactions/transformations must go inside the wrapper.
The function performs no evaluation of expr; it is purely structural.
No side effects other than the potential abort.

`.gdpar_formula_set_to_amm_spec_list(fs, W_basis_arg = NULL)`

Purpose

Internal converter that bridges the user-facing multi-slot formula specification (a gdpar_formula_set) and the lower-level AMM machinery. For each parameter slot it extracts the right-hand side of the stored formula, parses it via .gdpar_parse_amm_formula(), validates the consistency between any declared W() term and the externally supplied W_basis object, and assembles an amm_spec object. The result is a named list of amm_spec objects — one per slot — suitable for K-individual dispatch in gdpar(). For $K = 1$ the function still returns a length-one named list; the caller is responsible for unwrapping it when routing to the legacy single-amm_spec path to preserve bit-exact backward compatibility with Block 7.

Arguments

fs — gdpar_formula_set. An S3 object containing at least the components $param_names (a character vector of slot names) and $formulas (a named list of formula objects, one per slot). The function aborts with an internal error if fs does not inherit from "gdpar_formula_set".
W_basis_arg — W_basis or NULL. The external W argument passed by the user to gdpar(). When non-NULL it must inherit from "W_basis"; it is attached to every slot whose parsed formula declares a W() term. Defaults to NULL.

Mathematics

Let the formula set contain $K$ slots with names $\{k_1, \dots, k_K\} = \text{fs\$param\_names}$. For each slot $k$ the function extracts the right-hand side $R_k$ of the stored formula $f_k$:

$$ R_k = \begin{cases} f_k[,c(1,3),] & \text{if } \text{length}(f_k) = 3 \quad (\text{i.e. } \sim \text{ with LHS}), \\ f_k & \text{otherwise} \quad (\text{i.e. } \sim \text{ with no LHS}), \end{cases} $$

and parses it into components $(a_k, b_k, w_k^{\text{present}}) = \text{.gdpar\_parse\_amm\_formula}(R_k, k)$. The effective $W$ basis for slot $k$ is

$$ W_k = \begin{cases} \text{W_basis_arg} & \text{if } w_k^{\text{present}} = \text{TRUE and } \text{W_basis_arg} \neq \text{NULL}, \\ \text{NULL} & \text{if } w_k^{\text{present}} = \text{FALSE}. \end{cases} $$

The output is the named list

$$ \text{out} = \bigl{, k \mapsto \text{amm_spec}(a = a_k,; b = b_k,; W = W_k) ;\big|; k \in {k_1,\dots,k_K} ,\bigr}. $$

Returns

A named list of length length(fs$param_names), with names equal to fs$param_names. Each element is an amm_spec object constructed by amm_spec(a = parsed$a_formula, b = parsed$b_formula, W = W_for_slot). Slots without a W() term receive W = NULL; slots with a W() term receive the (validated) W_basis_arg.

Notes

Input class check. If !inherits(fs, "gdpar_formula_set"), the function aborts via gdpar_abort with class = "gdpar_internal_error" and the message "Internal error: .gdpar_formula_set_to_amm_spec_list expected a gdpar_formula_set.". This is treated as a programmer error, not a user error.
RHS extraction. The right-hand side is obtained by indexing the formula object: when length(f) == 3L (a two-sided formula y ~ ...), rhs_only <- f[c(1L, 3L)] reconstructs a one-sided formula ~ ...; otherwise the formula is used as-is. This avoids deparsing or re-parsing and preserves the original language object.
W() consistency (decision P3, sub-phase 8.3.3). Two abort conditions are checked, in order, for each slot where parsed$W_present is TRUE:
1. If W_basis_arg is NULL, abort with class = "gdpar_input_error", data = list(slot = k), and a message instructing the user to pass an explicit W = W_basis(...) or remove the W() term.
2. Else if W_basis_arg does not inherit from "W_basis", abort with class = "gdpar_input_error", data = list(received = class(W_basis_arg)), and a message stating that the external W argument must be of class W_basis.
The W_basis_arg is shared across all slots that declare W(); there is no per-slot W override at this layer.
The function does not evaluate any formula; it only manipulates language objects and delegates parsing to .gdpar_parse_amm_formula.
No S3 dispatch is performed by this function itself; it relies on the amm_spec() constructor (whose dispatch is external to this section).
The loop iterates over fs$param_names in their stored order, so the output list preserves that order.
For $K = 1$ the return value is a length-one named list; the function does not unwrap it (the caller handles backward compatibility).

R/gdpar_coef.R

`build_coef_term_df(draws_mat, term_names)`

Purpose (role in the package). Constructs a tidy data.frame of posterior summary statistics for a generic block of model parameters. It is the workhorse called by coef.gdpar_fit() for both the "multi" and "scalar" branches, producing one row per parameter term with its mean and central credible-interval quantiles.

Arguments

Argument	Type	Meaning
`draws_mat`	`matrix` or `NULL`	Matrix of posterior draws with rows = MCMC/HMC samples and columns = parameters. If `NULL` or with zero columns the function returns `NULL`.
`term_names`	`character`	Column labels (parameter names) whose length must equal `ncol(draws_mat)`.

Mathematics

For each column $j$ of draws_mat with draws $\{\theta_j^{(s)}\}_{s=1}^S$ the function computes

$$\bar{\theta}_j = \frac{1}{S}\sum_{s=1}^S \theta_j^{(s)},$$

and the empirical quantiles

$$q_j^{(\alpha)} = \text{Quantile}\bigl({\theta_j^{(s)}},;\alpha\bigr), \quad \alpha \in {0.05,;0.50,;0.95}.$$

Returns

A data.frame with columns term (character), mean, q05, q50, q95 (all numeric), one row per parameter. Returns NULL when draws_mat is NULL or has zero columns.

Notes

Raises a class gdpar_internal_error (via gdpar_abort) when length(term_names) != ncol(draws_mat).
stats::quantile is called with names = FALSE and na.rm = TRUE.
The returned mean and quantile columns are unnamed.
stringsAsFactors = FALSE is explicitly set; row.names = NULL.

`build_coef_W_df(draws_mat, basis_dim, x_names)`

Purpose (role in the package). Builds a long-format data.frame summarising the posterior draws of a $W$ block (the spatially-varying basis-coefficient matrix). Each row of the output corresponds to one (basis index, covariate name) combination, reflecting the row-major flattening convention used inside gdpar for the $W$ matrix.

Arguments

Argument	Type	Meaning
`draws_mat`	`matrix` or `NULL`	Draw matrix of dimension $(S,\;\texttt{basis\_dim} \times d_x)$ where $S$ is the number of posterior samples and $d_x = \texttt{length(x\_names)}$. Columns store a row-major flattening of a `basis_dim` $\times$ `d_x` matrix. Returns `NULL` if this is `NULL` or has zero columns.
`basis_dim`	`integer(1)`	Number of spatial basis functions $K_{\text{basis}}$ (i.e.\ the number of rows of the $W$ matrix per covariate).
`x_names`	`character`	Names of the $d_x$ covariates whose coefficients are spatially varying.

Mathematics

The draws matrix stores a row-major vectorisation of

$$\mathbf{W} \in \mathbb{R}^{K_{\text{basis}} \times d_x},$$

so that column $j$ of draws_mat maps to element $(b, x)$ of $\mathbf{W}$ via $j = (x-1)\,K_{\text{basis}} + b$. For each such column the function computes $\bar{w}_{b,x}$ and $q_{b,x}^{(\alpha)}$ exactly as in build_coef_term_df.

Returns

A data.frame with columns basis_idx (integer, 1-based spatial-basis index), x_name (character, covariate name), mean, q05, q50, q95. Row count is basis_dim * length(x_names). Returns NULL when any degenerate condition is met (NULL draws, zero columns, basis_dim == 0, or empty x_names).

Notes

basis_idx is constructed via rep(seq_len(basis_dim), times = length(x_names)) and x_name via rep(x_names, each = basis_dim), matching the row-major ordering $b$ varying fastest.
Raises gdpar_internal_error when ncol(draws_mat) != basis_dim * length(x_names).
Quantiles and means are unnamed; na.rm = TRUE.

`build_coef_theta_ref_df(draws_mat, p)`

Purpose (role in the package). Produces a tidy data.frame of posterior summaries for the reference-anchor parameter vector $\boldsymbol{\theta}_{\text{ref}}$ in the ungrouped (single-group) case. Each row represents one of the $p$ coordinates.

Arguments

Argument	Type	Meaning
`draws_mat`	`matrix`	Posterior draws of dimension $(S, p)$ where $S$ is the number of samples and $p$ the number of model coordinates.
`p`	`integer(1)`	Expected number of coordinates; must equal `ncol(draws_mat)`.

Mathematics

For coordinate $k \in \{1,\dots,p\}$:

$$\bar{\theta}_{\text{ref},k} = \frac{1}{S}\sum_{s=1}^S \theta_{\text{ref},k}^{(s)}, \qquad q_k^{(\alpha)} = \text{Quantile}\bigl({\theta_{\text{ref},k}^{(s)}},;\alpha\bigr), \quad \alpha \in {0.05,0.50,0.95}.$$

Returns

A data.frame with columns k (integer, $1,\dots,p$), mean, q05, q50, q95, and exactly $p$ rows. Returns no NULL guard; assumes non-empty input.

Notes

Raises gdpar_internal_error if ncol(draws_mat) != p.
k is always seq_len(p), guaranteeing contiguous integer labels.
Called internally by build_coef_theta_ref_df_grouped when J_groups == 1.

`build_coef_theta_ref_df_grouped(arr, J_groups, p)`

Purpose (role in the package). Generalises build_coef_theta_ref_df to the grouped case (Block 6.5 of the model). Handles both univariate ($p=1$, 2-D matrix input) and multivariate ($p>1$, 3-D array input) scenarios. When J_groups == 1 it collapses to the ungrouped schema for backward compatibility.

Arguments

Argument	Type	Meaning
`arr`	`matrix` or `array`	Posterior draws. Shape depends on context: $(S, J_{\text{groups}})$ when $p=1$ (matrix), or $(S, J_{\text{groups}}, p)$ (3-D array) when $p \ge 1$. When `J_groups == 1`, several shapes are tolerated (see Notes).
`J_groups`	`integer(1)`	Number of groups $J_{\text{groups}} \ge 1$.
`p`	`integer(1)`	Number of coordinates per group.

Mathematics

When J_groups > 1 and $p \ge 1$, for group $g$ and coordinate $k$:

$$\bar{\theta}_{\text{ref},g,k} = \frac{1}{S}\sum_{s=1}^S \theta_{\text{ref},g,k}^{(s)}, \qquad q_{g,k}^{(\alpha)} = \text{Quantile}\bigl({\theta_{\text{ref},g,k}^{(s)}},;\alpha\bigr).$$

When J_groups == 1, the group dimension is dropped and the output schema matches build_coef_theta_ref_df.

Returns

If J_groups == 1: delegates to build_coef_theta_ref_df, returning a data.frame with columns (k, mean, q05, q50, q95) and $p$ rows.
If J_groups > 1 and arr is a matrix ($p=1$): data.frame with columns (g, k, mean, q05, q50, q95), $J_{\text{groups}}$ rows, and k is always 1L.
If J_groups > 1 and arr is a 3-D array: data.frame with columns (g, k, mean, q05, q50, q95), $J_{\text{groups}} \times p$ rows, constructed by looping over groups and coordinates.

Notes

When J_groups == 1, the function accepts three input shapes:
1. A 3-D array with dim[2] == 1: the single group slice arr[, 1, ] is extracted.
2. A matrix with one column and p == 1: used directly.
3. A matrix with p columns: used directly (the ungrouped case). Any other shape raises gdpar_internal_error.
When J_groups > 1 and arr is a matrix, it must have ncol == J_groups and p == 1; otherwise gdpar_internal_error.
When J_groups > 1 and arr is a 3-D array, dimensions must be exactly (S, J_groups, p).
The loop-based construction for the 3-D case appends single-row data.frames to a list, then calls do.call(rbind, out_rows).

`build_coef_hyper_df(draws_mat, p)`

Purpose (role in the package). Constructs a tidy data.frame of posterior summaries for a hyperparameter vector (specifically $\boldsymbol{\mu}_{\theta_{\text{ref}}}$ or $\boldsymbol{\sigma}_{\theta_{\text{ref}}}$) of dimension $p$. The output schema is identical to build_coef_theta_ref_df.

Arguments

Argument	Type	Meaning
`draws_mat`	`matrix` or `NULL`	Posterior draws of dimension $(S, p)$, as returned by `.extract_mu_sigma_theta_ref`. Returns `NULL` if `NULL`.
`p`	`integer(1)`	Expected number of coordinates; must equal `ncol(draws_mat)`.

Mathematics

Identical per-coordinate mean and quantile computation as in build_coef_theta_ref_df.

Returns

A data.frame with columns k ($1,\dots,p$), mean, q05, q50, q95, and $p$ rows. Returns NULL when draws_mat is NULL.

Notes

Raises gdpar_internal_error if ncol(draws_mat) != p (when draws_mat is not NULL).
Structurally identical to build_coef_theta_ref_df except for the explicit NULL guard, reflecting that hyperparameters may be absent when there is no grouping.

`validate_coef_slot(slot, p, expected_cols, slot_name)`

Purpose (role in the package). Internal validation helper that checks the structural integrity of a per-$k$ coefficient list slot inside a gdpar_coef object. Each such slot is either NULL (component absent at the AMM level) or a list of length $p$ whose elements are themselves either NULL (coordinate inactive) or data.frames with specific required columns.

Arguments

Argument	Type	Meaning
`slot`	`list` or `NULL`	The coefficient slot to validate.
`p`	`integer(1)`	Expected list length (number of coordinates).
`expected_cols`	`character`	Column names that every non-`NULL` `data.frame` element must contain.
`slot_name`	`character(1)`	Human-readable name used in error messages.

Mathematics

None (pure validation logic).

Returns

Returns invisible(NULL) unconditionally on success. Side-effect-only on failure: raises gdpar_internal_error.

Notes

Validation proceeds in order:

If slot is NULL, return immediately (component absent — valid).
Check is.list(slot) and length(slot) == p; abort otherwise.
For each $k \in 1,\dots,p$: if element $k$ is NULL skip (coordinate inactive); otherwise verify it is a data.frame and that all expected_cols are present in names(elem).

`validate_coef_theta_ref(theta_ref_df, p, J_groups)`

Purpose (role in the package). Validates the structural integrity of the theta_ref slot of a gdpar_coef object. Handles both ungrouped ($J_{\text{groups}} = 1$, no g column) and grouped schemas.

Arguments

Argument	Type	Meaning
`theta_ref_df`	`data.frame`	The `theta_ref` component to validate.
`p`	`integer(1)`	Expected number of coordinate levels ($1,\dots,p$).
`J_groups`	`integer(1)`	Expected number of groups. Defaults to `1L`.

Mathematics

None (pure validation logic).

Returns

Returns invisible(NULL) on success. Raises gdpar_internal_error on any failure.

Notes

Validation steps in order:

theta_ref_df must be a data.frame.
Expected columns are c("k", "mean", "q05", "q50", "q95"); if column "g" is present, "g" is prepended to the expected set.
All expected columns must exist; missing ones trigger an abort.
Row count must equal $J_{\text{groups}} \times p$ (when g present) or $p$ (when ungrouped).
Ungrouped path: theta_ref_df$k must contain exactly the integers $1,\dots,p$ without gaps (checked via sort(as.integer(...))).
Grouped path: theta_ref_df$g must contain exactly $1,\dots,J_{\text{groups}}$ and theta_ref_df$k must contain exactly $1,\dots,p$, each without gaps.

`validate_coef_hyper_df` (signature: `validate_coef_hyper_df(hyper_df, p, hyper_name)`)

Purpose (role in the package). Validates a hyperparameter summary data.frame (either mu_theta_ref or sigma_theta_ref) inside a gdpar_coef object. Expects a data.frame with columns (k, mean, q05, q50, q95) and exactly $p$ rows. A NULL slot is accepted (no grouping present).

Arguments

Argument	Type	Meaning
`hyper_df`	`data.frame` or `NULL`	The hyperparameter summary to validate. `NULL` is accepted (component inactive / no grouping).
`p`	`integer(1)`	Expected number of coordinate rows.
`hyper_name`	`character(1)`	Human-readable name used in error messages (typically `"mu_theta_ref"` or `"sigma_theta_ref"`).

Mathematics

None (pure validation logic).

Returns

Returns invisible(NULL) on success. Raises gdpar_internal_error on failure.

Notes

The function body is truncated in this section (the section ends immediately after the roxygen @noRd tag). Based on the documentation header, the expected behaviour is: accept NULL silently; otherwise verify the object is a data.frame with columns c("k", "mean", "q05", "q50", "q95") and exactly p rows. The full implementation is in a subsequent section.

`validate_coef_hyper(df, p, slot_name)`

Purpose Internal validation helper that checks whether a hyperparameter summary data.frame (such as mu_theta_ref or sigma_theta_ref) conforms to the expected structure. Called during new_gdpar_coef() construction.

Arguments

Argument	Type	Meaning
`df`	`data.frame` or `NULL`	The hyperparameter summary table to validate. If `NULL` the function returns silently (component is absent).
`p`	integer scalar	Expected number of rows (one per coordinate).
`slot_name`	character scalar	Human-readable name of the slot being validated; used in error messages.

Mathematics

No formula is implemented. The function enforces the invariant that, when present, a hyperparameter summary data.frame must contain exactly the columns $${\texttt{k},;\texttt{mean},;\texttt{q05},;\texttt{q50},;\texttt{q95}}$$ and must have exactly $p$ rows.

Returns invisible(NULL) on success. Never returns a value on failure; an error of class "gdpar_internal_error" is raised instead.

Notes

If df is NULL, returns immediately—this is the expected path when the hyperparameter component is absent.
If df is not a data.frame, gdpar_abort() is called with a message stating the slot must be NULL or a data.frame.
Missing columns are detected via setdiff(expected, names(df)) and reported in a single comma-separated list.
Row-count mismatch triggers an error message quoting the actual and expected ($p$) row counts.
All errors carry class "gdpar_internal_error".

`new_gdpar_coef(theta_ref, a, b, W, p, mu_theta_ref, sigma_theta_ref, J_groups, group_levels, summary_stats)`

Purpose Internal constructor for the gdpar_coef S3 class. Validates every slot and assembles a well-formed object that downstream methods can trust without further checks. This is the sole entry point for creating gdpar_coef instances.

Signature

new_gdpar_coef(theta_ref, a = NULL, b = NULL, W = NULL,
               p,
               mu_theta_ref = NULL,
               sigma_theta_ref = NULL,
               J_groups = 1L,
               group_levels = NULL,
               summary_stats = c("mean", "q05", "q50", "q95"))

Arguments

Argument	Type	Meaning
`theta_ref`	`data.frame`	Reference-anchored parameter summaries with columns `(k, mean, q05, q50, q95)` and `p` rows. May also include a `g` column for grouped models.
`a`	`NULL` or list of length `p`	Per-coordinate intercept/additive component slot. Each entry is `NULL` (inactive) or a `data.frame` with columns `(term, mean, q05, q50, q95)`.
`b`	`NULL` or list of length `p`	Per-coordinate slope/multiplier component slot. Same structure as `a`.
`W`	`NULL` or list of length `p`	Per-coordinate basis-expansion component slot. Each entry is `NULL` or a `data.frame` with columns `(basis_idx, x_name, mean, q05, q50, q95)`.
`p`	numeric scalar	Number of coordinates (time points, locations, etc.). Must be a positive integer.
`mu_theta_ref`	`NULL` or `data.frame`	Hyperparameter summary for the prior mean of $\theta_{\text{ref}}$. When present, validated with `validate_coef_hyper`.
`sigma_theta_ref`	`NULL` or `data.frame`	Hyperparameter summary for the prior standard deviation of $\theta_{\text{ref}}$. When present, validated with `validate_coef_hyper`.
`J_groups`	integer scalar	Number of distinct groups in a grouped model. Defaults to `1L`.
`group_levels`	`NULL` or vector	Labels for the `J_groups` groups; stored as-is.
`summary_stats`	character vector	Names of the summary statistics carried by the object (e.g. `c("mean","q05","q50","q95")`).

Returns A list of class "gdpar_coef" with elements theta_ref, a, b, W, p, mu_theta_ref, sigma_theta_ref, J_groups, group_levels, summary_stats.

Notes

p is coerced to integer after validation. The check requires it to be a length-1 numeric, non-NA, $\geq 1$, and equal to its own integer cast.
J_groups is coerced to integer unconditionally.
Validation is delegated to helpers: validate_coef_theta_ref for theta_ref; validate_coef_slot for a, b, W (each with its expected column set); validate_coef_hyper for mu_theta_ref and sigma_theta_ref.
Any validation failure raises an error of class "gdpar_internal_error" via gdpar_abort().
The returned object uses structure(...) to attach the "gdpar_coef" class.

`count_active_coords(slot)`

Purpose Internal utility that counts how many coordinates in a component slot (a, b, or W) have a non-empty data.frame, i.e. how many coordinates actively contribute a component.

Signature

count_active_coords(slot)

Arguments

Argument	Type	Meaning
`slot`	`NULL` or list	A per-coordinate component list (e.g. `x$a`).

Returns An integer scalar: the number of list elements that are non-NULL data.frames with at least one row. Returns 0L when slot is NULL.

Notes

Uses vapply with logical(1L) for type-safe element-wise testing.
An element is "active" if and only if all three conditions hold: !is.null(e), is.data.frame(e), and nrow(e) > 0L.
Called by print.gdpar_coef and summary.gdpar_coef to report component activity counts.

`print.gdpar_coef(x, level, digits, ...)`

Purpose S3 print method for gdpar_coef objects. Provides three verbosity levels for inspecting coefficient summaries.

Signature

print.gdpar_coef(x,
                 level = c("global", "coord", "full"),
                 digits = 4L,
                 ...)

Arguments

Argument	Type	Meaning
`x`	`gdpar_coef`	Object to print.
`level`	character, one of `"global"`, `"coord"`, `"full"`	Controls verbosity. `"global"` shows only the `theta_ref` table and active-component counts. `"coord"` appends per-coordinate component means (only the `mean` column). `"full"` appends per-coordinate data.frames with all summary columns.
`digits`	integer scalar	Number of significant digits passed to `format_coef_df()` for numeric formatting.
`...`	ignored	Present for S3 generic compatibility.

Returns invisible(x) — the input object, unmodified.

Notes

The header prints: class tag <gdpar_coef>, p, conditionally J_groups (only when $> 1$), summary_stats, and active-component counts formatted as a(n/p) b(n/p) W(n/p).
If mu_theta_ref or sigma_theta_ref is non-NULL, their data.frames are printed before theta_ref.
At level = "global" a hint is printed suggesting the other levels and as.data.frame().
At level = "coord" or "full", the method iterates $k = 1, \ldots, p$. For each coordinate, if at least one of a[[k]], b[[k]], W[[k]] is non-NULL and non-empty, its table is printed. At "coord" level only columns (term, mean) (for a/b) or (basis_idx, x_name, mean) (for W) are shown; at "full" level all columns are shown.
All numeric formatting is delegated to format_coef_df().
Uses match.arg() for level argument matching.

`format_coef_df(df, digits)`

Purpose Internal helper that formats numeric columns of a coefficient data.frame for pretty printing. Modifies numeric columns in-place (except integer-like columns) to character strings with a specified number of significant digits.

Signature

format_coef_df(df, digits)

Arguments

Argument	Type	Meaning
`df`	`data.frame` or `NULL`	The data.frame to format.
`digits`	integer scalar	Number of significant digits for `formatC()`.

Returns The same data.frame with selected numeric columns replaced by formatted character strings. Returns df unchanged if it is NULL or has zero rows.

Notes

Columns "g", "k", and "basis_idx" are skipped even if numeric, because they serve as integer identifiers rather than estimated quantities.
Formatting uses formatC(..., digits = digits, format = "g", flag = "-"), where "g" selects the shorter of %f / %e representation and "-" requests left-justification.
The function mutates and returns df in-place (standard R copy-on-modify semantics apply).

`summary.gdpar_coef(object, ...)`

Purpose S3 summary method that returns a compact list of aggregated statistics about a gdpar_coef object.

Signature

summary.gdpar_coef(object, ...)

Arguments

Argument	Type	Meaning
`object`	`gdpar_coef`	Object to summarize.
`...`	ignored	Present for S3 generic compatibility.

Mathematics

Computes the cross-coordinate average of posterior means: $$\bar{\theta}{\text{ref}} = \frac{1}{p}\sum{k=1}^{p} \theta_{\text{ref},k}^{\text{mean}}$$ via mean(object$theta_ref$mean).

Returns A named list with elements:

Element	Type	Content
`p`	integer	Number of coordinates.
`n_active`	named integer vector	Counts of active coordinates for components `a`, `b`, `W` (via `count_active_coords`).
`theta_ref_mean`	numeric scalar	Mean of the `mean` column of `theta_ref` across all coordinates.
`summary_stats`	character vector	The summary statistic names carried by the object.

Notes

The n_active vector has names "a", "b", "W".
mean() is called directly on object$theta_ref$mean; if theta_ref has a g column (grouped model), this averages over all group–coordinate combinations.

`as.data.frame.gdpar_coef(x, row.names, optional, ...)`

Purpose S3 method that coerces a gdpar_coef object into a single long-format (tidy) data.frame. Flattens the hierarchical per-coordinate, per-component structure into rows suitable for use with dplyr / ggplot2.

Signature

as.data.frame.gdpar_coef(x, row.names = NULL, optional = FALSE, ...)

Arguments

Argument	Type	Meaning
`x`	`gdpar_coef`	Object to coerce.
`row.names`	`NULL`	Ignored; required by the `as.data.frame` generic.
`optional`	logical	Ignored; required by the generic.
`...`	ignored	Present for generic compatibility.

Returns A data.frame with one row per summarized scalar coefficient and columns:

Column	Type	Content
`component`	character	`"theta_ref"`, `"mu_theta_ref"`, `"sigma_theta_ref"`, `"a"`, `"b"`, or `"W"`.
`g`	integer	Group index from `theta_ref` if a `g` column exists; `NA_integer_` for all other components.
`k`	integer	Coordinate index ($1, \ldots, p$).
`identifier`	character	`term` value for `a`/`b`; `basis_idx` (as character) for `W`; `NA_character_` for `theta_ref` and hyperparameter rows.
`x_name`	character	Predictor name from the `W` slot; `NA_character_` elsewhere.
`mean`	numeric	Posterior mean.
`q05`	numeric	5th percentile of the posterior.
`q50`	numeric	50th percentile (median) of the posterior.
`q95`	numeric	95th percentile of the posterior.

Notes

When x$theta_ref contains a column "g", each theta_ref row carries the corresponding integer group value; otherwise g is NA_integer_.
The helper function add_hyper(slot, comp_name) (defined inline) iterates over rows of mu_theta_ref / sigma_theta_ref and appends a row per entry. Its g column is always NA_integer_.
The helper function add_terms(slot, comp_name) (defined inline) iterates over coordinates and then over rows of the per-coordinate data.frame for components a and b, extracting the term column as the identifier.
For W, iteration is done directly (not via add_terms) because the identifier is basis_idx and an additional x_name column must be carried.
If no rows are accumulated (all slots are NULL), a zero-row data.frame with the correct column schema is returned.
Rows are accumulated in a list (rows) and combined with do.call(rbind, rows) at the end.

One-line formatter (truncated)

The section ends with the roxygen header for an exported function described as a "One-line formatter for gdpar_coef objects" but the function body is not included in this section. The documented signature is format.gdpar_coef(x, ...) (inferred from the generic dispatch implied by the @export tag and the parameter name x). It is documented as returning a length-1 character vector. Its implementation continues in section 3.

`format.gdpar_coef(x, ...)`

Purpose S3 method that provides a human-readable textual representation of a gdpar_coef object (the coefficient container for GDPAR models). It is automatically dispatched when format() or print() is called on an object of class "gdpar_coef", and is typically used for console display.

Arguments

Argument	Type	Meaning
`x`	`list` with class attribute `"gdpar_coef"`	The coefficient object whose structure is to be rendered. Expected components accessed are `$p`, `$a`, `$b`, and `$W`.
`...`	(ignored)	Additional arguments passed through the S3 generic; not used by this method.

Returns A single character string of the form:

<gdpar_coef> p=<p>, components=[theta_ref, a(<k_a>/<p>), b(<k_b>/<p>), W(<k_W>/<p>)]

where

$p$ is x$p (total number of predictors/anchors).
$k_a$, $k_b$, $k_W$ are the values returned by count_active_coords(x$a), count_active_coords(x$b), and count_active_coords(x$W) respectively — each counts how many coordinates in the corresponding array are non-zero (active).

The expression sprintf is used for interpolation; the %d placeholders are all supplied as integers.

Notes

S3 dispatch. The leading-dot naming convention .gdpar_coef after the generic name format is the standard R mechanism for S3 method dispatch. When format() (or print(), which calls format()) is invoked on any object whose class vector contains "gdpar_coef", this method is selected.
Dependency on count_active_coords. The helper function count_active_coords (defined elsewhere in the package) must return a scalar integer. If it returns a non-scalar or non-integer, sprintf will either coerce or warn.
No side effects. The function is purely informational; it does not modify x or any global state.
Assumptions on structure. The method assumes x contains list elements p, a, b, and W. If any are missing, a runtime error will be thrown at the point of access.

R/gdpar_loo.R

`gdpar_loo.R`

`gdpar_loo(fit, aggregation = c("subject", "cell"), r_eff = NULL, cores = 1L, ...)`

Purpose

Exported entry point for approximate leave-one-out cross-validation of a gdpar_fit object. It delegates to loo::loo() to perform Pareto-smoothed importance sampling LOO (PSIS-LOO), using the per-observation log-likelihood draws persisted by the Stan model in the generated quantity log_lik. For multivariate fits ($p > 1$) it offers two aggregation strategies that determine what counts as a single "observation" for the ELPD computation.

Arguments

Argument	Type	Meaning
`fit`	`gdpar_fit`	A fitted gdpar model object. Must inherit from `"gdpar_fit"` (checked via `assert_inherits`). The function accesses `fit$fit$draws(...)` (the underlying Stan/CmdStan fit) and `fit$amm$p` (the declared coordinate dimension).
`aggregation`	`character` scalar	One of `"subject"` (default) or `"cell"`. Matched via `match.arg`. Ignored when $p = 1$. `"subject"` sums log-likelihood over coordinates within each row; `"cell"` treats each $(i,k)$ pair as an independent observation.
`r_eff`	`numeric` vector or `NULL`	Relative effective sample sizes per observation. If `NULL` (default), computed from the draws via `loo::relative_eff()` using chain identifiers reconstructed from the draws array dimensions. Supplying it explicitly avoids recomputation across repeated LOO calls on the same fit.
`cores`	`integer` (coerced to `1L` by default)	Number of cores passed to `loo::loo()` and `loo::relative_eff()`. Defaults to `1L` (sequential) to avoid non-determinism.
`...`	(any)	Additional arguments forwarded to `loo::loo()`.

Mathematics

For a univariate fit ($p = 1$), the Stan model emits log_lik as a length-$n$ vector, and each of the $n$ rows of the input data is one observation.

For a multivariate fit ($p > 1$), the Stan model emits log_lik as an $n \times p$ matrix with $\texttt{log\_lik}[i,k] = \log p(y_{ik} \mid \theta_i[k])$. Under the coordinate-wise factorization

$$ p(y_i \mid \theta_i) = \prod_{k=1}^{p} D_k!\left(y_{ik} \mid \theta_i[k]\right), $$

the per-subject log-likelihood (aggregation "subject") is

$$ \log p(y_i \mid \theta_i) = \sum_{k=1}^{p} \log p(y_{ik} \mid \theta_i[k]). $$

For aggregation "cell", each $(i,k)$ pair is treated as an independent observation, yielding PSIS-LOO over $n \cdot p$ cells.

The relative effective sample size is computed (when r_eff = NULL) as

$$ \texttt{r_eff} = \texttt{loo::relative_eff}!\left(\exp(\texttt{log_lik_mat}),; \texttt{chain_id}\right), $$

where chain_id is constructed as rep(seq_len(n_chain), each = n_iter) from the draws array of shape [n_iter, n_chain, n_variables].

Returns

A loo object: an S3 object of class "psis_loo" containing the ELPD estimate (elpd_loo), its standard error, Pareto-$\hat{k}$ diagnostics, and pointwise contributions. This is the direct return value of loo::loo(log_lik_mat, r_eff = r_eff, cores = cores, ...).

Notes

Flagged @keywords experimental; the aggregation rule is stable but the signature may gain additional arguments in future versions.
Calls assert_inherits(fit, "gdpar_fit", "fit") — raises an error if fit does not inherit from "gdpar_fit".
Calls require_suggested("loo", "compute PSIS-LOO approximate cross-validation") — raises an error if the loo package is not installed.
Draws are extracted via fit$fit$draws(variables = "log_lik", format = "draws_array"), producing a 3-D array [iteration, chain, variable].
The coordinate dimension p is read from fit$amm$p; if that slot is NULL, p defaults to 1L.
The r_eff computation uses exp(log_lik_mat) (i.e., the likelihood on the natural scale) as required by loo::relative_eff().
Pareto-$\hat{k} > 0.7$ signals that the PSIS approximation is unreliable for the affected observations; the documentation recommends loo::loo_moment_match() or loo::reloo() as refinements.
The "subject" aggregation matches the convention used by brms multivariate fits with set_rescor(FALSE), yielding ELPD values directly comparable to per-coordinate competitors aggregated identically.
The "cell" aggregation is recommended for diagnostics only, not for reporting comparable ELPD values, because it conflates subject-level and coordinate-level cross-validation.

`aggregate_log_lik(draws_arr, p, aggregation)`

Purpose

Internal helper (not exported; documented @noRd) that converts a draws_array of the Stan-generated log_lik quantity into a plain numeric matrix suitable for loo::loo(). It handles the univariate case (pass-through), the multivariate subject-aggregation case (sum over coordinates), and the multivariate cell-aggregation case (concatenation in i-major, k-minor order).

Arguments

Argument	Type	Meaning
`draws_arr`	`draws_array` (from `posterior`/`cmdstanr`)	A draws array containing only the `log_lik` variable(s). Expected shape: `[n_iter, n_chain, n_variables]`.
`p`	`integer`	Declared coordinate dimension. If `1L`, the function returns the draws matrix unchanged. If $> 1$, variable names are parsed to extract indices $(i, k)$.
`aggregation`	`character` scalar	Either `"subject"` or `"cell"`. Determines whether coordinates are summed (subject) or concatenated (cell). Not validated inside this function (the caller `gdpar_loo` passes the result of `match.arg`).

Mathematics

For $p = 1$, the input matrix is returned as-is (shape $S \times n$, where $S$ is the total number of posterior draws).

For $p > 1$, variable names are parsed with the regular expression ^log_lik\[(\d+),(\d+)\]$, yielding index pairs $(i, k)$. Let $n = \max_i$ and $p_{\text{in}} = \max_k$.

Subject aggregation produces an $S \times n$ matrix where column $i$ is

$$ \texttt{out}[\cdot, i] = \sum_{k=1}^{p} \texttt{mat}[\cdot, \texttt{col}(i,k)], $$

with $\texttt{col}(i,k)$ being the column index in mat corresponding to variable log_lik[i,k].

Cell aggregation produces an $S \times (n \cdot p)$ matrix where the flat column index is

$$ \texttt{flat_col}(i, k) = (i - 1),p + k, $$

so columns are ordered as $((1,1), (1,2), \ldots, (1,p), (2,1), \ldots, (n,p))$ — i-major, k-minor.

Returns

A plain numeric matrix (no class attributes beyond matrix, since unclass() is applied to the draws_matrix):

$p = 1$: shape $S \times n$, identical to the input draws matrix.
$p > 1$, "subject": shape $S \times n$, columns indexed by subject $i$.
$p > 1$, "cell": shape $S \times (n \cdot p)$, columns indexed by flat $(i,k)$.

Here $S = \texttt{nrow(mat)}$ is the total number of posterior draws (iterations $\times$ chains).

Notes

The function begins by extracting variable names via posterior::variables(draws_arr) and converting to a plain matrix via unclass(posterior::as_draws_matrix(draws_arr)).
For $p = 1$, the function returns immediately without parsing variable names; both aggregation modes would yield the same result.
Parse error: If any variable name does not match ^log_lik\[(\d+),(\d+)\]$, the function calls gdpar_abort() with:
- class = "gdpar_loo_parse_error",
- data = list(unparsed = vars[bad]),
- a message listing up to 5 offending names (via utils::head(vars[bad], 5L) and sQuote).
Dimension mismatch error: If the parsed maximum $k$ (p_in) does not equal the declared p, the function calls gdpar_abort() with:
- class = "gdpar_loo_dim_mismatch",
- data = list(p_parsed = p_in, p_fit = p),
- a message of the form "log_lik dimension mismatch: parsed p = %d from draws; fit declares p = %d.".
The index matrix ij is built by do.call(rbind, lapply(parsed, function(z) as.integer(z[2:3]))), with column names c("i", "k").
The subject-aggregation loop iterates seq_along(vars) and accumulates mat[, col_idx] into out[, i]; the cell-aggregation loop places each mat[, col_idx] into out[, flat_col] without summation.
No validation of aggregation is performed inside this function; it assumes the caller has already matched the argument. If an unrecognized value is passed, the cell-aggregation branch is taken by default (since the if (aggregation == "subject") check fails), and the subject branch is skipped.

R/gdpar-package.R

R/gdpar-package.R contains no functions: it holds only the package-level documentation sentinel "_PACKAGE" (which Roxygen turns into the ?gdpar-package help topic, aliased gdpar-package) plus the package-wide @importFrom directives.

Package-level documentation. The roxygen block states the package's thesis — individual parameters decomposed as $\theta_i=\theta_{\text{ref}}+\Delta(x_i,\theta_{\text{ref}})$ in the Additive–Multiplicative–Modulated canonical form $\Delta(x,\theta)=a(x)+b(x)\odot\theta+W(\theta)x$ — and documents the three estimation paths and the ten theoretical vignettes (Blocks 0–9: canonical form & identifiability, gnoseological validity, special-case subsumptions, per-path asymptotics, EB vs FB, causal positioning, cognitive motivation).

The roxygen @section Three paths describes path = "bayes" (Path 1, hierarchical Bayesian via cmdstanr — the default and, in this release, the only operational path), path = "vcm" (Path 2, varying-coefficient via mgcv), and path = "hyper" (Path 3, hypernetwork via torch, explicitly "Not implemented in this release"; invoking it raises a structured gdpar_unsupported_feature_error, and torch is not a declared dependency). The stated default rationale: Path 1 admits the closed-form identifiability results (Theorems 1A/1E) and calibrated Bayesian uncertainty (Theorem 4C). (Note: consistent with the operational status documented throughout this wiki and the framework-overview vignette, the executable surface of gdpar 0.1.0 is Path 1 only; path = "vcm" and path = "hyper" abort with gdpar_unsupported_feature_error.)

Imports. @importFrom stats model.matrix terms update sd quantile median lm coef qnorm setNames qt and @importFrom methods is — the base-stats/methods symbols used across the package.

← Part IV — Exhaustive Function Reference (2/7) · gdpar Wiki Home · Part IV — Exhaustive Function Reference (4/7) →

gdpar Wiki

Home

Part IV Function Reference 3

.gdpar_eb_make_random_init(stan_data, seed_offset = 1L, base_seed = NULL)

.gdpar_eb_apply_correction(eb_correction, laplace_result, stan_data, p = 1L, verbose)

.gdpar_eb_correction_matrix(eb_correction, laplace_result, stan_data, p = 1L, verbose)

.gdpar_eb_resolve_K_inputs(formula, amm, W, family, formula_set_input, amm_list_input, classic_with_amm_calls, family_is_named_list)

.gdpar_eb_run_K(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, family_id_k_vector, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group, parametrization, id_check_rigor, eb_correction, laplace_control, call, ...)

.gdpar_eb_correction_tensor(eb_correction, laplace_result_per_slot, K = 2L, p = 1L, verbose = TRUE)

empty_tensor() (nested closure inside .gdpar_eb_correction_tensor)

.build_amm_design_KxP(amm_list_canonical, data, formula_rhs)

.assemble_stan_data_KxP(design_KxP, family, amm_list_canonical, y_matrix, theta_anchor_kp, group_id = NULL, path = c("EB", "FB"), cp_W = FALSE)

pad_to(z, target_cols, n_rows) (nested closure inside .assemble_stan_data_KxP)

.gdpar_eb_make_random_init_KxP(stan_data, seed_offset = 1L, base_seed = NULL)

.gdpar_eb_maximize_marginal_KxP(model, stan_data, control, seed, verbose)

.resolve_anchor_KxP(anchor, family, y_matrix, K, p, verbose)

.gdpar_eb_run_KxP(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, family_id_k_vector, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group, parametrization, id_check_rigor, eb_correction, laplace_control, call, ...)

R/families.R

.gdpar_param_spec(name, link, did_status, did_condition, did_reference, support, prior_canonical_kind, scope, family_role)

.gdpar_known_family_roles()

.gdpar_link_funcs(link)

.gdpar_family_param_specs_for(name, link)

.gdpar_apply_did_override(param_specs, did_override, family_name)

.gdpar_apply_did_override(param_specs, did_override, family_name)

print.gdpar_param_spec(x, ...)

gdpar_family(name, link, did_override)

gdpar_family_custom(name, link, did_holds, did_condition, stan_loglik_block, stan_log_lik_block, stan_y_pred_block, y_type, did_reference)

.gdpar_K_custom_patterns()

gdpar_family_custom_K(name, stan_lpdf_id, did_holds = TRUE, did_condition = NULL, did_reference = NULL)

print.gdpar_family(x, ...)

gdpar_family_multi(family, p, link)

gdpar_family_multi(family, p, link = NULL)

print.gdpar_family_multi(x, ...)

.gdpar_promote_scope_per_observation(family, k_names)

.gdpar_canonical_inv_link_id_slot1(stan_id)

.gdpar_canonical_inv_link_id(stan_id, slot_idx)

.gdpar_canonical_inv_link_id_slot(stan_id, slot_idx)

.gdpar_default_link_for(name)

.gdpar_canonical_support_slot1(stan_id)

.gdpar_support_subset_coherent(emitted, required)

.gdpar_compatible_families_for_support(required, exclude_stan_ids = integer(0))

.gdpar_validate_heterogeneous_family_K(family_het_list, location_param_specs, slot_names)

.gdpar_inv_link_id_per_slot(family_id_k_vector, location_family) (documentation only; body not in this section)

.gdpar_compute_inv_link_id_per_slot(family_id_k_vector, location_family)

.gdpar_resolve_heterogeneous_family_K(family_input, slot_names)

R/formula_spec.R

formula_spec.R (section 1 of 3)

.gdpar_check_no_intercept_suppression(f, slot_name)

.gdpar_assert_named_formula_list(args, fn_name)

gdpar_formula_set(...)

gdpar_bf(...)

gdpar_bf(...)

print.gdpar_formula_set(x, ...)

[[.gdpar_formula_set(x, i)

[.gdpar_formula_set(x, i)

names.gdpar_formula_set(x)

length.gdpar_formula_set(x)

.gdpar_rhs_has_amm_calls(formula)

.gdpar_is_default_amm_spec(amm)

.gdpar_parse_amm_formula(rhs_formula, slot_name)

.gdpar_split_amm_summands(expr, slot_name)

.gdpar_formula_set_to_amm_spec_list(fs, W_basis_arg = NULL)

R/gdpar_coef.R

build_coef_term_df(draws_mat, term_names)

build_coef_W_df(draws_mat, basis_dim, x_names)

build_coef_theta_ref_df(draws_mat, p)

build_coef_theta_ref_df_grouped(arr, J_groups, p)

build_coef_hyper_df(draws_mat, p)

validate_coef_slot(slot, p, expected_cols, slot_name)

validate_coef_theta_ref(theta_ref_df, p, J_groups)

validate_coef_hyper_df (signature: validate_coef_hyper_df(hyper_df, p, hyper_name))

validate_coef_hyper(df, p, slot_name)

new_gdpar_coef(theta_ref, a, b, W, p, mu_theta_ref, sigma_theta_ref, J_groups, group_levels, summary_stats)

count_active_coords(slot)

print.gdpar_coef(x, level, digits, ...)

format_coef_df(df, digits)

summary.gdpar_coef(object, ...)

as.data.frame.gdpar_coef(x, row.names, optional, ...)

One-line formatter (truncated)

format.gdpar_coef(x, ...)

R/gdpar_loo.R

gdpar_loo.R

`.gdpar_eb_make_random_init(stan_data, seed_offset = 1L, base_seed = NULL)`

`.gdpar_eb_apply_correction(eb_correction, laplace_result, stan_data, p = 1L, verbose)`

`.gdpar_eb_correction_matrix(eb_correction, laplace_result, stan_data, p = 1L, verbose)`

`.gdpar_eb_resolve_K_inputs(formula, amm, W, family, formula_set_input, amm_list_input, classic_with_amm_calls, family_is_named_list)`

`.gdpar_eb_run_K(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, family_id_k_vector, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group, parametrization, id_check_rigor, eb_correction, laplace_control, call, ...)`

`.gdpar_eb_correction_tensor(eb_correction, laplace_result_per_slot, K = 2L, p = 1L, verbose = TRUE)`

`empty_tensor()` (nested closure inside `.gdpar_eb_correction_tensor`)

`.build_amm_design_KxP(amm_list_canonical, data, formula_rhs)`

`.assemble_stan_data_KxP(design_KxP, family, amm_list_canonical, y_matrix, theta_anchor_kp, group_id = NULL, path = c("EB", "FB"), cp_W = FALSE)`

`pad_to(z, target_cols, n_rows)` (nested closure inside `.assemble_stan_data_KxP`)

`.gdpar_eb_make_random_init_KxP(stan_data, seed_offset = 1L, base_seed = NULL)`

`.gdpar_eb_maximize_marginal_KxP(model, stan_data, control, seed, verbose)`

`.resolve_anchor_KxP(anchor, family, y_matrix, K, p, verbose)`

`.gdpar_param_spec(name, link, did_status, did_condition, did_reference, support, prior_canonical_kind, scope, family_role)`

`.gdpar_known_family_roles()`

`.gdpar_link_funcs(link)`

`.gdpar_family_param_specs_for(name, link)`

`.gdpar_apply_did_override(param_specs, did_override, family_name)`

`.gdpar_apply_did_override(param_specs, did_override, family_name)`

`print.gdpar_param_spec(x, ...)`

`gdpar_family(name, link, did_override)`

`gdpar_family_custom(name, link, did_holds, did_condition, stan_loglik_block, stan_log_lik_block, stan_y_pred_block, y_type, did_reference)`

`.gdpar_K_custom_patterns()`

`gdpar_family_custom_K(name, stan_lpdf_id, did_holds = TRUE, did_condition = NULL, did_reference = NULL)`

`print.gdpar_family(x, ...)`

`gdpar_family_multi(family, p, link)`

`gdpar_family_multi(family, p, link = NULL)`

`print.gdpar_family_multi(x, ...)`

`.gdpar_promote_scope_per_observation(family, k_names)`

`.gdpar_canonical_inv_link_id_slot1(stan_id)`

`.gdpar_canonical_inv_link_id(stan_id, slot_idx)`

`.gdpar_canonical_inv_link_id_slot(stan_id, slot_idx)`

`.gdpar_default_link_for(name)`

`.gdpar_canonical_support_slot1(stan_id)`

`.gdpar_support_subset_coherent(emitted, required)`

`.gdpar_compatible_families_for_support(required, exclude_stan_ids = integer(0))`

`.gdpar_validate_heterogeneous_family_K(family_het_list, location_param_specs, slot_names)`

`.gdpar_inv_link_id_per_slot(family_id_k_vector, location_family)` (documentation only; body not in this section)

`.gdpar_compute_inv_link_id_per_slot(family_id_k_vector, location_family)`

`.gdpar_resolve_heterogeneous_family_K(family_input, slot_names)`

`formula_spec.R` (section 1 of 3)

`.gdpar_check_no_intercept_suppression(f, slot_name)`

`.gdpar_assert_named_formula_list(args, fn_name)`

`gdpar_formula_set(...)`

`gdpar_bf(...)`

`gdpar_bf(...)`

`print.gdpar_formula_set(x, ...)`

`[[.gdpar_formula_set(x, i)`

`[.gdpar_formula_set(x, i)`

`names.gdpar_formula_set(x)`

`length.gdpar_formula_set(x)`

`.gdpar_rhs_has_amm_calls(formula)`

`.gdpar_is_default_amm_spec(amm)`

`.gdpar_parse_amm_formula(rhs_formula, slot_name)`

`.gdpar_split_amm_summands(expr, slot_name)`

`.gdpar_formula_set_to_amm_spec_list(fs, W_basis_arg = NULL)`

`build_coef_term_df(draws_mat, term_names)`

`build_coef_W_df(draws_mat, basis_dim, x_names)`

`build_coef_theta_ref_df(draws_mat, p)`

`build_coef_theta_ref_df_grouped(arr, J_groups, p)`

`build_coef_hyper_df(draws_mat, p)`

`validate_coef_slot(slot, p, expected_cols, slot_name)`

`validate_coef_theta_ref(theta_ref_df, p, J_groups)`

`validate_coef_hyper_df` (signature: `validate_coef_hyper_df(hyper_df, p, hyper_name)`)

`validate_coef_hyper(df, p, slot_name)`

`new_gdpar_coef(theta_ref, a, b, W, p, mu_theta_ref, sigma_theta_ref, J_groups, group_levels, summary_stats)`

`count_active_coords(slot)`

`print.gdpar_coef(x, level, digits, ...)`

`format_coef_df(df, digits)`

`summary.gdpar_coef(object, ...)`

`as.data.frame.gdpar_coef(x, row.names, optional, ...)`

`format.gdpar_coef(x, ...)`

`gdpar_loo.R`

`gdpar_loo(fit, aggregation = c("subject", "cell"), r_eff = NULL, cores = 1L, ...)`

`aggregate_log_lik(draws_arr, p, aggregation)`