Part IV Function Reference 4

← Part IV — Exhaustive Function Reference (3/7) · gdpar Wiki Home · Part IV — Exhaustive Function Reference (5/7) →

R/gdpar.R

`dedup_message_blocks(lines)`

Purpose

Internal helper that deduplicates consecutive message blocks from captured stderr lines. It is used by gdpar() when verbose = FALSE to silence the duplicate divergence warnings emitted by cmdstanr during sampling. The function splits the captured lines into blocks delimited by empty lines, retains only the first occurrence of each unique block (compared by full text), and returns a flattened character vector suitable for re-emission via message().

Arguments

Argument	Type	Meaning
`lines`	Character vector	Captured `stderr` lines from `cmdstanr` sampling.

Mathematics

The algorithm operates in two passes:

Block segmentation. Given the input vector $\mathbf{l} = (l_1, l_2, \dots, l_n)$, consecutive non-empty strings are accumulated into a block $B_i$. An empty string acts as a block delimiter: when encountered and the current accumulator is non-empty, the accumulator is closed as a completed block. Leading empty lines (before any non-empty content) and consecutive empty lines between blocks are silently consumed because the accumulator is empty at those points. Trailing empty lines after the last non-empty content do not produce a block.
Deduplication. For each block $B_i$ in order of first appearance, a key $k_i = \text{paste}(B_i, \text{collapse} = \text{"\textbackslash n"})$ is computed. If $k_i \notin S$ where $S$ is the set of previously seen keys, the block's lines (followed by an empty-string separator) are appended to the output and $k_i$ is added to $S$. Otherwise the block is dropped.

A final trimming step removes the trailing empty-string separator from the output if present.

Returns

A character vector. If length(lines) == 0L, returns character(0). Otherwise returns the deduplicated blocks flattened into a single character vector, with blocks separated by empty strings (""), and the trailing separator removed. Ready for line-by-line re-emission via message().

Notes

Marked @keywords internal and @noRd; not exported.
The comparison key uses paste(b, collapse = "\n"), so blocks are compared by their full multi-line text content, not line-by-line.
The function does not modify its input; it builds new vectors via concatenation.
No side effects; no errors raised on any input (including NULL, which would cause length(NULL) == 0L to be TRUE, returning character(0)).

gdpar(formula, family = gdpar_family("gaussian"), amm = amm_spec(), W = NULL, data, prior = NULL, path = c("bayes", "vcm", "hyper"), anchor = "prior_mean", skip_id_check = FALSE, chains = 4L, iter_warmup = 1000L, iter_sampling = 1000L, adapt_delta = 0.95, max_treedepth = 12L, refresh = 100L, verbose = TRUE, seed = NULL, group = NULL, parametrization = c("auto", "ncp", "cp"), parametrization_a = NULL, parametrization_W = NULL, parametrization_aggregation = NULL, id_check_rigor = c("full", "fast"), ...)

Purpose

Main exported entry point of the gdpar package. Fits the AMM (Additive–Multiplicative Modulation) canonical decomposition of the individual parameter via Path 1 (hierarchical Bayesian inference through Stan/cmdstanr). The function orchestrates input validation, covariate standardization, basis-restricted identifiability diagnostics, Stan model code generation, compilation, HMC/NUTS sampling, and convergence diagnostic collection. It also dispatches to internal multivariate (.gdpar_multi), K-individual (.gdpar_K), and placeholder paths (Path 2/Path 3) based on the input configuration.

Arguments

Argument	Type	Meaning
`formula`	Formula or `gdpar_formula_set`	Two-sided formula (`y ~ ...`) or a `gdpar_formula_set` object. In the legacy two-sided form, the LHS is the outcome variable; the RHS either lists covariates entering the modulating component as the linear factor $x$ in $W(\theta)x$ (with AMM components declared via `amm`), or contains AMM wrapper calls `a(...)`, `b(...)`, `W()`. A `gdpar_formula_set` is the multi-parameter form with one slot per K-individual parameter.
`family`	`gdpar_family`, `gdpar_family_multi`, or named list of `gdpar_family`	Response distribution. Defaults to `gdpar_family("gaussian")`. A named list of `gdpar_family` objects triggers the heterogeneous-families-per-slot path (sub-phase 8.3.7), requiring $K \geq 2$.
`amm`	`amm_spec` or named list of `amm_spec`	AMM specification. Defaults to `amm_spec()` (Level 0 degenerate). A named list triggers the K-individual low-level path. Must remain at default when `formula` is a `gdpar_formula_set` or contains AMM wrapper calls in the RHS.
`W`	`W_basis` or `NULL`	Modulating basis for K-individual paths. `NULL` by default. Ignored in the legacy single-`amm_spec` path (where the basis travels via `amm$W`). Supplying a non-`NULL` `W` in the legacy path raises an error.
`data`	Data frame	Data containing variables referenced by `formula` and `amm`.
`prior`	`gdpar_prior` or `NULL`	Prior specification. Defaults to `gdpar_prior()` when `NULL`.
`path`	Character scalar	Estimation path: `"bayes"` (Path 1, default), `"vcm"` (Path 2), `"hyper"` (Path 3). Only `"bayes"` is implemented; the other two raise `gdpar_unsupported_feature_error`.
`anchor`	Numeric scalar or character	Anchor value $\theta_0$ in linear-predictor scale. `"prior_mean"` (default) or `"empirical_y"` (linkfun applied to outcome mean).
`skip_id_check`	Logical scalar	If `TRUE`, skip the Gram-matrix identifiability diagnostic. Default `FALSE`.
`chains`	Integer scalar	Number of HMC chains. Default `4L`.
`iter_warmup`	Integer scalar	Warmup iterations per chain. Default `1000L`.
`iter_sampling`	Integer scalar	Sampling iterations per chain. Default `1000L`.
`adapt_delta`	Numeric scalar	Target acceptance probability for NUTS. Default `0.95`. Validated to be in $[0.5, 0.999]$.
`max_treedepth`	Integer scalar	Maximum NUTS tree depth. Default `12L`.
`refresh`	Integer scalar	Progress-reporting frequency passed to `cmdstanr`. Default `100L`. Must be non-negative.
`verbose`	Logical scalar	Verbosity of package-level informational messages. Default `TRUE`. Also controls `show_messages` and `show_exceptions` in `cmdstanr`.
`seed`	Integer scalar or `NULL`	Random seed for `cmdstanr`. Default `NULL`. Also forwarded to the pre-flight diagnostic when `parametrization = "auto"`.
`group`	One-sided formula or `NULL`	Grouping variable for per-group hierarchical estimation of `theta_ref`. Default `NULL` (single global `theta_ref`). Only single-variable one-sided formulas accepted (e.g., `~ species`).
`parametrization`	Character scalar	CP/NCP parametrization selection: `"auto"` (default, runs pre-flight), `"ncp"` (force non-centered), `"cp"` (force centered).
`parametrization_a`	Character scalar or `NULL`	Override for component `a`: `"ncp"` or `"cp"`. `NULL` inherits from `parametrization`.
`parametrization_W`	Character scalar or `NULL`	Override for component `W`: `"ncp"` or `"cp"`. `NULL` inherits from `parametrization`. If both `parametrization_a` and `parametrization_W` are explicit, pre-flight is skipped.
`parametrization_aggregation`	Character scalar or `NULL`	Aggregation rule for multivariate per-coordinate CP/NCP decisions: `"any_ncp"`, `"majority"`, or `"per_k"`. Only forwarded to `.gdpar_multi`; not used in the scalar or K-individual paths.
`id_check_rigor`	Character scalar	Identifiability check strictness: `"full"` (default) or `"fast"`. Forwarded to `.gdpar_K` and `.gdpar_multi`; not directly used in the scalar path.
`...`		Additional named arguments forwarded to the underlying `cmdstanr` sampler (via `sample_args`), and also forwarded to `.gdpar_K` / `.gdpar_multi`.

Mathematics

AMM canonical decomposition. The individual parameter $\theta$ is decomposed via an additive component $a$ and a modulating component $W$:

$$\theta = a + W(\theta_{\text{ref}}) \cdot x$$

where $x$ is a vector of covariates entering the modulating component.

Anchor parametrization. The anchor value $\theta_0$ enters as a parametrization device:

$$W(\theta) - W(\theta_{\text{anchor}})$$

Changing the anchor changes the parametrization but not the data-generating model.

Group-level random intercept. When group is supplied, $\theta_{\text{ref}}$ is promoted to a vector indexed by group with a Normal hyperprior:

$$\theta_{\text{ref}}[g] \sim \mathrm{Normal}(\mu_{\theta_{\text{ref}}}, \sigma_{\theta_{\text{ref}}})$$

Identifiability diagnostic test point. The diagnostic test point $\theta_{\text{ref,init}}$ is set to:

$$\theta_{\text{ref,init}} = \begin{cases} 1 & \text{if } b \neq \text{NULL} \text{ and } |\theta_0| < 10^{-8} \ \theta_0 & \text{otherwise} \end{cases}$$

This avoids a degenerate zero test point when the anchor is numerically zero and a multiplicative component $b$ is present.

Covariate standardization. Covariates entering the additive basis $a$ are centered (enforcing empirical assumption C1). Covariates entering the modulating component $W$ are additionally scaled to unit standard deviation. The transformation is recorded for consistent prediction on new data.

Parametrization pre-flight (Path B'). When parametrization = "auto", a short pre-flight NCP fit is run and per-component CP/NCP decisions are made via three sequential filters:

A divergence-attribution $t$-test.
An E-BFMI threshold check.
A chain-aware block-bootstrap $z$-test on the posterior-to-prior contraction of the effective coefficient.

Returns

An object of class c("gdpar_fit", "list") with components:

Component	Type	Description
`fit`	`CmdStanFit` object	The underlying `cmdstanr` fit object.
`amm`	`amm_spec`	The (possibly materialized) AMM specification used.
`family`	`gdpar_family` / `gdpar_family_multi`	The (possibly promoted) family object.
`prior`	`gdpar_prior`	The prior specification.
`design`	List	The AMM design matrices and metadata from `build_amm_design()`.
`anchor`	Numeric scalar	The resolved anchor value $\theta_0$.
`stan_data`	List	The data list passed to Stan.
`identifiability_report`	List or `NULL`	Report from `gdpar_check_identifiability()`, or `NULL` if `skip_id_check = TRUE`.
`diagnostics`	List	Convergence diagnostics from `compute_diagnostics()`.
`parametrization`	List	Contains `cp_a` (logical), `cp_W` (logical), and `meta` (pre-flight diagnostic statistics when applicable).
`group_info`	List or `NULL`	Group information from `.resolve_group_argument()`.
`call`	`call`	The matched call.
`path`	Character scalar	The resolved estimation path (`"bayes"`).

Notes

Dispatch logic. The function has three major dispatch branches before reaching the scalar Stan pipeline:

K-individual paths — triggered when formula is a gdpar_formula_set, when amm is a named list of amm_spec objects, or when formula is a classic two-sided formula whose RHS contains AMM wrapper calls (a()/b()/W(), detected by .gdpar_rhs_has_amm_calls()). In all three sub-paths, amm must be at its default amm_spec(). The formula set / AMM wrappers are converted to amm_list_canonical via .gdpar_formula_set_to_amm_spec_list(). If $K > 1$, the function delegates to .gdpar_K(). If $K = 1$, the single amm_spec is extracted, the formula is reconstructed from union(all.vars(amm$a), all.vars(amm$b)), and execution falls through to the scalar path.
Multivariate path — triggered when amm$p > 1L (and amm is a single amm_spec). Delegates to .gdpar_multi().
Scalar univariate path — the main body, for amm$p being NULL or 1L.

Heterogeneous family detection. A named list is recognized as a heterogeneous family specification when all of the following hold: it is.list(), does not inherit from gdpar_family or gdpar_family_multi, has non-NULL names, all names are non-empty (nzchar), no duplicated names, and every element inherits from gdpar_family. If $K = 1$ with a heterogeneous family, an error is raised. If $K > 1$, the family is resolved via .gdpar_resolve_heterogeneous_family_K() (producing a location_family and a family_id_k_vector); otherwise .gdpar_promote_scope_per_observation() is used.

Errors raised (via gdpar_abort):

Condition	Error class
`path = "vcm"`	`gdpar_unsupported_feature_error`
`path = "hyper"`	`gdpar_unsupported_feature_error`
`formula` is a `gdpar_formula_set` but `amm` is not at default	`gdpar_input_error`
Named list `amm` with empty slot name	`gdpar_input_error`
Named list `amm` entry not inheriting `amm_spec`	`gdpar_input_error` (with `data` containing `slot` and `received`)
Named list `amm` with duplicated names	`gdpar_input_error`
Named list `amm` but `formula` not a two-sided formula	`gdpar_input_error`
Classic formula with AMM calls but `amm` not at default	`gdpar_input_error`
`W` supplied in the legacy single-`amm_spec` path	`gdpar_input_error`
`formula` not a two-sided formula	`gdpar_input_error`
`gdpar_family_multi` with `amm$p` NULL or 1	`gdpar_input_error` (with `data` containing `family_class` and `amm_p`)
Heterogeneous family with $K = 1$	`gdpar_input_error` (with `data` containing `K`)
`refresh` not a non-negative numeric scalar	`gdpar_input_error`
`verbose` not a logical scalar	`gdpar_input_error`
Outcome variable not in `data`	`gdpar_input_error`
Outcome contains non-finite values (NA, NaN, Inf)	`gdpar_input_error`
Identifiability check fails	`gdpar_identifiability_error` (with `data` containing `report`)

Assertion checks (via assert_* helpers):

family must inherit from c("gdpar_family", "gdpar_family_multi") (unless it is a recognized named list).
amm must inherit from "amm_spec".
data must be a data frame.
prior must inherit from "gdpar_prior".
chains, iter_warmup, iter_sampling, max_treedepth must be count scalars.
adapt_delta must be a numeric scalar in $[0.5, 0.999]$.

Side effects:

Calls require_suggested("cmdstanr", ...); aborts if cmdstanr is not installed.
Writes Stan source code to a temporary file via write_stan_to_tempfile().
Compiles the Stan model via cmdstanr::cmdstan_model().
Performs HMC/NUTS sampling via cs_model$sample().
When verbose = FALSE: creates a temporary .log file, opens a message sink, redirects stderr to the file during sampling, then on exit reads the file, deduplicates blocks via dedup_message_blocks(), and re-emits them via message(). The on.exit handler is registered with add = TRUE, after = FALSE.
When verbose = TRUE: passes show_messages = TRUE and show_exceptions = TRUE to the sampler.
Emits informational messages via gdpar_inform() for: skipped identifiability check, D-ID conditional status, and group resolution.
Calls .check_group_aliasing_c7() when a group is supplied, enforcing condition C7 of Block 6.5 (no perfect aliasing of a/b columns with the group indicator).
The ... arguments are merged into sample_args by name, potentially overriding default sampler settings.

D-ID status message. When family$did_status == "holds_under_condition" and verbose = TRUE, a message is emitted with the family name, the D-ID condition, and the D-ID reference. The package documents but does not verify this condition from data.

Group argument. Resolved via .resolve_group_argument(group, data, n = length(y), verbose = verbose). If non-NULL, the group_id integer vector is passed to assemble_stan_data() and .check_group_aliasing_c7() is called with the design, group ID, and variable name.

Parametrization resolution. resolve_parametrization() is called with parametrization, parametrization_a, parametrization_W, prior, stan_data, amm, preflight_seed = seed, and verbose. Note that parametrization_aggregation is NOT passed to resolve_parametrization() in the scalar path. The resolved cp_a and cp_W flags are forwarded to generate_stan_code().

Stan code generation. generate_stan_code(prior, cp_a, cp_W) substitutes prior placeholders into the static template at inst/stan/amm_main.stan.

W basis materialization. When amm$W is non-NULL, it is materialized via materialize_W_basis(amm$W, p = 1L) before design assembly.

Formula RHS processing. The RHS is extracted as formula[c(1L, 3L)] and updated with ~ . + 0 (removing the intercept), then passed as formula_rhs to build_amm_design() and gdpar_check_identifiability().

S3 dispatch. The returned object has class c("gdpar_fit", "list"), enabling S3 method dispatch for print, summary, etc. (defined elsewhere in the package).

`flush_captured()`

Purpose

A closure defined inside gdpar() (not a top-level function). It is registered as an on.exit handler (with add = TRUE, after = FALSE) when verbose = FALSE, to flush the stderr messages captured during cmdstanr sampling. It un-sinks the message stream, closes the temporary log file connection, reads and deduplicates the captured lines, re-emits them via message(), and cleans up the temp file.

Arguments

None. The function captures msg_file (the temp file path) and msg_con (the file connection) from the enclosing scope of gdpar().

Returns

NULL (invisibly). Called for its side effects.

Notes

This is a local closure, not a package-level function. It is defined only when verbose = FALSE within gdpar().
The function first checks sink.number(type = "message") > 0L before un-sinking, guarding against double-un-sink.
It checks isOpen(msg_con) before closing.
File reading is wrapped in tryCatch: on error, character(0) is returned, and no messages are re-emitted.
The temp file is deleted via unlink(msg_file) regardless of read success.
Captured lines are deduplicated via dedup_message_blocks() before re-emission.
Each line of the deduplicated output is re-emitted via message(line).
Registered via on.exit(flush_captured(), add = TRUE, after = FALSE), ensuring it runs before other on.exit handlers added earlier.

`.gdpar_multi(formula, family, amm, data, prior, anchor, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group = NULL, parametrization, parametrization_a, parametrization_W, parametrization_aggregation = NULL, id_check_rigor = "full", call, ...)`

Purpose

Internal workhorse implementing the multivariate ($p > 1$) Bayesian "Path 1" fitting route of gdpar(). It validates inputs, promotes a univariate family to a multivariate one when necessary, assembles the AMM design matrix, resolves the anchor and the CP/NCP parametrization, runs a basis-restricted identifiability check, generates and compiles Stan code via cmdstanr, executes HMC/NUTS sampling, computes posterior diagnostics, and returns a gdpar_fit object. It is the multivariate counterpart referenced by the trailing (not-yet-defined-in-this-section) K-individual path documented for $K > 1, p = 1$.

Arguments

formula: formula (length 3, two-sided). The LHS names the outcome column in data; the RHS is rebuilt internally with the intercept suppressed.
family: object of class gdpar_family or gdpar_family_multi. A bare gdpar_family is auto-promoted to gdpar_family_multi with p = amm$p when amm$p > 1.
amm: list representing the anchor model matrix specification; must expose integer element $p (the outcome dimension) and may expose $W (basis) and $b (offset). $W, if present, is materialized through materialize_W_basis().
data: data.frame containing the outcome column and any RHS variables.
prior: gdpar_prior object; if NULL, replaced by gdpar_prior().
anchor: anchor specification forwarded to resolve_anchor_multi().
skip_id_check: logical scalar; when TRUE, the basis-restricted identifiability check (C1–C4 + C4-bis) is bypassed.
chains: non-negative integer scalar; number of MCMC chains.
iter_warmup: non-negative integer scalar; warmup iterations per chain.
iter_sampling: non-negative integer scalar; sampling iterations per chain.
adapt_delta: numeric scalar in $[0.5, 0.999]$; NUTS step-size adaptation target.
max_treedepth: non-negative integer scalar; NUTS maximum tree depth.
refresh: non-negative numeric scalar; Stan progress refresh interval.
verbose: logical scalar; toggles informative messages, Stan show_messages/show_exceptions, and message-sink behavior.
seed: integer scalar or NULL; PRNG seed forwarded to cmdstanr and to the parametrization preflight.
group: NULL or group specification; resolved by .resolve_group_argument().
parametrization: forwarded to resolve_parametrization_multi(); uniform CP/NCP flag or "auto".
parametrization_a: forwarded to resolve_parametrization_multi(); per-component $a$ parametrization override.
parametrization_W: forwarded to resolve_parametrization_multi(); per-component $W$ parametrization override.
parametrization_aggregation: NULL (defaulting internally to "any_ncp") or a string; rule for aggregating per-coordinate CP/NCP decisions.
id_check_rigor: string (default "full"); rigor level passed to gdpar_check_identifiability().
call: language object; the original call, stored verbatim in the returned object.
...: extra named arguments merged into the sample_args list passed to cmdstanr::cmdstan_model$sample().

Mathematics

The function operationalizes the multivariate Path-1 Bayesian model. Let $p = \texttt{amm}\$p$ denote the outcome dimension and $n$ the number of observations. The outcome $Y \in \mathbb{R}^{n \times p}$ must be a finite matrix with exactly $p$ columns. The RHS design is built with intercept suppressed:

$$ \texttt{rhs} ;=; \texttt{update}\bigl(\texttt{formula}[c(1,3)],; \sim; . + 0\bigr). $$

The diagnostic test point for the identifiability check is chosen heuristically as

$$ \theta_{\text{ref}} ;=; \begin{cases} \mathbf{1}_p, & \texttt{amm}$b \neq \text{NULL} ;\land; \lVert \text{anchor_value} \rVert_\infty < 10^{-8}, \\ \text{anchor_value}, & \text{otherwise}, \end{cases} $$

i.e. when the anchor is numerically zero and a basis offset $b is present, the all-ones vector is substituted to avoid a degenerate diagnostic. The identifiability report is produced by gdpar_check_identifiability() at rigor id_check_rigor; failure raises a gdpar_identifiability_error.

The CP/NCP decision is resolved per coordinate $k \in \{1,\dots,p\}$ by resolve_parametrization_multi(), yielding boolean vectors cp_a_per_k and cp_W_per_k plus uniform flags cp_a/cp_W. When the per-coordinate $a$ decisions are non-constant,

$$ \bigl|{\texttt{cp_a_per_k}}\bigr| > 1 ;\Longrightarrow; \texttt{stan_data$cp_a_per_k_data} ;=; \texttt{as.integer}(\texttt{cp_a_per_k}), $$

activating the segment()-based per-coordinate prior wiring in the generated Stan code. The metadata note records that uniform-NCP and uniform-CP branches compile to byte-identical code relative to the "H.1 template", and that the mode is "multivariate_phase_h" (Phase H.2 preflight).

Returns

A list with S3 class c("gdpar_fit", "list") containing:

fit: the cmdstanr fit object returned by $sample().
amm: the (possibly $W-materialized) amm input.
family: the (possibly promoted) gdpar_family_multi object.
prior: the gdpar_prior object used.
design: the AMM design object from build_amm_design().
anchor: the resolved anchor value (numeric vector of length $p$).
stan_data: the assembled Stan data list.
identifiability_report: the report from gdpar_check_identifiability(), or NULL when skip_id_check = TRUE.
diagnostics: output of compute_diagnostics(fit, verbose).
parametrization: a list with elements cp_a, cp_W, cp_a_per_k, cp_W_per_k, report, and meta (containing mode, note, aggregation, and requested).
group_info: result of .resolve_group_argument() (possibly NULL).
call: the stored call.
path: the string "bayes".
p: the integer outcome dimension.

Notes

Errors raised (class gdpar_input_error): formula not a length-3 formula; data not a data frame; prior not inheriting gdpar_prior; chains/iter_warmup/iter_sampling/max_treedepth not non-negative integer scalars; adapt_delta outside $[0.5, 0.999]$; refresh not a non-negative numeric scalar; verbose not a logical scalar; family not promotable to gdpar_family_multi; family$p != amm$p; outcome name absent from data; outcome not a matrix/array; outcome containing any non-finite (NA, NaN, Inf) entry — Path 1 does not impute; outcome column count differing from $p$.
Errors raised (class gdpar_identifiability_error): basis-restricted identifiability check (C1–C4 + C4-bis) fails at the diagnostic test point; the error data field carries the full report.
Family auto-promotion: when family is a bare gdpar_family (not gdpar_family_multi) and amm$p > 1, it is promoted via gdpar_family_multi(family, p = p) and a gdpar_family_promotion_message is emitted when verbose.
D-ID informational message: when family$did_status == "holds_under_condition" and verbose, a gdpar_did_message is emitted reporting the condition and reference; the package documents but does not verify the condition from data.
Group aliasing: when a group is supplied, .check_group_aliasing_c7() is invoked against the design and group id before Stan data assembly.
Message capture under verbose = FALSE: a temporary .log file is opened, sink(..., type = "message") is engaged, and an on.exit handler flush_captured() (a closure defined inside the function) closes the sink, reads the file, deduplicates blocks via dedup_message_blocks(), emits each surviving line through message(), and unlinks the temp file. The handler is registered with add = TRUE, after = FALSE.
Extra arguments: any ... names overwrite or extend sample_args entries, allowing caller override of cmdstanr::cmdstan_model$sample() parameters.
S3 dispatch: the returned object is dispatched on class gdpar_fit; no methods are defined within this section.
Trailing roxygen block: the section ends with a @noRd/@keywords internal documentation block describing a not-yet-defined K-individual path ($K > 1, p = 1$) targeting the amm_distrib_K.stan template via .build_amm_design_K(), .assemble_stan_data_K(), and generate_stan_code_K(); that function's body is not present in this section and is therefore not described here.

Section 3: K-Individual Bayesian Fitting and Model Assembly

`.gdpar_K(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group = NULL, parametrization, parametrization_a, parametrization_W, id_check_rigor = "full", family_id_k_vector = NULL, call, ...)`

Purpose

Primary internal entry point for Path 1 (K-individual hierarchical Bayesian) model fitting. Validates all user-supplied arguments, delegates model/data assembly to .gdpar_K_build(), invokes cmdstanr MCMC sampling, computes post-sampling diagnostics and post-fit identifiability (information-contraction) checks, and assembles the final gdpar_fit S3 object.

Arguments

Argument	Type	Meaning
`amm_list_canonical`	list	Canonical AMM (anchor-map model) specifications, one per K slot.
`family`	`gdpar_family`	Response distribution family with K-individual slots promoted to per-observation scope.
`data`	data.frame	Model data containing the outcome and all covariates.
`prior`	`gdpar_prior` or `NULL`	Prior specification; if `NULL`, defaults to `gdpar_prior()`.
`anchor`	numeric scalar, numeric vector (length K), or character (`"prior_mean"`, `"empirical_y"`)	Reference-anchor value(s) for the K slots.
`outcome_name`	character (length 1)	Name of the outcome column in `data`.
`formula_env`	environment	Environment in which the model formula is constructed.
`skip_id_check`	logical	If `TRUE`, skip basis-restricted identifiability checks.
`chains`	integer (count)	Number of MCMC chains.
`iter_warmup`	integer (count)	Warmup iterations per chain.
`iter_sampling`	integer (count)	Post-warmup sampling iterations per chain.
`adapt_delta`	numeric scalar in $[0.5,\, 0.999]$	NUTS step-size adaptation target.
`max_treedepth`	integer (count)	Maximum NUTS tree depth.
`refresh`	non-negative numeric scalar	Stan progress-report refresh interval.
`verbose`	logical scalar	Whether to show Stan messages and informational diagnostics.
`seed`	integer or `NULL`	Random seed for reproducibility.
`group`	`NULL` or group specification	Grouping structure for hierarchical partial pooling.
`parametrization`	character	Global CP/NCP parametrization mode (`"cp"` or `"ncp"`).
`parametrization_a`	character or `NULL`	Per-component parametrization for the `a` coefficients; overrides `parametrization` when non-`NULL`.
`parametrization_W`	character or `NULL`	Per-component parametrization for the `W` basis; overrides `parametrization` when non-`NULL`.
`id_check_rigor`	character (default `"full"`)	Rigor level for the K-level identifiability check.
`family_id_k_vector`	`NULL` or vector	Per-slot family ID vector.
`call`	call	The original matched call (stored in the returned object).
`...`	named arguments	Extra arguments forwarded to `cs_model$sample()`, overriding defaults.

Mathematics

The function orchestrates Bayesian inference for a K-individual dynamic-parameter model. Each slot $k \in \{1, \dots, K\}$ carries its own AMM with design matrices $Z_a^{(k)}$, $Z_b^{(k)}$, and shared covariate matrix $X$. The posterior is explored via the No-U-Turn Sampler (NUTS) with adaptation target adapt_delta and maximum tree depth max_treedepth.

Post-fit, an information-contraction diagnostic is computed per slot. Let $\pi_0(\theta_k)$ denote the prior and $\pi(\theta_k \mid y)$ the posterior for slot $k$'s anchor parameter. The contraction ratio is defined (conceptually) as:

$$ r_k = 1 - \frac{\mathrm{Var}_{\pi(\theta_k \mid y)}[\theta_k]}{\mathrm{Var}_{\pi_0(\theta_k)}[\theta_k]} $$

Slots with $r_k < 0.1$ are flagged as information errors (posterior recovers prior); slots with $r_k \in [0.1, 0.5)$ are flagged as weak learning warnings.

Returns

A list with S3 class c("gdpar_fit", "list") containing:

Element	Type	Description
`fit`	CmdStanFit object	Raw posterior draws.
`amm_list_canonical`	list	Canonical AMM list (possibly with materialized $W$ bases).
`family`	`gdpar_family`	Resolved family object.
`prior`	`gdpar_prior`	Resolved prior object.
`design_K`	list	K-individual design (matrices $Z_a^{(k)}$, $Z_b^{(k)}$, $X$).
`anchor`	numeric vector (length K)	Resolved anchor values.
`stan_data`	list	Stan data list passed to the sampler.
`identifiability_report`	list or `NULL`	Pre-fit per-slot identifiability reports with K-level attribute.
`identifiability_post_fit`	list	Post-fit information-contraction report.
`diagnostics`	list	MCMC convergence diagnostics (R-hat, ESS, divergences, etc.).
`parametrization`	list	Resolved CP/NCP flags (`cp_a`, `cp_W`, `cp_a_per_K`, `meta`).
`group_info`	list or `NULL`	Resolved grouping information.
`call`	call	Original function call.
`path`	character	Always `"bayes"`.
`K`	integer	Number of slots.
`slot_names`	character vector	Canonical slot names.

Notes

Input validation: Asserts data is a data.frame, prior inherits from gdpar_prior, chains/iter_warmup/iter_sampling/max_treedepth are counts, adapt_delta is in $[0.5, 0.999]$, refresh is a non-negative numeric scalar, and verbose is a logical scalar. Violations raise errors of class gdpar_input_error.
Suggested package: Requires cmdstanr (via require_suggested).
Message capture: When verbose = FALSE, Stan messages are redirected to a temporary .log file via sink(). A closure flush_captured (registered on on.exit) closes the sink, reads the file, deduplicates message blocks via dedup_message_blocks(), and replays them through message().
Extra arguments: Named elements in ... are merged into sample_args after all defaults are set, so they can override any sampling argument.
Seed: Only added to sample_args when non-NULL; coerced to integer.
Init: Only added to sample_args when bd$init is non-NULL (Tweedie family case).
Post-fit warnings: If any slot has status == "information_error", a warning of class gdpar_information_error is issued (always). If any slot has status == "warn" and verbose is TRUE, a warning of class gdpar_information_warning is issued.
Side effects: Creates (and cleans up) a temporary log file when verbose = FALSE. Calls do.call(cs_model$sample, sample_args) which writes to disk and may print to console.

`.gdpar_K_build(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, skip_id_check, verbose, group = NULL, parametrization, parametrization_a, parametrization_W, id_check_rigor = "full", family_id_k_vector = NULL, compile_model_methods = FALSE)`

Purpose

Behaviour-preserving extraction of the build phase of .gdpar_K() — everything up to but not including the call to cs_model$sample(). Shared between .gdpar_K() (which passes compile_model_methods = FALSE) and gdpar_geom_fit() (which passes compile_model_methods = TRUE to expose $log_prob/$grad_log_prob/$hessian for the geometry engine). This ensures the model and data are built and compiled exactly once with no duplication.

Arguments

Argument	Type	Meaning
`amm_list_canonical`	list	Canonical AMM specifications, one per K slot.
`family`	`gdpar_family`	Response distribution family.
`data`	data.frame	Model data.
`prior`	`gdpar_prior` or `NULL`	Prior specification; defaults to `gdpar_prior()` if `NULL`.
`anchor`	numeric or character	Anchor specification for the K slots.
`outcome_name`	character	Name of the outcome column.
`formula_env`	environment	Environment for formula construction.
`skip_id_check`	logical	Whether to skip identifiability checks.
`verbose`	logical	Verbosity flag for informational messages.
`group`	`NULL` or group spec	Grouping structure.
`parametrization`	character	Global CP/NCP mode.
`parametrization_a`	character or `NULL`	Per-component `a` parametrization.
`parametrization_W`	character or `NULL`	Per-component `W` parametrization.
`id_check_rigor`	character (default `"full"`)	Rigor for K-level identifiability check.
`family_id_k_vector`	`NULL` or vector	Per-slot family ID vector.
`compile_model_methods`	logical (default `FALSE`)	If `TRUE`, compiles with `compile_model_methods = TRUE` to expose log-prob/gradient/Hessian methods.

Mathematics

Design construction. The union of all covariate variables across the K AMM slots is collected:

$$ \mathcal{V} = \bigcup_{k=1}^{K} \left( \mathrm{vars}(a_k) \cup \mathrm{vars}(b_k) \right) $$

A model formula $y \sim \mathcal{V} + 0$ is constructed (or $y \sim 1 + 0$ if $\mathcal{V} = \emptyset$), and the K-individual design is built via .build_amm_design_K().

Anchor resolution. The anchor is resolved via resolve_anchor_K(), producing a length-K numeric vector $\boldsymbol{\theta}_{\mathrm{ref}} = (\theta_{\mathrm{ref}}^{(1)}, \dots, \theta_{\mathrm{ref}}^{(K)})$.

Per-slot identifiability check. For each slot $k$, a diagnostic test point is chosen:

$$ \theta_{\mathrm{diag}}^{(k)} = \begin{cases} 1 & \text{if } b_k \neq \mathrm{NULL} \text{ and } |\theta_{\mathrm{ref}}^{(k)}| < 10^{-8} \ \theta_{\mathrm{ref}}^{(k)} & \text{otherwise} \end{cases} $$

This avoids testing at a degenerate zero anchor when a dispersion/basis term $b_k$ is present. The check is delegated to gdpar_check_identifiability().

K-level identifiability check. A cross-slot check (via .check_identifiability_K()) verifies:

D-B3: per-slot $Z_a^{(k)}$ rank sufficiency.
D-B2: cross-slot extended Gram matrix rank.

Parametrization resolution. CP/NCP flags are resolved with per-component overrides:

$$ \texttt{cp_a} = \begin{cases} \mathbb{1}[\texttt{parametrization_a} = \texttt{"cp"}] & \text{if } \texttt{parametrization_a} \neq \text{NULL} \ \mathbb{1}[\texttt{parametrization} = \texttt{"cp"}] & \text{otherwise} \end{cases} $$

and analogously for cp_W. Per-slot parametrization (cp_a_per_K) is NULL (deferred to a future sub-phase).

Tweedie init. For the Tweedie family (family$stan_id == 9), the structural constraint $p \in (1.01, 1.99)$ is enforced via a uniform prior, but the default cmdstanr init draws unconstrained parameters in $(-2, 2)$ which frequently land outside the support. The init function pins $\theta_{\mathrm{ref}}^{(K)}$ (the $p$ slot) at the midpoint $1.5$:

$$ \theta_{\mathrm{ref}}^{(j, k)} = \begin{cases} 0 & k < K \ 1.5 & k = K \end{cases}, \quad j = 1, \dots, J_{\mathrm{groups}} $$

Returns

A list with four top-level elements:

Element	Type	Description
`cs_model`	CmdStanModel	Compiled Stan model.
`stan_data`	list	Stan data list.
`init`	function or `NULL`	Per-chain init function (Tweedie only); `NULL` otherwise.
`meta`	list	Metadata list (see below).

The meta sub-list contains:

Element	Type	Description
`amm_list_canonical`	list	AMM list (with materialized $W$ bases).
`family`	`gdpar_family`	Resolved family.
`prior`	`gdpar_prior`	Resolved prior.
`design_K`	list	K-individual design.
`anchor_value`	numeric vector (length K)	Resolved anchor values.
`id_report`	list or `NULL`	Per-slot identifiability reports with `K_level` attribute.
`group_info`	list or `NULL`	Resolved group information.
`parametrization_resolved`	list	Resolved CP/NCP flags and metadata.
`slot_names`	character vector	Canonical slot names.
`K`	integer	Number of slots.
`stan_src`	character	Generated Stan source code.

Notes

Outcome validation: Aborts with gdpar_input_error if outcome_name is not a column in data, if the outcome is matrix/array-valued (Path 1 with $K > 1, p = 1$ requires a length-$n$ vector), or if the outcome contains any non-finite values (NA, NaN, Inf). Path 1 does not impute.
$W$ basis materialization: For each slot, if $W is non-NULL, it is materialized via materialize_W_basis(W, p = 1L).
Identifiability skip: When skip_id_check = TRUE and verbose = TRUE, an informational message of class gdpar_identifiability_message is issued; id_report is set to NULL.
Identifiability failure: Per-slot failures abort with class gdpar_identifiability_error (including the slot name and report in data). K-level failures abort similarly, identifying whether the failure was in the per-slot rank check (D-B3) or the cross-slot extended Gram check (D-B2).
D-ID conditional families: When family$did_status == "holds_under_condition" and verbose = TRUE, an informational message of class gdpar_did_message is issued documenting the condition and reference. The package does not verify the condition from data.
Group aliasing: When groups are present, .check_group_aliasing_c7() is called for each slot's design ($Z_a^{(k)}$, $Z_b^{(k)}$, $X$) against the group ID.
Model compilation: When compile_model_methods = FALSE, calls cmdstanr::cmdstan_model(stan_path) (bit-identical to pre-refactor behavior). When TRUE, calls cmdstanr::cmdstan_model(stan_path, compile_model_methods = TRUE).
Stan code generation: Delegates to generate_stan_code_K() with the resolved prior, CP/NCP flags, and family.
Tempfile: Stan source is written to a temporary file via write_stan_to_tempfile().

`flush_captured()` (local closure defined inside `.gdpar_K`)

Purpose

Closure defined inside .gdpar_K() when verbose = FALSE. Registered on on.exit() to cleanly close the message sink, read the captured Stan messages from the temporary log file, deduplicate them, and replay them through message().

Arguments

None. Captures msg_con (the file connection) and msg_file (the temp file path) from the enclosing scope.

Returns

NULL (invisible). Side effect: messages are replayed to the R message stream.

Notes

Checks sink.number(type = "message") > 0L before closing the sink to avoid errors if the sink was already closed.
Checks isOpen(msg_con) before closing the connection.
Reads the log file with readLines(msg_file, warn = FALSE); on error, returns character(0).
Unlinks the temporary file after reading.
Passes captured lines through dedup_message_blocks() before replaying.
Registered via on.exit(flush_captured(), add = TRUE, after = FALSE) so it executes before any other on.exit handlers.

Documentation for resolve_anchor_K appears at the end of this section (roxygen block only); its function definition is not present in this section and will be documented in a subsequent section.

`resolve_anchor_K(anchor, family, y, prior, slot_names, verbose)`

Purpose

Resolves the user-supplied anchor argument for the K-slot (multi-parameter-per-response) path of gdpar. It returns a length-K numeric vector on the linear-predictor scale, one anchor value per parameter slot (e.g. location, scale, shape). This is the slot-aware counterpart of the simpler resolve_anchor.

Arguments

anchor: A numeric scalar, a numeric vector of length K, or a single character string ("prior_mean" or "empirical_y").
family: A gdpar_family object. Only used when anchor = "empirical_y"; its $linkfun component is applied to the outcome mean for the location (first) slot.
y: Numeric vector of outcomes. Used only when anchor = "empirical_y".
prior: A gdpar_prior object. Accepted as part of the signature but not referenced in the function body.
slot_names: Character vector of length K giving the names of the parameter slots. Used for naming the output and for validating a named anchor vector.
verbose: Logical scalar. If TRUE, an informational message is issued when the empirical anchor is used.

Mathematics

For "prior_mean":

$$\boldsymbol{\eta}^{\text{anchor}} = \mathbf{0} \in \mathbb{R}^{K}$$

For "empirical_y":

$$\bar{y} = \frac{1}{n}\sum_{i=1}^{n} y_i, \qquad \eta_{\text{loc}} = g(\bar{y})$$

where $g$ is family$linkfun. The returned vector is:

$$\boldsymbol{\eta}^{\text{anchor}} = (\eta_{\text{loc}},; 0,; \ldots,; 0)^{\top}$$

i.e. only the first slot (location) receives the link-transformed mean; all remaining slots are anchored at $0$.

Returns

A numeric vector of length K with names set to slot_names.

Notes

Scalar broadcast: A finite numeric scalar is recycled to all K slots.
Named vector validation: If anchor is a length-K numeric vector with at least one non-empty name, the names must form the same set as slot_names (order-independent, checked via setequal). A mismatch raises a gdpar_input_error (class "gdpar_input_error") with data = list(received_names = ..., expected_names = ...).
Unnamed vector: A length-K numeric vector with no names is accepted positionally.
Link failure: If family$linkfun(yb) errors, the error is caught and re-raised as gdpar_input_error with the original condition message.
Invalid anchor: Any input not matching the accepted patterns raises gdpar_input_error with data = list(received = anchor, K = K, slot_names = slot_names).
The prior argument is unused in the body.
When verbose is TRUE and "empirical_y" is used, a message of class "gdpar_anchor_message" is emitted via gdpar_inform.

`resolve_anchor_multi(anchor, family, y, prior, p, verbose)`

Purpose

Multivariate-outcome counterpart of resolve_anchor. Resolves the anchor for a model with p outcome coordinates, returning a length-p numeric vector on the linear-predictor scale.

Arguments

anchor: A numeric scalar, a numeric vector of length p, or a single character string ("prior_mean" or "empirical_y").
family: A gdpar_family_multi object. Must contain a families component that is a list of length p, each element providing a $linkfun.
y: Numeric matrix of outcomes (n × p). Used only when anchor = "empirical_y".
prior: A gdpar_prior object. Accepted but not referenced in the body.
p: Integer scalar; the number of outcome coordinates (length of theta_ref).
verbose: Logical scalar. If TRUE, an informational message is issued when the empirical anchor is used.

Mathematics

For "prior_mean":

$$\boldsymbol{\eta}^{\text{anchor}} = \mathbf{0} \in \mathbb{R}^{p}$$

For "empirical_y":

$$\bar{y}_k = \frac{1}{n}\sum_{i=1}^{n} y_{ik}, \qquad \eta_k = g_k(\bar{y}_k), \quad k = 1, \ldots, p$$

where $g_k$ is family$families[[k]]$linkfun. The returned vector is:

$$\boldsymbol{\eta}^{\text{anchor}} = (\eta_1, \ldots, \eta_p)^{\top}$$

Returns

A numeric vector of length p (unnamed).

Notes

Scalar broadcast: A finite numeric scalar is recycled to length p.
Vector pass-through: A length-p finite numeric vector is returned as-is (coerced to double), without naming or name validation (unlike resolve_anchor_K).
Per-coordinate link: Each coordinate's link function is applied independently via vapply. If any fam_k$linkfun(yb[k]) errors, it is caught and re-raised as gdpar_input_error identifying the failing coordinate index k.
Invalid anchor: Raises gdpar_input_error with data = list(received = anchor, p = p).
The prior argument is unused in the body.
When verbose is TRUE and "empirical_y" is used, a message of class "gdpar_anchor_message" is emitted showing the formatted anchor values.

`resolve_anchor(anchor, family, y, prior, verbose)`

Purpose

Simplest (univariate) anchor resolver. Parses the user-supplied anchor argument and returns a single numeric value on the linear-predictor scale.

Arguments

anchor: A numeric scalar, or a single character string ("prior_mean" or "empirical_y").
family: A gdpar_family object. Must provide $linkfun (used for "empirical_y") and $link (a string naming the link, used in the verbose message).
y: Numeric vector of outcomes. Used only when anchor = "empirical_y".
prior: A gdpar_prior object. Accepted but not referenced in the body.
verbose: Logical scalar. If TRUE, an informational message is issued when the empirical anchor is used.

Mathematics

For "prior_mean":

$$\eta^{\text{anchor}} = 0$$

For "empirical_y":

$$\bar{y} = \frac{1}{n}\sum_{i=1}^{n} y_i, \qquad \eta^{\text{anchor}} = g(\bar{y})$$

where $g$ is family$linkfun.

Returns

A numeric scalar.

Notes

Unlike resolve_anchor_K and resolve_anchor_multi, this function does not accept a vector anchor; only a scalar or string is valid.
If family$linkfun(yb) errors, the error is caught and re-raised as gdpar_input_error with the original condition message.
Invalid anchor: Raises gdpar_input_error with data = list(received = anchor).
The prior argument is unused in the body.
When verbose is TRUE and "empirical_y" is used, a message of class "gdpar_anchor_message" is emitted, referencing family$link (the link name) and the computed anchor value.

`compute_diagnostics(fit, verbose = TRUE)`

Purpose

Collects convergence diagnostics from a cmdstanr fit object at fit time. Extracts R-hat and effective sample size (bulk and tail) summaries via the posterior package, counts divergent transitions and tree-depth saturations from sampler diagnostics, and optionally warns when thresholds are violated.

Arguments

fit: A cmdstanr fit object. Must provide $draws(), $diagnostic_summary().
verbose: Logical scalar (default TRUE). If TRUE and convergence thresholds are violated, individual warnings are issued.

Mathematics

$$\hat{R}_{\max} = \max_{v \in V} \hat{R}_v \leq 1.01$$

$$\text{ESS}_{\text{bulk,min}} = \min_{v \in V} \text{ESS}_{\text{bulk},v} \geq 400$$

$$\text{ESS}_{\text{tail,min}} = \min_{v \in V} \text{ESS}_{\text{tail},v} \geq 400$$

$$\frac{n_{\text{div}}}{N_{\text{iter}}} \leq 0.001, \qquad \frac{n_{\text{td}}}{N_{\text{iter}}} \leq 0.01$$

where $N_{\text{iter}} = n_{\text{iterations}} \times n_{\text{chains}}$, $n_{\text{div}}$ is the total divergent-transition count, and $n_{\text{td}}$ is the total tree-depth-saturation count. Overall convergence is:

$$\text{converged} = \bigwedge \text{ of all five conditions above}$$

Returns

An object of class c("gdpar_diagnostics", "list") with the following named components:

Component	Type	Description
`rhat_max`	numeric (scalar, possibly `NA_real_`)	Maximum R-hat across retained variables
`ess_bulk_min`	numeric (scalar, possibly `NA_real_`)	Minimum bulk ESS across retained variables
`ess_tail_min`	numeric (scalar, possibly `NA_real_`)	Minimum tail ESS across retained variables
`divergent_count`	integer	Total divergent transitions
`divergent_relative`	numeric (possibly `NA_real_`)	`divergent_count / total_iter`
`treedepth_saturated`	integer	Total tree-depth saturations
`treedepth_relative`	numeric (possibly `NA_real_`)	`treedepth_saturated / total_iter`
`efmi_min`	numeric (possibly `NA_real_`)	Minimum E-BFMI across chains
`converged`	logical	Whether all thresholds are satisfied
`summary`	data.frame	Full output of `posterior::summarise_draws` on retained variables
`messages`	character vector	Warning messages (populated only when `verbose && !converged`; otherwise empty)

Notes

Dependency check: Calls require_suggested("posterior", "summarize posterior draws") at entry.
Variable filtering: Variables whose names match the regex ^(eta|log_lik|y_pred|theta_i|a_coef|b_coef|a_raw|b_raw|W_raw) are excluded from convergence summaries. If this leaves zero variables, the function falls back to "theta_ref" (if present), then to the first variable in the draws.
Summary measures: Uses posterior::default_convergence_measures() appended to the summarise_draws call.
Diagnostic summary: fit$diagnostic_summary() is called inside tryCatch; if it errors or returns NULL, the divergent and treedepth counts default to 0L and efmi_min defaults to NA_real_.
Total iterations: Computed as posterior::niterations(draws) * posterior::nchains(draws) inside tryCatch; if this fails or is non-finite, set to NA_integer_, and the relative divergence/treedepth rates become NA_real_.
Warning behavior: When verbose = TRUE and converged = FALSE, each violated threshold produces a separate warning via gdpar_warn with class "gdpar_diagnostic_warning". The same messages are collected in the messages component.
Side effects: May emit up to five warnings (one per violated threshold) when verbose is TRUE.
S3 class: The returned object has class c("gdpar_diagnostics", "list"); no S3 methods for this class are defined in this section.

Diagnostics Accessor

`diagnostics(fit)`

Purpose

Exported accessor that retrieves the precomputed diagnostics component stored inside a fitted gdpar_fit object. It is the user-facing entry point for inspecting sampler-level diagnostics (e.g., divergences, R-hat, ESS, max treedepth) that were attached to the fit during model estimation. The function performs no recomputation; it simply validates the class of its input and returns the stored list element named diagnostics.

Arguments

Argument	Type	Meaning
`fit`	S3 object of class `"gdpar_fit"`	A fitted model object produced by the package's main estimation routine. Must contain a `diagnostics` element.

Returns

The object fit$diagnostics, returned as-is. The exact structure is whatever was attached at fit time (typically a named list of diagnostic quantities); this function does not coerce or reshape it.

Notes

Calls assert_inherits(fit, "gdpar_fit", "fit") before accessing the element. If fit does not inherit from class "gdpar_fit", an error is raised (with the argument name "fit" used in the message). The error is propagated from assert_inherits; no custom class is set by diagnostics itself.
No side effects: the function is pure with respect to fit.
S3 dispatch: none. This is an ordinary (non-generic) function; there is no UseMethod call, so subclasses of gdpar_fit cannot override behavior.
If fit$diagnostics is NULL or absent, NULL is returned silently (R's standard behavior for accessing a missing list element); no further validation of the element's presence is performed.

Group Argument Resolution

`.resolve_group_argument(group, data, n, verbose = TRUE)`

Purpose

Internal helper that translates the user-facing group argument — either NULL or a one-sided formula such as ~ species — into the integer grouping vector consumed by the Stan template underlying gdpar. It validates the formula, extracts the referenced variable from data, coerces it to a factor, and returns the integer codes together with metadata (variable name and level labels). When verbose is TRUE and at least one group has fewer than five observations, it emits an informational warning about the dominance of hierarchical shrinkage in small groups.

Arguments

Argument	Type	Meaning
`group`	`NULL` or `formula` of length 2 (one-sided)	The user-supplied grouping specification. `NULL` indicates no grouping (function returns `NULL` immediately). A one-sided formula must contain exactly one variable name on its right-hand side.
`data`	`data.frame`	The user-supplied data frame from which the grouping variable is extracted via `data[[var_name]]`.
`n`	`integer` (scalar)	The sample size of the outcome variable, used to validate that the grouping variable has matching length.
`verbose`	`logical` (scalar, default `TRUE`)	When `TRUE`, enables the small-group informational warning. When `FALSE` (or any non-`TRUE` value, tested via `isTRUE`), the warning is suppressed.

Mathematics

Let $g_i \in \{1, \dots, J\}$ denote the group index for observation $i$, obtained by factor-coercing the raw grouping variable. Let $n_g = \sum_{i=1}^{n} \mathbf{1}(g_i = g)$ be the per-group count. The small-group warning fires when

$$ \min_{g \in {1,\dots,J}} n_g < 5. $$

In that regime, the hierarchical anchor $\theta_{\text{ref},g}$ is dominated by shrinkage toward the population mean $\mu_{\theta_{\text{ref}}}$, so per-group posteriors may be uninformative.

Returns

If group is NULL: returns NULL.
Otherwise: a named list with three components:
- group_id: integer vector of length $n$, with values in $\{1, \dots, J\}$, obtained via as.integer(as.factor(raw)). The mapping from raw values to integers follows as.factor's level ordering (alphabetical by default for character inputs, or the factor's pre-existing level order).
- var_name: character scalar — the single variable name extracted from the formula via all.vars(group)[[1]].
- levels: character vector — levels(fac), i.e., the original group level labels in factor order.

Notes

Formula validation. If group is not NULL, it must inherits(group, "formula") and have length(group) == 2L (one-sided formulas have length 2: ~ plus RHS). Violations raise:
```
gdpar_abort("Argument 'group' must be a one-sided formula such as ~ species.",
            class = "gdpar_input_error")
```
Single-variable check. all.vars(group) must yield exactly one variable. If zero or more than one are found, an error of class "gdpar_input_error" is raised, with data = list(variables_found = vars) attached for programmatic inspection.
Column existence. The extracted var_name must appear in colnames(data); otherwise an error of class "gdpar_input_error" is raised with data = list(missing_variable = var_name).
Length consistency. length(raw) must equal n. A mismatch raises an error of class "gdpar_input_error" reporting both lengths.
Missing values. Any NA in raw is fatal: an error of class "gdpar_input_error" is raised reporting the count of NA values via sum(is.na(raw)). The message instructs the user to remove or impute before fitting.
Zero levels. After as.factor, if nlevels(fac) < 1L (defensive guard; under normal R semantics this cannot occur for a non-NA vector), an error of class "gdpar_input_error" is raised.
Small-group warning. When isTRUE(verbose) and length(counts) > 0L and min(counts) < 5L, a warning is emitted via gdpar_warn with class "gdpar_grouping_warning". The message includes var_name, min(counts), and J_groups. The warning is purely informational (not an error) and does not abort resolution.
Coercion semantics. The raw variable is coerced via as.factor(raw), so numeric grouping variables are treated by their printed values; factor inputs retain their level order; character inputs are sorted alphabetically by as.factor.
No side effects beyond the optional warning. The function does not modify data or any global state.
Not exported. Marked @keywords internal and @noRd; intended solely for internal use within the package's fitting pipeline.

R/geometry_bridge.R

`.gdpar_geom_enable_methods(fit, hessian = TRUE, seed = 1L)`

Purpose Internal helper that enables the standalone log_prob, grad_log_prob, and optionally hessian methods on a CmdStanFit / CmdStanMCMC object. It is called by gdpar_geom_bridge() to prepare a fitted cmdstan object for consumption by the geometry engine. The call is idempotent: init_model_methods() is a no-op when the methods are already attached.

Arguments

Argument	Type	Meaning
`fit`	`CmdStanMCMC` or `CmdStanFit`	A cmdstan fit object whose compiled executable should have the standalone methods exposed.
`hessian`	logical (length 1)	Whether to request compilation of the standalone Hessian (needed by the Riemannian / SoftAbs geometry level). Defaults to `TRUE`.
`seed`	integer-coercible (length 1)	Forwarded to `fit$init_model_methods(seed = ...)`. Seeds any internal RNG inside the standalone methods. Defaults to `1L`.

Mathematics None; this is a plumbing / compilation-gating function.

Algorithm (two-stage fallback)

Attempt fit$init_model_methods(seed, verbose = FALSE, hessian = TRUE).
If that succeeds, return list(has_hessian = TRUE).
If hessian = TRUE was requested and step 1 failed, retry with hessian = FALSE.
- If the retry succeeds, emit a gdpar_geometry_warning via gdpar_warn() and return list(has_hessian = FALSE).
- This fallback leaves the Euclidean, dense, and sub-Riemannian (expected-Fisher) levels fully operational; only the Riemannian SoftAbs level is disabled.
If both attempts fail (or hessian = FALSE was requested and the single attempt failed), call gdpar_abort() with class "gdpar_input_error", surfacing the original error message. A C++ toolchain capable of compiling the model methods is required.

Returns invisible(list(has_hessian = logical(1))) — a single-element list recording whether the Hessian method is available. Returned invisibly.

Notes

The tryCatch / try_init wrapper converts errors to condition objects rather than propagating them, enabling the two-stage fallback.
A failure to expose log_prob / grad_log_prob at all is fatal (gdpar_abort); only Hessian failure is tolerated.
Side effects: may emit a gdpar_geometry_warning via gdpar_warn() on Hessian fallback.

`.gdpar_geom_unconstrained_summary(fit)`

Purpose Internal helper that extracts the unconstrained-dimension and a posterior-mean warm-start vector from a cmdstan fit with methods already enabled. The geometry engine operates on the unconstrained (real-vector) scale, so this provides both dim (the dimension of that space) and reference (the posterior mean in that space, used as a warm-start for position-dependent levels).

Arguments

Argument	Type	Meaning
`fit`	`CmdStanMCMC` or `CmdStanFit`	A methods-enabled cmdstan fit object. Must support `$unconstrain_draws()`.

Mathematics Given the matrix of unconstrained draws $\mathbf{U} \in \mathbb{R}^{S \times d}$ where $S$ is the number of posterior samples and $d$ is the unconstrained dimension:

$$\text{dim} = d, \qquad \text{reference} = \frac{1}{S} \sum_{s=1}^{S} \mathbf{U}_{s,\cdot} \in \mathbb{R}^{d}$$

That is, the reference is the column-wise mean of the unconstrained draws (the posterior mean on the unconstrained scale).

Returns A named list with two elements:

Element	Type	Meaning
`dim`	integer (scalar)	Number of columns of the unconstrained draws matrix $d$.
`reference`	numeric vector (length $d$, unnamed)	Posterior mean on the unconstrained scale.

Notes

Calls posterior::as_draws_matrix(fit$unconstrain_draws()) then coerces to a base matrix via as.matrix().
If fit$unconstrain_draws() errors, gdpar_abort() is called with class "gdpar_input_error", forwarding the original message.
The reference vector is unname()-d to strip any column names.

`.gdpar_geom_bridge_core(model, stan_data, dim, fisher = NULL, reference = NULL, engine_fit = NULL, has_hessian = TRUE, extra = list())`

Purpose Internal factory that assembles the gdpar_geom_bridge list object from its constituent parts. Called by both gdpar_geom_bridge() (the post-hoc bridge) and gdpar_geom_fit() (the one-call entry). It constructs the target (for re-sampling diagnostics) and geom_target (for the geometry engine) components and attaches metadata.

Arguments

Argument	Type	Meaning
`model`	`CmdStanModel`	A compiled cmdstan model (with methods) that can be re-sampled. Used for the diagnostic `target`.
`stan_data`	named list	The Stan data list for the model.
`dim`	integer-coercible (scalar)	The unconstrained dimension $d$.
`fisher`	function or `NULL`	An optional function $\theta \mapsto F(\theta)$ returning the expected Fisher information matrix. Required by the sub-Riemannian geometry level. Must be `NULL` or a function; validated at entry.
`reference`	numeric vector or `NULL`	The unconstrained reference position (warm-start). Can be `NULL` if no warm-start is available.
`engine_fit`	`CmdStanMCMC` / `CmdStanFit` or `NULL`	A methods-enabled cmdstan fit to back `geom_target`. When `NULL`, the `geom_target` is derived from `model` instead (a cheap one-iteration sample would expose the methods).
`has_hessian`	logical (scalar)	Whether the Hessian method is available. When `FALSE`, the `$hessian` closure in `geom_target` is set to `NULL` so that the SoftAbs level degrades gracefully instead of erroring at call time.
`extra`	named list	Additional elements to merge into the returned object (e.g., extra metadata).

Returns An object of class c("gdpar_geom_bridge", "list") containing:

Element	Type / Class	Meaning
`target`	list with `$model`, `$dim`, `$data`	The re-samplable diagnostic target.
`geom_target`	`gdpar_geom_target` object	The engine sampling target on the unconstrained scale (exposes `$log_prob`, `$grad_log_prob`, and optionally `$hessian`).
`fisher`	function or `NULL`	The expected-Fisher function, if supplied.
`reference`	numeric vector or `NULL`	Warm-start position.
`dim`	integer	The unconstrained dimension.
`model`	`CmdStanModel`	The methods-enabled compiled model.
`stan_data`	named list	The Stan data list.
`...`	any	Any extra elements from the `extra` argument.

Notes

fisher is validated: if non-NULL and not a function, gdpar_abort() is called with class "gdpar_input_error".
geom_target is created by calling gdpar_geom_target(object = src, dim = dim, data = stan_data) where src is engine_fit when non-NULL, otherwise model.
When has_hessian is FALSE, geom_target$hessian is explicitly set to NULL so that downstream consumers (the SoftAbs level) degrade gracefully.
The dim argument is coerced to integer in both the target sub-list and the top-level object.

`gdpar_geom_bridge(object, fisher = NULL, reference = NULL, hessian = TRUE, methods_seed = 1L, ...)`

Purpose Exported function. The durable, path-agnostic core of the Block RG integration (RG.6 part ii). Takes an already-fitted gdpar_fit object, enables standalone log-probability / gradient / Hessian methods on its cmdstan fit, derives the unconstrained dimension and posterior-mean warm-start, recompiles a re-samplable CmdStanModel from the fit's own Stan source, and returns a gdpar_geom_bridge object consumable by gdpar_geom_orchestrate(). It never touches the gdpar() default fit path, so the golden tests remain bit-identical.

Arguments

Argument	Type	Meaning
`object`	`gdpar_fit`	A fitted gdpar model (result of `gdpar()`). Must carry `$fit` (a `CmdStanMCMC` / `CmdStanFit`) and `$stan_data`.
`fisher`	function or `NULL`	Optional function of the unconstrained $\theta$ returning the expected Fisher information matrix. Required by the sub-Riemannian level. Use `gdpar_geom_fisher_simulator()` when no closed form exists.
`reference`	numeric vector or `NULL`	Optional unconstrained reference position. Defaults to the posterior mean extracted from the fit's draws. If supplied, must have length equal to the unconstrained dimension $d$.
`hessian`	logical (length 1)	Whether to compile the standalone Hessian method (needed by the Riemannian SoftAbs level). Defaults to `TRUE`.
`methods_seed`	integer (length 1)	Seed forwarded to `init_model_methods()`. Defaults to `1L`.
`...`	—	Reserved for future extension; currently unused.

Algorithm

Input validation: verify object inherits from "gdpar_fit", that $fit is a CmdStanMCMC / CmdStanFit, that $stan_data is non-NULL, that reference (if non-NULL) is numeric.
Dependency check: call require_suggested("cmdstanr", ...) and require_suggested("posterior", ...).
Enable methods: call .gdpar_geom_enable_methods(csfit, hessian, seed) on the fitted cmdstan object. This returns meth with meth$has_hessian.
Unconstrained summary: call .gdpar_geom_unconstrained_summary(csfit) to obtain the dimension $d$ and the posterior-mean reference. If reference was NULL, use the posterior mean; otherwise validate its length equals $d$.
Recompile model: call cmdstanr::cmdstan_model(cmdstanr::write_stan_file(csfit$code()), compile_model_methods = TRUE) to get a fresh, re-samplable CmdStanModel from the fit's Stan source. Cmdstanr's content-hash cache makes this a cache hit when the methods variant already exists.
Assemble bridge: call .gdpar_geom_bridge_core(model, stan_data, d, fisher, reference, engine_fit = csfit, has_hessian = meth$has_hessian).

Returns An object of class c("gdpar_geom_bridge", "list") — see .gdpar_geom_bridge_core() for the full structure. Feed (target, geom_target, fisher, reference) to gdpar_geom_orchestrate().

Notes

Path-agnostic: works for K-individual, multi-coordinate, and single-coordinate gdpar_fit objects, provided they carry $fit and $stan_data.
Uses the fit's own $code() Stan source for recompilation, guaranteeing model identity.
Cmdstanr's hash-based compilation cache means recompilation is typically a no-op cache hit.
Errors at any validation or compilation step raise gdpar_abort() with class "gdpar_input_error".
This function is the tool RG.7 points at the real Tweedie count of benchmark 9.2.O.

`print.gdpar_geom_bridge(x, ...)`

Purpose S3 print method for objects of class "gdpar_geom_bridge". Provides a concise, human-readable summary of the bridge's key properties.

Arguments

Argument	Type	Meaning
`x`	`gdpar_geom_bridge`	The bridge object to print.
`...`	—	Unused; present for S3 method compatibility.

Output (to console)

<gdpar_geom_bridge>
  unconstrained dim: <d>
  fisher supplied:   <TRUE/FALSE>
  reference:         <description>
  feed (target, geom_target, fisher, reference) to gdpar_geom_orchestrate()

Where <description> is:

"zeros (default)" when x$reference is NULL.
"length <n> (warm-start)" when x$reference is non-NULL, with <n> = length(x$reference).

Returns invisible(x) — the input object, unchanged.

Notes Pure side-effect (console output). No validation beyond what cat() and is.null() provide.

`.gdpar_geom_fit_resolve_K(formula, family, amm, W, data)`

Purpose Internal helper used by gdpar_geom_fit() to resolve the front-end inputs for the K-individual (distributional) model-building path. It reuses the same resolution logic as gdpar() without touching gdpar() itself, and returns the arguments that .gdpar_K_build() needs. Scoped to $K > 1$.

Arguments

Argument	Type	Meaning
`formula`	`gdpar_formula_set`, formula, or other	A `gdpar_bf()` formula set, a classic formula with AMM wrapper calls on the RHS, or (with a named `amm` list) a two-sided `y ~ ...` formula.
`family`	`gdpar_family`, `gdpar_family_multi`, or named list of `gdpar_family`	The distribution family. A single `gdpar_family` is broadcast across all K slots; a named list enables heterogeneous families per slot.
`amm`	`amm_spec` (default) or named list of `amm_spec`	The additive multiresolution specification. When `formula` is a `gdpar_formula_set`, `amm` must stay at its default. When a named list of `amm_spec`, it is used directly.
`W`	modulating basis or `NULL`	Optional modulating basis for the K-individual paths.
`data`	data frame	The data (not directly used in resolution but part of the interface contract).

Algorithm / Resolution Logic

The function handles three mutually exclusive input patterns:

Case 1 — formula is a gdpar_formula_set:

amm must be at its default (amm_spec()); otherwise error.
Derive amm_list_canonical via .gdpar_formula_set_to_amm_spec_list(formula, W).
Extract outcome_name from formula$outcome and formula_env from formula$env.

Case 2 — amm is a named list of amm_spec:

Each entry must be a named amm_spec; validated in a loop.
formula must be a two-sided formula (length(formula) == 3L); otherwise error.
amm_list_canonical <- amm.
outcome_name extracted from formula[[2L]].
formula_env <- environment(formula).

Case 3 — Classic formula with AMM wrapper calls on the RHS:

Detected via .gdpar_rhs_has_amm_calls(formula).
amm must be at its default; otherwise error.
Derive the first eligible parameter name from family$param_specs[[1]]$name.
Build a gdpar_formula_set via do.call(gdpar_formula_set, list(formula)) named with that parameter.
Derive amm_list_canonical from the formula set.
Extract outcome_name and formula_env from the formula set.

If none of the three cases applies, gdpar_abort() is called with a message indicating that only the K-individual path is supported (RG.6 scope).

Post-resolution checks:

K <- length(amm_list_canonical). If $K \leq 1$, abort: gdpar_geom_fit() requires $K > 1$.
Family resolution:
- If family is a named list of gdpar_family (detected by checking it's a list, not a gdpar_family or gdpar_family_multi, has non-empty unique names, and every element inherits from "gdpar_family"): call .gdpar_resolve_heterogeneous_family_K(family, names(amm_list_canonical)) to get family_promoted and family_id_k_vector.
- Otherwise: call .gdpar_promote_scope_per_observation(family, names(amm_list_canonical)) to get a single family_promoted; family_id_k_vector is NULL.

Returns A named list:

Element	Type	Meaning
`amm_list_canonical`	named list of `amm_spec`	Canonical AMM specification list (length $K$).
`family`	`gdpar_family` or `gdpar_family_multi`	The resolved (possibly promoted) family.
`outcome_name`	character (length 1)	Name of the response variable.
`formula_env`	environment	The environment associated with the formula (for evaluating terms).
`family_id_k_vector`	integer vector or `NULL`	Per-slot family index for heterogeneous family lists; `NULL` when a single family is broadcast.
`K`	integer (scalar)	Number of distributional slots ($K > 1$).

Notes

This function is the shared seam between gdpar_geom_fit() and gdpar()'s internal K path: it reuses .gdpar_K_build() as the single source of model assembly.
The family_is_named_list detection checks: is.list(family), not inheriting from "gdpar_family" or "gdpar_family_multi", non-NULL names, all names non-empty (nzchar), no duplicated names, and every element inheriting from "gdpar_family".
All error paths use gdpar_abort() with class "gdpar_input_error".
The error for $K \leq 1$ includes data = list(K = K) for programmatic access to the slot count.

`gdpar_geom_fit(formula, family, amm, W, data, prior, anchor, skip_id_check, parametrization, parametrization_a, parametrization_W, id_check_rigor, group, fisher, budget, criteria, entry_level, level_map, reference, speed, rest_mass, laplace_fallback, laplace_draws, n_grid, seed, verbose, ...)`

Purpose
High-level entry point for fitting General Dynamic Parameter (GDP) models using geometry-adaptive optimization (opt-in controller) with reference anchoring. Builds the hierarchical Bayesian model once, then runs the geometry-adaptive controller (an alternative to the default gdpar() fit path). Requires cmdstanr and posterior packages.

Arguments

formula : formula – Model formula specifying response and predictors.
family : gdpar_family – Distribution family (default: gdpar_family("gaussian")).
amm : amm_spec – Adaptive Mixture of Multivariate normals specification (default: amm_spec()).
W : matrix or NULL – Weight matrix for the model (default: NULL).
data : data.frame – Data frame containing variables referenced in formula.
prior : list or NULL – Prior specifications for parameters (default: NULL).
anchor : character or numeric – Anchor for reference (default: "prior_mean").
skip_id_check : logical – If TRUE, skip identity checks (default: FALSE).
parametrization : character – One of "auto", "ncp" (non-centered), "cp" (centered); default "auto".
parametrization_a : character or NULL – Overrides parametrization for a parameters; must be "ncp" or "cp" if set.
parametrization_W : character or NULL – Overrides parametrization for W parameters; must be "ncp" or "cp" if set.
id_check_rigor : character – One of "full" or "fast" (default: "full").
group : character or NULL – Grouping variable for hierarchical structure (default: NULL).
fisher : matrix, list, or NULL – Precomputed Fisher information or specification (default: NULL).
budget : numeric or NULL – Computational budget for optimizer (default: NULL).
criteria : list or NULL – Convergence criteria for optimizer (default: NULL).
entry_level : integer or NULL – Initial level in hierarchy for geometry controller (default: NULL).
level_map : list or NULL – Mapping of levels for geometry controller (default: NULL).
reference : numeric or NULL – Reference point for anchoring (default: NULL).
speed : numeric – Speed parameter for geometry controller (default: 10).
rest_mass : numeric – Rest mass parameter for geometry controller (default: 1).
laplace_fallback : logical – If TRUE, use Laplace approximation as fallback (default: FALSE).
laplace_draws : integer – Number of draws from Laplace fallback (default: 0L).
n_grid : integer or NULL – Grid size for geometry controller (default: NULL).
seed : integer – Random seed for reproducibility (default: 20260603L).
verbose : logical – If TRUE, print progress messages (default: TRUE).
... : Additional arguments passed to internal functions.

Mathematics
The function orchestrates a two-stage process:

Model building: Constructs a Stan model via .gdpar_K_build, yielding a compiled model and data.
Geometry-adaptive fitting:
- A throwaway single-iteration MCMC sample (chains=1, iter_warmup=1, iter_sampling=1) extracts the unconstrained parameter dimension $d$ and enables model methods (e.g., Hessian).
- A bridge object (.gdpar_geom_bridge_core) defines the target density $\pi(\theta)$ and geometry-informed target $\tilde{\pi}(\theta)$, optionally using a Fisher information matrix $\mathcal{I}(\theta)$.
- The geometry controller (gdpar_geom_orchestrate) performs adaptive sampling over a hierarchy of levels, using speed $v$ and rest mass $m$ to navigate the parameter space. It may fall back to Laplace approximation $\mathcal{N}(\hat{\theta}, \mathcal{I}(\hat{\theta})^{-1})$ if requested.

Returns
An object of class gdpar_geom_fit (inheriting from list) with elements:

orchestration : Output from gdpar_geom_orchestrate.
bridge : Bridge object from .gdpar_geom_bridge_core.
status : Character scalar – optimization status (e.g., "resolved").
level : Numeric scalar – level at which resolution occurred.
metric : List – geometry metric used.
draws : Matrix – posterior draws (samples × parameters).
laplace : List – Laplace fallback results (if used).
certificate : List – verification certificate (if applicable).
stan_data : List – data passed to Stan.
family : gdpar_family – family object.
K : Integer – number of mixture components/groups.
slot_names : Character vector – model slot names.
call : Matched call.

Notes

Requires packages cmdstanr and posterior; raises error if missing.
The throwaway sample discards all draws; its purpose is only to initialize methods and compute $d$.
parametrization="auto" lets internal logic choose between non-centered (ncp) and centered (cp) parametrization.
The geometry controller adapts over levels; entry_level and level_map control this hierarchy.
If laplace_fallback=TRUE, uses laplace_draws posterior samples from the Laplace approximation.
seed ensures reproducibility of the geometry controller's internal sampling.
The printed message states that this function never alters the default gdpar() fit path.

`print.gdpar_geom_fit(x, ...)`

Purpose
S3 print method for gdpar_geom_fit objects. Provides a concise summary of the fit.

Arguments

x : gdpar_geom_fit – Object to print.
... : Additional arguments (currently unused).

Mathematics
None.

Returns
Invisibly returns the input object x.

Notes

Dispatches automatically for objects of class gdpar_geom_fit.
Output varies based on x$status:
- If status == "resolved", prints the level and draws dimensions (if draws exist).
- Otherwise, prints verdict and prescription count from x$certificate (if present).
If x$laplace is not NULL, appends the Laplace fit quality label.
Does not modify the object; side effect is printing to console.

R/geometry_diagnostic.R

`gdpar_geometry_thresholds()`

Purpose

Returns the default decision thresholds for the rule-based classifier inside gdpar_geometry_diagnostic. These are exposed as data rather than hard-coded so that the calibration of Block RG.1 can be tuned against gdpar_geometry_suite and so users can re-calibrate for their own settings.

Arguments

None.

Returns

A named list of 13 numeric scalars:

Element	Default	Role
`divergent_rate_high`	0.01	Upper tolerable divergence rate before a pathology is flagged.
`funnel_ebfmi_low`	0.35	Minimum E-BFMI below which a funnel-type energy pathology is suspected.
`heavy_cond_max`	25	Maximum tolerable condition number for the heavy-tail class.
`treedepth_sat_high`	0.20	Upper tolerable fraction of iterations hitting `max_treedepth`.
`condition_high`	12	Condition-number cut above which anisotropy is flagged.
`step_scale_ratio_low`	0.10	Lower cut on the adapted-step-to-scale ratio.
`nslope_grows`	0.80	Slope threshold for the difficulty-vs-$n$ curve above which difficulty is deemed to grow with $n$ (quasi-determinism indicator).
`flat_var_high`	600	Variance cut above which a flat-direction pathology is flagged.
`boundary_prox_high`	0.02	Fraction of draws within `boundary_eps` of a boundary above which boundary pile-up is flagged.
`boundary_eps`	0.01	Distance defining proximity to a parameter boundary.
`multimodal_high`	2.5	Cut on the multimodality signal above which multimodality is flagged.
`heavy_kurtosis_high`	1.8	Excess-kurtosis cut above which heavy tails are flagged.
`target_ess`	400	Target effective sample size used for cost extrapolation.

Notes

The defaults were calibrated in sub-phase RG.1.c (session B9.23) against a synthetic suite over a difficulty × pilot-budget × replica grid. Thresholds were proposed by a data-driven Youden cut, regularised to interpretable values on a calibration fold, and validated out-of-sample on a held-out fold (held-out balanced accuracy rose from 0.60 to 0.89).
Versus the initial hand-set values, several cuts were tightened to compensate for signal attenuation in the short pilots: condition_high 50 → 12, heavy_kurtosis_high 3 → 1.8, boundary_prox_high 0.10 → 0.02, nslope_grows 0.30 → 0.80, funnel_ebfmi_low 0.25 → 0.35, heavy_cond_max 8 → 25, flat_var_high 1000 → 600, multimodal_high 2 → 2.5.
The calibration is against an idealised synthetic suite and is not claimed to be optimal on real posteriors. The funnel and heavy-tail classes remain mutually confusable (per-class recall ≈ 0.6–0.7).
No side effects; no errors raised; pure data factory.

`gdpar_geometry_diagnostic(target, n_grid = NULL, difficulty = NULL, pilot_warmup = 150L, pilot_sampling = 150L, chains = 4L, adapt_delta = 0.8, max_treedepth = 10L, seed = 20260602L, thresholds = NULL, verbose = TRUE, ...)`

Purpose

Top-level exported entry point of the geometry-diagnostic subsystem. It is an opt-in forensic probe: it runs a sequence of short pilot Stan fits at one or more data sizes $n$, extracts sampler- and posterior-level "signals" from each pilot, fits a difficulty curve across $n$, classifies the posterior geometry into a pathology/remedy/level triple, estimates a computational cost, and assembles everything into an S3 object of class gdpar_geometry_diagnostic. It does not modify any fit.

Arguments

Argument	Type	Meaning
`target`	various (see `.gdpar_geom_normalize_target`)	The posterior target to probe: a `gdpar_geometry_target` suite object, a `list(type = "gdpar", formula, amm, data)` gdpar specification, a `list(stan_code, stan_data[, data_n_fn, bounds])` raw Stan target, or a `list(model, data[, data_n_fn, bounds])` with a compiled `CmdStanModel`.
`n_grid`	`numeric` / `NULL`	Grid of data sizes at which to run pilots. If `NULL`, the default grid supplied by the normalised target is used. Coerced to numeric, then `sort`-ed and de-duplicated via `unique`.
`difficulty`	various / `NULL`	Difficulty level forwarded to a suite target's `make(n, diff)`. Ignored for non-suite targets.
`pilot_warmup`	`integer` (count)	Warmup iterations per pilot fit. Asserted strictly positive integer via `assert_count`.
`pilot_sampling`	`integer` (count)	Sampling (post-warmup) iterations per pilot fit. Asserted via `assert_count`.
`chains`	`integer` (count)	Number of Markov chains per pilot. Asserted via `assert_count`.
`adapt_delta`	`numeric` scalar in $(0,1)$	Stan `adapt_delta`. Asserted via `assert_numeric_scalar` with `lower = 0, upper = 1`.
`max_treedepth`	`integer` (count)	Stan `max_treedepth`. Asserted via `assert_count`.
`seed`	`integer` (count)	Base RNG seed. Pilot $i$ receives seed `seed + i`. Asserted via `assert_count`.
`thresholds`	`list` / `NULL`	Classification thresholds. If `NULL`, defaults are obtained from `gdpar_geometry_thresholds()`.
`verbose`	`logical` scalar	If `TRUE`, emits an informational `gdpar_optin_message` describing the probe before running. Manually validated (must be length-1 logical).
`...`	forwarded	Extra arguments passed to `.gdpar_geom_run_pilot` (and thence to the Stan sampler / target `make`).

Mathematics

For each element $n_i$ of the sorted, de-duplicated grid $\{n_1, \dots, n_K\}$, a pilot is run with seed $s_i = \texttt{seed} + i$:

$$ \text{pilot}_i = \texttt{.gdpar_geom_run_pilot}(\text{norm},; n_i,; \text{controls},; s_i,; \dots) $$

Each pilot yields a signals sub-list; these are stacked row-wise into a data frame $\mathbf{S}$ with one row per grid point and columns:

$$ \mathbf{S} = \begin{bmatrix} n & \text{divergent_rate} & \text{ebfmi_min} & \text{treedepth_sat_rate} & \text{condition_number} & \text{step_scale_ratio} & \lambda_{\max,\text{cov}} & \text{mean_leapfrog} & \text{multimodality} & \text{heavy_kurtosis} & \text{boundary_proximity} & \text{failed} \end{bmatrix} $$

A difficulty curve is then estimated:

$$ \text{ncurve} = \texttt{.gdpar_geom_difficulty_curve}(\mathbf{S},; \text{thresholds}) $$

Classification uses the largest pilot's signals (pilot_max = pilots[[K]]), except that multimodality — being $n$-invariant — is aggregated across all pilots by taking the maximum (with na.rm = TRUE):

$$ \text{multimodality}_{\text{classify}} = \max_{i=1}^{K} \mathbf{S}_i[\text{multimodality}] $$

The classification and cost are then:

$$ \text{cls} = \texttt{.gdpar_geom_classify}(\text{sig_classify},; \text{ncurve},; \text{culprit},; \text{thresholds}) $$

$$ \text{cost} = \texttt{.gdpar_geom_cost}(\text{pilot_max},; \text{ncurve}) $$

Returns

A list with S3 class c("gdpar_geometry_diagnostic", "list") and elements:

Element	Content
`pathology`	Classified pathology string (from `cls$pathology`).
`confidence`	Classification confidence (from `cls$confidence`).
`recommended_geometry`	Recommended remedy (from `cls$remedy`).
`geometry_level`	Recommended sampler level (from `cls$level`).
`signals`	Data frame $\mathbf{S}$ (one row per grid point).
`difficulty_curve`	Result of `.gdpar_geom_difficulty_curve`.
`culprit`	Culprit information from the largest pilot (`pilot_max$culprit`).
`cost`	Result of `.gdpar_geom_cost`.
`rule_trace`	Trace from the classifier (`cls$trace`).
`reproducibility`	A list: `seed`, `n_grid`, `controls` (warmup/sampling/chains/adapt_delta/max_treedepth), `target_id` (`norm$meta$id`), `gdpar_version` (`utils::packageVersion("gdpar")` as character), `cmdstan_version` (`cmdstanr::cmdstan_version()` wrapped in `tryCatch`, yielding `NA_character_` on error).
`ground_truth`	Present only if `norm$meta$ground_truth` is non-`NULL`; copied verbatim.
`correct`	Present only if ground truth exists; `identical(cls$pathology, norm$meta$ground_truth$pathology)`.

Notes

Input validation order: assert_count is called on pilot_warmup, pilot_sampling, chains, max_treedepth, and seed; assert_numeric_scalar on adapt_delta (bounds $(0,1)$); verbose is checked manually (length-1 logical, else gdpar_abort with class gdpar_input_error).
Suggested-package gating: require_suggested("cmdstanr", ...) and require_suggested("posterior", ...) are called before any work begins.
Threshold defaulting: when thresholds is NULL, gdpar_geometry_thresholds() is called exactly once.
Grid coercion: n_grid (whether user-supplied or defaulted) is passed through sort(unique(as.numeric(n_grid))), so non-numeric input will be coerced (and may error inside as.numeric), duplicates are removed, and order is ascending.
Verbose message: the informational message reports length(n_grid) pilot fits, length(n_grid) sizes, chains chains, and pilot_warmup + pilot_sampling iterations. It is emitted with class gdpar_optin_message via gdpar_inform.
Multimodality aggregation: suppressWarnings(max(signals$multimodality, na.rm = TRUE)) is used; if all values are NA, this yields -Inf (with the warning suppressed).
Ground-truth comparison: correct uses identical, so the classified pathology string must match the ground-truth string exactly (including type and encoding).
No side effects beyond the optional informational message and the Stan fits launched by .gdpar_geom_run_pilot.

`.gdpar_geom_normalize_target(target, difficulty)`

Purpose

Internal three-way (effectively four-way) adapter that normalises the heterogeneous target argument into a uniform list consumed by the pilot runner. It inspects the class and structure of target and dispatches to one of four branches: suite target, gdpar specification, raw Stan code, or compiled CmdStanModel.

Arguments

Argument	Type	Meaning
`target`	various	The user-supplied target. See the four recognised forms below.
`difficulty`	various / `NULL`	Difficulty level; only consumed by the suite-target branch.

Mathematics

For the gdpar-specification branch, the default grid is three logarithmically spaced points between a lower bound and the full data size:

$$ n_{\text{grid,default}} = \mathrm{unique}!\left(\mathrm{round}!\left(\exp!\left(\mathrm{seq}!\left(\log L,; \log n_{\text{full}},; \text{length.out} = 3\right)\right)\right)\right), \quad L = \max!\left(30,; \left\lceil \frac{n_{\text{full}}}{4} \right\rceil\right) $$

where $n_{\text{full}} = \texttt{nrow(target\$data)}$.

Returns

A normalised list whose common fields are kind (a string tag), n_grid_default (numeric vector of default grid sizes), and meta (a list with id, bounds, ground_truth). Branch-specific additional fields:

Branch	`kind`	`n_grid_default`	`meta`	Extra fields
`inherits(target, "gdpar_geometry_target")`	`"suite"`	`target$n_grid`	`id = target$id`, `bounds = target$bounds`, `ground_truth = list(pathology, geometry_remedy, culprit, difficulty_scales_with_n)` extracted from the target	`make = function(n) target$make(n, diff)` where `diff = difficulty %
`is.list(target) && identical(target$type, "gdpar")`	`"gdpar"`	Log-spaced 3-point grid (formula above)	`id = "gdpar_spec"`, `bounds = NULL`, `ground_truth = NULL`	`gdpar_target = target`, `n_full = nrow(target$data)`
`is.list(target) && !is.null(target$stan_code)`	`"stan"`	`1`	`id = "stan_target"`, `bounds = target$bounds %
`is.list(target) && (inherits(target$model, "CmdStanModel") \|\| !is.null(target$model))`	`"model"`	`1`	`id = "cmdstan_model"`, `bounds = target$bounds %

Notes

Branch order matters: the suite-target branch is tested first (via inherits), then the gdpar-specification branch (via identical(target$type, "gdpar")), then the stan_code branch, then the model branch. A list that satisfies multiple conditions is dispatched to the earliest matching branch.
gdpar-specification validation: if target$data or target$formula is NULL, the function aborts with class gdpar_input_error and message "A gdpar target must supply at least 'formula' and 'data'.".
Model-branch guard: the condition inherits(target$model, "CmdStanModel") || !is.null(target$model) is logically equivalent to !is.null(target$model) (since a non-NULL value satisfies the second disjunct regardless of class). The inherits check is therefore redundant but harmless.
Unrecognised target: if none of the four branches matches, the function calls gdpar_abort with class gdpar_input_error and a message listing the four accepted forms.
%||% operator: used for bounds and data_n_fn defaults; the comment notes it is "defined canonically in R/preflight_multi.R".
difficulty for non-suite branches: silently ignored.
ground_truth for non-suite branches: always NULL, meaning the returned diagnostic object will not carry ground_truth or correct fields.
data_n_fn default: for the stan and model branches, if target$data_n_fn is not supplied, a closure is created that ignores its argument n and returns the fixed target$stan_data (or target$data). This means the raw-Stan and compiled-model branches are effectively single-$n$ targets (n_grid_default = 1) unless the user supplies a custom data_n_fn.
No side effects; pure transformation (the make closure captures target and diff but is not invoked here).

`.gdpar_geom_run_pilot(norm, n_knob, controls, seed, ...)`

Purpose

Top-level pilot runner for the geometry diagnostic. It attempts to draw a posterior sample at a given knob size n_knob, guards against sampling failures, and—on success—delegates to signal extraction and culprit localization. On failure it returns a sentinel structure so downstream code can proceed uniformly.

Arguments

Argument	Type	Meaning
`norm`	list	A "norm" descriptor of the target model. Must contain `$kind` (character), `$meta$bounds` (parameter bounds), and kind-specific fields consumed by `.gdpar_geom_sample`.
`n_knob`	numeric / integer	The knob value (typically data size) at which to run the pilot.
`controls`	list	Sampling control parameters. Must include `$max_treedepth`; other fields (`$chains`, `$warmup`, `$sampling`, `$adapt_delta`) are forwarded to the sampler.
`seed`	integer	Random seed passed to the sampler.
`...`	any	Extra arguments forwarded to `.gdpar_geom_sample` and ultimately to the underlying sampler or `gdpar()`.

Returns

A list with elements:

Element	Type	Description
`failed`	logical	`TRUE` if sampling raised an error; `FALSE` otherwise.
`signals`	list	Diagnostic signal list (real signals on success, all-`NA` via `.gdpar_geom_na_signals()` on failure).
`culprit`	list	Culprit localization (real on success, empty via `.gdpar_geom_empty_culprit()` on failure).
`elapsed`	numeric	Elapsed wall-clock seconds (present only when `failed = FALSE`).

Notes

Sampling errors are caught with tryCatch. On error, gdpar_warn() is called with class "gdpar_diagnostic_warning" and a message of the form "Geometry pilot at n = <n_knob> failed: <message>.".
The format(n_knob) call in the warning message uses base R's generic format, so the rendered string depends on the class of n_knob.
.gdpar_geom_empty_culprit() is called from another section; it is invoked here but not defined here.

`.gdpar_geom_sample(norm, n_knob, controls, seed, ...)`

Purpose

Dispatches posterior sampling according to norm$kind. Supports four model kinds: "gdpar" (delegates to .gdpar_geom_sample_gdpar), "suite" (builds a Stan instance from a generator), "stan" (uses pre-supplied Stan code and a data-size function), and a fallback "model" kind (uses a pre-compiled cmdstanr model object).

Arguments

Argument	Type	Meaning
`norm`	list	Model descriptor. Fields used depend on `norm$kind` (see below).
`n_knob`	numeric / integer	Knob value passed to the data-generating function or used for subsetting.
`controls`	list	Must contain `$chains`, `$warmup`, `$sampling`, `$adapt_delta`, `$max_treedepth`.
`seed`	integer	Seed for the sampler.
`...`	any	Extra arguments forwarded to `mod$sample()` or `gdpar()`.

Returns

The return value of .gdpar_geom_pack_fit(cs_fit, elapsed): a list with cs_fit, pm, chain_id, and elapsed.

Notes

norm$kind == "gdpar": immediately returns .gdpar_geom_sample_gdpar(...).
norm$kind == "suite": calls norm$make(n_knob) to obtain a list with $stan_code and $stan_data; compiles via .gdpar_geom_compile(code).
norm$kind == "stan": uses norm$stan_code directly and calls norm$data_n_fn(n_knob) for data; compiles via .gdpar_geom_compile(code).
Fallback (e.g. "model"): uses norm$model (a pre-compiled cmdstanr model) and norm$data_n_fn(n_knob) for data. No compilation step.
The mod$sample() call sets refresh = 0, show_messages = FALSE, show_exceptions = FALSE, and parallel_chains = controls$chains (i.e., one parallel chain per chain).
Elapsed time is measured with Sys.time() / difftime(..., units = "secs") and coerced to numeric.

`.gdpar_geom_sample_gdpar(norm, n_knob, controls, seed, ...)`

Purpose

Specialized sampler for norm$kind == "gdpar". Subsets the full gdpar target dataset to n_knob rows (randomly), assembles an argument list from the stored target, and invokes gdpar() via do.call.

Arguments

Argument	Type	Meaning
`norm`	list	Must contain `$gdpar_target` (a list of arguments for `gdpar()`, including `$data`) and `$n_full` (integer, total number of data rows).
`n_knob`	numeric / integer	Desired data subset size.
`controls`	list	Sampling controls (`$warmup`, `$sampling`, `$chains`, `$adapt_delta`, `$max_treedepth`).
`seed`	integer	Seed for `gdpar()`.
`...`	any	Extra arguments merged into the `gdpar()` call.

Mathematics

Row indices are drawn uniformly without replacement:

$$ \mathcal{I} \sim \text{UniformSubsample}\bigl({1, \ldots, n_{\text{full}}},; \min(n_{\text{knob}},, n_{\text{full}})\bigr) $$

and the data is subsetted as $\text{sub} = \text{data}[\mathcal{I}, ]$.

Returns

The return value of .gdpar_geom_pack_fit(fit$fit, elapsed), where fit is the object returned by gdpar().

Notes

The argument list is built by copying norm$gdpar_target and then overriding: type is set to NULL (removed), data is replaced by the subset, and all control/seed fields are set from controls/seed.
skip_id_check = TRUE and verbose = FALSE are hardcoded into the args, suppressing gdpar's internal identifier validation and verbose output.
refresh is set to 0L.
Extra arguments in ... are appended via c(args, list(...)) and thus override any同名 fields already in args.

`.gdpar_geom_pack_fit(cs_fit, elapsed)`

Purpose

Normalizes a cmdstanr fit object into a compact list containing the posterior draws matrix (excluding lp__), per-draw chain IDs, and elapsed time. This standardized structure is consumed by signal extraction and culprit localization.

Arguments

Argument	Type	Meaning
`cs_fit`	CmdStanMCMC	A cmdstanr fit object returned by `mod$sample()` or `gdpar()$fit`.
`elapsed`	numeric	Wall-clock seconds elapsed during sampling.

Returns

A list:

Element	Type	Description
`cs_fit`	CmdStanMCMC	The original fit object (retained for later diagnostic queries).
`pm`	numeric matrix	Posterior draws matrix; rows = draws, columns = parameters (excluding `lp__`).
`chain_id`	integer vector	Chain identifier for each draw (from `posterior::as_draws_df()$`.chain``).
`elapsed`	numeric	Passed through unchanged.

Notes

lp__ is explicitly removed from the variable set via setdiff(vars, "lp__").
posterior::as_draws_matrix() followed by as.matrix() ensures pm is a plain numeric matrix.
The chain ID extraction relies on the .chain column produced by posterior::as_draws_df().

`.gdpar_geom_compile(stan_code)`

Purpose

Writes Stan source code to a file and compiles it into a cmdstanr model object. Relies on cmdstanr's file-hash caching so that repeated diagnostic calls with identical code do not recompile.

Arguments

Argument	Type	Meaning
`stan_code`	character (scalar)	Stan model source code as a string.

Returns

A CmdStanModel object.

Notes

Uses cmdstanr::write_stan_file(stan_code) to materialize the code to disk (typically a temp file with content hashing).
Then cmdstanr::cmdstan_model(f) compiles (or retrieves the cached compilation).
No explicit error handling; compilation failures propagate to the caller.

`.gdpar_geom_extract_signals(cs_fit, pm, chain_id, bounds, max_treedepth, elapsed)`

Purpose

Extracts a comprehensive set of size-invariant geometric diagnostic signals from a single pilot fit. These signals characterize sampler health (divergences, tree-depth saturation, EBFMI), posterior geometry (condition number, step-scale ratio, multimodality, heavy kurtosis, boundary proximity), and computational cost (leapfrog count, elapsed time).

Arguments

Argument	Type	Meaning
`cs_fit`	CmdStanMCMC	The cmdstanr fit object.
`pm`	numeric matrix	Posterior draws matrix (rows = draws, cols = parameters, `lp__` excluded).
`chain_id`	integer vector	Chain ID per draw.
`bounds`	list / matrix	Parameter bounds, forwarded to `.gdpar_geom_boundary_prox()`.
`max_treedepth`	integer	Maximum NUTS tree depth; used to compute treedepth saturation rate.
`elapsed`	numeric	Elapsed wall-clock seconds.

Mathematics

Let $N$ be the total number of post-warmup draws and $d$ the number of parameters.

Sampler diagnostics:

$$ \text{divergent_rate} = \frac{1}{N}\sum_{i=1}^{N} \mathbb{1}[\text{divergent}_i = 1] $$

$$ \text{treedepth_sat_rate} = \frac{1}{N}\sum_{i=1}^{N} \mathbb{1}[\text{treedepth}_i \geq \text{max_treedepth}] $$

$$ \text{ebfmi_min} = \min_{c} \text{EBFMI}_c $$

$$ \text{mean_leapfrog} = \frac{1}{N}\sum_{i=1}^{N} n_{\text{leapfrog},i}, \qquad \text{step_size} = \frac{1}{N}\sum_{i=1}^{N} \text{stepsize}_i $$

Covariance-based geometry:

$$ \Sigma = \text{Cov}(\text{pm}), \qquad \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_d ;\text{(eigenvalues of } \Sigma \text{)} $$

$$ \lambda_{\max} = \lambda_1, \qquad \lambda_{\min} = \max(\lambda_d,, 0) $$

$$ \kappa = \frac{\lambda_{\max}}{\max(\lambda_{\min},, \varepsilon)}, \qquad \varepsilon = \texttt{.Machine$double.eps} $$

$$ \sigma_j = \sqrt{\max(\Sigma_{jj},, \varepsilon)}, \qquad \text{step_scale_ratio} = \frac{\text{step_size}}{\min_j \sigma_j} $$

Returns

A named list:

Element	Type	Description
`divergent_rate`	numeric	Fraction of divergent transitions.
`ebfmi_min`	numeric	Minimum EBFMI across chains.
`treedepth_sat_rate`	numeric	Fraction of draws hitting `max_treedepth`.
`condition_number`	numeric	$\kappa$ as above; `NA_real_` if eigen-decomposition failed.
`lambda_max_cov`	numeric	Largest eigenvalue of the posterior covariance.
`step_scale_ratio`	numeric	Mean step size divided by smallest marginal SD.
`mean_leapfrog`	numeric	Mean number of leapfrog steps per draw.
`multimodality`	numeric	From `.gdpar_geom_multimodality()`.
`heavy_kurtosis`	numeric	From `.gdpar_geom_heavy_kurtosis()` (defined in another section).
`boundary_proximity`	numeric	From `.gdpar_geom_boundary_prox()` (defined in another section).
`elapsed`	numeric	Passed through.
`n_sampling`	integer	Total number of post-warmup draws ($N$).

Notes

Sampler diagnostics are retrieved via cs_fit$sampler_diagnostics(format = "draws_df") and cs_fit$diagnostic_summary(quiet = TRUE).
ebfmi_min is wrapped in suppressWarnings(min(ds$ebfmi)).
Eigen-decomposition uses eigen(cv, symmetric = TRUE, only.values = TRUE) inside a tryCatch that returns NA_real_ on error. If all eigenvalues are NA, both lambda_max and lambda_min are set to NA_real_, and condition_number is NA_real_.
lambda_min is clamped to be non-negative via max(min(ev), 0).
The condition number denominator uses max(lambda_min, .Machine$double.eps) to avoid division by zero.
Marginal SDs use sqrt(pmax(diag(cv), .Machine$double.eps)) to avoid zero or negative variances from numerical issues.

`.gdpar_geom_na_signals()`

Purpose

Returns a template signal list with every field set to NA_real_. Used by .gdpar_geom_run_pilot when sampling fails, so that downstream consumers receive a structurally complete (but missing-valued) signal object.

Arguments

None.

Returns

A named list with all twelve signal fields set to NA_real_:

divergent_rate, ebfmi_min, treedepth_sat_rate, condition_number, lambda_max_cov, step_scale_ratio, mean_leapfrog, multimodality, heavy_kurtosis, boundary_proximity, elapsed, n_sampling.

Notes

The field names and order exactly mirror those produced by .gdpar_geom_extract_signals().
All values are explicitly NA_real_ (typed missing), not NA or NA_integer_.

`.gdpar_geom_multimodality(pm, chain_id)`

Purpose

Computes a between-chain mean-separation signal relative to within-chain spread. Chains that settle in different modes produce large between-chain mean differences, inflating this metric. The design is intended to be robust to short pilot runs.

Arguments

Argument	Type	Meaning
`pm`	numeric matrix	Posterior draws matrix; rows = draws, columns = parameters.
`chain_id`	integer vector	Chain identifier for each row of `pm`. Length must equal `nrow(pm)`.

Mathematics

Let $C$ be the number of unique chains and $d$ the number of columns (parameters). For each coordinate $j \in \{1, \ldots, d\}$:

$$ \bar{x}_j^{(c)} = \frac{1}{n_c}\sum_{i:,\text{chain}_i = c} \text{pm}_{ij}, \qquad s_j^{(c)} = \text{sd}\bigl({\text{pm}_{ij} : \text{chain}_i = c}\bigr) $$

$$ \bar{s}_j = \frac{1}{C}\sum_{c=1}^{C} s_j^{(c)} \quad (\text{with } \texttt{na.rm=TRUE}) $$

If $\bar{s}_j$ is not finite or $\bar{s}_j \leq 0$, the score for coordinate $j$ is $0$. Otherwise:

$$ m_j = \frac{\text{sd}\bigl(\bar{x}_j^{(1)}, \ldots, \bar{x}_j^{(C)}\bigr)}{\bar{s}_j} $$

The returned multimodality signal is:

$$ M = \max_{j} m_j \quad (\text{with } \texttt{na.rm=TRUE}) $$

Returns

A single numeric value. Returns 0 if fewer than 2 unique chains are present.

Notes

Early return of 0 (not NA) when length(unique(chain_id)) < 2.
Per-chain means and SDs are computed with vapply over the unique chain set; the order of chains is determined by unique(chain_id).
Within-chain spread mean(csds, na.rm = TRUE) averages the per-chain standard deviations (not a pooled variance).
The guard !is.finite(within) || within <= 0 returns 0 for that coordinate, preventing division by zero or non-finite values.
The final max(scores, na.rm = TRUE) tolerates NA scores from individual coordinates.

Geometry Diagnostic — Culprit Localisation, Difficulty Curve, and Rule-Based Classifier

`.gdpar_geom_heavy_kurtosis(pm)`

Purpose

Computes a scalar summary of tail heaviness across all parameter marginals in a posterior matrix. Used by the classifier as the heavy_kurtosis signal to detect funnel-neck or heavy-tail pathologies.

Arguments

Argument	Type	Meaning
`pm`	numeric matrix	Posterior draws matrix with rows = draws, columns = parameters.

Mathematics

For each column $j$ of pm, let $x = \mathrm{pm}[:,\,j]$, $\bar{x} = \frac{1}{N}\sum_i x_i$, and $\sigma^2 = \frac{1}{N}\sum_i (x_i - \bar{x})^2$ (population variance). If $\sigma^2 \le 0$, the column contributes $0$. Otherwise the excess kurtosis is

$$ \kappa_j = \frac{\frac{1}{N}\sum_i (x_i - \bar{x})^4}{\sigma^4} - 3. $$

The returned value is

$$ \max_j \kappa_j \quad (\text{with } \texttt{na.rm=TRUE}). $$

Returns

A single numeric scalar — the maximum excess kurtosis over all parameter columns.

Notes

Uses vapply with numeric(1), so type stability is guaranteed.
na.rm = TRUE in the final max means columns that produced NA (e.g. all-constant columns returning 0, or columns with NaN) do not propagate.
The variance estimator is the biased (population) estimator, not the sample ($N-1$) estimator, so kurtosis values are slightly inflated relative to the unbiased form for small $N$.

`.gdpar_geom_boundary_prox(cs_fit, bounds, eps = 0.01)`

Purpose

Measures how tightly the posterior draws of bounded parameters pile against their declared lower or upper bound. Returns the worst-case (maximum) proximity fraction across all bounded parameters, used as the boundary_proximity signal.

Arguments

Argument	Type	Meaning
`cs_fit`	CmdStanFit-like object	A fitted model object exposing a `draws()` method compatible with the `posterior` package.
`bounds`	named list of length-2 numeric vectors, or `NULL`	Each element is `c(lo, hi)` giving the declared bound for the parameter named by the list element.
`eps`	numeric scalar (default `0.01`)	Relative tolerance: a draw is "near" a bound if it lies within `eps * (hi - lo)` of that bound.

Mathematics

For each bounded parameter $n$ with bounds $[\ell_n, u_n]$ and range $r_n = u_n - \ell_n$, the proximity fractions are

$$ f_{\ell,n} = \frac{1}{N}\sum_i \mathbf{1}!\left[x_{i,n} \le \ell_n + \varepsilon, r_n\right], \qquad f_{u,n} = \frac{1}{N}\sum_i \mathbf{1}!\left[x_{i,n} \ge u_n - \varepsilon, r_n\right]. $$

The per-parameter score is $p_n = \max(f_{\ell,n},\, f_{u,n})$, and the returned value is $\max_n p_n$ (with na.rm = TRUE).

Returns

A numeric scalar in $[0, 1]$. Returns 0 if bounds is NULL or empty, or if no bounded parameter name matches a variable in the draws.

Notes

Variables are retrieved via posterior::variables(draws_arr). Parameters in bounds that are absent from the draws silently contribute 0.
posterior::subset_draws and posterior::as_draws_matrix are used to extract a single variable; the result is coerced with as.numeric.
If hi - lo is zero or negative, rng is non-positive and the comparisons x <= lo + eps * rng / x >= hi - eps * rng still execute but the geometric meaning is undefined.

`.gdpar_geom_empty_culprit()`

Purpose

Factory function returning an empty culprit-localisation data frame with the correct schema, used as a sentinel when no culprit is identified.

Arguments

None.

Returns

A data.frame with zero rows and three columns:

Column	Type
`parameter`	character
`mechanism`	character
`score`	numeric

Constructed with stringsAsFactors = FALSE.

Notes

No side effects. Pure constructor.

`.gdpar_geom_localize(cs_fit, pm, chain_id, bounds)`

Purpose

Localises the geometric "culprit" parameters responsible for a detected pathology by combining three independent mechanisms: (a) flat/anisotropic directions from the covariance spectrum, (b) divergence-neck parameters that separate divergent from non-divergent draws, and (c) boundary-piling of bounded parameters.

Arguments

Argument	Type	Meaning
`cs_fit`	CmdStanFit-like object	Provides `sampler_diagnostics(format = "draws_df")` for divergence information and `draws()` (via `.gdpar_geom_boundary_prox`).
`pm`	numeric matrix	Posterior draws matrix (rows = draws, columns = parameters). Column names are used as parameter names.
`chain_id`	(any)	Accepted but never used in the function body.
`bounds`	named list of length-2 numeric vectors, or `NULL`	Declared parameter bounds, as in `.gdpar_geom_boundary_prox`.

Mathematics

(a) Flat / anisotropic direction. Let $\Sigma = \mathrm{cov}(\mathrm{pm})$ with eigendecomposition $\Sigma = V \Lambda V^\top$ (computed via eigen(..., symmetric = TRUE)). Let $v_1$ be the first eigenvector (largest eigenvalue $\lambda_1$). The loading magnitude is $|v_{1,k}|$ for parameter $k$. The top $\min(2, p)$ parameters by loading receive score $|v_{1,k}| \cdot \lambda_1$.

(b) Divergence neck. Let $D_i \in \{0, 1\}$ indicate whether draw $i$ is divergent. For each parameter $j$ with finite, positive sample standard deviation $s_j$, compute the standardised mean shift

$$ \Delta_j = \frac{\left|,\overline{x_j^{\mathrm{div}}} - \overline{x_j^{\mathrm{nondiv}}},\right|}{s_j}. $$

Parameters with $\Delta_j > 0.5$ are flagged with mechanism "divergence_neck" and score $\Delta_j$.

(c) Boundary pile. For each bounded parameter, .gdpar_geom_boundary_prox is called with a single-element bounds list. Parameters with proximity $> 0.05$ are flagged with mechanism "boundary_pile".

Returns

A data.frame with columns parameter (character), mechanism (character), score (numeric), sorted by score in decreasing order, with rownames set to NULL. If no culprit rows are produced, returns .gdpar_geom_empty_culprit().

Mechanism labels:

Mechanism string	Source
`"flat_or_anisotropic_direction"`	Covariance eigenvector (a)
`"divergence_neck"`	Divergence separation (b)
`"boundary_pile"`	Boundary proximity (c)

Notes

The eigendecomposition is wrapped in tryCatch; if it fails, mechanism (a) is silently skipped.
Mechanism (b) is only evaluated when any(div) && !all(div) — i.e., at least one but not all draws are divergent. If all draws diverge, the mechanism is skipped entirely.
In mechanism (b), columns with non-finite or non-positive sd(x) are skipped via next.
In mechanism (c), .gdpar_geom_boundary_prox is called once per bounded parameter with a single-element sublist bounds[nm], so the eps default of 0.01 is always used.
chain_id is a dead parameter — it is accepted for interface consistency but has no effect on the output.

`.gdpar_geom_difficulty_curve(signals, thresholds)`

Purpose

Fits a log–log regression of condition number against data size $n$ to determine whether geometric difficulty grows with $n$. Produces the grows_with_n flag consumed by the classifier.

Arguments

Argument	Type	Meaning
`signals`	data.frame	Must contain columns `failed` (logical), `condition_number` (numeric), and `n` (numeric). Typically a stacked table of signals across multiple fit sizes.
`thresholds`	named list	Must contain `nslope_grows` (numeric): the slope threshold above which difficulty is declared to grow with $n$.

Mathematics

Rows are filtered to those with !failed, finite condition_number, and finite n. Let the filtered set be $\{(n_k, \kappa_k)\}$. If fewer than 2 rows or fewer than 2 distinct $n$ values remain, the result is NA. Otherwise, an ordinary least-squares fit is performed:

$$ \log \kappa = \alpha + \beta \log n + \varepsilon. $$

The slope $\beta$ is extracted as coef(fit)[2]. The flag grows_with_n is TRUE iff $\beta >$ thresholds$nslope_grows.

Returns

A named list:

Element	Type	Meaning
`slope`	numeric	The fitted exponent $\beta$, or `NA_real_` if insufficient data.
`grows_with_n`	logical or `NA`	`TRUE` if slope exceeds threshold; `NA` if insufficient data.
`note`	character or `NULL`	`"need at least two distinct sizes"` if insufficient data; `NULL` otherwise.

Notes

The filtering predicate uses is.finite on both condition_number and n, so NA, NaN, Inf, and -Inf rows are excluded.
length(unique(s$n)) < 2L checks for at least two distinct sizes; duplicate sizes are allowed in the regression.
stats::lm is used without error handling; if the regression matrix is degenerate (e.g., all identical n after filtering, which is already guarded), lm would still execute but the guard prevents this.
unname is applied to the slope coefficient to strip the "log(n)" name.

`.gdpar_geom_remedy_for(pathology)`

Purpose

Maps a pathology label to a recommended remedy (metric / sampler strategy) and a difficulty level. Called by .gdpar_geom_classify via its inner decide closure.

Arguments

Argument	Type	Meaning
`pathology`	character scalar	One of the recognised pathology labels (see table below).

Returns

A named list with two elements:

Element	Type	Meaning
`remedy`	character	The recommended remedy string.
`level`	integer	A difficulty/severity level (can be `NA_integer_`).

Mapping table:

`pathology`	`remedy`	`level`
`"isotropic"`	`"euclidean_diagonal"`	`0L`
`"anisotropic"`	`"euclidean_dense"`	`1L`
`"funnel"`	`"riemannian"`	`3L`
`"heavy_tails"`	`"finsler_relativistic"`	`4L`
`"quasi_deterministic"`	`"sub_riemannian"`	`5L`
`"multimodal"`	`"tempering"`	`6L`
`"boundary"`	`"boundary_reparam"`	`6L`
`"flat_direction"`	`"reparam_eliminate"`	`-1L`
(any other)	`"unknown"`	`NA_integer_`

Notes

Implemented via switch with a default fallback, so unrecognised inputs never error.
The level values are not contiguous (level 2 is unused), suggesting an ordinal scale where some slots are reserved or deprecated.

`.gdpar_geom_classify(sig, ncurve, culprit, th)`

Purpose

The central rule-based classifier that consumes all geometric signals, the difficulty-vs-$n$ curve, and (nominally) the culprit localisation, then emits a pathology label, confidence score, remedy, and a trace of which rules fired.

Arguments

Argument	Type	Meaning
`sig`	named list	Signal values. Expected elements: `heavy_kurtosis`, `condition_number`, `boundary_proximity`, `multimodality`, `lambda_max_cov`, `ebfmi_min`, `divergent_rate`, `step_scale_ratio`.
`ncurve`	named list	Output of `.gdpar_geom_difficulty_curve`; must contain `grows_with_n`.
`culprit`	(any)	Accepted but never used in the function body.
`th`	named list	Threshold values. Expected elements: `heavy_kurtosis_high`, `condition_high`, `boundary_prox_high`, `multimodal_high`, `flat_var_high`, `funnel_ebfmi_low`, `divergent_rate_high`, `step_scale_ratio_low`, `heavy_cond_max`, `nslope_grows` (the latter is used only indirectly via `ncurve`).

Mathematics

Two boolean predicates are precomputed:

$$ \texttt{kurt_high} = [\texttt{sig$heavy_kurtosis} \ge \texttt{th$heavy_kurtosis_high}], \qquad \texttt{cond_high} = [\texttt{sig$condition_number} \ge \texttt{th$condition_high}]. $$

The confidence mapping function sq(x, ref) maps a signal value $x$ relative to its threshold $\rho$ to a confidence in $[0, 1]$:

$$ \mathrm{sq}(x, \rho) = \begin{cases} 0.5 & \text{if } x \text{ or } \rho \text{ is non-finite, or } \rho \le 0, \\ \min!\left(1,; \max!\left(0,; \frac{x}{2\rho} + 0.25\right)\right) & \text{otherwise.} \end{cases} $$

This yields confidence $0.25$ when $x = 0$, $0.75$ when $x = \rho$ (exactly at threshold), and $1.0$ when $x \ge 1.5\rho$.

The classification rules are evaluated in strict priority order (first match wins):

Priority	Pathology	Condition	Confidence
1	`boundary`	`sig$boundary_proximity >= th$boundary_prox_high`	`sq(boundary_proximity, boundary_prox_high)`
2	`multimodal`	`sig$multimodality >= th$multimodal_high`	`sq(multimodality, multimodal_high)`
3	`flat_direction`	`sig$lambda_max_cov >= th$flat_var_high` AND `!grows`	`sq(lambda_max_cov, flat_var_high)`
4	`quasi_deterministic`	`cond_high` AND `grows`	`sq(condition_number, condition_high)`
5	`anisotropic`	`cond_high` AND `!kurt_high`	`sq(condition_number, condition_high)`
6	`funnel`	(`kurt_high` AND `sig$ebfmi_min <= th$funnel_ebfmi_low`) OR (`sig$divergent_rate >= th$divergent_rate_high` AND `sig$step_scale_ratio <= th$step_scale_ratio_low`)	`sq(heavy_kurtosis, heavy_kurtosis_high)` if energy branch; `sq(divergent_rate, divergent_rate_high)` if divergence branch
7	`heavy_tails`	`kurt_high` AND `sig$condition_number <= th$heavy_cond_max`	`sq(heavy_kurtosis, heavy_kurtosis_high)`
8	`anisotropic` (residual)	`cond_high` (n-curve unknown)	`0.4` (fixed)
9	`isotropic` (default)	no threshold exceeded	`0.6` (fixed)

Returns

A named list:

Element	Type	Meaning
`pathology`	character	The matched pathology label.
`confidence`	numeric	Confidence in $[0, 1]$ (or a fixed value for rules 8–9).
`remedy`	character	Remedy string from `.gdpar_geom_remedy_for`.
`level`	integer or `NA_integer_`	Difficulty level from `.gdpar_geom_remedy_for`.
`trace`	character vector	Human-readable strings documenting which rule fired and why.

Notes

The inner function decide(p, conf) closes over the trace variable in the enclosing scope. Each rule appends a trace string to trace before calling decide, so the returned trace reflects the path taken.
culprit is a dead parameter — accepted for interface consistency but never referenced.
All threshold comparisons use isTRUE(...) wrappers, which means NA comparisons safely evaluate to FALSE rather than propagating NA. This is critical: if any signal is NA, the corresponding rule is simply skipped.
Rule 6 (funnel) has two sub-conditions: funnel_energy (high kurtosis + low E-BFMI) and funnel_div (high divergence rate + low step/scale ratio). If both are true, funnel_energy takes precedence in the trace message and confidence computation because the if (funnel_energy) branch in the trace if-else is evaluated first.
Rule 8 catches the case where cond_high is true but grows is NA (n-curve unknown) and kurt_high is false — rules 4, 5, and 6 all fail, but rule 8 fires with a fixed confidence of 0.4.
Rule 9 (isotropic) fires with a fixed confidence of 0.6 when no pathology threshold is exceeded. This is higher than rule 8's 0.4, reflecting higher confidence in a clean diagnosis than in a residual one.
The sq function's default return of 0.5 for non-finite inputs means that if a signal is NA but the threshold comparison somehow passed (which isTRUE prevents), the confidence would be 0.5.
The function does not modify sig, ncurve, culprit, or th; it is side-effect-free.

`.gdpar_geom_cost(pilot_max, ncurve)`

Purpose
Internal helper for geometry diagnostics. Computes a compact cost and tractability summary from a pilot run’s sampler signals and from the difficulty-vs-sample-size curve.

Arguments

pilot_max: list. Expected components:
- failed: scalar logical or any value tested by isTRUE; TRUE indicates the pilot run failed.
- signals: list with numeric scalar fields:
  - elapsed: elapsed time for the sampling phase.
  - n_sampling: number of sampling draws.
  - mean_leapfrog: mean leapfrog steps.
  - treedepth_sat_rate: tree-depth saturation rate.
ncurve: list. Expected component grows_with_n, tested by isTRUE; indicates whether difficulty grows with sample size.

Mathematics
Let $t = \text{elapsed}$, $n = \text{n\_sampling}$, $s = \text{treedepth\_sat\_rate}$, and $g = \mathrm{isTRUE}(\text{ncurve\$grows\_with\_n})$. If the pilot is not failed and $t$ is finite:

$$ \text{seconds_per_1000_draws} = \frac{t}{\max(n, 1)} \times 1000. $$

Tractability is classified as:

$$ \text{tractability} = \begin{cases} \text{unknown} & \text{if } \mathrm{isTRUE}(\text{failed}) \text{ or } t \text{ is not finite},\\ \text{intractable (escalate geometry or certify limit)} & \text{if } \mathrm{isTRUE}(s \ge 0.5) \text{ and } g,\\ \text{expensive} & \text{if } \mathrm{isTRUE}(s \ge 0.5) \text{ and not } g,\\ \text{tractable} & \text{otherwise}. \end{cases} $$

Returns
A list with elements:

seconds_per_1000_draws: numeric scalar, or NA_real_ in the failure branch.
mean_leapfrog: sig$mean_leapfrog, or NA_real_ in the failure branch.
treedepth_saturation: sig$treedepth_sat_rate, or NA_real_ in the failure branch.
tractability: character string; one of "unknown", "tractable", "expensive", or "intractable (escalate geometry or certify limit)".

Notes
The failure branch is triggered when isTRUE(pilot_max$failed) or !is.finite(sig$elapsed). The max(sig$n_sampling, 1) guard protects against zero draws, but not against NA; if n_sampling is NA, per_1000 becomes NA. isTRUE(sat >= 0.5) requires sat to be a length-one numeric/logical value; NA or length not equal to one yields FALSE, causing classification to fall through to less severe branches. Assumes pilot_max$signals exists; if failed is not TRUE and sig$elapsed is absent or NULL, the finite check may error. Internal function; no S3 dispatch.

`print.gdpar_geometry_diagnostic(x, ...)`

Purpose
S3 print method for objects of class gdpar_geometry_diagnostic. Renders a compact console summary of the diagnosed pathology, recommended geometry, difficulty curve, computational cost, culprit parameters, and optional ground-truth comparison.

Arguments

x: object of class gdpar_geometry_diagnostic, expected to be a list with components:
- pathology: character string.
- confidence: numeric scalar.
- recommended_geometry: character string.
- geometry_level: numeric or character geometry level.
- difficulty_curve: list with optional slope and grows_with_n.
- cost: list with seconds_per_1000_draws and tractability.
- culprit: data frame with column parameter; may have zero rows.
- correct: NULL, or a scalar logical/NA indicating whether classification was correct.
- ground_truth: list with pathology, used when correct is non-NULL.
...: unused; present for S3 generic compatibility.

Returns
Invisibly returns x.

Notes
Writes output to the console via cat. Printed lines include:

Header: <gdpar_geometry_diagnostic>.
Pathology with confidence formatted using digits = 2.
Recommended geometry and geometry level.
An n-curve line only if x$difficulty_curve$slope is not NULL; slope is formatted with digits = 3, followed by grows_with_n.
Cost line with seconds_per_1000_draws formatted with digits = 3 and tractability.
Culprit line only if nrow(x$culprit) > 0L; prints up to the first three values of x$culprit$parameter using utils::head(..., 3), comma-separated.
Ground-truth line only if !is.null(x$correct); this includes FALSE and NA values, not only TRUE, and accesses x$ground_truth$pathology.

Edge cases: if x$culprit is NULL, nrow(x$culprit) is NULL and the if condition errors due to length zero; the method assumes culprit is a data frame. If x$correct is non-NULL but x$ground_truth is missing, accessing x$ground_truth$pathology will error. The method does not validate the class or required fields. Exported as an S3 method.

← Part IV — Exhaustive Function Reference (3/7) · gdpar Wiki Home · Part IV — Exhaustive Function Reference (5/7) →

gdpar Wiki

Home

Part IV Function Reference 4

R/gdpar.R

dedup_message_blocks(lines)

flush_captured()

.gdpar_multi(formula, family, amm, data, prior, anchor, skip_id_check, chains, iter_warmup, iter_sampling, adapt_delta, max_treedepth, refresh, verbose, seed, group = NULL, parametrization, parametrization_a, parametrization_W, parametrization_aggregation = NULL, id_check_rigor = "full", call, ...)

Section 3: K-Individual Bayesian Fitting and Model Assembly

.gdpar_K_build(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, skip_id_check, verbose, group = NULL, parametrization, parametrization_a, parametrization_W, id_check_rigor = "full", family_id_k_vector = NULL, compile_model_methods = FALSE)

flush_captured() (local closure defined inside .gdpar_K)

resolve_anchor_K(anchor, family, y, prior, slot_names, verbose)

resolve_anchor_multi(anchor, family, y, prior, p, verbose)

resolve_anchor(anchor, family, y, prior, verbose)

compute_diagnostics(fit, verbose = TRUE)

Diagnostics Accessor

diagnostics(fit)

Group Argument Resolution

.resolve_group_argument(group, data, n, verbose = TRUE)

R/geometry_bridge.R

.gdpar_geom_enable_methods(fit, hessian = TRUE, seed = 1L)

.gdpar_geom_unconstrained_summary(fit)

.gdpar_geom_bridge_core(model, stan_data, dim, fisher = NULL, reference = NULL, engine_fit = NULL, has_hessian = TRUE, extra = list())

gdpar_geom_bridge(object, fisher = NULL, reference = NULL, hessian = TRUE, methods_seed = 1L, ...)

print.gdpar_geom_bridge(x, ...)

.gdpar_geom_fit_resolve_K(formula, family, amm, W, data)

gdpar_geom_fit(formula, family, amm, W, data, prior, anchor, skip_id_check, parametrization, parametrization_a, parametrization_W, id_check_rigor, group, fisher, budget, criteria, entry_level, level_map, reference, speed, rest_mass, laplace_fallback, laplace_draws, n_grid, seed, verbose, ...)

print.gdpar_geom_fit(x, ...)

R/geometry_diagnostic.R

gdpar_geometry_thresholds()

gdpar_geometry_diagnostic(target, n_grid = NULL, difficulty = NULL, pilot_warmup = 150L, pilot_sampling = 150L, chains = 4L, adapt_delta = 0.8, max_treedepth = 10L, seed = 20260602L, thresholds = NULL, verbose = TRUE, ...)

.gdpar_geom_normalize_target(target, difficulty)

.gdpar_geom_run_pilot(norm, n_knob, controls, seed, ...)

.gdpar_geom_sample(norm, n_knob, controls, seed, ...)

.gdpar_geom_sample_gdpar(norm, n_knob, controls, seed, ...)

.gdpar_geom_pack_fit(cs_fit, elapsed)

.gdpar_geom_compile(stan_code)

.gdpar_geom_extract_signals(cs_fit, pm, chain_id, bounds, max_treedepth, elapsed)

.gdpar_geom_na_signals()

.gdpar_geom_multimodality(pm, chain_id)

Geometry Diagnostic — Culprit Localisation, Difficulty Curve, and Rule-Based Classifier

.gdpar_geom_heavy_kurtosis(pm)

.gdpar_geom_boundary_prox(cs_fit, bounds, eps = 0.01)

.gdpar_geom_empty_culprit()

.gdpar_geom_localize(cs_fit, pm, chain_id, bounds)

.gdpar_geom_difficulty_curve(signals, thresholds)

.gdpar_geom_remedy_for(pathology)

.gdpar_geom_classify(sig, ncurve, culprit, th)

.gdpar_geom_cost(pilot_max, ncurve)

print.gdpar_geometry_diagnostic(x, ...)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gdpar Wiki

Clone this wiki locally

`dedup_message_blocks(lines)`

`flush_captured()`

`.gdpar_K_build(amm_list_canonical, family, data, prior, anchor, outcome_name, formula_env, skip_id_check, verbose, group = NULL, parametrization, parametrization_a, parametrization_W, id_check_rigor = "full", family_id_k_vector = NULL, compile_model_methods = FALSE)`

`flush_captured()` (local closure defined inside `.gdpar_K`)

`resolve_anchor_K(anchor, family, y, prior, slot_names, verbose)`

`resolve_anchor_multi(anchor, family, y, prior, p, verbose)`

`resolve_anchor(anchor, family, y, prior, verbose)`

`compute_diagnostics(fit, verbose = TRUE)`

`diagnostics(fit)`

`.resolve_group_argument(group, data, n, verbose = TRUE)`

`.gdpar_geom_enable_methods(fit, hessian = TRUE, seed = 1L)`

`.gdpar_geom_unconstrained_summary(fit)`

`.gdpar_geom_bridge_core(model, stan_data, dim, fisher = NULL, reference = NULL, engine_fit = NULL, has_hessian = TRUE, extra = list())`

`gdpar_geom_bridge(object, fisher = NULL, reference = NULL, hessian = TRUE, methods_seed = 1L, ...)`

`print.gdpar_geom_bridge(x, ...)`

`.gdpar_geom_fit_resolve_K(formula, family, amm, W, data)`

`gdpar_geom_fit(formula, family, amm, W, data, prior, anchor, skip_id_check, parametrization, parametrization_a, parametrization_W, id_check_rigor, group, fisher, budget, criteria, entry_level, level_map, reference, speed, rest_mass, laplace_fallback, laplace_draws, n_grid, seed, verbose, ...)`

`print.gdpar_geom_fit(x, ...)`

`gdpar_geometry_thresholds()`

`gdpar_geometry_diagnostic(target, n_grid = NULL, difficulty = NULL, pilot_warmup = 150L, pilot_sampling = 150L, chains = 4L, adapt_delta = 0.8, max_treedepth = 10L, seed = 20260602L, thresholds = NULL, verbose = TRUE, ...)`

`.gdpar_geom_normalize_target(target, difficulty)`

`.gdpar_geom_run_pilot(norm, n_knob, controls, seed, ...)`

`.gdpar_geom_sample(norm, n_knob, controls, seed, ...)`

`.gdpar_geom_sample_gdpar(norm, n_knob, controls, seed, ...)`

`.gdpar_geom_pack_fit(cs_fit, elapsed)`

`.gdpar_geom_compile(stan_code)`

`.gdpar_geom_extract_signals(cs_fit, pm, chain_id, bounds, max_treedepth, elapsed)`

`.gdpar_geom_na_signals()`

`.gdpar_geom_multimodality(pm, chain_id)`

`.gdpar_geom_heavy_kurtosis(pm)`

`.gdpar_geom_boundary_prox(cs_fit, bounds, eps = 0.01)`

`.gdpar_geom_empty_culprit()`

`.gdpar_geom_localize(cs_fit, pm, chain_id, bounds)`

`.gdpar_geom_difficulty_curve(signals, thresholds)`

`.gdpar_geom_remedy_for(pathology)`

`.gdpar_geom_classify(sig, ncurve, culprit, th)`

`.gdpar_geom_cost(pilot_max, ncurve)`

`print.gdpar_geometry_diagnostic(x, ...)`