NUTS sampler for GGM (RATTLE constrained HMC)#86
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #86 +/- ##
==========================================
- Coverage 89.82% 89.08% -0.75%
==========================================
Files 66 71 +5
Lines 10666 12325 +1659
==========================================
+ Hits 9581 10980 +1399
- Misses 1085 1345 +260 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Phase 1 of the GGM NUTS implementation: forward map, backward pass, and integration into GGMModel. New files: - graph_constraint_structure.h: precomputed per-column constraint info - ggm_gradient.h/.cpp: GGMGradientEngine with forward_map and logp_and_gradient (reverse-mode adjoint with FD cross-column terms) - ggm_gradient_interface.cpp: Rcpp test hooks for R-level validation - test-ggm-gradient.R: 14 tests covering forward map properties and gradient vs finite differences for 6 graph configs (p = 4/6/8) Modified files: - ggm_model.h/.cpp: has_gradient() returns true, overrides for parameter_dimension, get/set_vectorized_parameters, logp_and_gradient, get_active_inv_mass, storage_dimension, get_storage_vectorized_parameters - RcppExports: auto-generated for new test hooks - man/*.Rd: roxygen regeneration
- Extract edge indicator sweep into update_edge_indicators() with random-scan shuffling via prepare_iteration() - Add tune_proposal_sd() with Robbins-Monro adaptation (stage 3b) - Wire sampler_type/target_acceptance/max_tree_depth through R → C++ - Upgrade get_active_inv_mass() with null-space (N_q) rotation for constrained columns - Set constraint_dirty_/theta_valid_ flags on every edge toggle - Map NUTS diagnostic fields (treedepth/divergent/energy) in GGM build_output path - Add chain-level error handling in run_sampler (drop failed chains) - GGM defaults to adaptive-metropolis; NUTS available via explicit update_method='nuts'; HMC blocked with clear error - Update validation tests and guards for new sampler options
- Replace inv_sympd(K) with inv(trimatu(Phi)) in gradient backward pass — eliminates chain crashes when leapfrog pushes theta to extreme values during warmup - Add test-ggm-nuts.R with 69 tests covering: - Posterior moment comparison (means, variances, CI overlap) - Geweke forward-sampling test (200 reps) - Edge selection PIP agreement between NUTS and MH - Gradient accuracy near PD boundary (dense and sparse graphs) - NUTS diagnostic health checks (divergences, E-BFMI)
Add RATTLE integration for GGM NUTS with edge constraints: - Full-space gradient (no reverse-Givens adjoint) - Constrained leapfrog with position/momentum projection - NUTS build_tree routes constrained models to RATTLE Add include_edge parameter for user-specified fixed graph structure: - R-side validation and plumbing (bgm, bgm_spec, validate_model) - Sparse graph detection: has_constraints() triggers RATTLE - MLE initialization zeros excluded edges Tests: 151 new (5 test files), 6173 total pass, 0 failures. Comparison: Matt vs NUTS vs MH posterior means agree on full, sparse, and edge-selection graphs (max diff < 0.05).
Three bugs fixed: 1. ProjectFn bundled position+momentum projection, causing wasted PCG solves and missing cotangent projection after first half-step. Split into ProjectPositionFn (SHAKE) and ProjectMomentumFn (RATTLE). 2. Stage 3c step size was never re-tuned when constraints activate. Added heuristic_initial_step_size_constrained and call it when entering stage 3c. 3. Mass-matrix update used unconstrained step-size heuristic even for constrained models. Now uses constrained variant. Leapfrog now follows the 7-step RATTLE structure (Mici / Reich 1996). Initial momentum projection in nuts_step re-enabled (momentum-only).
Reuse previous PCG lambda solution as initial guess within each NUTS tree, reducing average PCG iterations by 54-66% (p=15: 10.3->3.5, p=20: 11.9->5.5). Cache is reset between trees via reset_projection_cache() to avoid stale initializers after position jumps. Add RattleProfiler singleton with fine-grained timing probes for NUTS tree operations, leapfrog phases, and PCG/SHAKE projections. Profiler resets at warmup/sampling boundary to report sampling-only costs.
Apply std::move throughout build_tree and nuts_step to eliminate redundant arma::vec copies when extracting from BuildTreeResult. Each move steals the internal pointer (~5ns) instead of allocating and copying (~100-200ns), saving ~35 vector copies per recursive merge. In logp_and_gradient_full, repurpose the P matrix in-place as Phi_bar (avoids one p×p allocation per gradient call), use arma::fill::none for the gradient vector (all elements are set explicitly), and std::move on return. Profiling shows NUTS tree overhead cut ~50%: p=7 unconstrained: 13.4ms → 6.6ms (total 35.2 → 28.6ms, -19%) p=15 unconstrained: 25.7ms → 11.5ms (total 103.7 → 90.2ms, -13%) p=20 unconstrained: ~45ms → 21.8ms (total ~263 → 252ms, -4%)
Replace Phi * S with arma::trimatu(Phi) * S in both gradient functions. Phi is upper-triangular (precision matrix), so this dispatches to dtrmm instead of dgemm, halving the FLOP count for the matrix multiply.
Split compound statements onto separate lines and apply styler formatting (if-brace wrapping, indentation).
488ea7f to
54677af
Compare
The discrete edge indicator MH used Cauchy(0, 0.5*pairwise_scale_) for the slab prior, but the gradient used Cauchy(0, pairwise_scale_). This mismatch caused the MH acceptance ratio to be computed against the wrong prior, degrading PIP accuracy. Changed to pairwise_scale_ to match. Also set constraint_dirty_ = true after discrete, continuous, and cross edge indicator updates so RATTLE re-projects after topology changes.
Add full-space gradient (logp_and_gradient_full), copy constructor, pack/unpack for the complete parameter vector, constraint projection (SHAKE/RATTLE for excluded edges and Cholesky factor), mass matrix diagonal, and dimension helpers for the mixed MRF model. Add R-callable gradient interface for testing and a test file for mixed RATTLE correctness checks. Fixes initializer list order in copy constructor to match declaration order, eliminating -Wreorder-ctor warning.
426dfde to
4dd963c
Compare
Add 58 constrained leapfrog trajectory tests for Mixed MRF covering reversibility, O(eps^2) scaling, energy conservation, constraint preservation, cotangent condition, SPD preservation, combined two-phase trajectory, marginal pseudolikelihood, and smoke tests. Add optional inv_mass parameter to all test wrappers (GGM and mixed) so projections use the mass-weighted overloads matching the real NUTS/HMC sampler path. Existing callers are unaffected (defaults to identity mass). Register ggm_test_forward_map in helper-internals.R (pre-existing omission).
Continuous-continuous edges retain Cholesky numerical noise (~1e-34) when gamma=0, so the vec != 0 proxy included all draws in the slab mean. Pass the indicator array through to summarize_slab and use ind == 1 for filtering. Also fix the Rhat loop in summarize_pair. Add test-mixed-nuts.R with 212 tests for NUTS vs MH posterior agreement on Mixed MRF models (6 conditions + downstream methods + diagnostics).
Nine conditions (4 GGM, 5 mixed MRF) check that divergence rate, E-BFMI, tree depth saturation, ESS, and Rhat remain healthy as problem size increases. Condition M5 stresses RATTLE with a near-singular Kyy (condition number ~106).
Fix two bugs in the MH sampler discovered by SBC testing:
1. Diagonal update (update_diagonal_parameter, tune_proposal_sd):
- Gamma(1,1) prior evaluated at exp(theta) = sqrt(Schur) instead
of K_ii; corrected to evaluate at precision_matrix_(i,i).
- Jacobian used 1x multiplier instead of 2x; corrected to
2*(theta_prop - theta_curr) since dK_ii/dtheta = 2*exp(2*theta).
2. Off-diagonal update (update_edge_parameter, tune_proposal_sd,
update_edge_indicator_parameter_pair):
- constrained_diagonal() changes K_jj deterministically but the
Gamma(1,1) prior on K_jj was missing from the acceptance ratio.
- Added dgamma(K_jj_prop) - dgamma(K_jj_curr) to all off-diagonal
MH acceptance ratios.
Same fixes applied to Mixed MRF (mixed_mrf_metropolis.cpp):
- Off-diagonal: Gamma evaluated at K_jj/2 instead of K_jj.
- Diagonal: Gamma evaluated at K_ii/2 instead of K_ii; Jacobian 1x.
- Edge-selection continuous: missing Gamma(K_jj) prior added.
Add SBC and parameter recovery tests that validate these fixes.
Add backlog item 20: SBC with edge selection.
…tions The 0.15 threshold failed on d1-d3 (ordinal-ordinal, true signal 0.15) where both samplers give ambiguous PIPs near 0.5. Relaxed to 0.20 for conditions C, D, E.
# Conflicts: # R/RcppExports.R # R/mcmc_summary.R # src/RcppExports.cpp
MaartenMarsman
added a commit
that referenced
this pull request
Apr 17, 2026
The 'Geweke-style forward-sampling test' added in #86 was never a valid Geweke joint distribution test: it drew K from a Wishart(p+1, I) prior while bgms targets a Cauchy(0, 2.5) + Gamma(1, 1) prior, ran 50 warmup iterations (so the kernel kept changing during 'one transition'), and started NUTS from a data-driven init. A bug-free sampler could not satisfy the equality it was asserting; the test has been quietly failing on every nightly run since it was introduced. The NUTS-vs-MH moment comparison and the SBC test cover the same correctness territory properly. A future commit will add a proper SBC replacement (TODO noted in the file header).
MaartenMarsman
added a commit
that referenced
this pull request
Apr 27, 2026
* feat: prior class system for configurable interaction and threshold priors
Replace hardcoded Cauchy/beta-prime priors with a BAS-inspired prior class
system. Users now specify priors via constructor functions:
bgm(x, interaction_prior = normal_prior(0.5),
threshold_prior = normal_threshold_prior(1),
edge_prior = beta_bernoulli_prior(2, 5))
R-level changes:
- R/priors.R: 7 prior constructors (cauchy_prior, normal_prior,
beta_prime_prior, normal_threshold_prior, bernoulli_prior,
beta_bernoulli_prior, sbm_prior) with print methods and unpackers
- bgm() and bgmCompare(): new interaction_prior, threshold_prior,
edge_prior object parameters; all old scalar params deprecated via
lifecycle with full backward compatibility
C++ changes:
- src/priors/interaction_prior.h: InteractionPriorType and
ThresholdPriorType enums with logp/grad inline functions
- All R::dcauchy() calls replaced with interaction_prior_logp() across
GGM, OMRF, MixedMRF, and bgmCompare model code (~50 call sites)
- All inline log_beta_prior lambdas replaced with threshold_prior_logp()
- Prior types flow from R -> spec -> C++ bridge -> model constructors
This enables Normal priors for SBC validation studies where Cauchy
priors generate degenerate data, and prepares the architecture for
future g-priors and other prior families.
* fix: prior type keys not passed to mixed MRF C++ sampler
sample_mixed.cpp checked for "interaction_prior" and "threshold_prior"
keys in the R input list, but run_sampler.R sends them as
"interaction_prior_type" and "threshold_prior_type". This mismatch
caused the mixed model to silently ignore user-specified priors and
always fall back to Cauchy/beta-prime defaults.
Adds test-prior-interface.R with 54 tests covering all prior
constructors × model types (ordinal, GGM, mixed, blume-capel,
bgmCompare), edge selection, backward compatibility, and stored
prior info.
* fix: GGM likelihood df correction, prior sampler, and edge update cleanup
- Use n-1 effective degrees of freedom in GGM likelihood for centered data,
fixing systematic overestimation of diagonal precision elements
- Add sample_ggm_prior() NUTS-based prior sampler for SBC validation,
supporting both full and constrained (sparse) graphs via RATTLE
- Remove redundant Gamma(1,1) diagonal prior from edge parameter and
edge indicator updates (cancels under constrained Cholesky parameterization)
- Guard initialize_precision_from_mle() against n=0 (prior-only sampling)
- Regenerate RcppExports (remove stale sample_mixed_prior binding)
- Document interaction_prior and threshold_prior params in bgmCompare
* fix ggm? (#90)
* Rename posterior_mean_associations to posterior_mean_pairwise; trim roxygen; update README; fix mixed MRF prior
- Revert posterior_mean_associations back to posterior_mean_pairwise across
all R code, man pages, and tests for consistency with other output and
extractor naming conventions.
- Trim detailed roxygen sections in bgm() and bgmCompare() to concise
summaries with a pointer to the package website.
- Update x parameter documentation to mention continuous variables and
GGM column-centering.
- Align README citation section with the website.
- Fix mixed MRF interaction prior: evaluate on Kyy = -Omega/2 scale with
chain-rule correction.
- Update NEWS.md with acceptance rate fix.
* fix: update remaining posterior_mean_associations references to posterior_mean_pairwise
* Unify parameter priors into bgms_parameter_prior class; add means_prior
- Merge bgms_interaction_prior and bgms_threshold_prior into a single
bgms_parameter_prior S3 class. All prior families (Cauchy, Normal,
Beta-Prime) are now valid for interactions, thresholds, and means.
- Constructors use dual class vectors for backward compatibility:
cauchy_prior/normal_prior inherit bgms_interaction_prior,
beta_prime_prior inherits bgms_threshold_prior.
- Remove normal_threshold_prior() (never released); use normal_prior().
- Add means_prior argument to bgm() for continuous variable means in
mixed MRF (currently plumbed to C++ but not yet consumed).
- Add unpack_parameter_prior() as unified unpacker; existing unpack
functions become thin wrappers.
- Convert test fixtures from deprecated string edge priors to S3 objects.
- Add tests for string edge_prior deprecation warnings and class-based
edge prior plumbing.
- Apply package-wide styler formatting.
* Add bgms_scale_prior class with gamma_prior() and exponential_prior()
- New bgms_scale_prior S3 class for positive scale parameters
(precision matrix diagonal).
- gamma_prior(shape, rate) and exponential_prior(rate) constructors.
- New precision_scale_prior argument in bgm() for GGM and mixed MRF.
- Plumbed through bgm_spec and run_sampler to inputFromR (C++ not yet
consuming the values).
- Add prior constructors section to _pkgdown.yml.
* Add C++ polymorphic BaseParameterPrior class hierarchy
New header src/priors/parameter_prior.h with:
- BaseParameterPrior abstract base class (virtual logp, grad, clone)
- CauchyPrior, NormalPrior, BetaPrimePrior for real-valued parameters
- GammaScalePrior for positive parameters (precision diagonal)
- Factory functions create_parameter_prior() and create_scale_prior()
- All concrete classes marked final for devirtualization in hot loops
Not yet wired into models (Phase 4).
* Wire polymorphic parameter priors into GGM model
Replace InteractionPriorType enum + pairwise_scale_ with
unique_ptr<BaseParameterPrior> interaction_prior_ and diagonal_prior_
throughout the GGM model:
- ggm_model.h: constructors, copy constructor, member declarations
- ggm_model.cpp: all Metropolis acceptance ratio computations
- ggm_gradient.h/cpp: rebuild() takes prior refs, logp/grad use
virtual dispatch instead of switch statements
- sample_ggm.cpp: construct priors from inputFromR via factories
- ggm_gradient_interface.cpp: update all test interface functions
* Wire polymorphic parameter priors into OMRF model
- Replace enum + scalar prior params with unique_ptr<BaseParameterPrior>
for both interaction and threshold priors in OMRF model.
- Add logp(x, scale_factor) and grad(x, scale_factor) virtual methods
to BaseParameterPrior for OMRF's per-edge pairwise_scaling_factors.
- Override in CauchyPrior and NormalPrior to apply multiplicative scale.
- Update factory function and sample_omrf.cpp to construct priors from
inputFromR.
* Wire polymorphic parameter priors into Mixed MRF model
Replace enum + scalar prior params with unique_ptr<BaseParameterPrior>
for all four prior roles in the Mixed MRF model:
- interaction_prior_: pairwise interactions (discrete-discrete, etc.)
- threshold_prior_: main effects / thresholds (discrete variables)
- means_prior_: continuous variable means (was hardcoded Normal(0,1))
- diagonal_prior_: precision diagonal (was hardcoded Gamma(1,1))
Updated: mixed_mrf_model.h/cpp (constructors, members, copy),
mixed_mrf_metropolis.cpp (all MH acceptance ratios),
mixed_mrf_gradient.cpp (log-posterior and gradient computations),
sample_mixed.cpp (prior construction from inputFromR),
mixed_gradient_interface.cpp (test interface functions).
* Wire polymorphic parameter priors into bgmCompare
Replace enum + scalar prior params with const BaseParameterPrior&
references throughout the bgmCompare subsystem:
- Three prior objects: interaction_prior (baseline pairwise),
difference_prior (group differences), threshold_prior (main effects)
- Updated all function signatures through the 6-deep call chain
- bgmCompare_interface.cpp constructs priors via factories
- Uses logp(x, scale_factor) for pairwise_scaling_factors
* Remove obsolete interaction_prior.h; clean up stale includes
Delete src/priors/interaction_prior.h (InteractionPriorType and
ThresholdPriorType enums, and the free logp/grad functions). All
models now use the polymorphic BaseParameterPrior hierarchy from
parameter_prior.h exclusively.
Remove redundant #include "priors/interaction_prior.h" from all
headers and source files that already include parameter_prior.h.
* Add comprehensive tests for polymorphic parameter priors
New test file test-parameter-prior-cpp.R (240 tests) covering:
1. C++ prior logp/grad correctness: each prior class (Cauchy, Normal,
Beta-Prime, Gamma, Exponential) verified against R reference
implementations at multiple parameter values.
2. Scaled variant correctness: logp(x, scale_factor) and
grad(x, scale_factor) verified against reference at scaled
parameters (used by OMRF/bgmCompare pairwise_scaling_factors).
3. Numerical gradient verification: GGM gradient engine tested with
Normal, Beta-Prime, and non-default Gamma diagonal priors on both
full and sparse graphs; analytic gradients match finite differences.
Both active-theta (NUTS) and full-space (RATTLE) parameterizations.
4. Prior consumption tests: verify that non-default precision_scale_prior
and means_prior produce different posteriors than defaults, confirming
the C++ code actually uses the new prior objects.
New C++ test interface: src/prior_test_interface.cpp exposing
test_parameter_prior(), test_scale_prior(), and prior-parameterized
gradient test functions.
* Add mixed MRF gradient tests with non-default priors
Extend mixed_test_logp_and_gradient and mixed_test_logp_and_gradient_full
C++ test interfaces to accept means_prior and diagonal_prior parameters
(previously hardcoded as Normal(0,1) and Gamma(1,1)).
Add three mixed MRF numerical gradient tests:
- Normal interaction prior
- Cauchy means prior + Gamma(2,1) diagonal prior
- All non-default priors (Normal interaction, Normal threshold,
Cauchy means, Gamma(0.5,2) diagonal)
All analytic gradients match finite differences to < 1e-4.
* Add edge selection tests with non-default parameter priors
Verify that edge selection (spike-and-slab) works correctly when
combined with non-default priors across all model types:
- GGM: normal_prior, beta_prime_prior, non-default diagonal prior
- OMRF: normal_prior interaction + cauchy threshold,
beta_prime_prior interaction
- Mixed MRF: all four non-default priors simultaneously
All tests verify inclusion indicators are in [0,1] and the sampler
runs without error.
* Add bgmCompare tests with non-default parameter priors
Verify bgmCompare runs correctly with:
- normal_prior interaction + cauchy threshold (no selection)
- beta_prime_prior interaction + beta_prime_prior threshold (no selection)
- normal_prior for both + difference selection enabled
* Rename bgms_edge_prior to bgms_indicator_prior; wire into bgmCompare
Rename the edge prior class to bgms_indicator_prior since the same
selection prior structure (Bernoulli, Beta-Bernoulli, SBM) applies to
both edge inclusion in bgm() and difference selection in bgmCompare().
- Rename class, print method, and unpack function throughout
- bgmCompare() now accepts indicator prior objects for difference_prior
(bernoulli_prior, beta_bernoulli_prior, sbm_prior)
- Deprecate string-valued difference_prior and loose difference_probability
- Convert test fixture from deprecated string to prior object
- SBM prior is now available for difference selection in bgmCompare
* Fix Unicode escape sequences in roxygen comments
Replace <U+201C>/<U+201D> (smart quotes), <U+2013> (en-dash),
<U+2014> (em-dash), and <U+2019> (smart apostrophe) with ASCII
equivalents across all R source files. These were introduced by
styler converting actual Unicode characters to escape sequences.
* Update indicator prior docs: edge inclusion -> inclusion indicators
Rename roxygen titles and descriptions from "Edge Inclusion" to
"Inclusion Indicators" for bernoulli_prior, beta_bernoulli_prior,
and sbm_prior. These priors apply to both edge selection in bgm()
and difference selection in bgmCompare().
* Fix remaining Unicode escapes; improve wrong-type error for edge_prior
- Replace <U+2192> (arrow) and <U+00D7> (multiplication) escape
sequences with ASCII -> and x in comments.
- Add explicit error when edge_prior receives a non-indicator,
non-string object (e.g. cauchy_prior), instead of falling through
to an unhelpful match.arg error. String values are still accepted
via the deprecated legacy path.
* Fix citation
* Improve warmup for sample_ggm_prior()
* Fix NUTS accept_prob accumulation across full trajectory
The top-level trajectory loop in nuts_step overwrote the Metropolis
contribution (alpha, n_alpha) with the latest subtree's values each
iteration instead of accumulating across the full trajectory. The
reported accept_prob therefore reflected only the last built subtree,
biasing the signal consumed by dual-averaging step-size adaptation.
Match Stan's base_nuts.hpp semantics: accumulate sum_metro_prob and
n_leapfrog across every subtree built (including ones that terminated
early), outside the s_prime == 1 guard, since every leapfrog step
contributes to the Metropolis diagnostic.
* Ignore .claude settings directory in R CMD check
Adds the Claude Code settings directory to .Rbuildignore so it is not
flagged as an unexpected hidden directory by --as-cran checks. The
directory is also listed in .gitignore but R CMD check scans the
working tree, not the git index.
* Expose NUTS mean acceptance probability in nuts_diag
Surface the per-iteration mean Metropolis acceptance probability in
fit$nuts_diag$accept_prob (chains x iterations), paralleling Stan's
accept_stat__ output. Adds a per-chain mean_accept_prob summary and
threads the value through the bgm and bgmCompare output pipelines.
The new field also serves as a direct regression guard for the
accept_prob accumulation bug fixed previously: the test in
test-reversibility-check.R asserts a well-tuned GGM produces mean
accept_prob above the dual-averaging target.
* Remove Geweke-style NUTS test; plan SBC replacement
The 'Geweke-style forward-sampling test' added in #86 was never a valid
Geweke joint distribution test: it drew K from a Wishart(p+1, I) prior
while bgms targets a Cauchy(0, 2.5) + Gamma(1, 1) prior, ran 50 warmup
iterations (so the kernel kept changing during 'one transition'), and
started NUTS from a data-driven init. A bug-free sampler could not
satisfy the equality it was asserting; the test has been quietly
failing on every nightly run since it was introduced.
The NUTS-vs-MH moment comparison and the SBC test cover the same
correctness territory properly. A future commit will add a proper SBC
replacement (TODO noted in the file header).
* Fail nightly validation CI on any test failure
The nightly workflow invoked 'devtools::test()', which returns a
testthat result object with exit code 0 even when tests fail. As a
result the job reported 'success' on every nightly run despite
testthat printing 'FAIL 4' (the Geweke test, now removed).
Switch to 'devtools::test(stop_on_failure = TRUE)' so a single test
failure propagates to the job status and the nightly turns red.
* Switch NUTS to multinomial candidate weighting (Stan base_nuts.hpp)
Replace the Hoffman-Gelman slice sampler inside nuts_step with Stan's
multinomial variant. Candidates are now weighted by log w_i = H0 - h_i
(the Hamiltonian offset of each leaf from the initial state) and
combined with log-sum-exp in log space:
- Base case: log_sum_weight = H0 - h (or -inf on divergence); the
non-reversible early return runs BEFORE logp evaluation, so broken
states never hit the posterior cache.
- Sibling subtrees (within build_tree) are combined with symmetric
multinomial sampling: accept the final subtree with probability
exp(w_final - log_sum_exp(w_init, w_final)).
- Top-level (in nuts_step) uses biased progressive sampling:
accept the new subtree with probability min(1, w_subtree / w_traj),
so later subtrees are preferred on average.
- Divergence is detected by (h - H0) > Delta_max (= 1000), matching
Stan; no slice variable is needed.
The Metropolis diagnostic (alpha and n_leapfrog) is computed for every
leaf regardless of divergence or reversibility and accumulated outside
the s_prime guard, so the reported accept_prob still reflects the full
trajectory.
Drops the vestigial theta_0 and r0 parameters from build_tree's
signature. Adds a numerically stable log_sum_exp helper under src/math/.
n_alpha is renamed to n_leapfrog and n_prime is replaced by
log_sum_weight in BuildTreeResult, matching Stan's vocabulary.
Validation:
- Full testthat suite green (7362 pass / 43 expected skips).
- test-ggm-nuts.R (slow) passes including NUTS-vs-MH moment agreement,
edge-selection accuracy, gradient-near-PD-boundary, and diagnostics.
- test-rattle-edge-selection.R (slow) passes.
- test-mixed-nuts.R (slow) passes.
- test-sbc-ggm.R NUTS part passes; pre-existing MH SBC failure is
unaffected by this change.
- Step 0 regression test (accept_prob >= 0.75 on a well-tuned GGM)
still passes; mean accept_prob ~ 0.94.
- Mean tree depth: 2.9100 (pre-swap slice) -> 3.0025 (multinomial),
within MC noise. Divergence rate unchanged at 0.
References:
Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian
Monte Carlo, section A.3.2. arXiv:1701.02434.
Stan Development Team. base_nuts.hpp.
https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/base_nuts.hpp
* Match Stan's absorption rule for final Stage-2 warmup window
bgms's doubling-window loop produced one extra small trailing window
at the end of Stage-2 whenever warmup \geq 1000. Each window boundary
triggers a mass-matrix update and a step-size reinit (heuristic +
dual-averaging restart), so the spurious extra update disrupted
step-size convergence during the most important part of warmup.
For warmup=1000 the window boundaries now match Stan exactly:
bgms (before): 100, 150, 250, 450, 850, 950 (6 windows)
bgms (after): 100, 150, 250, 450, 950 (5 windows, same as Stan)
The rule: when the window AFTER the next would overshoot stage3a_start,
stretch the current next_end to stage3a_start instead of emitting a
smaller trailing window. Matches Stan's
windowed_adaptation::compute_next_window().
Measured improvement at matched settings (3-seed average):
p=5 tridiagonal GGM, warmup=1000, iter=2000:
ESS_MIN 2841 -> 3121 (+10%)
ESS/leapfrog 0.181 -> 0.200
2D standard normal, warmup=1000, iter=4000:
ESS_MIN 3306 -> 3466 (+5%)
This is a partial close of the ~30% ESS gap versus Stan; the remaining
gap is still under investigation (step-size reinit heuristic, build-tree
early-return bookkeeping).
* Revert "Match Stan's absorption rule for final Stage-2 warmup window"
This reverts commit aee662f.
* Fix NUTS top-level U-turn bookkeeping to match Stan
The outer trajectory loop in nuts_step was tracking the four U-turn
boundary momenta (p_bck_bck / p_bck_fwd / p_fwd_bck / p_fwd_fwd and
their sharp counterparts) incorrectly:
1. Neither outer-end momentum (p_bck_bck for the leftmost end, p_fwd_fwd
for the rightmost) was maintained at all. Stan requires them so that
the 'save old outer end as new interior boundary' step can run
before each extension.
2. The backward branch assigned the subtree's BEG fields to the
'bck_bck' slot and the subtree's END fields to the 'bck_fwd' slot,
which is the opposite of what the names mean. For a backward
subtree, result.p_beg is the FIRST leaf (interior right, closest to
the rest of the trajectory) and result.p_end is the LAST leaf
(outer leftmost). The forward branch was correct; only the backward
branch had the asymmetry.
3. Neither branch saved the old outer end into the interior-boundary
slot before calling build_tree, so the extended U-turn checks
(checks 2 and 3) used stale momenta from whatever extension last
updated those variables — typically the initial r0 when the
sampler happened to extend in the same direction multiple times
in a row.
The combined effect: on targets where the trajectory's U-turn
geometry depends on the interior boundary (the higher the dimension
or the more non-isotropic the posterior, the stronger the effect),
bgms terminated trees at positions Stan would have continued past.
Trajectories were cut short, ESS was lost.
Fix: (1) declare p_bck_bck and p_fwd_fwd in nuts_step and keep them
updated; (2) save p_bck_bck -> p_fwd_bck before each backward
extension and p_fwd_fwd -> p_bck_fwd before each forward extension,
matching Stan's base_nuts.hpp; (3) normalize the backward-branch
post-call assignments so p_bck_bck holds the outer-leftmost momentum
(result.p_end) and p_bck_fwd holds the interior-right momentum
(result.p_beg).
Validation:
- 6 targets x 5 seeds (gauss_2d, ggm_p3/p5/p8/p12/p20): ESS_min up
on 6/6 targets, ranging +1.3% (p=8) to +95.1% (p=20). Treedepth
increases slightly (trees now run to real U-turns instead of
premature ones) but ESS/leapfrog also improves on 5/6.
- Posterior mean/sd differences <= 0.003 across all targets; sd
ratios in [0.97, 1.01]. Same target, same posterior.
- Zero divergences in every run.
- Full testthat suite: 7362/7362 fast tests pass; slow suite shows
zero new regressions (3 failing tests are pre-existing on HEAD,
unrelated to NUTS).
- External 65-D marginal-PL benchmark reports bgms/Stan ESS ratio
of 1.05 (up from 0.69) at matched settings; posterior-mean /
sd-ratio agreement preserved.
References:
Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian
Monte Carlo. arXiv:1701.02434.
Stan Development Team. base_nuts.hpp. Verbatim reference checked in
under dev/stan_reference/.
* Fix mixed-MRF marginal PL target and warmup schedule
Three changes, all verified by analytic tests and paired SBC against
the mixedGM Stan reference.
1. Interaction prior on Kyy scale (pure GGM + mixed MRF continuous
block). The `interaction_prior` now refers to off-diagonal entries
on the K_yy = -K/2 association scale, matching the mixed-MRF
convention. A `Normal(0, scale)` prior therefore constrains K_yy
off-diagonals with standard deviation `scale`, equivalent to
Normal(0, 2*scale) on the precision entries.
2. Mixed-MRF marginal pseudo-likelihood
(`pseudolikelihood = "marginal"`): the off-diagonal cross-induced
contribution to the effective interaction matrix
M = A_xx + 2 A_xy Sigma_yy A_xy' was under-counted by a factor of 2
in the rest-score computation. Forward log-posterior and analytic
gradient now match a direct Gaussian-integral reference and central
finite differences to machine precision. The fix affects both the
NUTS and adaptive-Metropolis paths.
3. NUTS Stage-2 warmup windowing: when the window after the next would
overshoot the Stage-3a boundary, the current next window is now
stretched to absorb the remaining Stage-2 budget instead of
emitting a small trailing window. Each window boundary triggers a
mass-matrix update and step-size reinitialization, so eliminating
the disruptive trailing window improves dual-averaging convergence
and matches Stan's windowed_adaptation::compute_next_window.
* Apply styler formatting
* Expose sample_ggm_prior with full prior-class plumbing and tests
The user-facing wrapper now validates inputs, rejects beta_prime_prior()
with a clear message, passes the scale_prior_type through to C++ via
create_scale_prior(), and wires verbose to the progress manager. The
return list is fully documented with examples.
Adds tests/testthat/test-sample-ggm-prior.R covering return shape,
edge_indicators constraint enforcement (algebraic vs RATTLE-projected
zeros), interaction- and scale-prior plumbing, and error paths.
* Apply bgms_style: <- -> = on merge artifacts
Styler pass via inst/styler/bgms_style.R picked up the assignment-style
violations introduced when keeping HEAD's prior-class additions in
R/bgm.R, R/bgm_spec.R, R/run_sampler.R, R/sample_ggm_prior.R, and
the auto-merged R/bgmCompare.R / tests/.
* Detect CRAN easybgm via source fingerprint, not version
Both the CRAN release of easybgm and its in-development rewrite are
stamped 0.4.0, so the previous version-based threshold
(version < 0.4.0) returned FALSE for both, sending an S7 fit to
old-API code that does class(fit) <- ... and fit$arguments$... and
crashing it.
Replace the version check with a source-code fingerprint: probe
the easybgm namespace for bgm_extract.package_bgms and inspect
its body for extract_category_thresholds (renamed to
extract_main_effects in bgms 0.2.0). The CRAN body still calls the
old name; the rewrite uses extract_main_effects.
Verified:
- easybgm CRAN main (0.4.0, old code): 182 pass, 0 fail
(was 87 pass + 3 errors under the version-only threshold)
- easybgm update_easybgm (0.4.0, new): 232 pass, 0 fail
* Restore version-based easybgm shim, threshold 0.5.0
Source-fingerprint detection was a workaround for the version
collision between CRAN main and the in-development rewrite (both
0.4.0). Once update_easybgm bumps to 0.5.0, the version-based
threshold is the cleaner discriminator.
Status under the 0.5.0 threshold:
- CRAN main 0.4.0: shim -> S3 list (still 182 pass)
- update_easybgm 0.4.0: shim -> S3 list (still 232 pass; the
new extractor API works on S3 too, so this is fine until the
version is bumped to 0.5.0)
- update_easybgm >= 0.5.0 (after bump): S7 path
* Rename sample_ggm_prior -> sample_precision_prior, list in _pkgdown.yml
The previous name was opaque: the function samples precision matrices
$K$ from the GGM prior (n=0, S=0 makes the likelihood flat), so
'precision_prior' captures both *what* it draws and *where* it draws
from. Also list it under "Simulation and prediction" in
_pkgdown.yml — its absence was breaking the GitHub Pages build.
Renamed:
- R/sample_ggm_prior.R -> R/sample_precision_prior.R
- tests/testthat/test-sample-ggm-prior.R -> test-sample-precision-prior.R
- C++ entry point sample_ggm_prior() -> sample_precision_prior()
- Rcpp export name sample_ggm_prior_cpp -> sample_precision_prior_cpp
Tests: 6663/6732 pass, 0 fail, 0 error.
* Document simulate_* vs sample_* convention in _pkgdown.yml
simulate_* = forward-generate data given known parameters (cheap, exact).
sample_* = MCMC draws from a target distribution (NUTS/Metropolis).
The two prefixes coexist on purpose; record that under the
'Simulation and prediction' section description so users browsing
the reference don't read it as accidental inconsistency.
---------
Co-authored-by: Don van den Bergh <donvdbergh@hotmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add Hamiltonian Monte Carlo (HMC) and No-U-Turn sampler (NUTS) for the Gaussian graphical model, using RATTLE constrained HMC to sample precision elements while enforcing positive-definiteness of the precision matrix.
Problem / Motivation
The OMRF and mixed MRF models in bgms already use NUTS for sampling interaction parameters. The GGM model still relies on element-wise adaptive Metropolis-Hastings, making it the only model class without a gradient-based sampler. Bringing the GGM in line with the rest of the package is not straightforward because the precision matrix must remain positive definite -- a constraint that standard (unconstrained) NUTS cannot enforce. The RATTLE method extends HMC to manifold constraints.
Proposed Changes / New Functionality
update_method = "nuts"option inbgm()for GGM models (default for continuous data).Benchmarks on simulated graphs up to p = 20 show RATTLE delivers higher ESS/s than adaptive MH.
Type of Change
Files Edited
"nuts"as anupdate_methodchoice, build the model specification, and warn for constrained HMC.Testing and Validation
BGMS_RUN_SLOW_TESTS=true).R CMD INSTALL --preclean .with 0 errors/warnings.