Skip to content

Scalable contract: scale returns log Jacobian; sum-of-margins for Tree#70

Open
alexeid wants to merge 4 commits intomasterfrom
scalable-contract
Open

Scalable contract: scale returns log Jacobian; sum-of-margins for Tree#70
alexeid wants to merge 4 commits intomasterfrom
scalable-contract

Conversation

@alexeid
Copy link
Copy Markdown
Member

@alexeid alexeid commented May 1, 2026

Summary

Refactors beast.base.inference.Scalable to a three-method contract whose invariants are mutually consistent on a single dilation axis:

double scale(double s)              // returns log Jacobian determinant of the move
double getScalableValue()           // current position on the dilation axis
default double setScalableValue(V)  // = scale(V / getScalableValue())

The contract requires (and a new test harness enforces):

  1. scale(s) makes getScalableValue() return s × original (scale-equivariance).
  2. setScalableValue(V) makes getScalableValue() return V (set is a fixed point of get).
  3. setScalableValue(get × s) produces the same state as scale(s) (set composes with scale).

Tree.scale switches from affine internal-height scaling to interval (margin) scaling, so the move always succeeds for any positive s on heterochronous trees and the dilation summary getScalableValue = sum of margins is exactly s-equivariant. The previous affine semantics remain available as Tree.scaleToRootHeight(targetHeight), used by StarBeastStartState init code, outside the contract.

Operator-side updates are mechanical: scale return type int → double log Jacobian, summed directly. ScaleOperator and UpDownOperator (legacy + spec + Bactrian) preserve their existing kernel-symmetry corrections. AMVN routes through setScalableValue / getScalableValue, inheriting the contract.

Closes #20 for the contract layer.

Why

Joelle Barido-Sottani is blocked porting custom-tree packages to beast3 because the old int dof return on Scalable.scale couldn't express non-affine HRs and the spec UpDownOperator was bypassing Tree.scale entirely (her March 26 comment on #20). The contract refactor is the fix: each Scalable now declares a single dilation axis, the three methods are mutually consistent or the contract test fails, and getScalableValue carries whatever summary the implementer's scale actually preserves.

The Tree.scale body change addresses a separate latent bug: the old affine Node.scale throws IllegalArgumentException on heterochronous trees for any s below a tree-shape-dependent threshold (e.g., on a tree with leaves at h=0 and h=2 and a parent at h=4, any s < 0.5 throws). Interval scaling preserves tip dates by construction and never throws for s > 0. dof equals numIntervals for any binary tree (with or without sampled ancestors), so operator HRs are unchanged.

Out of scope (deferred to follow-ups)

These are operator-design questions, separable from the interface contract:

  • spec UpDownOperator's actualScaler = lengthAfter / lengthBefore logic for tree + up/down combinations (lines 158-176) — preserved as-is.
  • Whether IntervalScaleOperator should collapse into UpDownOperator once Tree.scale is interval-scaling.
  • AMVN's covariance behaviour for non-ultrametric trees beyond the proposal-target fix delivered here. (AMVN's parameter summary now results in exactly V under interval scaling, but the empirical covariance choice is a separate discussion.)
  • Whether tree-aware operators should use the kernel s or an "effective" tree-dilation factor for non-tree up/down parameters.

API impact

This is a binding interface change. Downstream packages that implement Scalable directly or store the int return must update:

  • Mascot, beast-classic, LPhyBeast core: no Scalable usage; compile cleanly against this branch.
  • ⚠️ BEASTLabs: 2 files (PrevalenceList, TreeScaleOperator) need a corresponding update. Parallel PR: Adapt to beast-base Scalable contract change BEAST2-Dev/BEASTLabs#29 on the matching scalable-contract branch (all 83 BEASTLabs tests pass against this beast3 branch).

The Scalable.scaleAll default helper is removed (no callers in any tracked repo).

Test plan

  • 408 existing beast-base unit tests pass
  • 9 new Scalable contract tests pass (ScalableContractTest harness + per-class tests for RealScalarParam, RealVectorParam, and Tree on ultrametric, heterochronous, and leaf-intruding fixtures)
  • 10 heterochronous MCMC integration tests pass under -Pslow-tests (TipTimeTest legacy + spec, ~4 minutes; chains converge to BEAST1 reference values within tolerance)
  • Mascot, beast-classic, LPhyBeast core compile cleanly against the new beast-base
  • BEASTLabs migrated on parallel branch; all 83 tests pass

Side-effect bug fixes

Spec AdaptableVarianceMultivariateNormalOperator.getValue for trees was t.getRoot().getHeight(); with no return statement, so it always fell through to throw new RuntimeException("programmer error: should not get here"). The migration to t.getScalableValue() fixes the missing return.

Notes for review

  • The contract is enforced per-Scalable-instance, not per-operator. Operators remain free to choose what to do with scale(s)'s return — e.g., UpDownOperator keeps its −2 log(s) kernel correction; spec UpDownOperator does not. The contract just guarantees that scale, get, and set on a single Scalable are mutually consistent.
  • Node.intervalScale (new) sits alongside Node.scale (unchanged). Tree.scale calls the former; Tree.scaleToRootHeight calls the latter. Two recursions, two purposes, no overlap.
  • For sampled-ancestor trees, both recursions skip fake nodes consistently with IntervalScaleOperator.resampleNodeHeight (the existing reference implementation we lifted).

Refactors the Scalable interface to a three-method contract whose
invariants are mutually consistent on a single dilation axis:

  scale(s)            returns log Jacobian determinant of the move
  getScalableValue()  returns position on the dilation axis
  setScalableValue(V) default: scale(V / getScalableValue())

The contract requires:
  (1) scale(s) makes getScalableValue() return s × original
  (2) setScalableValue(V) makes getScalableValue() return V
  (3) setScalableValue(get × s) produces the same state as scale(s)

A new ScalableContractTest harness exercises these three invariants on
every concrete implementation. Per-class tests cover RealScalarParam,
RealVectorParam, and Tree on ultrametric, heterochronous, and
leaf-intruding fixtures.

Tree.scale switches from affine internal-height scaling to interval
(margin) scaling so the move always succeeds for any positive s on
heterochronous trees, and the dilation summary getScalableValue =
sum of margins is exactly s-equivariant. The previous affine semantics
remain available as Tree.scaleToRootHeight(targetHeight), used by
StarBeastStartState init code.

Operator-side updates are mechanical: scale return type int -> double
log Jacobian, summed directly. ScaleOperator and UpDownOperator (legacy
+ spec + Bactrian variants) preserve their existing kernel-symmetry
corrections. AMVN routes through setScalableValue / getScalableValue,
inheriting the contract; this also fixes a latent bug in spec AMVN
where Tree.getValue had no return statement.

Out of scope for this PR (operator-design layer, deferred to follow-up):
  * spec UpDown actualScaler logic for tree+up/down combinations
  * IntervalScaleOperator collapse into UpDown
  * AMVN's covariance behaviour for non-ultrametric trees beyond the
    target-landing fix

Tested:
  * 408 existing beast-base unit tests pass
  * 9 new Scalable contract tests pass
  * 10 heterochronous MCMC integration tests pass (TipTimeTest legacy
    + spec under -Pslow-tests, ~4 minutes)
  * Mascot, beast-classic, LPhyBeast core compile cleanly against new
    beast-base
  * BEASTLabs requires corresponding 2-file update; see parallel
    BEASTLabs scalable-contract branch (all 83 tests pass)

API impact: binding interface change. Downstream packages that
implement Scalable directly or store the int return must update; the
beast.base implementations and BEASTLabs branch demonstrate the
migration pattern.

Closes/addresses: #20
@alexeid alexeid marked this pull request as ready for review May 1, 2026 23:01
alexeid added 3 commits May 2, 2026 12:24
Same change pattern as TreeTest and the legacy ScaleOperatorTest in
the parent commit. BactrianScaleOperatorTest is @tag("slow") so it
was not exercised in the default test run; surfaced when running
mvn -Pslow-tests test.
UpDownOperatorTest.testLogNormalDistribution and
RealRandomWalkOperatorTest.testNormalDistribution were failing
intermittently because their tolerance (5e-3) was at ~2.7 SE of the
sample mean — a ~5%/~10% expected failure rate per run by design.

Computing 3 SE properly using ESS rather than nominal sample count:

  UpDownOperatorTest (LogNormal M=1, S=1, no tree, ~498k samples,
    ESS near nominal): SE = sqrt(1.72 / 498000) ~= 1.86e-3, 3 SE ~= 5.6e-3.
    Tolerance set to 6e-3.

  RealRandomWalkOperatorTest (Normal mean=1, var=1, documented Mirror
    ESS = 196k from existing comment): SE = sqrt(1 / 196000) ~= 2.26e-3,
    3 SE ~= 6.8e-3. Tolerance set to 7e-3.

  Re-enabled Randomizer.setSeed(127) in RealRandomWalkOperatorTest
    (it had been commented out, leaving the test non-deterministic
    against whatever Randomizer state preceded it).

Expected failure rate at the new bounds: 0.27% per test run.

These are pre-existing flake fixes surfaced when running mvn -Pslow-tests
test against the Scalable contract change (#70). The
identical failing values reproduce on master; not caused by the contract
change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scale/UpDown operators rely on StateNode.scale() -- replace by Scalable interface

1 participant