Skip to content

fix: log_prior_from_value sign convention — density form across Prior subclasses#1269

Merged
Jammy2211 merged 1 commit into
mainfrom
feature/log-prior-sign-convention
May 15, 2026
Merged

fix: log_prior_from_value sign convention — density form across Prior subclasses#1269
Jammy2211 merged 1 commit into
mainfrom
feature/log-prior-sign-convention

Conversation

@Jammy2211
Copy link
Copy Markdown
Collaborator

Summary

Switches Prior.log_prior_from_value to return log p(x) in density form (negative for low-density values, zero at the mode) across NormalMessage (Gaussian/TruncatedGaussian base), LogGaussianPrior, and LogUniformPrior. Closes #1266.

This is a behaviour-change fix that affects every MCMC / MLE fit with a non-uniform prior. Read the migration warning below before pulling.

What was broken

Fitness._call (autofit/non_linear/fitness.py:200) computes:

figure_of_merit = log_likelihood + sum(log_priors)

Emcee / Zeus / MLE-Drawer maximise figure_of_merit; LBFGS / BFGS minimise -2 * figure_of_merit. Both interpretations assume log_priors is in density form so that the maximand equals the log-posterior. But the existing concrete bodies returned cost form (+½z² for Gaussian, 1/value for LogUniform), inverting the sign of the prior contribution. Adding +cost to log_likelihood makes the maximand log_lik − log_prior_density, which pushes MCMC samples away from prior modes and makes LBFGS minimise the prior-mean direction.

Empirically demonstrated by two regression scripts added in the matching autofit_workspace_test PR — Emcee with a flat log-likelihood and GaussianPrior(mean=5, sigma=1) diverges to ~10^146 (should cluster at 5.0); LBFGS converges to 8e143 (should converge to 5.0). After this fix both behave correctly.

This regression has existed since commit db4016db42 (4 May 2022, "refactored priors into package") for LogUniformPrior and pre-dates that for the Gaussian family. Most production usage on this codebase is nested-sampler-based, which is why it stayed hidden — prior_transform-based samplers (Dynesty / Nautilus) bypass log_prior_from_value entirely.

API Changes

Prior.log_prior_from_value is a behaviour change, not a signature change. The kwarg surface is unchanged ((value, xp=np)). Returned values are sign-flipped on the JAX-family priors and replaced with the correct density-form expression on LogUniformPrior. The normalisation constants -log(σ · √(2π)) and -log(log(b/a)) are dropped from the returned value, matching UniformPrior.log_prior_from_value which drops -log(b-a) to return 0.0. The math of the search is invariant to these constants.

See full details below.

Migration warning — IMPORTANT

Cached samples.csv from Emcee / Zeus / MLE-Drawer / LBFGS / BFGS fits with ANY non-uniform prior (Gaussian, LogGaussian, TruncatedGaussian, LogUniform) are biased and should be re-run after this PR.

Sampler family Cached chains Stored log_prior column
Dynesty / Nautilus unaffected (priors via prior_transform, never touch log_prior_from_value) wrong-signed but auto-recovers when Aggregator re-runs model.log_prior_list_from_vector against existing parameter columns
Emcee / Zeus / MLE-Drawer / LBFGS / BFGS biased for non-uniform priors — re-run required will be correct in newly-written outputs

If you have important results from before this PR that used Emcee or LBFGS with a non-UniformPrior prior, re-run them. Uniform-prior-only fits are unaffected on any sampler.

Test Plan

  • pytest test_autofit — 1242 passed, 1 skipped
  • test_autofit/mapper/prior/test_prior.py — three pins updated to density-form values
  • test_autofit/mapper/model/test_model_mapper.py::test_log_prior_list_from_vector — one pin updated to density-form values
  • autofit_workspace_test/scripts/jax_assertions/priors_xp_dispatch.py — 28 assertions pass (existing parity + 4 new density-form gates)
  • autofit_workspace_test/scripts/prior_correctness/emcee_gaussian_bias_check.py — new permanent regression: Emcee samples cluster at mean=5.0 ± 0.2 with std≈1.0 (was diverging to 10^146 before)
  • autofit_workspace_test/scripts/prior_correctness/lbfgs_gaussian_bias_check.py — new permanent regression: L-BFGS-B converges to 5.0 ± 0.01 (was diverging to 8e143 before)
  • autofit_workspace/scripts/searches/{mcmc,mle,nest,start_point}.py — all 4 pass
  • autofit_workspace_test/scripts/graphical/ep.py — full EP loop runs to completion (EP uses Message.logpdf directly, confirmed unaffected by the bug)
  • Smoke pass: autofit_workspace 9/9, autogalaxy_workspace 8/8, autolens_workspace 9/9, autolens_workspace_test 12/12, HowToLens 6/6 — 44/44 across 5 workspaces
Full API Changes (for automation & release notes)

Changed Behaviour

  • NormalMessage.log_prior_from_value(value, xp=np) — now returns -(value - mean)**2 / (2 sigma**2) (density form, was the cost form +(value - mean)**2 / (2 sigma**2)). The constant -log(sigma * sqrt(2 * pi)) is dropped, matching UniformPrior's convention of dropping -log(b - a).
  • LogGaussianPrior.log_prior_from_value(value, xp=np) — now returns -(log(value) - mean)**2 / (2 sigma**2) - log(value) (density-form quadratic plus the change-of-variables Jacobian). NumPy and JAX paths both updated; out-of-support (value <= 0) returns -inf.
  • LogUniformPrior.log_prior_from_value(value, xp=np) — now returns -log(value) (NumPy path) or xp.where(in_bounds, -xp.log(value), -xp.inf) (JAX path). Previously returned 1.0 / value which was neither a density nor a cost — it was the Jacobian gradient d(log x)/dx.

Not Changed

  • UniformPrior.log_prior_from_value — already returned 0.0 (sign-agnostic, correct).
  • TruncatedNormalMessage.log_prior_from_value — already returned density form with -inf out-of-bounds.
  • Fitness._call, Fitness.call_wrap, Fitness._jit, Fitness._vmap — sign-invariant after the prior layer is corrected.
  • The entire autofit/graphical/ framework (factor graphs, expectation propagation, Laplace approximation) — EP routes priors through Message.logpdf / Message.cdf / Message.pdf directly and does not consult Prior.log_prior_from_value on the hot path.

Migration

No API migration required (signatures unchanged). Downstream callers see returned values with flipped sign (Gaussian family) or a different formula entirely (LogUniform). Existing Fitness._call-based math is now correct without changes.

Numerical values are different in three regression-test pinned locations, all updated:

  • test_autofit/mapper/prior/test_prior.py::TestLogUniformPrior::test__log_prior_from_value
  • test_autofit/mapper/prior/test_prior.py::TestGaussianPrior::test__log_prior_from_value
  • test_autofit/mapper/prior/test_prior.py::test_log_gaussian_prior_log_prior_from_value
  • test_autofit/mapper/model/test_model_mapper.py::TestInstances::test_log_prior_list_from_vector

🤖 Generated with Claude Code

… subclasses

Phase 0 of #1266: switch Prior.log_prior_from_value to return log p(x) in
density form (negative for low-density, zero at the mode) across the
Gaussian-family priors and LogUniformPrior. NumPy + JAX paths both updated.

What was broken
---------------
Fitness._call computes `figure_of_merit = log_likelihood + sum(log_priors)`
which Emcee / Zeus / MLE-Drawer maximise (and LBFGS minimises via
-2 * figure_of_merit). This expects log_priors in density form. But the
existing bodies returned cost form `-log p(x)` (positive for low-density):

  NormalMessage.log_prior_from_value:  +(value - mean)**2 / (2 * sigma**2)
  LogGaussianPrior.log_prior_from_value:  +(log value - mean)**2 / ... - log(value)
  TruncatedNormalMessage.log_prior_from_value:  density-form  (already correct)
  UniformPrior.log_prior_from_value:  0.0  (sign-agnostic, fine)
  LogUniformPrior.log_prior_from_value:  1.0 / value  (Jacobian gradient, not a log)

Adding a positive quadratic to log_likelihood turns the maximand into
log_lik - log_prior_density rather than log_lik + log_prior_density, so MCMC
samples drift AWAY from prior modes and LBFGS minimises in the wrong
direction. Empirically confirmed: Emcee + flat-likelihood + GaussianPrior(5,1)
diverges to ~10^146 instead of clustering at 5.0; LBFGS converges to 8e143
instead of 5.0. See autofit_workspace_test/scripts/prior_correctness/
{emcee,lbfgs}_gaussian_bias_check.py (added in the matching workspace_test PR).

What's fixed
------------
- NormalMessage.log_prior_from_value: returns -(value - mean)**2 / (2 sigma**2)
- LogGaussianPrior.log_prior_from_value: returns
  -(log value - mean)**2 / (2 sigma**2) - log(value) on both NumPy + JAX paths,
  matching the change-of-variables Jacobian convention
- LogUniformPrior.log_prior_from_value: returns -log(value); JAX path adds
  bounds-guard with xp.where(in_bounds, -xp.log(value), -xp.inf)

The -log(sigma * sqrt(2 * pi)) and -log(log(upper/lower)) normalisation
constants are dropped, mirroring UniformPrior dropping -log(b - a) to return
0.0 — these constants don't affect posterior shape and the existing
in-house convention is to omit them.

Unchanged (already correct)
---------------------------
- UniformPrior.log_prior_from_value (0.0, sign-agnostic)
- TruncatedNormalMessage.log_prior_from_value (density form with -inf out-of-bounds)
- Fitness._call / Fitness.call_wrap / Fitness._jit / Fitness._vmap
- The entire graphical / expectation-propagation framework (EP uses
  Message.logpdf directly, not Prior.log_prior_from_value — confirmed unaffected)

Test pin updates
----------------
- test_autofit/mapper/prior/test_prior.py — three updated pins to assert
  density-form values (LogUniform, Gaussian parametrize, LogGaussian)
- test_autofit/mapper/model/test_model_mapper.py — one updated pin for
  Model.log_prior_list_from_vector to assert density-form values

Migration warning
-----------------
Cached samples.csv from Emcee / Zeus / MLE-Drawer / LBFGS / BFGS fits with
ANY non-uniform prior (Gaussian, LogGaussian, TruncatedGaussian, LogUniform)
are BIASED and should be re-run. Nested-sampler chains (Dynesty / Nautilus)
are UNAFFECTED — those route priors through prior_transform and only the
stored log_prior column in samples.csv is wrong-signed; it is automatically
re-derived from parameters on next load.

Closes #1266

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pending-release PR queued for the next release build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: log_prior_from_value sign-convention bug across Prior subclasses

1 participant