Skip to content

Consume pinned Ledger artifacts and enforce the period contract#287

Open
MaxGhenis wants to merge 4 commits into
mainfrom
ledger-artifact-period-contract
Open

Consume pinned Ledger artifacts and enforce the period contract#287
MaxGhenis wants to merge 4 commits into
mainfrom
ledger-artifact-period-contract

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

The populace side of the Ledger facts-only transition (PolicyEngine/ledger#71, shipped in PolicyEngine/ledger#73). Builds now consume pinned, hash-verified Ledger artifacts, and consuming an observation dollar level at a period other than its fact period is a hard error instead of a silent default.

Pinned artifact consumption (#160, #271). New populace.build.ledger_artifact (duck-typed, stdlib-only, like the rest of our Ledger consumption): load_ledger_consumer_artifact loads a Ledger consumer-artifact directory (manifest.json + consumer_facts.jsonl), verifies the fact file against the manifest hash, honors optional --ledger-facts-sha256 / --ledger-manifest-sha256 build-config pins, validates row assertions, and exposes a provenance() block that the release tool now records in both build_manifest.json and release_manifest.json — a release's target values are reproducible from a named Ledger artifact. Bare consumer-facts files still load (content-addressed by their own hash) so nothing breaks today.

Period-contract enforcement (the populace#212 guard). compile_us_fiscal_target_registry now always ends with enforce_period_contract: an ageable observation dollar target (USD sum, not already period-aligned within-surface) whose fact period differs from the build period raises PeriodContractError naming every violation — unless aging transformed it or the build passes allow_unaged_dollar_targets=True, which stamps period_contract_waiver metadata on each affected target so the waived state is auditable in diagnostics. Publisher projections consumed at their published level (JCT/CBO-backed targets) are exempt by assertion; that's projection-basis by construction, not un-aged consumption.

Aging is the named, versioned alignment model. AGING_MODEL_ID = "cbo_growth_factor_aging" / AGING_MODEL_VERSION = "1.0.0"; every aged target records alignment_model_id/alignment_model_version alongside the existing basis/aging_factor/factor-lineage metadata — the declaration shape Ledger records but never executes. Bumping the model version changes no Ledger fact and no artifact hash, only build outputs. The release tool ages by default now (--no-age-targets to disable, in which case the contract requires the explicit waiver flag); #116's SOI dollar targets are the first user.

Assertion-aware consumption (ledger#73). Consumer rows' assertion and fact period flow into ledger_assertion/ledger_fact_period target metadata and reference selectors ({"assertion": "source_projection"} now selects). CBO rows that look like projections structurally but are typed observation are refused as growth factors — inconsistent data must not supply ratios.

Behavior changes to review:

  1. compile_us_fiscal_target_registry raises on silently un-aged cross-period observation dollars (waiver available and auditable).
  2. tools/build_us_fiscal_refresh_release.py defaults to --age-targets (was off).
  3. The bare-JSONL loader in the release tool is replaced by the artifact loader; --ledger-facts accepts a directory or a file.

Existing tests that compile TY2022/23 fixture surfaces at 2024/2025 un-aged now declare the explicit waiver (second commit) — the tested behavior is unchanged, the waived state visible. A new compile-level test asserts the unwaived path raises.

Test plan

  • uv run ruff check . and uv run pytest -q (full workspace) both pass locally.
  • New: test_ledger_artifact.py (hash verification, tamper detection, pins, assertion validation, provenance); period-contract unit + compile-level tests; alignment-declaration metadata; assertion selector/metadata; refused mistyped growth factors.

Refs #116, #160, #212, #271; PolicyEngine/ledger#71, PolicyEngine/ledger#73.

🤖 Generated with Claude Code

Measured implications (real feed)

Measured on a real Ledger bundle feed (105,216 consumer-fact rows built from fetched 2022/2023/2024 publisher artifacts, facts_sha256=f6fa6152…), compiling the identical US surface at the 2024 build period, un-aged vs the new default:

  • 5,490 targets total: 1,480 age, 0 waived, 4,010 untouched (counts, same-period rows, and current-year filing-season anchors).
  • Aged-dollar surface: $184.0T → $202.7T (+10.2%) across 1,480 dollar targets. Since calibration hits targets near-exactly (Cross-check: Populace US 2024 net income & tax ~37–43% below Enhanced CPS (benefits match) #212), these are the deltas the next build's aggregates inherit.
  • Factors in play: ×1.1203 on 1,400 rows (TY2022 → 2024, chained: observed SOI AGI growth 1.0305 × CBO projected 1.0872), ×1.1062 on 52 wage rows, ×1.0872 direct on 26 national TY2023 rows, ×1.3154 on net capital gain (2023 crash → projected recovery).
  • Headline rows: national AGI $15.29T → $16.62T; income tax liability $2.11T → $2.36T; total itemized deductions $691B → $751B; state AGI (CA $1.99T → $2.23T) consistently scaled.
  • Coherence check: the TY2022 US AGI row ages to $16.56T and the TY2023 national path to $16.62T — independent vintages converge within 0.35% on the same build-year level.

The measurement also drove two fixes now in this PR / ledger: model v1.1.0 chaining (v1.0.0 aged only 28 national rows because the CBO detail starts at TY2023 — all 1,480 state TY2022 rows would have needed waivers, an incoherent joint surface), and the loader no longer stamps default assertions onto legacy rows (the stamp made unlabeled CBO projections look like mistyped observations and zeroed out factor eligibility). PolicyEngine/ledger#76 types the CBO projection record sets at the source.

MaxGhenis and others added 2 commits July 2, 2026 16:05
Loads Ledger consumer artifacts (manifest.json + consumer_facts.jsonl)
with hash verification and optional build-config pins, and records the
artifact identity in both build and release manifests, so a release's
target values are reproducible from a named Ledger artifact (#160,
#271). Bare consumer-facts files still load, content-addressed by their
own hash.

The period contract (PolicyEngine/ledger#71) is now enforced at compile
time: an ageable observation dollar level whose fact period differs
from the build period raises PeriodContractError instead of silently
calibrating at the wrong period — the populace#212 failure mode.
Builds either age the value under the named cbo_growth_factor_aging
model (now the release-tool default, with --no-age-targets to disable),
or pass an explicit allow_unaged_dollar_targets waiver that stamps
period_contract_waiver metadata on every affected target. Aged targets
carry alignment_model_id/alignment_model_version alongside the existing
factor lineage, publisher projections consumed at their published level
are exempt by assertion, and CBO rows typed as observations are refused
as growth factors. Consumer rows' assertion and fact period now flow
into ledger_* target metadata and reference selectors.

Refs #116, #160, #212, #271; PolicyEngine/ledger#71, PolicyEngine/ledger#73.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Existing tests compile 2024/2025 surfaces from TY2022/23 fixture facts
un-aged — exactly the state the contract now refuses silently. Each
compile call declares the explicit allow_unaged_dollar_targets waiver
so the tested behavior is unchanged and the waived state is visible,
and the release-builder test stubs the artifact loader instead of the
retired bare-JSONL helper. A new compile-level test asserts the
unwaived path raises.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MaxGhenis and others added 2 commits July 2, 2026 16:58
The artifact loader validated row assertions by writing the observation
default onto rows that omit the field. Measured against a real bundle
feed, that broke aging: pre-#73 CBO projection rows arrive unlabeled,
the stamp typed them as observations, and the growth-factor filter then
refused them — zero targets aged. Rows now pass through exactly as
published; an assertion is validated when present and never fabricated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Measured against a real three-year Ledger bundle feed, v1.0.0 aged only
the 28 national TY2023 dollar targets: the CBO February 2026 projection
detail starts at TY2023, so all 1,480 TY2022 state rows had no
denominator and fell to period-contract waivers — an incoherent joint
surface (national anchors at 2024 levels, states at 2022).

cbo_growth_factor_aging v1.1.0 bridges years the projection detail does
not cover with the same series' national SOI actuals: factor =
soi(pivot)/soi(source) x cbo(build)/cbo(pivot), observed growth for
observed years and projected growth only where nothing is observed.
Chain sources are an explicit map (AGI via Table 1.1, wages and net
capital gain via Table 1.4); everything else keeps the AGI default,
now also chainable. Chained factors record both bridge facts in their
lineage.

On the same feed the full surface now ages: 1,480 targets, zero
waivers, aged-dollar total $184.0T -> $202.7T (+10.2%); the TY2022
US AGI row lands at $16.56T against $16.62T from the TY2023 national
path — independent vintages converging on the same build-year level.

Refs #116, #212; PolicyEngine/ledger#71.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant