Skip to content

Add opt-in period aging for US dollar targets#280

Merged
MaxGhenis merged 1 commit into
mainfrom
target-aging
Jul 2, 2026
Merged

Add opt-in period aging for US dollar targets#280
MaxGhenis merged 1 commit into
mainfrom
target-aging

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Motivation

Populace calibrates the US 2024 weights against SOI TY2022/TY2023 dollar levels applied at the 2024 build period un-aged. Because the calibration hits those source-year levels almost exactly, simulated current-year (2025+) income aggregates run systematically ~6-10% under current-year projections. This is a period-vintage artifact, not a support or weighting error.

Evidence from the #212 closing investigation (release populace-us-2024-sparse-l0-refit-57k-71a0887-national-only-20260701):

target calibrated value vs current-year projection
irs_soi.ty2022.historic_table_2.us.all.income_tax_liability_amount $2.105T (+0.01% hit) FY2025 receipts ~$2.4T
AGI (simulated 2025) $16.0T CBO TY2025 ~$17T+
income tax (simulated 2025) $2.15T FY2025 receipts ~$2.4T

The calibration is exact against the source-year levels, so the residual is the period of those levels. #116 raised this for the SOI EITC-by-AGI surface; the #212 comment generalized it to every SOI dollar target, to be implemented as Ledger target aging — the same declared consumer-side transform mechanism as the #205 geography-vintage crosswalk. This PR is the systematic results-level fix.

Closes the implementation gap flagged in #116 (generalized beyond EITC-by-AGI).

What this does

Adds a compile-time aging pass (populace.build.us_runtime.target_aging.age_us_dollar_targets) that scales dollar-amount targets from their source period to the build period using growth ratios computed from CBO revenue-projection facts already present in the Ledger consumer feed (cbo.revenue_projection.tyYYYY.income_by_source.<series>.projected_amount, CBO Feb 2026 Revenue Projections).

No numeric factor is ever hardcoded. Every factor is a ratio of two source-published CBO projected_amount facts, and every aged (or deliberately un-aged) target records the exact fact lineage it used.

Factor policy (priority order)

  1. Matching CBO series. A target whose measured concept maps to a CBO income-by-source row is aged by that series' own projection ratio projected_amount(series, build) / projected_amount(series, source). Mapped series: AGI, wages (wages_and_salaries), net capital gain, qualified dividends, net business income (Schedule C and partnership/S-corp both map here).
  2. CBO AGI default. Any other dollar amount (e.g. income tax liability, deductions, taxable interest) is aged by the CBO AGI projection ratio.
  3. Counts stay raw. Return/claim/population counts (measure_mode == "indicator_sum") are never aged — a growing nominal aggregate does not imply a growing return count.

A target is aged only when both the build-year and source-year projection facts of the chosen series are present in the feed. If the source-aligned series is missing one period, it falls back to the AGI series; if no usable CBO pair exists, the target is left at its raw source-year value with aging_factor_source="unavailable" so the un-aged state is explicit in diagnostics rather than silent (the #212 lesson: the failure was silent un-aged consumption). Rows already period-aligned within-surface by the existing SOI/EITC uprating passes (uprating_factor present) are excluded to avoid double counting.

Per-target diagnostics fields

Every spec carries, after the pass:

  • basis"projection" (aged) or "fact" (raw)
  • source_period — the fact's source year
  • aged_to — the build period
  • aging_factor — the applied ratio (1 when not aged)
  • aging_factor_source — the CBO fact source_record_id used as the factor numerator, or a skip reason (not_dollar_amount, not_usd_unit, already_period_aligned, source_equals_build, unavailable)

Opt-in / inert by default

The pass is gated behind:

  • age_targets: bool = False on compile_us_fiscal_target_registry(...)
  • --age-targets on tools/build_us_fiscal_refresh_release.py (default off)

With aging off, the compiled target surface is byte-identical to today (registry content hash unchanged) — preserving build step-isolation. The change is inert until a build explicitly enables it.

Eventual schema home

The fact-vs-computed boundary belongs in PolicyEngine/ledger#71: Ledger stays a facts-only store (including source-published projections as facts), and PolicyEngine-computed aged levels live in Populace as a named, versioned aging implementation that consumes growth-factor facts from Ledger and emits its own lineage. This module is that Populace-side implementation; it references #116 (the concrete SOI case) and #212 (the generalization).

Files touched

  • packages/populace-build/src/populace/build/us_runtime/target_aging.py (new) — the aging pass, factor policy, CBO projection indexing, diagnostics.
  • packages/populace-build/src/populace/build/us_runtime/fiscal_targets.pyage_targets param on compile_us_fiscal_target_registry, applied as the final nominal transform after within-surface alignment.
  • tools/build_us_fiscal_refresh_release.py--age-targets driver flag, plumbed to the compile call.
  • packages/populace-build/tests/test_us_fiscal_targets.py — factor-derivation tests (matching series, AGI fallback, series-year fallback, counts raw, unavailable, source==build, double-age guard) + inert-off byte parity + a compile-level test asserting basis="projection" and the right factor.

Expected results direction when enabled

Dollar-amount SOI/CBO targets sourced from TY2022/TY2023 are scaled up toward current-year (build-period) levels via CBO projection growth (~7-14% depending on series and source-year gap; e.g. AGI TY2023→TY2025 ≈ 1.14). Counts are unchanged. When a build calibrates against the aged surface, simulated current-year income and tax aggregates should rise ~6-10% toward CBO/receipts projections, closing the residual documented in #212 while benefits (already matching) stay put.

Tests

uv run pytest packages/populace-build packages/populace-calibrate798 passed, 6 skipped (heavy frame suite excluded). New aging tests: 8 targeted cases, all green. Ruff clean.

🤖 Generated with Claude Code

Populace calibrates US 2024 weights against SOI TY2022/TY2023 dollar levels
applied at the 2024 build period un-aged. Because calibration hits those
source-year levels almost exactly, simulated current-year (2025+) income
aggregates run ~6-10% under current-year projections (#212,
#116: AGI $16.0T vs CBO TY2025 ~$17T+; income tax $2.15T vs FY2025 receipts
~$2.4T). This is a period-vintage artifact, not a support or weighting error.

Add a compile-time aging pass that scales dollar-amount targets from their
source period to the build period using growth ratios computed from CBO
revenue-projection facts already in the Ledger consumer feed. No numeric factor
is ever hardcoded: every factor is a ratio of two source-published CBO
projected_amount facts, and each aged (or deliberately un-aged) target records
its fact lineage.

Factor policy:
1. Matching CBO series: a target whose measured concept maps to a CBO
   income-by-source row (AGI, wages, net capital gain, qualified dividends,
   net business income) is aged by that series' own projection ratio.
2. CBO AGI default: any other dollar amount uses the CBO AGI projection ratio.
3. Counts stay raw: return/claim/population counts are never aged.

A target is aged only when both the build-year and source-year projection facts
of the chosen series are present; otherwise it is left raw with
aging_factor_source="unavailable" so the un-aged state is explicit rather than
silent. Rows already period-aligned within-surface (uprating_factor present)
are excluded to avoid double counting.

Per-target diagnostics: basis (fact/projection), source_period, aged_to,
aging_factor, aging_factor_source.

The pass is opt-in via a new age_targets flag on
compile_us_fiscal_target_registry (default False) and a --age-targets driver
flag on tools/build_us_fiscal_refresh_release.py. With aging off the compiled
surface is byte-identical to today, preserving build step-isolation until a
build enables it.

The eventual fact-vs-computed boundary is PolicyEngine/ledger#71: facts-only
store, with PolicyEngine-computed aged levels living in Populace as a named,
versioned aging implementation consuming growth-factor facts from Ledger. This
module is that Populace-side implementation.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MaxGhenis

Copy link
Copy Markdown
Contributor Author

Adversarial review (independent agent, full diff + empirical verification scripts): MERGE-SAFE.

Attack angles checked and outcome:

  1. Double-aging — refuted: all three within-surface uprating passes stamp uprating_factor, which aging skips (already_period_aligned); the one pre-scaling path without it (CD→state hierarchy reconciliation) is same-period geographic rescaling, and the composition raw × geo × age was verified numerically correct with both metadata trails recorded.
  2. Ratio direction / period parsing — refuted: factor is build/source with zero/negative guards both sides; _period_year fails safe to unavailable on all 40 adversarial labels tested (the filing_season_week47 token never reaches period.value).
  3. Series mapping — refuted: explicit maps correct; AGI-default for deductions/credits/income tax is documented; program benefit totals would default to AGI growth but are current-period in production (no-op today) — noted as future polish.
  4. Count leakage — structurally impossible: measure_mode != "sum"not_dollar_amount; verified zero non-sum rows aged.
  5. Inertness — holds by construction: age_targets=False returns the untouched apply_ledger_target_profile(...) object.
  6. Signed targets — sign preserved; positive factor can never flip it.
  7. Diagnostics — every path records an explicit aging_factor_source or skip reason; zero silent skips.

Non-blocking polish filed as a follow-up issue: (a) strengthen the inertness test to compare against a frozen pre-PR registry hash (current test compares two off-paths); (b) record the denominator (source-year) fact id alongside the numerator for full ratio provenance; (c) apply the 1900–2100 sanity bound to the string branch of _period_year; (d) a benefits/COLA index for program totals if they ever become cross-period.

83/83 tests pass in the aging test file; 798 pass across populace-build + populace-calibrate.

@MaxGhenis MaxGhenis merged commit 9f63a51 into main Jul 2, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the target-aging branch July 2, 2026 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant