Postmortem: zero SPM thresholds in Enhanced CPS builds #975

MaxGhenis · 2026-05-14T12:28:00Z

MaxGhenis
May 14, 2026
Maintainer

Scope

This postmortem covers only the zero-dollar SPM threshold failure at the policyengine-us / policyengine-us-data boundary. It does not cover EITC calibration targets, fiscal estimates, Social Security data, or the history of downstream proposal apps.

Summary

We moved SPM thresholds from stored data values into policyengine-us formulas. That was the right long-term direction: policyengine-us-data should emit non-formulaic leaf inputs, and policyengine-us should calculate deterministic outputs like thresholds and poverty status.

We created a base-year failure mode during that migration. The first threshold formula inferred a geographic adjustment from the prior-year stored threshold. A single-year 2024 Enhanced CPS artifact did not contain a 2023 threshold, so the formula read the missing prior-year value as zero and calculated a zero geographic adjustment. That zero flowed into the 2024 SPM threshold.

At the same time, policyengine-us-data had a generic formula-variable pruning step. Once policyengine-us classified spm_unit_spm_threshold as formula-backed, the exporter could drop the stored threshold that had previously served as the base-year anchor. We therefore removed the old fallback at the same time that the new formula still depended on it.

A zero threshold collapses poverty and deep poverty into the same test:

poverty: resources < threshold
deep poverty: resources < threshold / 2
if threshold is 0, both become resources < 0

The visible symptom was poverty equaling deep poverty.

What Should Have Happened

policyengine-us should calculate SPM thresholds from current-year leaf inputs:

SPM unit adult and child composition
SPM tenure type
geography or geographic adjustment inputs sufficient to calculate the adjustment
Census/BLS reference thresholds and equivalence scale from spm-calculator

policyengine-us-data should emit those leaf inputs only. It should not persist formula outputs such as spm_unit_spm_threshold, person_in_poverty, in_poverty, or in_deep_poverty.

How We Created The Failure

1. We treated the threshold as a data input

Before the migration, policyengine-us defined spm_unit_spm_threshold as an input with CPI-U uprating. policyengine-us-data calculated thresholds from CPS SPM fields and geographic adjustments, then stored them in the H5. This duplicated logic, but it worked because the model treated the stored threshold as a data input.

2. We moved threshold logic into a formula that needed a prior-year anchor

policyengine-us#8020 replaced the input/uprating behavior with a spm-calculator formula. The formula tried to preserve local geography by backing out an implied geographic adjustment from the prior-year threshold:

prior_threshold = spm_unit("spm_unit_spm_threshold", period.last_year)
geoadj = prior_threshold / (prior_base * prior_equiv_scale)
return current_base * current_equiv_scale * geoadj

This passed tests that supplied a prior-year threshold. It failed on a base-year-only artifact. The 2024 Enhanced CPS asked the formula for a 2023 threshold, found no 2023 value, and received the variable default. The formula then accepted 0 / denominator as a real geographic adjustment of zero.

The test suite missed the base-year case. It tested transitions with explicit prior-year thresholds, not a single-year dataset with no prior-year threshold input.

3. We let generic formula pruning remove the old fallback

policyengine-us-data#554 introduced _drop_formula_variables, which removed variables that policyengine-us computed with formulas, adds, or subtracts. When #554 landed, spm_unit_spm_threshold was still input-backed, so the exporter kept it.

After #8020, the exporter saw spm_unit_spm_threshold as formula-backed and dropped it. That removed the stored threshold that could have masked the missing prior-year anchor.

We also had a second design problem nearby: the CPS-only QRF-imputed list included spm_unit_spm_threshold. Thresholds are deterministic functions of geography, tenure, and composition, so we should never donor-impute them onto the clone half.

4. We let missing structural data become a plausible zero

The calculation did not fail when the prior-year threshold was absent. It returned zero and continued. For a poverty threshold, zero is not a plausible structural default. The model should either calculate the threshold from current-year leaf inputs or fail closed.

Timeline

2025-12-19: us-data starts constructing SPM thresholds

policyengine-us-data#453 added spm-calculator use in data construction. The pipeline still treated thresholds as data outputs and model inputs.

2026-03-04: us-data adds formula-variable pruning

policyengine-us-data#554 added _drop_formula_variables. This made sense as a space and contract cleanup, but it inferred export intent from the current policyengine-us variable kind. Later model-side changes could therefore change what us-data exported.

2026-04-17: policyengine-us migrates thresholds to formulas

policyengine-us#8020 made spm_unit_spm_threshold formulaic and inferred geography from the prior-year stored threshold. This introduced the zero-anchor failure mode for single-year datasets.

2026-05-05: us-data lands a tactical fix

policyengine-us-data#903 moved thresholds out of QRF imputation, recalculated them from assigned geography near the end of the Extended CPS build, and temporarily preserved spm_unit_spm_threshold through formula pruning.

That fixed the immediate artifact, but it still preserved a formulaic output in us-data as a transition step.

2026-05-07: us-data tightens threshold and clone handling

policyengine-us-data#917 aligned threshold and reference-threshold math and tightened clone priors. This improved the transitional data-side implementation but did not yet finish the ownership migration.

2026-05-07 to 2026-05-08: we move ownership to model formulas

policyengine-us#8246 introduced spm_unit_geographic_adjustment and made thresholds use current-year composition, tenure, and geographic adjustment.

policyengine-us-data#918 stopped materializing SPM thresholds in us-data and stored geographic adjustments instead.

policyengine-us#8249 removed the prior-year threshold backout and calculated geographic adjustment from current raw geography and tenure via spm-calculator.

policyengine-us-data#924 removed the remaining ACS SPM threshold materialization.

2026-05-12: us-data explicitly excludes formulaic SPM outputs

policyengine-us-data#954 centralized formulaic SPM outputs that us-data should not persist:

person_in_poverty
in_poverty
in_deep_poverty
spm_unit_is_in_spm_poverty
spm_unit_is_in_deep_spm_poverty
spm_unit_spm_threshold
spm_unit_geographic_adjustment

This made the SPM export rule explicit instead of relying only on broad formula detection.

How We Fixed It

We fixed the incident in layers:

We stabilized the artifact by recalculating thresholds from assigned geography and temporarily preserving them through pruning (Assign legacy CPS blocks and SPM thresholds by geography #903).
We stopped QRF-imputing thresholds, because thresholds are deterministic (Assign legacy CPS blocks and SPM thresholds by geography #903).
We tightened threshold/geography math and clone handling (Align SPM thresholds and clone priors #917).
We moved threshold ownership to policyengine-us, using current composition, tenure, and geographic adjustment (#8246).
We removed the prior-year threshold backout and calculated geographic adjustment from current geography and tenure (#8249).
We stopped persisting formulaic SPM outputs in us-data (Remove ACS SPM threshold materialization #924, Drop formulaic SPM outputs from exported H5s #954).

Why Tests Missed It

The old tests verified the intended formula behavior under a favorable setup:

They supplied a prior-year threshold.
They tested future-year transitions, not a base-year-only dataset.
They did not assert that 2024 thresholds stay positive when only 2024 inputs exist.
They did not run a generated-H5 smoke test that calculated SPM poverty variables from the saved artifact.
The exporter silently dropped variables when policyengine-us reclassified them as formula-backed.

The missing test was not "does spm-calculator return the right Census threshold?" It was "does the final one-year Enhanced CPS artifact plus the locked model package calculate positive thresholds and distinct poverty/deep-poverty rates?"

What We Already Hardened

policyengine-us now tests current-year geographic adjustment and includes a regression that prior stored thresholds do not imply geographic adjustment after #8249.
policyengine-us-data now explicitly drops formulaic SPM outputs through FORMULAIC_SPM_INPUTS_TO_DROP.
Current us-data main no longer writes spm_unit_spm_threshold from CPS SPM fields in add_spm_variables.
Current SPM threshold calculation no longer depends on a prior-year stored threshold.
policyengine-us-data#974 now proposes a fail-closed computed-export contract, so this class of formula-pruning break should fail the build instead of silently changing the final H5.

Hardening We Should Add Next

P0: generated-artifact SPM smoke test

After generating an Enhanced CPS artifact, instantiate a microsimulation from the saved H5 and calculate SPM variables from that artifact, not from intermediate dataframes.

Recommended checks, using MicroSeries operations rather than manual weights:

threshold = sim.calc("spm_unit_spm_threshold", period=2024)
assert (threshold <= 0).mean() < 0.001
assert 15_000 < threshold.mean() < 80_000

poverty = sim.calc("person_in_poverty", period=2024, map_to="person")
deep = sim.calc("in_deep_poverty", period=2024, map_to="person")
assert deep.mean() < poverty.mean()
assert abs(deep.mean() - poverty.mean()) > 0.01

The tolerances should stay broad. The test should catch collapse and impossible thresholds, not pin the model to one baseline.

P0: cross-repo variable-kind contract check

When policyengine-us-data bumps its locked policyengine-us version, CI should check every variable that us-data exports or uses as a construction anchor. If a variable changes from input/uprated input to formula/adds/subtracts, CI should fail unless us-data deliberately stopped exporting it or explicitly marked it as construction-only.

#974 covers the final export side of this. We should still keep a lock-bump-oriented contract check so the failure appears as close as possible to the model dependency update.

P0: self-lag formula lint

policyengine-us should flag formulas that call the same variable in period.last_year. That pattern can be valid, but it should require:

a base-period fallback;
a no-prior-input unit test;
a single-year dataset test when microsimulation data uses the variable.

This would have caught the #8020 failure mode directly.

P1: fail-closed structural inputs

For structural inputs like geography, tenure, and poverty thresholds, missing data should not silently become zero when it drives classification outcomes. We should either document a safe default, such as geographic adjustment 1.0, or raise a hard error.

P1: published-artifact validation before promotion

We should run the artifact-level SPM smoke test against the exact H5 we plan to publish. Source-level tests are not enough because this failure depended on what survived export and how the locked model calculated against the saved file.

P1: explicit export manifests

For each data build surface, us-data should declare:

public leaf inputs to emit;
construction-only intermediates to drop;
formula outputs that must never be persisted.

Then CI should fail if the final artifact contains undeclared formula outputs or omits declared leaf inputs.

Broader Lessons

Formula migrations are cross-repo migrations when data artifacts depend on the variable. A model-side PR can break us-data even when all model unit tests pass.
Prior-year formulas need base-year tests. Any formula that uses period.last_year needs a no-prior-input path.
Generic pruning hides contract breaks. The exporter should declare intent explicitly instead of deriving intent from the current model implementation.
We need to validate final artifacts, not just pipeline internals. The H5 plus the locked model package is the product.
Missing structural data should not become zero. For poverty thresholds, zero almost always means the contract failed.

References

policyengine-us#8020 - migrated spm_unit_spm_threshold to a formula using prior-year threshold backout.
policyengine-us-data#554 - introduced formula-variable pruning in data export.
policyengine-us-data#903 - tactical fix: recalculate thresholds from assigned geography and keep through pruning.
policyengine-us-data#917 - aligned SPM thresholds and clone priors.
policyengine-us#8246 - introduced current-year geographic adjustment for threshold calculation.
policyengine-us-data#918 - stopped materializing thresholds in us-data.
policyengine-us#8249 - removed prior-year threshold backout; compute geoadj from current geography and tenure.
policyengine-us-data#924 - removed remaining ACS threshold materialization.
policyengine-us-data#954 - added the explicit formulaic SPM output drop list.
policyengine-us-data#974 - proposed fail-closed computed-export contract.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postmortem: zero SPM thresholds in Enhanced CPS builds #975

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Postmortem: zero SPM thresholds in Enhanced CPS builds #975

Uh oh!

Uh oh!

MaxGhenis May 14, 2026 Maintainer

Scope

Summary

What Should Have Happened

How We Created The Failure

1. We treated the threshold as a data input

2. We moved threshold logic into a formula that needed a prior-year anchor

3. We let generic formula pruning remove the old fallback

4. We let missing structural data become a plausible zero

Timeline

2025-12-19: us-data starts constructing SPM thresholds

2026-03-04: us-data adds formula-variable pruning

2026-04-17: policyengine-us migrates thresholds to formulas

2026-05-05: us-data lands a tactical fix

2026-05-07: us-data tightens threshold and clone handling

2026-05-07 to 2026-05-08: we move ownership to model formulas

2026-05-12: us-data explicitly excludes formulaic SPM outputs

How We Fixed It

Why Tests Missed It

What We Already Hardened

Hardening We Should Add Next

P0: generated-artifact SPM smoke test

P0: cross-repo variable-kind contract check

P0: self-lag formula lint

P1: fail-closed structural inputs

P1: published-artifact validation before promotion

P1: explicit export manifests

Broader Lessons

References

Replies: 0 comments

MaxGhenis
May 14, 2026
Maintainer