You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the operational plan for replacing PolicyEngine's Enhanced CPS
(enhanced_cps_2024.h5) with Microplex-produced datasets.
It separates two products that should share source semantics, target
compilation, scoring, and artifact discipline but should not share the same
runtime envelope:
mp-national: a national default dataset small enough for normal
PolicyEngine microsimulation.
mp-local: a larger local-area pipeline for state, congressional district,
and other subnational analyses.
The replacement claim is not "Microplex copies eCPS." The claim is that
Microplex becomes the canonical data construction path for PolicyEngine-US,
while policyengine-us remains the measurement and microsimulation runtime.
Current decision
Microplex should replace eCPS in stages:
Ship mp-national as a beta national H5 once it passes schema compatibility
and beats a pinned latest eCPS baseline on the broad target suite.
Make mp-national the default national dataset only after the microsimulation
benchmark suite, runtime gate, and rollback path pass.
Continue mp-local as a heavier local-calibration pipeline, with a sparse
analysis artifact or geography shards, before replacing local eCPS/L0
workflows.
This avoids blocking the national replacement on the harder local-scale
engineering problem.
Product split
Product
Intended use
Size target
Main blocker
Replacement bar
mp-national
Default national PolicyEngine microsims and public analyses
roughly current small ASEC + ACS100k scale
model quality and benchmark confidence
beats latest eCPS on broad loss and microsim benchmarks while remaining fast enough for routine use
mp-local
State, CD, and local-area analyses
larger ACS/local target coverage; may be sharded
local runtime, disk, and sparse output design
beats PE local L0 packages on local targets and can run practical local microsims
User-facing routing:
National default microsimulation: mp-national.
National results broken out by state or demographic group: mp-national,
unless the analysis explicitly asks for local-calibrated weights.
State-calibrated analyses: mp-local-fast state shard when available;
otherwise fall back to mp-national only with an explicit "not
state-calibrated" label.
Congressional district and smaller geographies: mp-local-fast geography
shard when available. Do not silently use mp-national for official local
claims.
Research sweeps and target debugging: mp-local-rich.
Shared architecture
flowchart TD
A["Arch target registry"] --> B["Microplex source providers"]
B --> C["Source semantics and donor specs"]
C --> D["Donor integration and imputation"]
D --> E["Synthetic population spine"]
E --> F["Fixed spine append: Forbes and other must-keep records"]
E --> G["Relational add-ons: capital gains lots, assets, diagnostics"]
F --> H["Residualized GD/L0 calibration"]
G --> H
H --> I["PE-ingestable H5 export"]
I --> J["PolicyEngine target scoring"]
I --> K["Microsimulation benchmark suite"]
J --> L["Run dashboard and release gate"]
K --> L
Loading
Important design rule: fixed records, such as Forbes top-tail records, should be
added after ordinary population synthesis and excluded from donor fitting. Their
weighted target contributions should be subtracted from calibration targets so
nonfixed records are calibrated to the residual population.
mp-national build path
mp-national should use the best currently known Microplex construction path:
CPS ASEC as the core demographic and program scaffold.
ACS100k to improve household, geography, housing, and local demographic
support without making the national H5 too slow.
PUF donor integration for filer/tax variables, including top-end wages,
capital gains, business income, retirement income, itemized deductions, and
filing concepts.
SIPP, SCF, SSA-style, and other Arch-backed sources where they improve
disability, SSI, assets, and program variables.
Forbes fixed-spine append for ultra-high-wealth units, with target
residualization rather than asking the regular population to absorb those
aggregates.
Capital-gains lots as a relational extension, not as a reason to flatten
every possible record-level detail into the primary H5.
Gradient-based calibration as the standard Microplex weight path. If l0_lambda=0 remains best for national quality, the implementation should
disable hard-concrete gates rather than carrying a fake sparsity mechanism.
National release gates
mp-national can become the default national dataset when all of these are
true:
Pinned baseline: every comparison records the eCPS H5 path, eCPS SHA256, policyengine-us-data commit, policyengine-us version, target DB path,
and target DB SHA256. "Latest eCPS" means that pinned baseline, not a moving
label.
Schema compatibility: passes an automated H5 contract check against the
current policyengine-us loader, including entity tables, IDs, joins,
weights, periods, dtypes, missing-value conventions, and absence of
source-dataset diagnostic variables.
Target loss: broad target loss is lower than the pinned eCPS baseline on
common kept targets.
Protected target families: each protected family is no worse than eCPS by
more than 5% relative loss or 0.005 absolute loss, whichever is larger.
Protected families are SSI, SNAP, wages, self-employment income, capital
gains, interest, dividends, retirement income, disability, and household net
income.
High-salience aggregates: absolute percentage error is at least as good as
eCPS, or the dashboard marks and explains the regression, for SSI recipients,
SSI value, SNAP recipients, SNAP value, wage income, long-term capital gains,
taxable interest, ordinary dividends, self-employment income, and household
net income.
Microsimulation benchmarks: passes the fixed benchmark suite covering SSI
asset limits, CTC/EITC-style tax reforms, capital-gains indexing, and at
the frozen external Tara Watson SSI asset-limit benchmark. Pass means no
unexplained fiscal or household-net-income delta exceeding 5% of the eCPS
estimate or $5 billion, whichever is larger.
Runtime: median runtime over the fixed national benchmark suite is no more
than 1.25x eCPS. A candidate can enter beta up to 2.0x eCPS, but cannot
become default above 1.25x without an explicit product decision.
Artifact size: national H5 is no more than 2x the eCPS H5 size unless the
extra size is from a separately loadable relational extension.
Artifacts: writes a complete immutable bundle containing config, source
versions, target DB hash, score files, target deltas, record counts, nonzero
weights, effective sample size, and benchmark outputs.
Release contract: has a stable H5 publication path, rollback path, and a
documented policyengine-us-data integration point.
mp-local build path
mp-local should not be forced into the same artifact shape as mp-national. Local accuracy needs more ACS/local target support, but routine
local microsims need sparse or sharded outputs.
Recommended path:
Build from the same source and semantic registry as mp-national.
Expand ACS and local targets incrementally rather than jumping straight to a
full monolithic artifact.
Calibrate against local target suites from Arch: state, congressional
district, age/race/household-type, income, benefits, disability, and program
participation where defensible.
Produce two outputs:
mp-local-rich: best-fit, larger research artifact.
mp-local-fast: sparse or sharded analysis artifact for routine
PolicyEngine use.
Prefer geography shards or post-fit sparse output over requiring every
national run to materialize the entire local record universe.
Local release gates
mp-local can replace the PE local L0 pipeline when all of these are true:
Pinned local baselines: every comparison records the PE local small-L0 and
big-L0 artifact paths or package IDs, their source commits, weight files,
target DB path and SHA256, objective definition, and the exact target subset
used by the incumbent objective.
Beats PE local small-L0 and big-L0 packages on their pinned actual objective.
Beats those packages on Microplex's broader Arch target suite.
Has explicit held-out target evaluation so local overfitting is visible.
Produces a fast analysis artifact whose median single-geography benchmark
runtime is no more than 2x the incumbent PE local artifact.
Can rebuild mp-local-fast on the standard cloud runner in less than 12
hours without manual cleanup, or has sharded build jobs whose slowest shard is
below that bound.
Has recoverable, profiled build stages for donor integration, PE table
materialization, scoring, and export.
Held-out target evaluation should hold out complete target groups, not random
rows inside a target. The default split should include at least one geography
family and one income/program family so epoch tuning cannot overfit only the
headline national aggregates.
Dashboard contract
The living dashboard should be the source of truth for replacement readiness.
It should show, at minimum:
latest mp-national candidate versus latest eCPS
latest mp-local-rich and mp-local-fast candidates versus PE small-L0 and
big-L0
broad loss, local loss, PE-actual objective loss, and microsim benchmark
deltas
record counts, positive weights, effective sample size, weight concentration,
H5 size, and median microsim runtime
top target wins and losses by source family
whether each release gate is passing, failing, or unmeasured
Every serious run should write a machine-readable loss record that the dashboard
can index without scraping logs.
Dashboard/indexing is a release-blocking workstream. It needs:
a stable loss-result JSON schema
a run indexer that can discover completed local artifacts
dashboard cells for each release gate
a published "current candidate" artifact path
CI or scheduled refresh for static score files, where practical
Cross-repo dependency graph
flowchart LR
A["arch-data: target facts and semantic scope"] --> B["microplex-us: build, score, and H5 export"]
B --> C["microplex-evals: microsim benchmark reports"]
B --> D["policyengine-us-data: artifact publication and loader integration"]
C --> D
D --> E["policyengine-us: default dataset switch and fallback behavior"]
Loading
Required handoffs:
arch-data to microplex-us: source facts, target scopes, exclusions, and
coverage reports are importable and pinned by content hash.
microplex-us to microplex-evals: candidate H5, manifest, score files, and
source/target provenance are sufficient to run benchmarks without rebuilding.
microplex-us to policyengine-us-data: exported H5 and metadata satisfy the
dataset publication contract.
policyengine-us-data to policyengine-us: loader names, default dataset
selection, fallback behavior, and release notes are stable.
policyengine-us default switch: happens only after beta artifacts and
benchmark reports are available for rollback comparison.
Epics and issue-sized tasks
Epic 1: National replacement candidate
Build and score the current best mp-national path.
Child issues:
microplex-us: produce small ASEC + ACS100k build with PUF, SIPP/SCF/Arch
additions, Forbes fixed spine, and capital-gains lots enabled.
microplex-us: score candidate against pinned latest eCPS and write top
target delta report.
microplex-evals: run microsimulation benchmark suite against candidate and
eCPS.
Exit:
decision on whether this candidate is release-track or needs another modeling
iteration
Epic 2: Target registry hardening
Make Arch the canonical source of target semantics used by both national and
local Microplex builds.
Child issues:
arch-data: add source/concept exclusions for misleading broad concepts.
arch-data: add explicit target scope labels: filer, full population,
recipient, household, tax unit, SPM unit, state, CD, local.
microplex-us: consume importable target coverage reports by product.
arch-data and microplex-us: add tests that prevent known semantic
regressions, including proprietors income and SSI recipient/value confusion.
Exit:
no material target in the release suite lacks source, scope, and entity
provenance
Epic 3: Calibration simplification
Make the winning gradient-based weight path the standard Microplex path.
Child issues:
microplex-us: disable hard-concrete gates automatically when l0_lambda=0.
microplex-us: preserve L0 gates only for sparse local artifacts or explicit
experiments.
microplex-us: write loss curves and held-out target curves for every run.
microplex-us: define epoch stopping rules from held-out target performance,
not only training loss.
Exit:
one standard national calibration command and one standard sparse-local
calibration command
Epic 4: Microsimulation benchmark suite
Codify policy outcomes that must be compared before replacement.
Child issues:
microplex-evals: add national benchmark suite covering SSI asset limits,
CTC/EITC, capital-gains indexing, and the Tara Watson SSI asset-limit
benchmark.
microplex-evals: freeze a benchmark manifest before judging any release
candidate, including reform definitions, periods, expected output fields, and
pinned baseline artifacts.
microplex-evals: report aggregate fiscal impact, household net income,
winners/losers, poverty/SPM where applicable, and component deltas.
microplex-evals: enforce PolicyEngine MicroSeries operations throughout;
no manual weight math.
Exit:
every candidate has a comparable benchmark report against eCPS
Epic 5: Dashboard and release readiness
Make replacement claims visible from a durable dashboard rather than ad hoc log
inspection.
Microplex path to replace eCPS
This is the operational plan for replacing PolicyEngine's Enhanced CPS
(
enhanced_cps_2024.h5) with Microplex-produced datasets.It separates two products that should share source semantics, target
compilation, scoring, and artifact discipline but should not share the same
runtime envelope:
mp-national: a national default dataset small enough for normalPolicyEngine microsimulation.
mp-local: a larger local-area pipeline for state, congressional district,and other subnational analyses.
The replacement claim is not "Microplex copies eCPS." The claim is that
Microplex becomes the canonical data construction path for PolicyEngine-US,
while
policyengine-usremains the measurement and microsimulation runtime.Current decision
Microplex should replace eCPS in stages:
mp-nationalas a beta national H5 once it passes schema compatibilityand beats a pinned latest eCPS baseline on the broad target suite.
mp-nationalthe default national dataset only after the microsimulationbenchmark suite, runtime gate, and rollback path pass.
mp-localas a heavier local-calibration pipeline, with a sparseanalysis artifact or geography shards, before replacing local eCPS/L0
workflows.
This avoids blocking the national replacement on the harder local-scale
engineering problem.
Product split
mp-nationalmp-localUser-facing routing:
mp-national.mp-national,unless the analysis explicitly asks for local-calibrated weights.
mp-local-faststate shard when available;otherwise fall back to
mp-nationalonly with an explicit "notstate-calibrated" label.
mp-local-fastgeographyshard when available. Do not silently use
mp-nationalfor official localclaims.
mp-local-rich.Shared architecture
Important design rule: fixed records, such as Forbes top-tail records, should be
added after ordinary population synthesis and excluded from donor fitting. Their
weighted target contributions should be subtracted from calibration targets so
nonfixed records are calibrated to the residual population.
mp-nationalbuild pathmp-nationalshould use the best currently known Microplex construction path:support without making the national H5 too slow.
capital gains, business income, retirement income, itemized deductions, and
filing concepts.
disability, SSI, assets, and program variables.
residualization rather than asking the regular population to absorb those
aggregates.
every possible record-level detail into the primary H5.
l0_lambda=0remains best for national quality, the implementation shoulddisable hard-concrete gates rather than carrying a fake sparsity mechanism.
National release gates
mp-nationalcan become the default national dataset when all of these aretrue:
policyengine-us-datacommit,policyengine-usversion, target DB path,and target DB SHA256. "Latest eCPS" means that pinned baseline, not a moving
label.
current
policyengine-usloader, including entity tables, IDs, joins,weights, periods, dtypes, missing-value conventions, and absence of
source-dataset diagnostic variables.
common kept targets.
more than 5% relative loss or 0.005 absolute loss, whichever is larger.
Protected families are SSI, SNAP, wages, self-employment income, capital
gains, interest, dividends, retirement income, disability, and household net
income.
eCPS, or the dashboard marks and explains the regression, for SSI recipients,
SSI value, SNAP recipients, SNAP value, wage income, long-term capital gains,
taxable interest, ordinary dividends, self-employment income, and household
net income.
asset limits, CTC/EITC-style tax reforms, capital-gains indexing, and at
the frozen external Tara Watson SSI asset-limit benchmark. Pass means no
unexplained fiscal or household-net-income delta exceeding 5% of the eCPS
estimate or $5 billion, whichever is larger.
than 1.25x eCPS. A candidate can enter beta up to 2.0x eCPS, but cannot
become default above 1.25x without an explicit product decision.
extra size is from a separately loadable relational extension.
versions, target DB hash, score files, target deltas, record counts, nonzero
weights, effective sample size, and benchmark outputs.
documented
policyengine-us-dataintegration point.mp-localbuild pathmp-localshould not be forced into the same artifact shape asmp-national. Local accuracy needs more ACS/local target support, but routinelocal microsims need sparse or sharded outputs.
Recommended path:
mp-national.full monolithic artifact.
district, age/race/household-type, income, benefits, disability, and program
participation where defensible.
mp-local-rich: best-fit, larger research artifact.mp-local-fast: sparse or sharded analysis artifact for routinePolicyEngine use.
national run to materialize the entire local record universe.
Local release gates
mp-localcan replace the PE local L0 pipeline when all of these are true:big-L0 artifact paths or package IDs, their source commits, weight files,
target DB path and SHA256, objective definition, and the exact target subset
used by the incumbent objective.
runtime is no more than 2x the incumbent PE local artifact.
mp-local-faston the standard cloud runner in less than 12hours without manual cleanup, or has sharded build jobs whose slowest shard is
below that bound.
materialization, scoring, and export.
Held-out target evaluation should hold out complete target groups, not random
rows inside a target. The default split should include at least one geography
family and one income/program family so epoch tuning cannot overfit only the
headline national aggregates.
Dashboard contract
The living dashboard should be the source of truth for replacement readiness.
It should show, at minimum:
mp-nationalcandidate versus latest eCPSmp-local-richandmp-local-fastcandidates versus PE small-L0 andbig-L0
deltas
H5 size, and median microsim runtime
Every serious run should write a machine-readable loss record that the dashboard
can index without scraping logs.
Dashboard/indexing is a release-blocking workstream. It needs:
Cross-repo dependency graph
Required handoffs:
arch-datatomicroplex-us: source facts, target scopes, exclusions, andcoverage reports are importable and pinned by content hash.
microplex-ustomicroplex-evals: candidate H5, manifest, score files, andsource/target provenance are sufficient to run benchmarks without rebuilding.
microplex-ustopolicyengine-us-data: exported H5 and metadata satisfy thedataset publication contract.
policyengine-us-datatopolicyengine-us: loader names, default datasetselection, fallback behavior, and release notes are stable.
policyengine-usdefault switch: happens only after beta artifacts andbenchmark reports are available for rollback comparison.
Epics and issue-sized tasks
Epic 1: National replacement candidate
Build and score the current best
mp-nationalpath.Child issues:
microplex-us: produce small ASEC + ACS100k build with PUF, SIPP/SCF/Archadditions, Forbes fixed spine, and capital-gains lots enabled.
microplex-us: score candidate against pinned latest eCPS and write toptarget delta report.
microplex-evals: run microsimulation benchmark suite against candidate andeCPS.
Exit:
iteration
Epic 2: Target registry hardening
Make Arch the canonical source of target semantics used by both national and
local Microplex builds.
Child issues:
arch-data: add source/concept exclusions for misleading broad concepts.arch-data: add explicit target scope labels: filer, full population,recipient, household, tax unit, SPM unit, state, CD, local.
microplex-us: consume importable target coverage reports by product.arch-dataandmicroplex-us: add tests that prevent known semanticregressions, including proprietors income and SSI recipient/value confusion.
Exit:
provenance
Epic 3: Calibration simplification
Make the winning gradient-based weight path the standard Microplex path.
Child issues:
microplex-us: disable hard-concrete gates automatically whenl0_lambda=0.microplex-us: preserve L0 gates only for sparse local artifacts or explicitexperiments.
microplex-us: write loss curves and held-out target curves for every run.microplex-us: define epoch stopping rules from held-out target performance,not only training loss.
Exit:
calibration command
Epic 4: Microsimulation benchmark suite
Codify policy outcomes that must be compared before replacement.
Child issues:
microplex-evals: add national benchmark suite covering SSI asset limits,CTC/EITC, capital-gains indexing, and the Tara Watson SSI asset-limit
benchmark.
microplex-evals: freeze a benchmark manifest before judging any releasecandidate, including reform definitions, periods, expected output fields, and
pinned baseline artifacts.
microplex-evals: report aggregate fiscal impact, household net income,winners/losers, poverty/SPM where applicable, and component deltas.
microplex-evals: enforce PolicyEngine MicroSeries operations throughout;no manual weight math.
Exit:
Epic 5: Dashboard and release readiness
Make replacement claims visible from a durable dashboard rather than ad hoc log
inspection.
Child issues:
microplex-us: define stable loss-result JSON schema.microplex-us: add run indexer support for national, local-rich, andlocal-fast candidate artifacts.
microplex-us: add dashboard gate cells for target loss, protected families,microsim benchmarks, compatibility, runtime, and artifact size.
microplex-us: publish the current candidate artifact path and score bundle.Exit:
Epic 6: Compatibility and publication contract
Make the H5 and metadata contract explicit before any default switch.
Child issues:
microplex-us: add automated H5 compatibility check againstpolicyengine-us.policyengine-us-data: add loader/publication path for Microplex nationalbeta artifact.
policyengine-us: define default dataset switch, feature flag, and rollbackbehavior.
policyengine-us-data: document the eCPS incumbent baseline and Microplexreplacement status.
Exit:
mp-nationalthrough the normal dataset interface witha documented rollback path
Epic 7: Local pipeline scalability
Make local builds practical and diagnosable.
Child issues:
microplex-us: add profiled stage timings and RSS for donor integration, PEtable construction, calibration, scoring, and export.
microplex-us: implement chunked or vectorized PE table construction whereneeded.
microplex-us: choose sparse output strategy: stronger L0, post-fit pruning,geography shards, or a combination.
microplex-us: add disk guardrails and resumable checkpoints.Exit:
mp-local-fastcan run routine local microsims without a multi-day laptoppipeline
Initial milestones
microplex-usmp-nationalversus eCPS with target deltasmicroplex-us,microplex-evalsarch-data,microplex-usmicroplex-usmicroplex-usmicroplex-us,policyengine-us-datamp-nationalloads through normal dataset interfacesmicroplex-uspolicyengine-us-data,policyengine-usmp-nationalcan replace eCPS in a controlled releaseWhat not to do
replacement candidate.
incumbent.
to make Microplex easier to debug.
configs and scores.
unless PolicyEngine needs it at that entity level.
Open questions
stricter before public default switch?
mp-local, should the primary public artifact be state shards, CDshards, or both?
prevent local overfit?
donor-imputed?