Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Microplex path to replace eCPS
This is the operational plan for replacing PolicyEngine's Enhanced CPS (
enhanced_cps_2024.h5) with Microplex-produced datasets.It separates record-count size classes that should share source semantics, target compilation, scoring, and artifact discipline but should not share the same runtime envelope. Names are based only on approximate record count, similar to how LLMs are named by parameter scale:
mp-300k: a national default dataset small enough for normal PolicyEngine microsimulation.mp-3m: a larger local-area pipeline for state, congressional district, and other subnational analyses.mp-30m: a future full candidate universe that can produce smaller sparse tiers.The exact realized row counts still live in artifact metadata. The names are record-count classes, not promises that every build has exactly that row count. The intended long-run path is hierarchical: build the largest feasible candidate universe, then use L0 sparsity and post-fit pruning to derive smaller tiers. In that framing,
mp-300kandmp-3mshould eventually derive frommp-30mor the largest available parent, rather than being unrelated builds.That is not true of the first
mp-300kcandidate. The currentmp-300kpath is a directly sampled small build, without L0 culling from a larger parent universe. That is acceptable for the first replacement test because it answers whether the small record class can beat eCPS at routine microsimulation scale. It should not become the permanent construction contract. Over time,mp-300kshould improve by selecting or culling from a larger candidate universe, not only by changing the initial sample.The replacement claim is not "Microplex copies eCPS." The claim is that Microplex becomes the canonical data construction path for PolicyEngine-US, while
policyengine-usremains the measurement and microsimulation runtime.Current decision
Microplex should replace eCPS in stages, with the national artifact loop proven before broader local work expands:
mp-300kloop end to end: produce one candidate H5, verify PolicyEngine can load it, run a runtime smoke benchmark, compare it against a pinned eCPS baseline, and write a CI artifact-gate report that the dashboard can index.mp-300kas a beta national H5 once it passes schema compatibility, remains inside the runtime envelope, and beats the pinned latest eCPS baseline on the broad target suite.mp-300kthe default national dataset only after the frozen microsimulation benchmark suite, runtime gate, and rollback path pass.mp-3mas a separate local-calibration workstream, with sparse analysis artifacts or geography shards, after the national loop is mechanically reliable.This avoids blocking the national replacement on the harder local-scale engineering problem and prevents target-quality work from masking loader, runtime, or release-contract failures.
Immediate execution loop
The first implementation slice is intentionally narrow. It is not a full local pipeline and not a permanent modeling contract, but it should be a standing CI gate set rather than a one-time readiness checklist. It answers one release-critical question on every candidate artifact: can Microplex produce a national candidate that PolicyEngine can load, run, compare, and index reproducibly?
The loop is:
mp-300kH5 plus a manifest with record counts, source versions, config, target DB hash, and artifact hash.policyengine-usloader before judging model quality.mp300k_artifact_gates.jsonreport that CI can fail on and the dashboard can index without scraping logs.A candidate that fails compatibility or runtime stops at that gate. eCPS comparison and benchmark results only matter for artifacts that can load, run, and be reproduced. Compatibility, runtime, artifact, and benchmark-manifest gates should continue after eCPS is deprecated.
Size classes
mp-300kmp-3mmp-30mmp-3mandmp-300kfrontier after sparsificationUser-facing routing:
mp-300k.mp-300k, unless the analysis explicitly asks for local-calibrated weights.mp-3m-faststate shard when available; otherwise fall back tomp-300konly with an explicit "not state-calibrated" label.mp-3m-fastgeography shard when available. Do not silently usemp-300kfor official local claims.mp-3m-rich.Shared architecture
Important design rule: fixed records, such as Forbes top-tail records, should be added after ordinary population synthesis and excluded from donor fitting. Their weighted target contributions should be subtracted from calibration targets so nonfixed records are calibrated to the residual population.
mp-300kbuild pathmp-300kshould use the best currently known Microplex construction path:l0_lambda=0remains best for national quality, the implementation should disable hard-concrete gates rather than carrying a fake sparsity mechanism.mp-300krelease gatesmp-300kcan become the default national dataset when all of these are true:policyengine-us-datacommit,policyengine-usversion, target DB path, and target DB SHA256. "Latest eCPS" means that pinned baseline, not a moving label.policyengine-usloader, including entity tables, IDs, joins, weights, periods, dtypes, missing-value conventions, and absence of source-dataset diagnostic variables.policyengine-us-dataintegration point.mp-3mbuild pathmp-3mshould not be forced into the same artifact shape asmp-300k. Local accuracy needs more ACS/local target support, but routine local microsims need sparse or sharded outputs.Recommended path:
mp-300k.mp-3m-rich: best-fit, larger research artifact.mp-3m-fast: sparse or sharded analysis artifact for routine PolicyEngine use.mp-3mrelease gatesmp-3mcan replace the PE local L0 pipeline when all of these are true:mp-3m-faston the standard cloud runner in less than 12 hours without manual cleanup, or has sharded build jobs whose slowest shard is below that bound.Held-out target evaluation should hold out complete target groups, not random rows inside a target. The default split should include at least one geography family and one income/program family so epoch tuning cannot overfit only the headline national aggregates.
Dashboard contract
The living dashboard should be the source of truth for replacement gate status. It should show, at minimum:
mp-300kcandidate versus latest eCPSmp-3m-richandmp-3m-fastcandidates versus PE small-L0 and big-L0Every serious run should write a machine-readable loss record that the dashboard can index without scraping logs.
Dashboard/indexing and CI gate publication are release-blocking workstreams. It needs:
Cross-repo dependency graph
Required handoffs:
arch-datatomicroplex-us: source facts, target scopes, exclusions, and coverage reports are importable and pinned by content hash.microplex-ustomicroplex-evals: candidate H5, manifest, score files, and source/target provenance are sufficient to run benchmarks without rebuilding.microplex-ustopolicyengine-us-data: exported H5 and metadata satisfy the dataset publication contract.policyengine-us-datatopolicyengine-us: loader names, default dataset selection, fallback behavior, and release notes are stable.policyengine-usdefault switch: happens only after beta artifacts and benchmark reports are available for rollback comparison.Epics and issue-sized tasks
Epic 1: National replacement candidate
Build and score the current best
mp-300kpath.Child issues:
microplex-us: produce small ASEC + ACS100k build with PUF, SIPP/SCF/Arch additions, Forbes fixed spine, and capital-gains lots enabled.microplex-us: score candidate against pinned latest eCPS and write top target delta report.microplex-evals: run microsimulation benchmark suite against candidate and eCPS.Exit:
Epic 2: Target registry hardening
Make Arch the canonical source of target semantics used by both national and local Microplex builds.
Child issues:
arch-data: add source/concept exclusions for misleading broad concepts.arch-data: add explicit target scope labels: filer, full population, recipient, household, tax unit, SPM unit, state, CD, local.microplex-us: consume importable target coverage reports by product.arch-dataandmicroplex-us: add tests that prevent known semantic regressions, including proprietors income and SSI recipient/value confusion.Exit:
Epic 3: Calibration simplification
Make the winning gradient-based weight path the standard Microplex path.
Child issues:
microplex-us: disable hard-concrete gates automatically whenl0_lambda=0.microplex-us: preserve L0 gates only formp-3m-fast, parent-to-child derivation, or explicit experiments.microplex-us: write loss curves and held-out target curves for every run.microplex-us: define epoch stopping rules from held-out target performance, not only training loss.Exit:
mp-300kcalibration command and one standardmp-3m-fastcalibration commandEpic 4: Microsimulation benchmark suite
Codify policy outcomes that must be compared before replacement.
Child issues:
microplex-evals: add national benchmark suite covering SSI asset limits, CTC/EITC, capital-gains indexing, and the Tara Watson SSI asset-limit benchmark.microplex-evals: freeze a benchmark manifest before judging any release candidate, including reform definitions, periods, expected output fields, and pinned baseline artifacts.microplex-evals: report aggregate fiscal impact, household net income, winners/losers, poverty/SPM where applicable, and component deltas.microplex-evals: enforce PolicyEngine MicroSeries operations throughout; no manual weight math.Exit:
Epic 5: Dashboard and release gates
Make replacement claims visible from durable CI gate reports and a dashboard rather than ad hoc log inspection.
Child issues:
microplex-us: define stable artifact-gate and loss-result JSON schemas.microplex-us: add run indexer support for national, local-rich, and local-fast candidate artifacts.microplex-us: add dashboard gate cells for target loss, protected families, microsim benchmarks, compatibility, runtime, and artifact size.microplex-us: publish the current candidate artifact path and score bundle.Exit:
Epic 6: Compatibility and publication contract
Make the H5 and metadata contract explicit before any default switch.
Child issues:
microplex-us: add automated H5 compatibility check againstpolicyengine-us.policyengine-us-data: add loader/publication path for Microplex national beta artifact.policyengine-us: define default dataset switch, feature flag, and rollback behavior.policyengine-us-data: document the eCPS incumbent baseline and Microplex replacement status.Exit:
mp-300kthrough the normal dataset interface with a documented rollback pathEpic 7: Local pipeline scalability
Make local builds practical and diagnosable.
Child issues:
microplex-us: add profiled stage timings and RSS for donor integration, PE table construction, calibration, scoring, and export.microplex-us: implement chunked or vectorized PE table construction where needed.microplex-us: choose sparse output strategy: stronger L0, parent-to-child derivation, post-fit pruning, geography shards, or a combination.microplex-us: add disk guardrails and resumable checkpoints.Exit:
mp-3m-fastcan run routine local microsims without a multi-day laptop pipelineInitial milestones
microplex-usmp-300kcandidate against pinned eCPS, while keeping non-eCPS gates reusable after eCPS is retiredmp-300kcandidate scoredmicroplex-usmp-300kversus eCPS with target deltas and compatibility/runtime gate statusmicroplex-evals,microplex-usmicroplex-us,microplex-evalsarch-data,microplex-usmicroplex-usmp-300kandmp-3m-fastcalibration commandsmicroplex-us,policyengine-us-datamp-300kloads through normal dataset interfaces with a documented rollback pathmp-3m-fastartifactmicroplex-uspolicyengine-us-data,policyengine-usmp-300kcan replace eCPS in a controlled releaseWhat not to do
Open questions
mp-3m, should the primary public artifact be state shards, CD shards, or both?Beta Was this translation helpful? Give feedback.
All reactions