Skip to content

PE-US rebuild smoke expects missing policyengine_us_data/storage/uprating_factors.csv #148

@anth-volk

Description

@anth-volk

After working around #147 by copying storage/calibration_targets/soi_targets.csv to the legacy storage/soi.csv path, the local PE-US rebuild smoke moved further through PUF loading and then failed on another missing PE-US-data prerequisite.

Command shape:

uv run --no-sync python -m microplex_us.pipelines.pe_us_data_rebuild_checkpoint \
  --output-root artifacts/local_us_microplex_smoke \
  --version-id local-smoke-v1 \
  --baseline-dataset /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/enhanced_cps_2024.h5 \
  --targets-db /Users/administrator/Documents/PolicyEngine/calibration-diagnostics/.artifacts/policy_data.db \
  --policyengine-us-data-repo /Users/administrator/Documents/PolicyEngine/policyengine-us-data \
  --calibration-backend microcalibrate \
  --donor-imputer-backend zi_qrf \
  --policyengine-materialize-batch-size 100000 \
  --cps-sample-n 1000 --puf-sample-n 1000 --donor-sample-n 1000 \
  --n-synthetic 1000 \
  --defer-policyengine-harness \
  --defer-policyengine-native-score \
  --defer-native-audit \
  --defer-imputation-ablation

Failure:

FileNotFoundError: Could not find PE uprating factors at /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/uprating_factors.csv

Relevant traceback:

microplex_us/data_sources/puf.py:2253 load_frame
microplex_us/data_sources/puf.py:1988 _build_puf_tax_units
microplex_us/data_sources/puf.py:664 uprate_mapped_puf_with_pe_factors
microplex_us/data_sources/puf.py:505 _resolve_pe_uprating_factors_path

The failure occurs after:

Loading processed CPS ASEC 2023 from ~/.cache/microplex/cps_asec_2023_processed_v20260601_ecps_spm_takeup_inputs.parquet
Loading PUF from ~/.cache/microplex/puf_2015.csv...
  Raw records: 207,692
Loading demographics from ~/.cache/microplex/demographics_2015.csv...
  After demographics merge: 207,692

This appears to be the same kind of fresh-checkout prerequisite contract gap as #147: --policyengine-us-data-repo points at a valid local checkout, but the expected storage artifact is not present and download_prerequisites.py does not appear to include it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions