Skip to content

Clean-main PE-US smoke fails when ACS donor H5 is absent from policyengine-us-data storage #153

@anth-volk

Description

@anth-volk

After overriding --policyengine-us-data-python to use an environment with numpy (#151), the clean-main PE-US rebuild smoke gets through CPS/PUF loading and then fails because the default ACS donor provider expects a local acs_2022.h5 in the policyengine-us-data checkout.

Clean-main worktree:

/Users/administrator/Documents/PolicyEngine/worktrees/microplex-us/fix-pe-rebuild-smoke-issues

Progress before failure:

Loading processed CPS ASEC 2023 from ~/.cache/microplex/cps_asec_2023_processed_v20260601_ecps_spm_takeup_inputs.parquet
Loading PUF from ~/.cache/microplex/puf_2015.csv...
  Raw records: 207,692
Loading demographics from ~/.cache/microplex/demographics_2015.csv...
  After demographics merge: 207,692
Expanded 1,000 tax units to 1,921 persons
Loading processed CPS ASEC 2023 from ~/.cache/microplex/cps_asec_2023_processed_v20260601_ecps_spm_takeup_inputs.parquet

Subprocess failure:

FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = '/Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/acs_2022.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Parent traceback:

microplex_us/data_sources/donor_surveys.py:880 load_frame
microplex_us/data_sources/donor_surveys.py:598 _default_acs_tables_loader
microplex_us/data_sources/donor_surveys.py:572 _run_policyengine_dataset_loader_from_spec
microplex_us/data_sources/donor_surveys.py:539 _run_policyengine_dataset_loader
subprocess.CalledProcessError

The CLI help says --no-include-acs is supported for an eCPS-shaped run that keeps SIPP/SCF, so I will use that as a local workaround if no acs_2022.h5 exists locally. For the default smoke path, though, the input contract should probably either document ACS as a required local artifact, provide a download/materialization path, or preflight the missing H5 before loading CPS/PUF.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions