Summary
A workstation PE-US rebuild smoke run completed donor integration and entropy calibration, saved both durable checkpoints, then failed during final artifact export because the checkpoint paths were placed under the same versioned output directory that final artifact allocation expects to create.
The CLI accepts explicit checkpoint save paths, but if those paths are inside --output-root/--version-id, they pre-create the version directory. Later _allocate_versioned_output_dir() raises FileExistsError instead of treating the directory as the active run's output directory.
Command shape
.venv/bin/python -m microplex_us.pipelines.pe_us_data_rebuild_checkpoint \
--output-root artifacts/local_us_microplex_smoke \
--version-id local-smoke-v1-entropy \
--baseline-dataset /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/enhanced_cps_2024.h5 \
--targets-db /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/calibration/policy_data.db \
--policyengine-us-data-repo /Users/administrator/Documents/PolicyEngine/policyengine-us-data \
--policyengine-us-data-python /Users/administrator/Documents/PolicyEngine/worktrees/microplex-us/fix-pe-rebuild-smoke-issues/.venv/bin/python \
--calibration-backend entropy \
--donor-imputer-backend zi_qrf \
--policyengine-materialize-batch-size 100000 \
--cps-sample-n 1000 \
--puf-sample-n 1000 \
--donor-sample-n 1000 \
--n-synthetic 1000 \
--no-include-acs \
--defer-policyengine-harness \
--defer-policyengine-native-score \
--defer-native-audit \
--defer-imputation-ablation \
--pipeline-checkpoint-save-post-imputation-path artifacts/local_us_microplex_smoke/local-smoke-v1-entropy/checkpoints/post_imputation \
--pipeline-checkpoint-save-post-microsim-path artifacts/local_us_microplex_smoke/local-smoke-v1-entropy/checkpoints/post_microsim
Progress before failure
The run reached:
US microplex build: post-imputation checkpoint saved [path=artifacts/local_us_microplex_smoke/local-smoke-v1-entropy/checkpoints/post_imputation]
US microplex build: policyengine calibration start [backend=entropy]
US microplex build: post-microsim checkpoint saved [path=artifacts/local_us_microplex_smoke/local-smoke-v1-entropy/checkpoints/post_microsim]
US microplex build: policyengine calibration complete [backend=entropy, calibrated_rows=2741]
Failure
File "/Users/administrator/Documents/PolicyEngine/worktrees/microplex-us/fix-pe-rebuild-smoke-issues/src/microplex_us/pipelines/artifacts.py", line 1834, in _allocate_versioned_output_dir
raise FileExistsError(
f"Versioned artifact directory already exists: {output_dir}"
)
FileExistsError: Versioned artifact directory already exists: artifacts/local_us_microplex_smoke/local-smoke-v1-entropy
Expected behavior
One of these should happen:
- the rebuild CLI should reject checkpoint save paths inside the final versioned artifact directory before starting the expensive run,
- final artifact allocation should allow the active run's checkpoint-created version directory and write final outputs into it, or
- checkpoint paths should default to a separate checkpoint root that cannot collide with versioned artifact allocation.
This failure happened after the expensive stages had already completed, so it is costly even for a sampled smoke run.
Summary
A workstation PE-US rebuild smoke run completed donor integration and entropy calibration, saved both durable checkpoints, then failed during final artifact export because the checkpoint paths were placed under the same versioned output directory that final artifact allocation expects to create.
The CLI accepts explicit checkpoint save paths, but if those paths are inside
--output-root/--version-id, they pre-create the version directory. Later_allocate_versioned_output_dir()raisesFileExistsErrorinstead of treating the directory as the active run's output directory.Command shape
Progress before failure
The run reached:
Failure
Expected behavior
One of these should happen:
This failure happened after the expensive stages had already completed, so it is costly even for a sampled smoke run.