Conversation
|
✅ Safe to merge from a data-protection standpoint. This PR creates a parallel public upload path that touches none of the FRS-derived plumbing CLAUDE.md protects. Code quality is solid; tests are unusually thorough. A few things worth a look before merge:
|
…blic-transfer-artifact-manifest # Conflicts: # policyengine_uk_data/datasets/frs.py
|
Pushed a fix for the uploader runtime error: Verification:
I also tried the full artifact-manifest test file locally, but the HDF-reading tests fail in this Python 3.14 environment because PyTables cannot load Current GitHub state after the push is still |
Summary
enhanced_cps_manifest_2025.jsonwith row counts, checksums, git blob SHAs, exchange-rate/build assumptions, loss diagnostics, and weight diagnosticspolicybench_transfer_*files are legacy 1k artifacts while Python aliases point to the currentenhanced_cpsbuilderDependency
The disability-benefit category schema change depends on PolicyEngine UK PR #1656, which makes PIP, DLA, and Attendance Allowance category variables direct model input leaves and removes their reported amount variables from the model. The disability-benefit tests below were run with that local PE-UK branch installed.
Why
The public transfer artifact is now part of the PolicyBench path, but its provenance should not depend on the private eFRS deployment workflow. This keeps public artifact publication separate from private UKDS-derived artifacts and makes misuse harder. Disability benefit reported amounts are survey/data-prep signals, not policy model inputs, so the model should consume category leaves.
Validation
UV_FROZEN=0 uv run --python 3.13 ruff check policyengine_uk_data/datasets/enhanced_cps.py policyengine_uk_data/datasets/__init__.py policyengine_uk_data/datasets/policybench_transfer.py policyengine_uk_data/utils/enhanced_cps_manifest.py policyengine_uk_data/storage/write_enhanced_cps_manifest.py policyengine_uk_data/storage/upload_public_transfer_dataset.py policyengine_uk_data/tests/test_enhanced_cps_artifact_manifest.pygit diff --checkUV_FROZEN=0 uv run --python 3.13 pytest policyengine_uk_data/tests/test_enhanced_cps_artifact_manifest.py policyengine_uk_data/tests/test_policybench_transfer.py policyengine_uk_data/tests/test_release_manifest.py policyengine_uk_data/tests/test_hf_destinations.pyuv run --no-sync ruff check policyengine_uk_data/datasets/enhanced_cps.py policyengine_uk_data/datasets/frs.py policyengine_uk_data/datasets/imputations/frs_only.py policyengine_uk_data/tests/test_policybench_transfer.py policyengine_uk_data/tests/test_frs_only_imputation.pyuv run --no-sync pytest -q policyengine_uk_data/tests/test_policybench_transfer.py policyengine_uk_data/tests/test_frs_only_imputation.py policyengine_uk_data/tests/test_legacy_benefit_proxies.py