Use full-population data for output weights by MaxGhenis · Pull Request #51 · PolicyEngine/policybench

MaxGhenis · 2026-05-19T13:11:27Z

Summary

compute household-impact and aggregate output weights from full source microsimulation populations instead of the 100-household benchmark sample
add a committed population-weight artifact and a package CLI to regenerate it from full US Enhanced CPS and UK enhanced FRS
refresh app data, snapshot reports, manifest hashes, and rendered paper assets so PIP and other sparse outputs keep population-derived weight
document the local-income-tax zero-weight limitation in the current full ECPS source

Fixes #49.

Tests

uv run ruff check policybench tests
uv run pytest -q
npm run lint && npx tsc --noEmit && npm run build
source .venv/bin/activate && quarto render paper/index.qmd --to html
source .venv/bin/activate && quarto render paper/index.qmd --to pdf

vercel · 2026-05-19T13:11:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
policybench-site	Ready	Preview, Comment	May 19, 2026 1:14pm

The pr-46-review-era stack landed before PR #51 ("Use full-population data for output weights") and Daphne's PR #47 ACA-from-scoring removal. The result on main right now: - ruff format has drifted on policybench/analysis.py, policybench/full_run_export.py, and tests/test_spec.py - policybench/population_weights.json still contains premium_tax_credit, but the headline output set no longer does (post-#47) That makes both lint and test red on every PR against current main. This commit: - Reformats the three drifted files with ruff format - Regenerates policybench/population_weights.json via policybench population-weights, which drops the now-out-of-scope premium_tax_credit entry - Updates the matching SHA-256 in paper/snapshot/20260501/manifest.json so test_snapshot_manifest_hashes_match_population_weight_artifact passes against the new artifact Verified: uv run pytest -m "not slow" -q → 258 passed; uv run ruff format --check . → clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Hero: replace inventory subtitle with a mission line The hero used to read "14 models on 100 households across 18 tax and benefit outputs." That information is already in the stats chip row below (Models / Households / Outputs / snapshot date), so the subtitle was redundant inventory rather than orienting copy. Replace with a one-line mission statement that says what the benchmark is actually for: "Testing how accurately language models calculate household taxes and benefits." The 100% = within-1% explainer continues immediately after as the muted continuation, so readers still learn what the score means. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Clean PolicyBench deployment config * Fix CI: ruff format, regenerate population-weight artifact The pr-46-review-era stack landed before PR #51 ("Use full-population data for output weights") and Daphne's PR #47 ACA-from-scoring removal. The result on main right now: - ruff format has drifted on policybench/analysis.py, policybench/full_run_export.py, and tests/test_spec.py - policybench/population_weights.json still contains premium_tax_credit, but the headline output set no longer does (post-#47) That makes both lint and test red on every PR against current main. This commit: - Reformats the three drifted files with ruff format - Regenerates policybench/population_weights.json via policybench population-weights, which drops the now-out-of-scope premium_tax_credit entry - Updates the matching SHA-256 in paper/snapshot/20260501/manifest.json so test_snapshot_manifest_hashes_match_population_weight_artifact passes against the new artifact Verified: uv run pytest -m "not slow" -q → 258 passed; uv run ruff format --check . → clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Use population data for output weights

bcb324b

vercel Bot deployed to Preview May 19, 2026 13:12 View deployment

Format population weight generator

85f997d

vercel Bot deployed to Preview May 19, 2026 13:14 View deployment

MaxGhenis merged commit b06d24d into main May 19, 2026
4 checks passed

MaxGhenis deleted the fix-population-variable-weights branch May 19, 2026 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use full-population data for output weights#51

Use full-population data for output weights#51
MaxGhenis merged 2 commits into
mainfrom
fix-population-variable-weights

MaxGhenis commented May 19, 2026

Uh oh!

vercel Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented May 19, 2026

Summary

Tests

Uh oh!

vercel Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 19, 2026 •

edited

Loading