Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Adds ETL pipeline for state-level individual income tax collections from Census Bureau's Annual Survey of State Government Tax Collections (STC)
  • Creates calibration targets using stratum_group_id 7 for state income tax
  • Uses hardcoded FY2023 data for all 50 states + DC ($531B total)

Test plan

  • Run make database to verify ETL executes successfully
  • Check database contains state income tax strata and targets
  • Verify validation passes (state_income_tax variable exists in policyengine-us)

Context

Addresses issue where PolicyEngine was overestimating state income tax revenue due to missing calibration targets. For example, Ohio showed ~$24B vs actual ~$10B.

Closes #492

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits January 30, 2026 18:50
Adds ETL pipeline for state-level individual income tax collections using
Census Bureau's Annual Survey of State Government Tax Collections (STC).

- New ETL script: etl_state_income_tax.py with FY2023 data for all 50 states + DC
- Uses stratum_group_id 7 for state income tax strata
- Includes proper handling for states without income tax (AK, FL, NV, SD, TX, WA, WY, NH, TN)
- Ohio value confirmed at $9.52B matching external sources
- Total collections across all states: ~$531B

This addresses the calibration gap where PolicyEngine was overestimating state
income tax (e.g., $24B for Ohio vs actual $10B) due to missing calibration targets.

Closes #492

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MaxGhenis added a commit that referenced this pull request Jan 31, 2026
Adds ETL pipeline for state-level individual income tax collections
from Census Bureau's Annual Survey of State Government Tax Collections
(STC) using FY2023 data for all 50 states + DC ($531B total).

Recreated from PR #493 rebased onto main.

Closes #492

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis
Copy link
Contributor Author

Superseded by new PR rebased onto main (was targeting stale db-work branch).

@MaxGhenis MaxGhenis closed this Jan 31, 2026
MaxGhenis added a commit that referenced this pull request Jan 31, 2026
Adds ETL pipeline for state-level individual income tax collections
from Census Bureau's Annual Survey of State Government Tax Collections
(STC) using FY2023 data for all 50 states + DC ($531B total).

Recreated from PR #493 rebased onto main.

Closes #492

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants