Skip to content

[corrected] populace_us_2024 itemized-deduction report retracted (measurement artifact); residual = state-level SOI income targets #122

Description

@MaxGhenis

⚠️ Correction (2026-06-19) — original headline retracted

The original version of this issue claimed populace_us_2024 overstates itemized deductions ~2–4×. That was a measurement artifact and is retracted. I had summed itemized_taxable_income_deductions over all ~199M tax units — the potential ("if you itemized") amount for everyone — instead of claimed deductions by the ~18–21M who actually itemize. The potential metric overstates the level for every dataset and isn't comparable to SOI/JCT.

On the correct metric (claimed deductions by itemizers + the actual itemized-repeal score, TY2026), populace itemized data is reasonable, including the build the Groundwork dashboard served:

dataset claimed itemized claimed charitable itemizers itemized-repeal score
populace latest (c86a631) $870B $285B 17.6M $99B
populace a912aea (dashboard's build) $1,013B $328B 20.9M $109B
external anchor SOI ~$650B SOI ~$260B JCT ~$184B

Claimed charitable ≈ SOI, and the itemized-repeal score is below JCT (~$99–109B vs ~$184B). So populace's itemized-deduction data did not cause the Groundwork blow-up, and there is no 2–4× itemized inflation. Apologies for the noise.

Where the real problems actually live (not populace itemized data)

  • The Groundwork dashboard's "~$7T" result (implying a $3k/person credit + Medicare-for-All twice over) is ~10× the entire repeal stack (itemized ~$109B + standard-deduction ~$292B + CTC ~$110B + EITC ~$70B + ALD ~$130B ≈ $0.7T). That points to a reform-construction / economy-endpoint bug in refundable-credit-conversion (units/scaling/double-count) — not the dataset. Belongs in that repo.
  • The enhanced-CPS artifact currently on Hugging Face is broken (TY2026 claimed charitable $3,008B — impossible vs ~$560B total US giving; itemized-repeal $1,048B). That's a policyengine-us-data artifact problem (the live API appears to serve a good pinned eCPS), not populace.

Genuine populace residual (needs triage, not yet root-caused)

The latest build's own calibration_diagnostics.json (c86a631) reports 189 / 4,408 targets >100% off (521 >10%), concentrated in state-level SOI income components — e.g. qbi_amount (US +485%), state rental/royalty income (NY estimate $81.7B vs ~$0 target), state taxable-interest (GA +1209%). Some of these are likely degenerate (near-zero targets — overlaps #104) and some look like real misses; I have not verified which. National income, AGI, wages, EITC, CTC, and (now confirmed) itemized deductions calibrate fine.

This overlaps #104 (zero-valued target weighting), #67 (income-tax regression in a fiscal build), and #57 (structured target fields). Worth deciding whether the state-level SOI income-target tail needs its own gate or folds into those.

Reproduction (corrected metric)

from policyengine_us import Microsimulation
from policyengine_us.data import USSingleYearDataset
from policyengine_core.reforms import Reform
from huggingface_hub import hf_hub_download

p = hf_hub_download("policyengine/populace-us", "populace_us_2024.h5", repo_type="dataset",
                    revision="populace-us-2024-c86a631-6e1bcd0271a5-20260619T002242Z")
sim = Microsimulation(dataset=USSingleYearDataset(file_path=p))
itz = sim.calculate("tax_unit_itemizes", 2026)
print("claimed itemized $B", round(sim.calculate("itemized_taxable_income_deductions",2026)[itz].sum()/1e9,1))
print("claimed charitable $B", round(sim.calculate("charitable_deduction",2026)[itz].sum()/1e9,1))

class Repeal(Reform):
    def apply(self): self.neutralize_variable("itemized_taxable_income_deductions")
base = sim.calculate("income_tax", 2026).sum()
ref  = Microsimulation(dataset=USSingleYearDataset(file_path=p), reform=Repeal).calculate("income_tax",2026).sum()
print("itemized-repeal score $B", round((ref-base)/1e9,1))   # ~99

Reframing or closing per maintainers' preference — the original headline bug does not exist; the residual above is the real (narrower) question.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions