⚠️ Correction (2026-06-19) — original headline retracted
The original version of this issue claimed populace_us_2024 overstates itemized deductions ~2–4×. That was a measurement artifact and is retracted. I had summed itemized_taxable_income_deductions over all ~199M tax units — the potential ("if you itemized") amount for everyone — instead of claimed deductions by the ~18–21M who actually itemize. The potential metric overstates the level for every dataset and isn't comparable to SOI/JCT.
On the correct metric (claimed deductions by itemizers + the actual itemized-repeal score, TY2026), populace itemized data is reasonable, including the build the Groundwork dashboard served:
| dataset |
claimed itemized |
claimed charitable |
itemizers |
itemized-repeal score |
populace latest (c86a631) |
$870B |
$285B |
17.6M |
$99B |
populace a912aea (dashboard's build) |
$1,013B |
$328B |
20.9M |
$109B |
| external anchor |
SOI ~$650B |
SOI ~$260B |
— |
JCT ~$184B |
Claimed charitable ≈ SOI, and the itemized-repeal score is below JCT (~$99–109B vs ~$184B). So populace's itemized-deduction data did not cause the Groundwork blow-up, and there is no 2–4× itemized inflation. Apologies for the noise.
Where the real problems actually live (not populace itemized data)
- The Groundwork dashboard's "~$7T" result (implying a $3k/person credit + Medicare-for-All twice over) is ~10× the entire repeal stack (itemized ~$109B + standard-deduction ~$292B + CTC ~$110B + EITC ~$70B + ALD ~$130B ≈ $0.7T). That points to a reform-construction / economy-endpoint bug in
refundable-credit-conversion (units/scaling/double-count) — not the dataset. Belongs in that repo.
- The enhanced-CPS artifact currently on Hugging Face is broken (TY2026 claimed charitable $3,008B — impossible vs ~$560B total US giving; itemized-repeal $1,048B). That's a
policyengine-us-data artifact problem (the live API appears to serve a good pinned eCPS), not populace.
Genuine populace residual (needs triage, not yet root-caused)
The latest build's own calibration_diagnostics.json (c86a631) reports 189 / 4,408 targets >100% off (521 >10%), concentrated in state-level SOI income components — e.g. qbi_amount (US +485%), state rental/royalty income (NY estimate $81.7B vs ~$0 target), state taxable-interest (GA +1209%). Some of these are likely degenerate (near-zero targets — overlaps #104) and some look like real misses; I have not verified which. National income, AGI, wages, EITC, CTC, and (now confirmed) itemized deductions calibrate fine.
This overlaps #104 (zero-valued target weighting), #67 (income-tax regression in a fiscal build), and #57 (structured target fields). Worth deciding whether the state-level SOI income-target tail needs its own gate or folds into those.
Reproduction (corrected metric)
from policyengine_us import Microsimulation
from policyengine_us.data import USSingleYearDataset
from policyengine_core.reforms import Reform
from huggingface_hub import hf_hub_download
p = hf_hub_download("policyengine/populace-us", "populace_us_2024.h5", repo_type="dataset",
revision="populace-us-2024-c86a631-6e1bcd0271a5-20260619T002242Z")
sim = Microsimulation(dataset=USSingleYearDataset(file_path=p))
itz = sim.calculate("tax_unit_itemizes", 2026)
print("claimed itemized $B", round(sim.calculate("itemized_taxable_income_deductions",2026)[itz].sum()/1e9,1))
print("claimed charitable $B", round(sim.calculate("charitable_deduction",2026)[itz].sum()/1e9,1))
class Repeal(Reform):
def apply(self): self.neutralize_variable("itemized_taxable_income_deductions")
base = sim.calculate("income_tax", 2026).sum()
ref = Microsimulation(dataset=USSingleYearDataset(file_path=p), reform=Repeal).calculate("income_tax",2026).sum()
print("itemized-repeal score $B", round((ref-base)/1e9,1)) # ~99
Reframing or closing per maintainers' preference — the original headline bug does not exist; the residual above is the real (narrower) question.
The original version of this issue claimed populace_us_2024 overstates itemized deductions ~2–4×. That was a measurement artifact and is retracted. I had summed
itemized_taxable_income_deductionsover all ~199M tax units — the potential ("if you itemized") amount for everyone — instead of claimed deductions by the ~18–21M who actually itemize. The potential metric overstates the level for every dataset and isn't comparable to SOI/JCT.On the correct metric (claimed deductions by itemizers + the actual itemized-repeal score, TY2026), populace itemized data is reasonable, including the build the Groundwork dashboard served:
c86a631)a912aea(dashboard's build)Claimed charitable ≈ SOI, and the itemized-repeal score is below JCT (~$99–109B vs ~$184B). So populace's itemized-deduction data did not cause the Groundwork blow-up, and there is no 2–4× itemized inflation. Apologies for the noise.
Where the real problems actually live (not populace itemized data)
refundable-credit-conversion(units/scaling/double-count) — not the dataset. Belongs in that repo.policyengine-us-dataartifact problem (the live API appears to serve a good pinned eCPS), not populace.Genuine populace residual (needs triage, not yet root-caused)
The latest build's own
calibration_diagnostics.json(c86a631) reports 189 / 4,408 targets >100% off (521 >10%), concentrated in state-level SOI income components — e.g.qbi_amount(US +485%), state rental/royalty income (NY estimate $81.7B vs ~$0 target), state taxable-interest (GA +1209%). Some of these are likely degenerate (near-zero targets — overlaps #104) and some look like real misses; I have not verified which. National income, AGI, wages, EITC, CTC, and (now confirmed) itemized deductions calibrate fine.This overlaps #104 (zero-valued target weighting), #67 (income-tax regression in a fiscal build), and #57 (structured target fields). Worth deciding whether the state-level SOI income-target tail needs its own gate or folds into those.
Reproduction (corrected metric)
Reframing or closing per maintainers' preference — the original headline bug does not exist; the residual above is the real (narrower) question.