Summary
The US bundle carries capital gain distributions only for filers who report them without Schedule D (non_sch_d_capital_gains = PUF E01100, ~$13.7B in 2026 across 4.0M tax units, SOI-calibrated). Filers with a Schedule D report their distributions on Schedule D line 13, where they fold into long_term_capital_gains (PUF P23250) and are no longer separable. That Schedule-D-routed slice is most of the dollars: ICI reports shareholders reinvested $234B of capital gain distributions in 2023 (all account types), and ~23M households hold ~$7T of long-term mutual fund assets in nonretirement accounts.
Without separating this component, reforms that treat fund distributions differently from other realized gains can't be modeled — currently requested for the GROWTH Act (H.R. 2089 / S. 1839), which defers tax on automatically reinvested RIC capital gain dividends. Companion engine fix: PolicyEngine/policyengine-us#8839 (E01100 was never entering AGI/NII; fixed and ready).
Proposal
Add a capital_gain_distributions_in_ltcg (name TBD) imputed component of long_term_capital_gains as a US donor stage:
- Donor/targets: SOI Sales of Capital Assets study publishes capital gain distribution totals and their share of net long-term gains by AGI class; SOI Individual Complete Report line-item aggregates become Ledger target references by AGI band.
- Imputation: predict the distribution share of each record's LTCG conditional on AGI, total LTCG, age, filing status — the same microimpute + calibrate pattern as the existing PUF-variable stages in
packages/populace-build/src/populace/build/us/sources.py.
- RIC/REIT split: Schedule D line 13 includes REIT capital gain dividends, which are outside the bill (subchapter M part II vs part I). Apportion with ICI (fund industry) vs NAREIT data; GROWTH-style reforms apply only to the RIC share.
- Consistency + joint-distribution guard:
non_sch_d_capital_gains + the new component should reconcile against total distribution aggregates, and the imputation should preserve the E01100/Schedule-D disjointness — 0.6% of E01100 dollars currently sit on records that also carry Schedule D gains, which is impossible on a real return (E01100 is definitionally no-Schedule-D) and interacts with the QDCGT worksheet's no-Schedule-D branch in the engine.
The base is inherently taxable-account-only (tax-deferred accounts don't pass distributions through to 1099-DIV), so no retirement-account carve-out is needed.
Motivation
Requested by The Tax Project for GROWTH Act analysis (revenue and household impacts of deferring reinvested capital gain distributions). Downstream layers — reinvestment/DRIP fraction as an ICI-sourced sensitivity parameter, and multi-year recapture timing — build on this and are out of scope here.
(Refiled from policyengine-us-data#1176 — that repo is deprecated in favor of populace.)
Summary
The US bundle carries capital gain distributions only for filers who report them without Schedule D (
non_sch_d_capital_gains= PUFE01100, ~$13.7B in 2026 across 4.0M tax units, SOI-calibrated). Filers with a Schedule D report their distributions on Schedule D line 13, where they fold intolong_term_capital_gains(PUFP23250) and are no longer separable. That Schedule-D-routed slice is most of the dollars: ICI reports shareholders reinvested $234B of capital gain distributions in 2023 (all account types), and ~23M households hold ~$7T of long-term mutual fund assets in nonretirement accounts.Without separating this component, reforms that treat fund distributions differently from other realized gains can't be modeled — currently requested for the GROWTH Act (H.R. 2089 / S. 1839), which defers tax on automatically reinvested RIC capital gain dividends. Companion engine fix: PolicyEngine/policyengine-us#8839 (E01100 was never entering AGI/NII; fixed and ready).
Proposal
Add a
capital_gain_distributions_in_ltcg(name TBD) imputed component oflong_term_capital_gainsas a US donor stage:packages/populace-build/src/populace/build/us/sources.py.non_sch_d_capital_gains+ the new component should reconcile against total distribution aggregates, and the imputation should preserve the E01100/Schedule-D disjointness — 0.6% of E01100 dollars currently sit on records that also carry Schedule D gains, which is impossible on a real return (E01100 is definitionally no-Schedule-D) and interacts with the QDCGT worksheet's no-Schedule-D branch in the engine.The base is inherently taxable-account-only (tax-deferred accounts don't pass distributions through to 1099-DIV), so no retirement-account carve-out is needed.
Motivation
Requested by The Tax Project for GROWTH Act analysis (revenue and household impacts of deferring reinvested capital gain distributions). Downstream layers — reinvestment/DRIP fraction as an ICI-sourced sensitivity parameter, and multi-year recapture timing — build on this and are out of scope here.
(Refiled from policyengine-us-data#1176 — that repo is deprecated in favor of populace.)