Skip to content

Impute the Schedule-D-routed share of capital gain distributions within long_term_capital_gains (US) #274

Description

@MaxGhenis

Summary

The US bundle carries capital gain distributions only for filers who report them without Schedule D (non_sch_d_capital_gains = PUF E01100, ~$13.7B in 2026 across 4.0M tax units, SOI-calibrated). Filers with a Schedule D report their distributions on Schedule D line 13, where they fold into long_term_capital_gains (PUF P23250) and are no longer separable. That Schedule-D-routed slice is most of the dollars: ICI reports shareholders reinvested $234B of capital gain distributions in 2023 (all account types), and ~23M households hold ~$7T of long-term mutual fund assets in nonretirement accounts.

Without separating this component, reforms that treat fund distributions differently from other realized gains can't be modeled — currently requested for the GROWTH Act (H.R. 2089 / S. 1839), which defers tax on automatically reinvested RIC capital gain dividends. Companion engine fix: PolicyEngine/policyengine-us#8839 (E01100 was never entering AGI/NII; fixed and ready).

Proposal

Add a capital_gain_distributions_in_ltcg (name TBD) imputed component of long_term_capital_gains as a US donor stage:

  1. Donor/targets: SOI Sales of Capital Assets study publishes capital gain distribution totals and their share of net long-term gains by AGI class; SOI Individual Complete Report line-item aggregates become Ledger target references by AGI band.
  2. Imputation: predict the distribution share of each record's LTCG conditional on AGI, total LTCG, age, filing status — the same microimpute + calibrate pattern as the existing PUF-variable stages in packages/populace-build/src/populace/build/us/sources.py.
  3. RIC/REIT split: Schedule D line 13 includes REIT capital gain dividends, which are outside the bill (subchapter M part II vs part I). Apportion with ICI (fund industry) vs NAREIT data; GROWTH-style reforms apply only to the RIC share.
  4. Consistency + joint-distribution guard: non_sch_d_capital_gains + the new component should reconcile against total distribution aggregates, and the imputation should preserve the E01100/Schedule-D disjointness — 0.6% of E01100 dollars currently sit on records that also carry Schedule D gains, which is impossible on a real return (E01100 is definitionally no-Schedule-D) and interacts with the QDCGT worksheet's no-Schedule-D branch in the engine.

The base is inherently taxable-account-only (tax-deferred accounts don't pass distributions through to 1099-DIV), so no retirement-account carve-out is needed.

Motivation

Requested by The Tax Project for GROWTH Act analysis (revenue and household impacts of deferring reinvested capital gain distributions). Downstream layers — reinvestment/DRIP fraction as an ICI-sourced sensitivity parameter, and multi-year recapture timing — build on this and are out of scope here.

(Refiled from policyengine-us-data#1176 — that repo is deprecated in favor of populace.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions