Skip to content

Prevent NIPA proprietors' income from mapping to Schedule C self-employment income #2

@MaxGhenis

Description

@MaxGhenis

Problem

The current Arch-to-Microplex target mapping allows BEA/NIPA proprietors_income_with_inventory_valuation_and_capital_consumption_adjustments to flow through as PolicyEngine self_employment_income.

That is too narrow and creates a false diagnostic failure. PE self_employment_income is Schedule C / non-farm self-employment income, while NIPA proprietors' income is a much broader business/proprietor-income concept.

Concrete local evidence:

  • Arch BEA artifact: /private/tmp/arch_bea_full_bundle_2024/consumer_facts.jsonl
  • Target: arch_target_4
  • Canonical concept: bea_nipa.proprietors_income_with_inventory_valuation_and_capital_consumption_adjustments
  • 2024 value: $2.02308T
  • Current Microplex adapter mapping observed in microplex_us/targets/arch.py:
    • proprietors_income_amount -> self_employment_income

When compared against datasets, that target makes both datasets look catastrophically wrong:

Source concept Target Legacy PE ECPS Microplex
BEA/NIPA proprietors' income $2.023T $0.281T, 86.1% error $0.397T, 80.4% error

But the clean SOI Schedule C comparison shows Microplex is fine on the narrower self_employment_income concept:

Source concept Target Legacy PE ECPS Microplex
SOI Schedule C net income $424.5B $339.6B, 20.0% error $421.7B, 0.67% error

And an IRS broader business-income sanity check is much closer to NIPA scale:

Source concept Target Legacy PE ECPS Microplex
IRS Schedule C net + partnership/S-corp net $2.358T $1.466T, 37.8% error $2.446T, 3.8% error

So this is a semantic mapping/harness issue, not a Microplex self-employment imputation failure.

Desired fix

Do not map BEA/NIPA proprietors' income directly to PE self_employment_income.

One of these is probably right:

  1. Emit/map the BEA/NIPA fact to a distinct canonical target concept such as proprietors_income / nipa_proprietors_income, with no direct PE variable binding until PolicyEngine has the right composite variable; or
  2. Map it only to an explicit composite expression that includes the relevant business-income components, e.g. Schedule C plus partnership/S-corp and any farm/nonfarm proprietors adjustments required by the NIPA definition.

Harness guardrail requested

Please adjust the Arch harness so this cannot recur. Suggested acceptance test:

  • Load the BEA NIPA personal-income consumer facts for 2024.
  • Locate bea_nipa.proprietors_income_with_inventory_valuation_and_capital_consumption_adjustments / source series A041RC.
  • Assert it does not get exposed as a plain downstream self_employment_income target.
  • If it is exposed to a PolicyEngine/Microplex-compatible target, assert the target variable/concept is a dedicated proprietors-income concept or an explicitly declared composite, not Schedule C self_employment_income.

A useful additional guardrail: add a source-concept compatibility allowlist for high-level NIPA concepts. Concepts containing proprietors_income should fail the harness if mapped one-to-one to self_employment_income without an explicit composite mapping. A magnitude sanity check against SOI Schedule C would also catch this: NIPA proprietors' income is about 4.8x the SOI Schedule C net target in 2024, so treating them as the same concept should fail loudly.

Reproduction notes

Local Arch artifacts used:

  • /private/tmp/arch_bea_full_bundle_2024/consumer_facts.jsonl
  • /Users/maxghenis/CosilicoAI/arch/macro/targets.db

Local Microplex command that surfaced the mapping:

cd /Users/maxghenis/CosilicoAI/microplex-us
uv run python - <<'PY'
from microplex_us.targets.arch import resolve_arch_sqlite_target_provider
from microplex.targets import TargetQuery
provider = resolve_arch_sqlite_target_provider('/private/tmp/arch_bea_full_bundle_2024/consumer_facts.jsonl')
ts = provider.load_target_set(TargetQuery(period=2024, provider_filters={
    'sources': ('BEA',),
    'variables': ('self_employment_income',),
}))
for target in ts.targets[:3]:
    print(target.name, target.source, target.measure, target.value, target.metadata.get('arch_concept'))
PY

Expected after fix: this should return no plain self_employment_income target for BEA/NIPA proprietors' income, or should return a differently named/composite target with an explicit concept contract.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions