Skip to content

Add explicit preTP boundary objects to fix aggregate indexing#1117

Open
SeaCelo wants to merge 3 commits intoPSLmodels:masterfrom
SeaCelo:feature/pr1073-pretp-boundary
Open

Add explicit preTP boundary objects to fix aggregate indexing#1117
SeaCelo wants to merge 3 commits intoPSLmodels:masterfrom
SeaCelo:feature/pr1073-pretp-boundary

Conversation

@SeaCelo
Copy link
Copy Markdown

@SeaCelo SeaCelo commented Apr 28, 2026

This is an alternative to #1073 that takes only the boundary fix, without reindexing the rest of the demographic path.

Master's pre-time-path boundary uses period-0 rates as a stand-in for prior-year rates, which leaves a small accounting residual at period 0 (the issue #1073 set out to fix). This PR addresses it by exposing explicit rho_preTP, imm_rates_preTP, and g_n_preTP fields. aggregates.py reads them with a getattr fallback so existing callers keep working. When OG-Core is fetching demographic data itself, get_pop_objs now fetches one prior year of UN mortality so the boundary identity holds to numerical noise. A new optional pre_mort_rates kwarg lets a caller like OG-USA's Calibration pass prior-year data directly.

The main demographic path arrays (omega, rho, imm_rates, g_n) are byte-identical to master. None of #1073's other changes — timeline back-shift, T+1, [1:] slicing on returned arrays, fixper adjustment, pre_pop_dist removal — are included, as they aren't required for the boundary correction.

For evidence: the population identity at the boundary should satisfy omega[0, s] · (1 + g_n_preTP) ≈ (1 - rho_preTP[s-1]) · omega_S_preTP[s-1] + imm_rates_preTP[s] · omega_S_preTP[s] for s ∈ [1, S-1]. Measured on a synthetic year-varying universe in the regression test:

preTP boundary residual
without pre_mort_rates (legacy alias of period-0 rates) 3.3e-3
with pre_mort_rates supplied 5.5e-17

Tests in tests/test_pretp_boundary.py cover both directions: aggregates use the new fields in SS and TPI, the legacy getattr fallback still works, the existing g_n timing contract is pinned, the prior-year UN fetch produces distinct boundary objects, and the boundary identity collapses to numerical noise when pre_mort_rates is supplied. tests/test_aggregates.py and tests/test_demographics.py pass unchanged.

cc @jdebacker @rickecon

Comment thread ogcore/aggregates.py Outdated
p.rho[0, :].reshape(1, p.S), p.rho[: p.T - 1, :], axis=0
)
rho = np.append(rho_preTP.reshape(1, p.S), p.rho[: p.T - 1, :], axis=0)
growth = np.hstack((pop_growth_rate_preTP, p.g_n[1 : p.T]))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SeaCelo Isn't the appending of $g_{n,preTP}$ inconsistent with Equation 173, which would have ${ g_{n,1},...g_{n,T} }$ in the denominator?

Maybe that equation is incorrect, but I can't see it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdebacker . The change you noted is mostly to have a naming symmetry, not an actual change in the indexing. This is how I was thinking about it in this PR:

The first row of the TPI bequest calculation uses the pre-time-path population distribution. For the row to be internally consistent, the mortality and growth rate in it need to describe the same transition — pre-period into the first model period — so all three inputs refer to the same year boundary. Here is what each input is, before and after this PR:

First-row input Master uses This PR uses
population distribution p.omega_S_preTP (preTP) p.omega_S_preTP (unchanged)
mortality p.rho[0, :] (period 0→1) p.rho_preTP (preTP→0)
population growth rate p.g_n[0] (preTP→0) p.g_n_preTP (preTP→0)

In that reading, g_n_preTP is not an additional term before the sequence in Eq. 173. It is the first term in that sequence, $\tilde{g}_{n,1}$ in the equation's notation: the growth rate from the pre-period into the first TPI period.

In the default flow, g_n_preTP and g_n[0] come from the same source (g_n_path[0] in demographics.py). So np.hstack((g_n_preTP, g_n[1:p.T])) evaluates to the same vector as master's p.g_n[:p.T]. No numerical change to the equation.

The numerical fix in this PR is on rho_preTP (and imm_rates_preTP in get_B) — those are the boundary fields master had pointing at the wrong year. Master's g_n[0] was already correct. The g_n_preTP rename is naming symmetry: all three boundary inputs read as named fields rather than mixing *_preTP with [0] indexing.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One source of why I was confused at first was the math-to-code mapping

I'm reading the theory equation as using model-period notation, where the first TPI bequest period has denominator $\tilde{g}_{n,1}$. In Python, that same transition is zero-indexed and historically lives at p.g_n[0], because it is the growth from the pre-period population into omega[0]. The next transition, first-TPI-period to second-TPI-period, is p.g_n[1].

Theory notation Code object Transition
$\tilde{g}_{n,1}$ g_n_preTP / historically p.g_n[0] preTP → first TPI period
$\tilde{g}_{n,2}$ p.g_n[1] first TPI period → second TPI period
... ... ...
$\tilde{g}_{n,T}$ p.g_n[T-1] (T−1)th TPI period → Tth TPI period

That's why np.hstack((g_n_preTP, g_n[1:p.T])) is the code equivalent of $\lbrace \tilde{g}_{n,1}, \ldots, \tilde{g}_{n,T} \rbrace$.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SeaCelo I do think there is a misunderstanding. Let me see if I can clarify.

In the master branch, the indexing should be g_n[0:p.T] = ${ g_{n,1},...g_{n,T} }$. But also in master, since demographics.py doesn't return a g_n_preTP we are assuming that g_n[0]=g_n_preTP.

PR #1073 tries to avoid making that assumption be explicitly computing g_n_preTP. So g_n_preTP is the growth rate in the population from the year before the model start year and g[0]=$g_{n,1}$ is the growth rate in the first period of the model.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdebacker. You're right that equation 173's denominator is {g_{n,1}, ..., g_{n,T}} — no preTP term. The hstack I added was a cosmetic rename of p.g_n[0] and didn't change any numbers, but it does make the PR look like it's taking a position on the broader g_n contract, which isn't the goal here.

I'll narrow the change. The PR's only boundary overrides will be on mortality and immigration — the two inputs master was getting from the wrong year. Growth stays exactly as master wrote it. Concretely:

  • Revert the TPI hstack to p.g_n[:p.T]
  • Revert both SS/preTP branches (get_B and get_BQ) to use p.g_n[0] directly
  • Keep g_n_preTP as a returned field from get_pop_objs (symmetric with rho_preTP and imm_rates_preTP), but aggregates.py won't read it

Will push the narrower diff shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants