Add explicit preTP boundary objects to fix aggregate indexing#1117
Add explicit preTP boundary objects to fix aggregate indexing#1117SeaCelo wants to merge 3 commits intoPSLmodels:masterfrom
Conversation
| p.rho[0, :].reshape(1, p.S), p.rho[: p.T - 1, :], axis=0 | ||
| ) | ||
| rho = np.append(rho_preTP.reshape(1, p.S), p.rho[: p.T - 1, :], axis=0) | ||
| growth = np.hstack((pop_growth_rate_preTP, p.g_n[1 : p.T])) |
There was a problem hiding this comment.
@SeaCelo Isn't the appending of
Maybe that equation is incorrect, but I can't see it.
There was a problem hiding this comment.
Thanks @jdebacker . The change you noted is mostly to have a naming symmetry, not an actual change in the indexing. This is how I was thinking about it in this PR:
The first row of the TPI bequest calculation uses the pre-time-path population distribution. For the row to be internally consistent, the mortality and growth rate in it need to describe the same transition — pre-period into the first model period — so all three inputs refer to the same year boundary. Here is what each input is, before and after this PR:
| First-row input | Master uses | This PR uses |
|---|---|---|
| population distribution | p.omega_S_preTP (preTP) |
p.omega_S_preTP (unchanged) |
| mortality | p.rho[0, :] (period 0→1) |
p.rho_preTP (preTP→0) |
| population growth rate | p.g_n[0] (preTP→0) |
p.g_n_preTP (preTP→0) |
In that reading, g_n_preTP is not an additional term before the sequence in Eq. 173. It is the first term in that sequence,
In the default flow, g_n_preTP and g_n[0] come from the same source (g_n_path[0] in demographics.py). So np.hstack((g_n_preTP, g_n[1:p.T])) evaluates to the same vector as master's p.g_n[:p.T]. No numerical change to the equation.
The numerical fix in this PR is on rho_preTP (and imm_rates_preTP in get_B) — those are the boundary fields master had pointing at the wrong year. Master's g_n[0] was already correct. The g_n_preTP rename is naming symmetry: all three boundary inputs read as named fields rather than mixing *_preTP with [0] indexing.
There was a problem hiding this comment.
One source of why I was confused at first was the math-to-code mapping
I'm reading the theory equation as using model-period notation, where the first TPI bequest period has denominator p.g_n[0], because it is the growth from the pre-period population into omega[0]. The next transition, first-TPI-period to second-TPI-period, is p.g_n[1].
| Theory notation | Code object | Transition |
|---|---|---|
g_n_preTP / historically p.g_n[0]
|
preTP → first TPI period | |
p.g_n[1] |
first TPI period → second TPI period | |
| ... | ... | ... |
p.g_n[T-1] |
(T−1)th TPI period → Tth TPI period |
That's why np.hstack((g_n_preTP, g_n[1:p.T])) is the code equivalent of
There was a problem hiding this comment.
@SeaCelo I do think there is a misunderstanding. Let me see if I can clarify.
In the master branch, the indexing should be g_n[0:p.T] = master, since demographics.py doesn't return a g_n_preTP we are assuming that g_n[0]=g_n_preTP.
PR #1073 tries to avoid making that assumption be explicitly computing g_n_preTP. So g_n_preTP is the growth rate in the population from the year before the model start year and g[0]=$g_{n,1}$ is the growth rate in the first period of the model.
There was a problem hiding this comment.
Thanks @jdebacker. You're right that equation 173's denominator is {g_{n,1}, ..., g_{n,T}} — no preTP term. The hstack I added was a cosmetic rename of p.g_n[0] and didn't change any numbers, but it does make the PR look like it's taking a position on the broader g_n contract, which isn't the goal here.
I'll narrow the change. The PR's only boundary overrides will be on mortality and immigration — the two inputs master was getting from the wrong year. Growth stays exactly as master wrote it. Concretely:
- Revert the TPI hstack to
p.g_n[:p.T] - Revert both SS/preTP branches (
get_Bandget_BQ) to usep.g_n[0]directly - Keep
g_n_preTPas a returned field fromget_pop_objs(symmetric withrho_preTPandimm_rates_preTP), butaggregates.pywon't read it
Will push the narrower diff shortly.
This is an alternative to #1073 that takes only the boundary fix, without reindexing the rest of the demographic path.
Master's pre-time-path boundary uses period-0 rates as a stand-in for prior-year rates, which leaves a small accounting residual at period 0 (the issue #1073 set out to fix). This PR addresses it by exposing explicit
rho_preTP,imm_rates_preTP, andg_n_preTPfields.aggregates.pyreads them with agetattrfallback so existing callers keep working. When OG-Core is fetching demographic data itself,get_pop_objsnow fetches one prior year of UN mortality so the boundary identity holds to numerical noise. A new optionalpre_mort_rateskwarg lets a caller like OG-USA'sCalibrationpass prior-year data directly.The main demographic path arrays (
omega,rho,imm_rates,g_n) are byte-identical to master. None of #1073's other changes — timeline back-shift,T+1,[1:]slicing on returned arrays,fixperadjustment,pre_pop_distremoval — are included, as they aren't required for the boundary correction.For evidence: the population identity at the boundary should satisfy
omega[0, s] · (1 + g_n_preTP) ≈ (1 - rho_preTP[s-1]) · omega_S_preTP[s-1] + imm_rates_preTP[s] · omega_S_preTP[s]for s ∈ [1, S-1]. Measured on a synthetic year-varying universe in the regression test:pre_mort_rates(legacy alias of period-0 rates)pre_mort_ratessuppliedTests in
tests/test_pretp_boundary.pycover both directions: aggregates use the new fields in SS and TPI, the legacygetattrfallback still works, the existingg_ntiming contract is pinned, the prior-year UN fetch produces distinct boundary objects, and the boundary identity collapses to numerical noise whenpre_mort_ratesis supplied.tests/test_aggregates.pyandtests/test_demographics.pypass unchanged.cc @jdebacker @rickecon