Synth-aggregated duration g-computation in build_netstats (#73)#74
Merged
Synth-aggregated duration g-computation in build_netstats (#73)#74
Conversation
Closes the partial task on #63: under method = "joint" + duration.method = "joint_lm" in build_netparams, the within-ARTnet stratum aggregation that build_netparams emits is now overridden in build_netstats with synth-population aggregation. Stratum-level mean.dur.adj values feeding the diss.byage dissolution offset are computed from per-synthetic-ego predictions of joint_lm log-duration, marginalized over partner-race uncertainty using joint_nm_race_model, then median-aggregated within (same.age.grp x index.age.grp) cells. Downstream effect: when the synthetic target population's joint attribute distribution differs from ARTnet's, dissolution offsets diverge from the within-ARTnet estimates the previous code path produced. Verified on a NHBS-like shifted race.prop run: casl durations diverge meaningfully across strata (matched.1: 68.9 vs 47.9 weeks Atlanta vs NHBS), main is largely unchanged because the joint_lm fit on main has weak race effects. Implementation: - New private helper .aggregate_synth_byage_durations() in NetStats.R. Predicts joint_lm log-duration at same.race = 0 and 1 per ego, weights by P(same.race | ego) from joint_nm_race_model, exponentiates, median-aggregates per stratum, applies the existing geometric transformation (1 / (1 - 2^(-1/median))) for mean.dur.adj. - In build_netstats joint synth-prediction block: alias synth$index.age.grp <- synth$age.grp (joint_lm uses index.age.grp RHS naming; joint_nm_*_model uses age.grp), then call helper for main and casl. Skip when joint_duration_model is NULL (which is the case under duration.method = "empirical" or method = "existing"). - diss.byage dissolution_coefs() calls now read from a local override vector when present, else fall back to netparams[[layer]]$durs.<layer>.byage$mean.dur.adj. - diss.homog still uses the within-ARTnet aggregation; not consumed by EpiModelHIV-Template's tergm offset, so synth analog can land in a follow-up. sex.cess.mod handling: helper preserves the deterministic post-cessation "dead row" (mean.dur.adj = 1) at the end of the override vector so dissolution_coefs sees the same shape it did before this PR. Validation: - Backward-compat snapshot harness: 3/3 match on default and explicit method = "existing". - New tests in test-duration-gcomp-synth.R (7 blocks, 17 assertions): override fires under joint + joint_lm; falls back under empirical; falls back under method = "existing"; diverges under shifted race.prop (>1% on at least one casl stratum); preserves the sex.cess.mod dead row; produces well-formed disscoef objects; handles race = FALSE without joint_nm_race_model. - Full testthat suite: 525 / 525 pass. - R CMD check: 0 errors / 0 warnings / 0 notes. - End-to-end EpiModelHIV-Template run: all 6 ERGMs converge under Stochastic-Approximation; main coef.form drifts -17.570 -> -17.602 under the new synth-aggregated durations vs PR #71, consistent with matched.5 mean.dur.adj moving 445.6 -> 491.5. Closes the dyad-level synthetic-pair gap raised by PR #71. Logical follow-ups now possible: #65 (Phase 1.5 validation suite, now unblocked); #72 (formation-stat sampling-bias work). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 25, 2026
Closed
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the partial task on #63 raised by PR #71's review: under
method = "joint"+duration.method = "joint_lm"inbuild_netparams, the within-ARTnet stratum aggregation thatbuild_netparamsemits is now overridden inbuild_netstatswith synth-population aggregation.What changed
.aggregate_synth_byage_durations()inR/NetStats.R.joint_lmlog-duration atsame.race = 0and1per synth ego.P(same.race | ego)fromjoint_nm_race_model.mean.dur.adj = 1 / (1 - 2^(-1/median)).build_netstatsjoint synth-prediction block now aliasessynth$index.age.grp <- synth$age.grp(joint_lm usesindex.age.grpRHS naming; joint_nm_*_model usesage.grp) and calls the helper for main and casl. Skips whenjoint_duration_modelisNULL.diss.byagedissolution_coefs calls read from a local override vector when present, else fall back tonetparams[[layer]]$durs.<layer>.byage$mean.dur.adj.diss.homogstill uses the within-ARTnet aggregation (not consumed by EpiModelHIV-Template's tergm offset; can land in a follow-up).sex.cess.modhandling: helper preserves the deterministic post-cessation "dead row" (mean.dur.adj = 1) at the end of the override vector sodissolution_coefssees the same shape it did before.Empirical effect
Stratum-level
mean.dur.adj(Atlanta + race = TRUE, N = 10k):Main durations move modestly (joint_lm has weak race effects on duration); casl moves substantially under shifted population (matched.1: 68.9 Atlanta vs 47.9 NHBS) — exactly the marginal-vs-joint correction the refactor was designed to apply. This is the dyad-level analog of the formation-stat divergence we documented in PR #68 / #69 reviews.
Validation
method = "existing".tests/testthat/test-duration-gcomp-synth.R, 7 blocks, 17 assertions. Covers: override fires under joint + joint_lm; falls back under empirical durations; falls back under method = "existing"; diverges under shifted race.prop; preserves sex.cess.mod dead row; produces well-formeddisscoefobjects; handles race = FALSE withoutjoint_nm_race_model.Stochastic-Approximation. Maincoef.formshifts-17.570 → -17.602between PR Duration methods: empirical + joint_lm (#63 phase 3) #71's path and this PR's, consistent withmatched.5mean.dur.adj moving445.6 → 491.5under synth aggregation.Approach note
This implements Option A from #63: ego attributes only on the prediction RHS, with partner-race marginalized via the existing
joint_nm_race_model. No explicit synthetic partnership-pair construction (which would be Option B). The simpler choice is consistent with how nodematch/absdiff are handled in PR #69 and gives correct results for the cross-sectional-age-of-extant-ties target Steve Goodreau articulated in the PR #71 review.Test plan
method = "existing")Depends on #71 (merged). Part of #63 — closes the partial task. Unblocks #65 (Phase 1.5 validation suite).