Add backward-compatibility validation harness for joint g-comp refactor#66
Merged
Add backward-compatibility validation harness for joint g-comp refactor#66
Conversation
Sets up inst/validation/ as the pre/post regression harness for the joint g-computation refactor (issues #61-#65). The harness captures golden snapshots of build_netparams() + build_netstats() output on main-era code, then diffs against the refactor branch to verify that method = "existing" (or the legacy default) reproduces prior behavior byte-identically. Contents: - README.md documents the capture/compare workflow. - validate_backward_compat.R provides capture_snapshot() and compare_to_snapshot() entry points. Iterates PARAM_SETS covering Atlanta+race, national no-geog, and Atlanta no-race. Uses a fixed seed so stochastic bits of build_netstats() are reproducible, and strips additive fields (e.g. \$joint_model) before comparison. - epimodelhiv_template_ref/ pins verbatim copies of the downstream consumer scripts from EpiModelHIV-Template/R/A-networks/ so the backward-compat contract is explicit and does not drift silently. - netstats_contract.md distills exactly which netstats fields the template ERGM specs read. - snapshots/*.rds is gitignored (large, local).
4 tasks
smjenness
added a commit
that referenced
this pull request
Apr 25, 2026
A 2,800-word standalone writeup at inst/validation/method_refactor_report.md documenting the methodological refactor delivered by PRs #66-#77. Structured as introduction / methods / results / discussion + references + reproducibility section. Sections cover: - Intro: ARTnet's role in EpiModelHIV-p; the marginal-vs-joint problem the legacy univariate approach exposed; the ARTnetPredict motivation for fixing the within-ARTnet baseline before forward projection. - Methods: the three new arguments (`method`, `duration.method`, `target_pop`); per-layer joint Poisson + binomial + Gaussian + log-linear fits; g-computation aggregation in build_netstats; the cross-sectional age-of-extant-ties target for dissolution; the validation infrastructure (snapshot harness, method comparison, GHA CI). - Results: 229/363 cells (63%) shift > 5% across four scenarios; worst shifts on dissolution durations in matched-and-old strata (-47%), one-time nodematch in older age groups (-51%), and high-deg.main casual nodefactor (+40%); decomposition of the -15% Atlanta main-edges shift attributed to ARTnet's 80.7% White vs Atlanta's 51.5% Black composition; coefficient strengthening on deg.casl (-0.24 -> -0.55), hiv2 (+0.09 -> +0.25), age slope, and the AIC-selected age:deg.casl interaction; end-to-end ERGM convergence with netdx |Z| <= 2.05 across 1000 sims. - Discussion: implications for EpiModelHIV-p simulations (Atlanta-specific models over-target main edges by 15%); three explicit limitations (geometric tergm dissolution can't honor Weibull k != 1, length-bias and 5-truncation in formation stats not yet addressed in #72, joint_lm uses ongoing partnerships only); ARTnetPredict's three unblocked next steps (corrected 2017-18 baseline, 2022-24 AMIS projection via target_pop data.frame, NHBS post-stratification as a one-line argument); methods paper outline. Numbers cited are spot-checked against the committed inst/validation/method_comparison.md to ensure the report and the machine-generated comparison agree. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sets up
inst/validation/as the pre/post regression harness for the upcoming joint g-computation refactor (issues #61-#65). Locks in a golden reference ofbuild_netparams()+build_netstats()output from currentmainso future refactors can prove byte-for-byte backward compatibility undermethod = "existing"(or whatever the legacy flag ends up being).validate_backward_compat.R—capture_snapshot()andcompare_to_snapshot()entry points. IteratesPARAM_SETS(Atlanta+race, national no-geog, Atlanta no-race), seeds RNG for reproducibility, strips additive fields ($joint_model) before diffing, usesall.equal(tolerance = 0).epimodelhiv_template_ref/— verbatim pinned copies of the downstream consumer scripts (initialize.R,model_{main,casl,ooff}.R) fromEpiModelHIV-Template/R/A-networks/so the backward-compat contract is explicit.netstats_contract.md— distilled list of exactly whichnetstatsfields those scripts read.README.md— the capture / compare workflow..gitignoreupdated: snapshot.rdsfiles stay local (~12 MB each, not worth committing).Sanity-run locally: all 3 param sets captured successfully and identity-compare reports ALL MATCH.
Test plan
capture_snapshot()runs cleanly against ARTnetData onmain-era codecompare_to_snapshot()reports ALL MATCH (identity check)method = "existing"default, re-run compare and confirm ALL MATCH before merging the refactorRef: CLAUDE.md §4.7.