Method comparison validation suite (#65, Phase 1.5) by smjenness · Pull Request #76 · EpiModel/ARTnet

smjenness · 2026-04-25T18:23:22Z

Closes #65. Now that the joint g-comp refactor is fully landed (PRs #67 / #68 / #69 / #71 / #74), this PR adds the systematic marginal-vs-joint comparison the validation phase calls for.

What's new

inst/validation/method_comparison.R — three entry points, designed to be sourced via system.file():

source(system.file("validation/method_comparison.R", package = "ARTnet"))
res <- compare_methods()              # full suite, ~30s
summarize_comparison(res)             # console summary
render_comparison_report(res)         # writes inst/validation/method_comparison.md

The harness walks every numeric target stat in netstats per layer (edges, nodefactor_, nodematch_, absdiff_*, concurrent, dissolution_duration) and produces a long-format data.frame with one row per (scenario, layer, stat, level) cell — existing and joint values side-by-side, plus abs_diff and pct_diff.

Four scenarios:

Scenario	Description
`atlanta_default`	Baseline EpiModelHIV-Template config (Atlanta, race = TRUE)
`national_no_geog`	No geographic stratification (sanity check)
`atlanta_nhbs_shifted`	Atlanta config with `race.prop = c(0.35, 0.25, 0.40)` (NHBS-MSM-like)
`atlanta_no_race`	`race = FALSE` path (sanity check)

inst/validation/method_comparison.md is the canonical rendered report of this run, committed to the repo so anyone can read the comparison without running the harness. It will feed methods-paper drafting work directly.

Headline findings (full breakdown in `method_comparison.md`)

363 target-stat cells across 4 scenarios
229 cells (~63%) shift > 5% under joint vs existing
Per scenario:
- atlanta_default: 63 / 96 cells materially shifted
- national_no_geog: 51 / 96
- atlanta_nhbs_shifted: 66 / 96 (slightly more than default — NHBS race mix stresses the correction further)
- atlanta_no_race: 49 / 75 (race off, but dyad-level corrections still drive the divergence)

Largest shifts cluster around:

dissolution_duration matched-and-old strata (matched.5 main: −47% across all scenarios). This is the multivariate-fit-vs-stratum-only-empirical-with-smoothing tradeoff documented in PR Duration methods: empirical + joint_lm (#63 phase 3) #71.
inst$nodematch_age.grp[5] (small targets dominated by AIC-selected interactions in Joint dyad-level modeling: nodematch + absdiff (#63 phases 1 & 2) #69's joint dyad fits).
casl$nodefactor_deg.main[3] (+35–43%): high-deg.main outliers are sparse in ARTnet, so Poisson joint fit produces noticeably different per-stratum sums than the marginal × deg.main.dist multiplication.

What it confirms

The joint refactor is doing material work — not a no-op. Substantial fraction of every layer's target stats are corrected.
The correction direction is consistent across scenarios (e.g., main dissolution durations always pulled down for older matched strata, whether we're looking at default Atlanta, NHBS-shifted, or national).
atlanta_nhbs_shifted sees more material shifts than atlanta_default, consistent with the marginal-vs-joint critique: the bigger the gap between target-population and ARTnet sample composition, the more the correction matters.

Tests

tests/testthat/test-method-comparison.R (4 blocks): structure check, arithmetic check, dissolution_duration scoping, baseline divergence sanity. Uses a 2000-node mini scenario for speed; full suite stays at 5000.

Test plan

Backward-compat snapshot harness 3/3
Comparison runs without error across all 4 scenarios
Output structure matches spec (long format, all expected stats)
Unit tests pass (4/4); full suite 546/546 passes
R CMD check 0/0/0
inst/validation/method_comparison.md rendered and committed

Closes #65. Logical follow-up: #64 (post-stratification API) — with these comparison results in hand, we know the magnitudes of population-shift effects and can size the API surface accordingly.

Systematic side-by-side comparison of method = "existing" vs method = "joint" + duration.method = "joint_lm" across four scenarios: - atlanta_default Baseline EpiModelHIV-Template config - national_no_geog No geographic stratification (sanity) - atlanta_nhbs_shifted Atlanta with race.prop = c(0.35, 0.25, 0.40) - atlanta_no_race race = FALSE path (sanity) Adds inst/validation/method_comparison.R with three entry points: source(system.file("validation/method_comparison.R", package = "ARTnet")) res <- compare_methods() # runs full suite (~30s) summarize_comparison(res) # console summary, top shifts render_comparison_report(res) # writes inst/validation/method_comparison.md The harness walks every numeric target stat in netstats per layer (edges, nodefactor_*, nodematch_*, absdiff_*, concurrent, dissolution_duration) and produces a long-format data.frame with columns scenario, layer, stat, level, existing, joint, abs_diff, pct_diff. Output is suitable both for interactive inspection and for serializing to publications/vignettes. Headline findings (full table in method_comparison.md): - 363 target-stat cells across 4 scenarios - 229 (~63%) shift > 5% under joint vs existing - Largest shifts on dissolution_duration (matched.5 main: -47%) and inst nodematch_age.grp[5] (-51% in atlanta_default) - Population shift (atlanta_nhbs_shifted) produces 66/96 cells >5% shifted, slightly more than atlanta_default's 63/96 -- the NHBS-like race composition stresses the marginal-vs-joint correction further than Atlanta's default - atlanta_no_race shifts 49/75 cells; without race in the joint formulas, the dyad-level corrections still dominate Inst/validation/method_comparison.md is the canonical report that will feed any methods-paper drafting work. New tests in tests/testthat/test-method-comparison.R (4 blocks): verify long-format structure; abs_diff and pct_diff arithmetic; dissolution_duration restricted to main/casl; at least one cell materially shifted on Atlanta default. Use a 2000-node mini scenario for speed; full suite in inst/validation/ stays at 5000. Closes #65 (Phase 1.5 validation suite). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

smjenness merged commit 4c539b7 into main Apr 25, 2026
1 check passed

smjenness deleted the feature/method-comparison-validation branch April 25, 2026 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method comparison validation suite (#65, Phase 1.5)#76

Method comparison validation suite (#65, Phase 1.5)#76
smjenness merged 1 commit intomainfrom
feature/method-comparison-validation

smjenness commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

smjenness commented Apr 25, 2026

What's new

Headline findings (full breakdown in method_comparison.md)

What it confirms

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Headline findings (full breakdown in `method_comparison.md`)