Context
build_netstats() currently assembles a synthetic target population from a mix of reference sources:
age: NCHS 2020 general-population pyramid (not MSM-specific)
race: ARTnetData::race.dist national or city-specific (not MSM-specific)
deg.casl: ARTnet sample's own deg.casl.dist
deg.main: ARTnet sample's own deg.main.dist
role.class: ARTnet sample distribution
risk.grp: uniform (5 equal quintiles)
This is a patchwork — no single coherent post-stratification target. If a user wants to parametrize for, say, CDC NHBS 2023 MSM demographics, there is no clean API to do so.
Proposed approach
Add a target_pop argument to build_netstats() that accepts either:
-
A named list of marginal distributions (current behavior as default):
target_pop = list(
age.pyramid = full.age.pyr, # length = nAges
race.props = c(Black=0.15, Hispanic=0.20, White.Other=0.65),
deg.casl = c(0.45, 0.30, 0.15, 0.10),
deg.main = c(0.60, 0.35, 0.05),
role.class = c(0.18, 0.27, 0.55),
risk.grp = rep(0.2, 5)
)
-
A pre-built data frame of synthetic respondents (user has their own joint distribution):
target_pop = my_synthetic_pop # data.frame with age, race, deg.casl, etc.
-
A built-in reference (character flag):
target_pop = 'nhbs_msm_2022' # package-provided MSM demographics
For #3, we'd add built-in reference population data to ARTnetData (CDC NHBS or similar).
Tasks
Acceptance criteria
build_netstats(..., target_pop = NULL) produces current output byte-identically.
build_netstats(..., target_pop = user_df) uses the user's joint distribution.
- At least one built-in reference MSM population is available.
Related
Context
build_netstats()currently assembles a synthetic target population from a mix of reference sources:age: NCHS 2020 general-population pyramid (not MSM-specific)race:ARTnetData::race.distnational or city-specific (not MSM-specific)deg.casl: ARTnet sample's owndeg.casl.distdeg.main: ARTnet sample's owndeg.main.distrole.class: ARTnet sample distributionrisk.grp: uniform (5 equal quintiles)This is a patchwork — no single coherent post-stratification target. If a user wants to parametrize for, say, CDC NHBS 2023 MSM demographics, there is no clean API to do so.
Proposed approach
Add a
target_popargument tobuild_netstats()that accepts either:A named list of marginal distributions (current behavior as default):
A pre-built data frame of synthetic respondents (user has their own joint distribution):
A built-in reference (character flag):
For #3, we'd add built-in reference population data to
ARTnetData(CDC NHBS or similar).Tasks
target_popargument API (three-option: list / data.frame / character).ARTnetData(CDC NHBS MSM or similar) — coordinate with Sam on data source.Acceptance criteria
build_netstats(..., target_pop = NULL)produces current output byte-identically.build_netstats(..., target_pop = user_df)uses the user's joint distribution.Related