regrake provides an interface for regularized raking in R.
This more general formulation of the weighting problem, following Barratt et al. (2021)'s approach, enables more flexible functional forms in adherence to population targets, meaningful regularization, and ultimately more expressive and efficient survey weights.
remotes::install_github("andytimm/regrake")library(regrake)
set.seed(605)
n <- 500
sample_data <- data.frame(
sex = sample(c("F", "M"), n, replace = TRUE, prob = c(0.55, 0.45)),
age_group = sample(c("18-34", "35-54", "55+"), n, replace = TRUE,
prob = c(0.40, 0.35, 0.25)),
income = rnorm(n, mean = 58000, sd = 14000)
)
# autumn-style target table: variable / level / target
pop_targets <- data.frame(
variable = c("sex", "sex", "age_group", "age_group", "age_group", "income"),
level = c("F", "M", "18-34", "35-54", "55+", "mean"),
target = c(0.51, 0.49, 0.30, 0.40, 0.30, 62000)
)
fit <- regrake(
data = sample_data,
formula = ~ rr_exact(sex) + rr_exact(age_group) + rr_mean(income),
population_data = pop_targets,
pop_type = "proportions",
regularizer = "entropy",
bounds = c(0.3, 3)
)
fit
head(fit$weights)
fit$balanceConstraint helpers include:
rr_exact(): exact matchingrr_l2(): least-squares matchingrr_kl(): KL matchingrr_range()/rr_between(): bounded matchingrr_mean(): continuous mean matchingrr_var(): continuous variance matchingrr_quantile(x, p): quantile matching
Interactions are supported with : (for example rr_l2(sex:age_group)).
regrake() supports:
pop_type = "proportions": autumn-style table withvariable,level,targetpop_type = "raw": one row per population unitpop_type = "weighted": population microdata plus a weight columnpop_type = "anesrake": named list of numeric vectorspop_type = "survey": margin/category/value tablepop_type = "survey_design":survey.designobject
A fitted object contains:
weights: calibrated weights (sum to sample size)balance: achieved vs target values by constraintdiagnostics: convergence and weight-quality diagnosticssolution: solver internals
- Barratt et al. (2021): overview of the underlying ADMM optimization formulation.
- NYOSPM regrake materials: broader motivation and practical context for survey weighting.
1.0.0 released; should get this bad boy on CRAN soon.
Apache License 2.0.