The mia package implements methods to estimate conditional outcome
means in settings with missingness-not-at-random and incomplete
auxiliary variables. Specifically, this package implements the
marginalization over incomplete auxiliaries (MIA) method.
You can install the development version of mia from
GitHub with:
# install.packages("devtools")
devtools::install_github("stmcg/mia")We first load the package.
library(mia)We will use the example dataset dat.sim included in the package. The
dataset contains 9,297 observations with a continuous outcome Y, a
binary auxiliary variable W, and binary predictors X1 and X2. The
first 10 rows of dat.sim are:
dat.sim[1:10,]
#> Y X1 X2 W
#> 1 NA 0 NA 0
#> 2 NA 1 1 NA
#> 3 6.066826 1 1 1
#> 4 6.113787 1 1 1
#> 5 NA 1 1 NA
#> 6 NA NA NA 0
#> 7 NA NA NA 0
#> 8 NA 1 NA 0
#> 9 6.439700 1 1 NA
#> 10 6.859992 1 NA 1The MIA method estimates the conditional outcome mean
The function mia implements the MIA method to obtain point estimates
of the identifying functionals of
Y_model: Formula for the outcome modelW_model: Formula for the auxiliary model when the auxiliary variable is univariate, or a list of formulas for each component of the auxiliary variable
It also requires specifying the names of the variable(s) X_names and their values X_values_1 and
X_values_2, respectively.
An application of mia to estimate
set.seed(1234)
res <- mia(data = dat.sim,
X_names = c("X1", "X2"),
X_values_1 = c(0, 1), X_values_2 = c(0, 0),
Y_model = Y ~ W + X1 + X2, W_model = W ~ X1 + X2)
res
#> MIA METHOD FOR CONDITIONAL MEAN ESTIMATION
#> ==========================================
#>
#> Setting:
#> Outcome variable type: continuous
#> Auxiliary variable(s) type: binary (W)
#>
#> Results:
#> Predictor values: X1=0, X2=1
#> Mean estimate: 2.1335
#>
#> Predictor values: X1=0, X2=0
#> Mean estimate: -0.1636
#>
#> Mean difference estimate: 2.2971We can obtain 95% confidence intervals around our estimates by applying
the get_CI function to the output of the mia function. The get_CI
function performs nonparametric bootstrap. Here, we use the percentile
method with 100 bootstrap replicates for ease of computation.
get_CI(res, n_boot = 100, type = 'perc')
#> BOOTSTRAP CONFIDENCE INTERVALS FOR MIA METHOD
#> =============================================
#>
#> Setting:
#> Confidence level: 0.95
#> Interval type: perc
#> Number of replicates: 100
#>
#> Results:
#> Predictor values: X1=0, X2=1
#> CI for mean: (2.0350, 2.2495)
#>
#> Predictor values: X1=0, X2=0
#> CI for mean: (-0.2638, -0.0588)
#>
#> CI for difference: (2.1565, 2.4539)