Skip to content
/ mia Public

Marginalization over Incomplete Auxiliaries

Notifications You must be signed in to change notification settings

stmcg/mia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mia

R-CMD-check Codecov test coverage

The mia package implements methods to estimate conditional outcome means in settings with missingness-not-at-random and incomplete auxiliary variables. Specifically, this package implements the marginalization over incomplete auxiliaries (MIA) method.

Installation

You can install the development version of mia from GitHub with:

# install.packages("devtools")
devtools::install_github("stmcg/mia")

Example

We first load the package.

library(mia)

Data Set

We will use the example dataset dat.sim included in the package. The dataset contains 9,297 observations with a continuous outcome Y, a binary auxiliary variable W, and binary predictors X1 and X2. The first 10 rows of dat.sim are:

dat.sim[1:10,]
#>           Y X1 X2  W
#> 1        NA  0 NA  0
#> 2        NA  1  1 NA
#> 3  6.066826  1  1  1
#> 4  6.113787  1  1  1
#> 5        NA  1  1 NA
#> 6        NA NA NA  0
#> 7        NA NA NA  0
#> 8        NA  1 NA  0
#> 9  6.439700  1  1 NA
#> 10 6.859992  1 NA  1

MIA Method

The MIA method estimates the conditional outcome mean $\mu_{\text{MIA}}(x)$, which is identified by $$\int_{w} E [ Y | X=x, W=w, M=1 ] p( w | X=x, R_W = R_X = 1 ) dw$$ where $R_W$ and $R_X$ are indicators of non-missing values of $W$ and $X$, respectively, and $M$ is an indicator of a complete case pattern (i.e., $Y$, $X$, and $W$ are non-missing). The MIA method estimates the identifying functional by fitting models for the conditional mean of $Y$ and conditional density of $W$ and performing Monte Carlo integration to compute the integral.

The function mia implements the MIA method to obtain point estimates of the identifying functionals of $\mu_{\text{MIA}}(x_1)$ and $\mu_{\text{MIA}}(x_2)$ as well as contrasts between them (differences, ratios). This function requires specifying the following regression models:

  • Y_model: Formula for the outcome model
  • W_model: Formula for the auxiliary model when the auxiliary variable is univariate, or a list of formulas for each component of the auxiliary variable

It also requires specifying the names of the variable(s) $X$ by X_names and their values $x_1$ and $x_2$ by X_values_1 and X_values_2, respectively.

An application of mia to estimate $\mu_{\text{MIA}}((0, 1)^\top)$ and $\mu_{\text{MIA}}((0, 0)^\top)$ as well as their difference is given below. Note that we set a random number seed because the function involves performing Monte Carlo integration.

set.seed(1234)
res <- mia(data = dat.sim,
           X_names = c("X1", "X2"), 
           X_values_1 = c(0, 1), X_values_2 = c(0, 0),
           Y_model = Y ~ W + X1 + X2, W_model = W ~ X1 + X2)
res
#> MIA METHOD FOR CONDITIONAL MEAN ESTIMATION
#> ==========================================
#> 
#> Setting:
#>   Outcome variable type:       continuous
#>   Auxiliary variable(s) type:  binary (W)
#> 
#> Results:
#>   Predictor values:            X1=0, X2=1
#>   Mean estimate:               2.1335
#> 
#>   Predictor values:            X1=0, X2=0
#>   Mean estimate:               -0.1636
#> 
#>   Mean difference estimate:    2.2971

We can obtain 95% confidence intervals around our estimates by applying the get_CI function to the output of the mia function. The get_CI function performs nonparametric bootstrap. Here, we use the percentile method with 100 bootstrap replicates for ease of computation.

get_CI(res, n_boot = 100, type = 'perc')
#> BOOTSTRAP CONFIDENCE INTERVALS FOR MIA METHOD
#> =============================================
#> 
#> Setting:
#>   Confidence level:        0.95
#>   Interval type:           perc
#>   Number of replicates:    100
#> 
#> Results:
#>   Predictor values:        X1=0, X2=1
#>   CI for mean:             (2.0350, 2.2495)
#> 
#>   Predictor values:        X1=0, X2=0
#>   CI for mean:             (-0.2638, -0.0588)
#> 
#>   CI for difference:       (2.1565, 2.4539)

About

Marginalization over Incomplete Auxiliaries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages