An R package to make imputation simple. Currently supported methods include

  • Model based (optionally add [non-]parametric random residual)
    • linear regression
    • robust linear regression (M-estimation)
    • ridge/elasticnet/lasso regression (from version >= 0.2.1)
    • CART models
    • Random forest
  • Model based, multivariate
    • Imputation based on EM-estimated parameters (from version >= 0.2.1)
    • missForest (from version >= 0.2.1)
  • Donor imputation (including various donor pool specifications)
    • k-nearest neigbour (based on gower's distance)
    • sequential hotdeck (LOCF, NOCB)
    • random hotdeck
    • Predictive mean matching
  • Other
    • (groupwise) median imputation (optional random residual)
    • Proxy imputation (copy from other variable)


To install simputation and all packages needed to support various imputation models do the following.

install.packages("simputation", dependencies=TRUE)

Example usage

Create some data suffering from missings

library(simputation) # current package
library(magrittr)    # for the %>% not-a-pipe operator
dat <- iris
# empty a few fields
dat[1:3,1] <- dat[3:7,2] <- dat[8:10,5] <- NA

Now impute Sepal.Length and Sepal.Width by regression on Petal.Length and Species, and impute Species using a CART model, that uses all other variables (including the imputed variables in this case).

dat %>% 
  impute_lm(Sepal.Length + Sepal.Width ~ Petal.Length + Species) %>%
  impute_cart(Species ~ .) %>% # use all variables except 'Species' as predictor


Beta versions can be installed from my drat repo. If you use the OS whose name shall not be spoken, first install Rtools.

if(!require(drat)) install.packages("drat")