R C Shell
Switch branches/tags
Nothing to show
Clone or download

README.md

Build Status Coverage Status drat version CRAN DownloadsMentioned in Awesome Official Statistics

simputation

An R package to make imputation simple. Currently supported methods include

  • Model based (optionally add [non-]parametric random residual)
    • linear regression
    • robust linear regression (M-estimation)
    • ridge/elasticnet/lasso regression (from version >= 0.2.1)
    • CART models
    • Random forest
  • Model based, multivariate
    • Imputation based on EM-estimated parameters (from version >= 0.2.1)
    • missForest (from version >= 0.2.1)
  • Donor imputation (including various donor pool specifications)
    • k-nearest neigbour (based on gower's distance)
    • sequential hotdeck (LOCF, NOCB)
    • random hotdeck
    • Predictive mean matching
  • Other
    • (groupwise) median imputation (optional random residual)
    • Proxy imputation (copy from other variable)

Installation

To install simputation and all packages needed to support various imputation models do the following.

install.packages("simputation", dependencies=TRUE)

Example usage

Create some data suffering from missings

library(simputation) # current package
library(magrittr)    # for the %>% not-a-pipe operator
dat <- iris
# empty a few fields
dat[1:3,1] <- dat[3:7,2] <- dat[8:10,5] <- NA
head(dat,10)

Now impute Sepal.Length and Sepal.Width by regression on Petal.Length and Species, and impute Species using a CART model, that uses all other variables (including the imputed variables in this case).

dat %>% 
  impute_lm(Sepal.Length + Sepal.Width ~ Petal.Length + Species) %>%
  impute_cart(Species ~ .) %>% # use all variables except 'Species' as predictor
  head(10)

Materials

Beta versions can be installed from my drat repo. If you use the OS whose name shall not be spoken, first install Rtools.

if(!require(drat)) install.packages("drat")
drat::addRepo("markvanderloo")
install.packages("simputation",type="source")