Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 903 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 903 Bytes

Can a Python package do what mice can?

Missing data frequently complicate data analysis. A robust technique for addressing missing data is multiple imputation. In R, multiple imputation is commonly implemented through the mice package which utilizes the multiple imputation by chained equations (MICE) algorithm. It solves the missing data problem iteratively on a variable-by-variable basis and can yield unbiased and confidence valid inferences under many missing data conditions. However, such a standard choice is not yet established for Python.

This repository contains code for a model-based simulation study that is used to evaluate different Python imputation methods under different missingness mechanisms and proportions to whether they can produce valid inferences. The Python imputation methods KNNImputer, IterativeImputer, miceforest and MIDASpy are considered.