To Impute or not to Impute? Missing Data in Treatment Effect Estimation
_{_{J. Berrevoets, F. Imrie, T. Kyono, J. Jordon, M. van der Schaar [AISTATS 2023]}}

In this repository we provide code for our AISTATS23 paper introducing MCM, a novel missingness mechanism for treatment effect inference. Note that this code is used for research purposes and is not intented for use in practice.

Code author: J. Berrevoets (jb2384@cam.ac.uk)

Installation

pip install -r requirements.txt

Repository structure

This repository is organised as follows:

mcm/
    |- src/
        |- data/
            |- data_module.py               # code to simulate MCM data
            |- utils.py                     # code to split data
    |- notebooks/
        |- <experiment>.ipynb               # dedicated notebook for experiment
        |- simple_setup.ipynb               # self contained notebook with basic experiment

Please use the above in a newly created virtual environment to avoid clashing dependencies.

Citing

If you use this code, please cite the associated paper:


@InProceedings{mcm23,
  title = 	 {To Impute or not to Impute? Missing Data in Treatment Effect Estimation},
  author =       {Berrevoets, Jeroen and Imrie, Fergus and Kyono, Trent and Jordon, James and van der Schaar, Mihaela},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3568--3590},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/berrevoets23a/berrevoets23a.pdf},
  url = 	 {https://proceedings.mlr.press/v206/berrevoets23a.html},
  abstract = 	 {Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the input (e.g. an individual) and the label (e.g. an outcome). The treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we introduce mixed confounded missingness (MCM), a new missingness mechanism where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment introduces bias in covariates. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data. We highlight that our experiments encompass both average treatment effects and conditional average treatment effects.}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
causalml @ 13ae257		causalml @ 13ae257
notebooks		notebooks
src/data		src/data
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

causalml @ 13ae257

causalml @ 13ae257

notebooks

notebooks

src/data

src/data

.gitignore

.gitignore

.gitmodules

.gitmodules

.pre-commit-config.yaml

.pre-commit-config.yaml

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

To Impute or not to Impute? Missing Data in Treatment Effect Estimation
_{_{J. Berrevoets, F. Imrie, T. Kyono, J. Jordon, M. van der Schaar [AISTATS 2023]}}

Installation

Repository structure

Citing

About

Releases

Packages

Contributors 2

Languages

License

jeroenbe/mcm

Folders and files

Latest commit

History

Repository files navigation

To Impute or not to Impute? Missing Data in Treatment Effect Estimation J. Berrevoets, F. Imrie, T. Kyono, J. Jordon, M. van der Schaar [AISTATS 2023]

Installation

Repository structure

Citing

About

Resources

License

Stars

Watchers

Forks

Languages

To Impute or not to Impute? Missing Data in Treatment Effect Estimation
_{_{J. Berrevoets, F. Imrie, T. Kyono, J. Jordon, M. van der Schaar [AISTATS 2023]}}