This repository contains the implementation of the ICML 2023 paper [1].
The dependencies are specified in the file of requirements.txt
.
-
The input data is an N by D matrix, the missing values of which are indicated by numpy.nan (N and D are the number of data samples and the feature dimensions respectively).
-
The completed data can also be provided (for evaluation only, not necessary), i.e., another an N by D matrix with the missing values filled.
-
We provide a dataset used in our paper, named
seeds
under the folder ofdatasets
, which is preprocessed from UCI Datasets.
Simply run demo.py
.
We gratefully thank the authors for the following software and datasets
- UCI Datasets (We used the datasets for evaluation)
- MissingDataOT (We used the functions of data preparation, mask generation, and evaluation)
- hyperimpute (We used the implementations of several baselines)
- FrEIA (We used the implementations of the invertible neural networks)
[1] He Zhao, Ke Sun, Amir Dezfouli, Edwin V. Bonilla, Transformed Distribution Matching for Missing Value Imputation, ICML 2023.
@inproceedings{zhao2023transformed,
title={Transformed Distribution Matching for Missing Value Imputation},
author={Zhao, He and Sun, Ke and Dezfouli, Amir and Bonilla, Edwin V},
booktitle={International Conference on Machine Learning},
pages={42159--42186},
year={2023},
organization={PMLR}
}
All the authors of the paper are with CSIRO's Data61.
The code comes without support.