Code for the paper "FlipTest: Fairness Testing via Optimal Transport". Currently, the code supports exact optimal transport mappings for the Lipton synthetic hiring dataset (Lipton et al., 2018) and the Strategic Subject List (City of Chicago, 2017), as well as GAN approximations for the Lipton hiring dataset.
Links to the paper
How to run the code
You will need an installation of Python 3 with some commonly used data analysis packages, such as
Exact optimal transport
Requires Gurobi. This is proprietary software, but free licenses are available for academic users.
Go to the
exact-ot/ directory and run
python main.py lipton or
python main.py ssl to find the exact optimal transport mapping on the Lipton hiring dataset or the Strategic Subject List, respectively. Alternatively, you can import
main.py and run
X1, X2, y1, y2, columns, forward, reverse = run_lipton() #or run_ssl()
to load the mapping into the current namespace.
X2 are 2-D numpy arrays of the input features,
y2 are 1-D numpy arrays containing the response, and
columns is a list of the feature names.
reverse are defined by the following relation: if
X1[i] maps to
X2[j] under the optimal transport mapping, then
forward[i] = j and
reverse[j] = i.
gan/ directory, there are Jupyter notebooks containing the results of the GAN experiments on the Lipton hiring dataset. Due to GPU nondeterminism, these results are slightly different from those reported in the paper. The notebooks can be rerun with TensorFlow 2.0 if desired.
(Lipton et al., 2018) Zachary Lipton, Julian McAuley, and Alexandra Chouldechova. Does mitigating ML's impact disparity require treatment disparity? Neural Information Processing Systems, 2018.
(City of Chicago, 2017) City of Chicago. Strategic Subject List. https://data.cityofchicago.org/Public-Safety/Strategic-Subject-List/4aki-r3np, 2017.