Skip to content

Reproducibility code for Differentially Private Permutation Tests: Applications to Kernel Methods, by Kim and Schrab

License

Notifications You must be signed in to change notification settings

antoninschrab/dpkernel-paper

Repository files navigation

Reproducibility code for dpMMD and dpHSIC (Differentially Private Permutation Test)

This GitHub repository contains the code for the reproducible experiments presented in our paper Differentially Private Permutation Tests: Applications to Kernel Methods.

The code is written in JAX which can leverage the architecture of GPUs to provide considerable computational speedups.

Dependencies

  • python 3.9

The packages in env_cpu.yml/env_gpu.yml are required to run our tests and the ones we compare against.

The instructions to install only the dependencies required to run dpMMD and dpHSIC are available on the dpkernel repository.

Installation

In a chosen directory, clone the repository and change to its directory by executing

git clone git@github.com:antoninschrab/dpkernel-paper.git
cd dpkernel-paper

We then recommend creating a conda environment with the required dependencies:

  • for GPU:
    conda env create -f env_gpu.yml
    conda activate dpkernel-env
    # can be deactivated by running:
    # conda deactivate
  • or, for CPU:
    conda env create -f env_cpu.yml
    conda activate dpkernel-env
    # can be deactivated by running:
    # conda deactivate

Downloading CelebA dataset

The img_align_celeba.zip file containing the CelebA images can either, be downloaded manually from the official website, or be downloaded by executing

wget https://cseweb.ucsd.edu/\~weijian/static/datasets/celeba/img_align_celeba.zip

The file then needs to be unzipped in the main dpkernel-paper directory by running

unzip img_align_celeba.zip

The resulting img_align_celeba directory then contains all the CelebA images.

The list of attributes for the CelebA images is already provided as the list_attr_celeba.txt file.

Reproducibility of the experiments

The code to reproduce the experiments of the paper can be found in the following notebooks:

These rely on the samplers sampler_perturbations.py and sampler_celeba.py, and can, for example, be opened through Jupyter Lab by running

jupyter lab

The results of the experiments are saved in the results directory. Running the code of the figures.ipynb notebook generates the figures and saves them in the figures directory.

How to use dpMMD and dpHSIC in practice?

Our proposed dpMMD and dpHSIC tests are implemented in dpkernel.py.

To use our tests in practice, we recommend using our dpkernel package which is available on the dpkernel repository. It can be installed by running

pip install git+https://github.com/antoninschrab/dpkernel.git

Installation instructions and example code are available on the dpkernel repository.

We also illustrate how to use the dpMMD and dpHSIC tests in the demo.ipynb notebook.

References

We implement two general methods for privatising the non-private MMD and HSIC tests, which we refer to as TOT (tot.py) and SARRM (sarrm.py):

We also compare dpMMD to the test of A Differentially Private Kernel Two-Sample Test, Anant Raj, Ho Chung Leon Law, Dino Sejdinovic, Mijung Park, using their implementation available on the private_tst repository corresponding to the cloned directory private_me.

Contact

If you have any issues running our code, please do not hesitate to contact Antonin Schrab.

Affiliations

Centre for Artificial Intelligence, Department of Computer Science, University College London

Gatsby Computational Neuroscience Unit, University College London

Inria London

Bibtex

@unpublished{kim2023differentially,
title={Differentially Private Permutation Tests: {A}pplications to Kernel Methods}, 
author={Ilmun Kim and Antonin Schrab},
year={2023},
url = {https://arxiv.org/abs/2310.19043},
eprint={2310.19043},
archivePrefix={arXiv},
primaryClass={math.ST}
}

License

MIT License (see LICENSE.md).

Related tests

  • mmdagg: MMD Aggregated MMDAgg test
  • ksdagg: KSD Aggregated KSDAgg test
  • agginc: Efficient MMDAggInc HSICAggInc KSDAggInc tests
  • mmdfuse: MMD-Fuse test
  • dckernel: Robust to Data Corruption dcMMD dcHSIC tests

About

Reproducibility code for Differentially Private Permutation Tests: Applications to Kernel Methods, by Kim and Schrab

Topics

Resources

License

Stars

Watchers

Forks