Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
doc
 
 
 
 
 
 
src
 
 
 
 
 
 

Differentially Private Database Release via Kernel Mean Embeddings

Matej Balog, Ilya Tolstikhin, Bernhard Schölkopf

35th International Conference on Machine Learning (ICML 2018)

[PDF] [arXiv]

This repository contains scripts to reproduce the experiments appearing in this academic paper.

Setup

Conda environment setup:

conda create -n RKHS-private-database python=3.6.3 matplotlib=2.1.0 numpy=1.13.3 pytorch=0.2.0 scikit-learn=0.19.0
source activate RKHS-private-database

Data generation

Two synthetic data files were used to generate the plots in the paper:

  • D=2: data/mixture_of_Gaussians_N100000_D2{.npz, .json}
  • D=5: data/mixture_of_Gaussians_N100000_D5{.npz, .json}

You can re-generate these files yourself by executing:

python data.py 100000 2
python data.py 100000 5

Experiments

Figure 1 ("Publishable subset" experiments)

Results of the experiments shown in Figure 1 are stored in the two files

  • D=2: results/D2_alg1_leak_M10000.json
  • D=5: results/D5_alg1_leak_M10000.json

You can re-generate these files by re-running the respective experiments as follows:

python experiments.py ../data/mixture_of_Gaussians_N100000_D2 leak --M 10000 1
python experiments.py ../data/mixture_of_Gaussians_N100000_D5 leak --M 10000 1

To then re-generate the plots shown in Figure 1, execute:

python plot.py --alg1 ../results/D2_alg1_leak_M10000.json --path_save ../figures/leaksD2
python plot.py --alg1 ../results/D5_alg1_leak_M10000.json --path_save ../figures/leaksD5
figures/leaksD2 figures/leaksD5
Figure 1 Figure 1

Figure 2 ("No publishable subset" experiments)

To re-run the experiments shown in Figure 2:

python experiments.py ../data/mixture_of_Gaussians_N100000_D2 random --M 10000 1
python experiments.py ../data/mixture_of_Gaussians_N100000_D5 random --M 10000 1
python experiments.py ../data/mixture_of_Gaussians_N100000_D2 random --M 10000 2
python experiments.py ../data/mixture_of_Gaussians_N100000_D5 random --M 10000 2

To then re-generate the plots shown in Figure 2, execute:

python plot.py --alg1 ../results/D2_alg1_random_M10000.json --alg2 ../results/D2_alg2_random_M10000.json --path_save ../figures/nodataD2
python plot.py --alg1 ../results/D5_alg1_random_M10000.json --alg2 ../results/D5_alg2_random_M10000.json --path_save ../figures/nodataD5
figures/nodataD2 figures/nodataD5
Figure 2 Figure 2

BibTeX

@inproceedings{balog2018privacy,
  author = {Balog, Matej and Tolstikhin, Ilya and Sch\"olkopf, Bernhard},
  title = {Differentially {Private} {Database} {Release} via {Kernel} {Mean} {Embeddings}},
  booktitle = {35th International Conference on Machine Learning (ICML)},
  year = {2018},
  month = {July}
}

About

Code to reproduce ICML 2018 paper "Differentially Private Database Release via Kernel Mean Embeddings"

Resources

License

Releases

No releases published

Packages

No packages published