Differentially private classification of MNIST in PyTorch. This project replicates the results from arXiv:1607.00133v2 [1].
The goal of the project is to train a simple classifier on the MNIST dataset, but in a differentially private way. A full explanation of the experiment and the used techniques can be found in [1]. As a quick overview, the experiment includes following steps:
First, dimension reduction is performed using a differentially private version of PCA (Principal Components Analysis), adapted from [2]. This is mainly done to reduce training time, however it also increases model accuracy by around 2%. Furthermore, the accuracy is fairly stable accross different levels of noise applied to the PCA [1].
This step is carried out in src/ppca.py
.
The dimension-reduced MNIST dataset is used to train the model provided in src/DPClassifier.py
. The training is carried out in mnist_1607.00133.py
.
The used differentially private optimizer (DPSGD) as well as the moments accountant are adopted from [3].
The results are included as plot in the results/
folder and are shown here:
From left to right the noise level increases, showing how this affects the accuracy of the model.
You can replicate the results yourself by using the provided code. Clone the repo and install the requirements using
$ pip install -r requirements.txt
Note: This command installs the library
pyvacy
[3] by cloning the repo and running its setup.py.
After setup, run the experiment file
$ python mnist_1607.00133.py
Disclaimer: The combination of differentially private SGD and moments accountant optimizer used in this project require to compute the gradient for every individual sample. Therefore the training does not make use of GPU parallelization, making it very slow compared to todays standards of training neural networks.
[1] Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang. Deep Learning with Differential Privacy. arXiv:1607.00133v2
[2] IBM. IBM Differential Privacy Library.
[3] Chris Waites. PyVacy: Privacy Algorithms for PyTorch.