Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation (UAE-PUPET)

Directory Structure:

uae-pupet
├── .DS_Store     
├── __pycache__   
├── adversaries_and_utility         #weak and strong adversary and utility provider
│   ├── MNIST
│   ├── FashionMNIST           
│   ├── UCIAdult            
│   └── USCensus
├── generated_images                #save here: original vs private image. (sample images in "results" folder)
├── results                         #save results here
│   ├── Figures
│   ├── UPTCurves
│   ├── ...
├── TrainData.csv                   #training data for US Demographic Census Data (USCensus)                      
├── TestData.csv                    #testing data for US Demographic Census Data (USCensus)  
├── adult.csv                       #training and testing data for UCI Adult (UCIAdult)
├── dataset.py                      #dataset pre-processing
├── model.py                        #build models with respect to the dataset
├── original_vs_private_image.py    #generates original vs private image
├── train_preprocessing.py          #additional preprocessing before in order to train the model
├── get_results.py                  #outputs result of the best performing adversary and utility provider
├── loss_curves.py                  #outputs loss curves after training is complete
├── main.py                         #main file for data type unaware conditions and saves results (to run file)
├── data-type-aware-main.py         #main file for data type aware conditions and saves results (to run file)
├── get_final_results.py            #compile all the results together
└── run.sh                          #bash script to run main.py or data-type-aware-main.py (optional)

Requirements

Use google colaboratory to stay away from the hassle of installing different libraries for the implementation of the source code provided in this repository.
Otherwise, you can create your own virtual environment and install the following dependencies: python = 3.7.12, pandas = 1.3.5, numpy = 1.19.5, tensorflow = 2.7.0, keras = 2.7.0, sklearn = 1.0.2, matplotlib =3.2.2.

Note: This repository doesn't contain a "requirements.txt" file.

Usage

For data type ignorant conditions, run main.py with appropriate command line arguments:

Dataset: -d MNIST, FashionMNIST, UCIAdult or USCensus
Generator: -g UAE, AE or VAE
Epochs: -e 40 (By default)
Lambda_p: -p 30 or (Suggested range as per experiments [0, 10, 20, ... 100])
Overwrite: -o true or false (mentioned lambda_p only works if -o true otherwise default value is used according to dataset)

Example:

python PUPET-Official/main.py -d MNIST -g UAE -p 40 -o true

For data type aware conditions, run data-type-aware-main.py with appropriate command line arguments:

Generator: -g UAE, AE or VAE
Epochs: -e 100 (By default)
Lambda_p: -p 10 (By deafult) or (Suggested range as per experiments [0, 1, 2, ... 10])

Example:

python PUPET-Official/data-type-aware-main.py -g UAE -e 90 -p 5

Or you can use the bash script as per your requirement to train and get results, example usage:

bash PUPET-Official/run.sh

Results

The folder "results" consists of accuracy and auroc scores for different datasets.
1. General structure of the output file: results/MNIST/UAE-10-private_acc.txt
2. This means the output file is belongs to the result for MNIST dataset when lambda_p = 10.
3. The generator used is UAE(Uncertainty Autoencoder) and the result is the inference accuracy of the private feature.
UPTCurve consists of Utility Privacy Tradeoff Curves.
Figures contain some other figures related to this paper and repository.

Generated Images (Original vs Privatized)

Odd columns are original images and even columns refer to its corresponding privatized images.

Note: generated_images folder is kept empty (to save space). Once you run the main file, it will start saving images there.

Fig.1 - Original v/s Privatized MNIST image

Fig.2 - Original v/s Privatized Fashion MNIST image

Latent Variable Visualisation

Fig. 3 Visualisation of latent variable before and after the privacy mechanism (UAE-PUPET) for MNIST dataset when the dimension of latent variable = 2. Note that we don't have a closed structure because we don't enforce our latent variable to be gaussian.

UPT Curves

Lines in the plot refer to the upper convex hull points. We focus on those points because, ideally, we don't want points which fall in shadow region. (Interpretation: Points lying on north-west regions are favorable.)

Fig.4 - UPT curves: For all experiments consisting data-type-ignorant conditions. Each point is the mean of 25 experiments for a particular lambda p value.

Loss Curves

Fig.5 - Loss Curves for MNIST dataset. Here we only show the values of privacy requirement from 0 to 30. Our actual experiments took over a range of (0,100).

Citation

If you find UAE-PUPET useful in your research, please consider citing our paper:

@INPROCEEDINGS{9892789,
  author={Mandal, Bishwas and Amariucai, George and Wei, Shuangqing},
  booktitle={2022 International Joint Conference on Neural Networks (IJCNN)}, 
  title={Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation}, 
  year={2022},
  volume={},
  number={},
  pages={1-8},
  doi={10.1109/IJCNN55064.2022.9892789}}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation (UAE-PUPET)

Requirements

Usage

Results

Generated Images (Original vs Privatized)

Latent Variable Visualisation

UPT Curves

Loss Curves

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
__pycache__		__pycache__
adversaries_and_utility		adversaries_and_utility
generated_images		generated_images
results		results
.DS_Store		.DS_Store
README.md		README.md
TestData.csv		TestData.csv
TrainData.csv		TrainData.csv
adult.csv		adult.csv
data-type-aware-main.py		data-type-aware-main.py
dataset.py		dataset.py
get_final_results.py		get_final_results.py
get_results.py		get_results.py
latent.gif		latent.gif
loss_curves.py		loss_curves.py
main.py		main.py
model.py		model.py
original_vs_private_image.py		original_vs_private_image.py
run.sh		run.sh
train_preprocessing.py		train_preprocessing.py

bishwasmandal246/uae-pupet

Folders and files

Latest commit

History

Repository files navigation

Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation (UAE-PUPET)

Requirements

Usage

Results

Generated Images (Original vs Privatized)

Latent Variable Visualisation

UPT Curves

Loss Curves

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages