Skip to content

VMS-6511/online-data-deletion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

online-data-deletion

This repository is to produce the results from the NeurIPS 2022 paper Algorithms that Approximate Data Removal: New Results and Limitations.

This code is heavily based on Certified Data Removal from Machine Learning Models

Dependencies

torch, torchvision, scikit-learn, pytorch-dp

Setup

We assume the following project directory structure:

<root>/
--> save/
--> final_results/

Training a differential private (DP) feature extractor

Training a (0.1, 1e-5)-differentially private feature extractor for SVHN:

python train_svhn.py --data-dir <SVHN path> --train-mode private --std 6 --delta 1e-5 --normalize --save-model

Extracting features using the differentially private extractor:

python train_svhn.py --data-dir <SVHN path> --test-mode extract --std 6 --delta 1e-5

Removing data from an MNIST 3 vs. 8 model

Training a removal-enabled binary logistic regression classifier for MNIST 3 vs. 8 and removing 1000 training points:

python ./scripts/test_removal_<method>.py --data-dir <MNIST path> --verbose --extractor none --dataset MNIST --train-mode binary --std 0.01 --lam 1e-3 --num-steps 100

Removing data from an SVHN 3 vs. 8 model

Training a removal-enabled binary logistic regression classifier for MNIST 3 vs. 8 and removing 1000 training points:

python ./scripts/test_removal_<method>.py --data-dir <SVHN path> --verbose --extractor none --dataset SVHN --train-mode binary --std 0.01 --lam 1e-3 --num-steps 2500

Removing data from a Warfarin dosage model

Training a removal-enabled binary logistic regression classifier for MNIST 3 vs. 8 and removing 1000 training points:

python ./scripts/test_removal_<method>_prox.py --data-dir <SVHN path> --verbose --extractor none --dataset SVHN --train-mode binary --std 0.01 --lam 1e-3 --num-steps 1000

where the method tag can be filled with exact (retraining), sekhari, IJ (our method).

Reference

This code builds on code from the following paper:

Chuan Guo, Tom Goldstein, Awni Hannun, and Laurens van der Maaten. Certified Data Removal from Machine Learning Models. ICML 2020.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages