Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
models
.gitignore
LICENSE
README
compute_logit_mean.py Initial commit Jul 14, 2016
l0_attack.py
model.py
robustml_attack.py
robustml_model.py
setup.py
train_baseline.py Update to TF1.0 Jul 6, 2017
train_distillation.py

README

Update: From Logan Engstrom - this repository has been modified to fit the API for the https://www.robust-ml.org/ project.

----

Update: this repository is out of date. It contains strictly less
useful code than the repository at the following URL:

https://github.com/carlini/nn_robust_attacks


In particular, do not use the l0 attack in this repository; it is
only good at breaking defensive distillation (not other attacks).


----


Defensive Distillation was recently proposed as a defense to
adversarial examples.

Unfortunately, distillation is not secure. We show this in our paper, at
http://nicholas.carlini.com/papers/2016_defensivedistillation.pdf
We strongly believe that research should be reproducible, and so our
releasing the code required to train a baseline model on MNIST, train
a defensively distilled model on MNIST, and attack the defensively
distilled model.

To run the code, you will need Python 3.x with TensorFlow. It will be slow
unless you have a GPU to train on.

Begin by running train_baseline.py and train_distillation.py; that will
create three model files, two of which are useful. They should report
final accuracy around 99.3% +/-0.2%.

To construct adversarial examples, run l0_attack.py passing as argument
either models/baseline models/distilled. This will run the modified l0
adversary on the given model. The success probability should be ~95%
modifying ~35 pixels.
You can’t perform that action at this time.