Deflecting Adversarial Attacks with Pixel Deflection
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 7 commits ahead of carlini:master.
Latest commit 34ffe85 Jun 21, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
images initial commit Jan 29, 2018
maps initial commit Jan 29, 2018
originals add the originals and the demo notebook Feb 1, 2018
README.md Update README Jun 21, 2018
demo.ipynb First paragraph edits Feb 1, 2018
imagenet_labels.json add imagenet labels Feb 1, 2018
main.py Defense is broken Apr 5, 2018
methods.py Defense is broken Apr 5, 2018
model_robustml.py Wrap attack Jun 21, 2018
robustml_attack.py Wrap attack Jun 21, 2018
utils.py add the originals and the demo notebook Feb 1, 2018

README.md

Deflecting Adversarial Attacks with Pixel Deflection

The code in this repository demonstrates that Deflecting Adversarial Attacks with Pixel Deflection (Prakash et al. 2018) is ineffective in the white-box threat model.

With an L-infinity perturbation of 4/255, we generate targeted adversarial examples with 97% success rate, and can reduce classifier accuracy to 0%.

See our note for more context and details.

Citation

@unpublished{cvpr2018breaks,
  author = {Anish Athalye and Nicholas Carlini},
  title = {On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses},
  year = {2018},
  url = {https://arxiv.org/abs/1804.03286},
}

robustml evaluation

Run with:

python robustml_attack.py --imagenet-path <path>

Credits

Thanks to Nicholas Carlini for implementing the break and Dimitris Tsipras for writing the robustml model wrapper.