This repository contains PyTorch code for the ARD method from "Adversarially Robust Distillation" by Micah Goldblum, Liam Fowl, Soheil Feizi, and Tom Goldstein.
Adversarially Robust Distillation is a method for transferring robustness from a robust teacher network to the student network during distillation. In our experiments, small ARD student models outperform adversarially trained models with identical architecture.
- Python3
- Pytorch
- CUDA
Here is an example of how to run our program:
$ python main.py --teacher_path INSERT-YOUR-TEACHER-PATH
A MobileNetV2 ARD model distilled from a TRADES WideResNet (34-10) teacher on CIFAR-10 can be found here.