Synthesizing robust adversarial examples
My entry for ICLR 2018 Reproducibility Challenge for paper Synthesizing robust adversarial examples https://openreview.net/pdf?id=BJDH5M-AW
Reproducibility Report The current report has been produced as a part of ICLR reproducibility challenge http://www.cs.mcgill.ca/~jpineau/ICLR2018-ReproducibilityChallenge.html
Author: Prabhant Singh, University of Tartu, firstname.lastname@example.org
Abstract: The paper’s main goal was to provide an algorithm to generate adversarial examples that are robust across any chosen distribution of transformations. The authors demonstrated this algorithm in 2 and 3 dimensions in the paper. The authors were successfully able to demonstrate that adversarial examples are a practical concern for real-world systems. During the reproducibility of the paper, we have implemented authors’ algorithm on 2D scenario and were able to verify authors’ claim. We have also checked for transferability with the image of 3D adversarial example generated in this paper in the real-world environment. This report also checks the robustness of adversarial examples on black box scenario which was not in the selected paper.
Experimental methodology: After reproducing the Expectation Over Transformation (EOT) algorithm we have generated adversarial examples on the pre-trained inceptionV3 model trained on ImageNet dataset. The adversarial examples were robust under the predefined distribution. One interesting observation here is that whenever we rotated the image out of the distribution there was confidence reduction in case of prediction and the target class which was predefined while creating the adversarial example was within the top 10 probabilities. The probability of target class was decreased when we rotated it away from the distribution and vice versa. As the paper states there are no guarantees of adversarial examples being robust outside the chosen distribution but the adversarial example was still able to reduce the confidence of the prediction.
Transferability: The transferability was checked on four images. First image was generated by EOT and other three were of adversarial Turtle mentioned in the paper . The transferability was tested on six different architectures pre-trained on the ImageNet dataset (Resnet50, InceptionV3, InceptionResnetV2, Xception, VGG16, VGG19). Our adversarial examples were generated using Tensorflow pre-trained Inception model. The transferability was checked with pre-trained keras models. The results of the experiments are listed below:
Generated adversarial image using EOT Parameters: Learning rate: 2e-1 Epsilon: 8.0/255.0 True class: Tabby cat Target class: Guacmole
- InceptionV3: Prediction: Flatworm, Confidence : 100%
- InceptionResnet: Prediction: Comicbook, Confidence : 100%
- Xception: Prediction: Necklace, Confidence : 92.5%
- Resnet50: Prediction: Tabby cat, Confidence: 35%
- VGG 19: Prediction: Tabby cat, Confidence: 47.9%
- VGG16 Prediction: Tabby cat, Confidence: 34.8%
Image of 3D adversarial turtle mentioned in the paper True class: Turtle
- InceptionV3: Prediction: Pencil sharpner, Confidence : 67.7%
- InceptionResnet: Prediction: Comic book, Confidence : 100%
- Xception: Prediction: Table lamp, Confidence : 84.8%
- Resnet50: Prediction: Bucket , Confidence: 20%
- VGG 19: Prediction: Mask, Confidence: 10.9%
- VGG16 Prediction: Turtle, Confidence: 3.6%
Other images of Adversarial turtle generated similar results.
Both images of adversarial turtle and cat were detected incorrectly by inception related architectures with a high confidence. Both images were classified as “Comic book” with 100 percent confidence by InceptionResnetV2. The adversarial examples were able to reduce the confidence by a high margin, about 50-60 percent in case of Tabbycat. Only VGG16 was able to classify the turtle correctly but by a very low confidence of 3.6% Similar results were found when we rotated, cropped and zoomed out of the image. In case of adversarial turtle, the photo was taken out of the distribution(Not inside the chosen distribution as mentioned in the paper ie camera distance between 2.5cm -3.0cm) ,still the image was misclassified.
The author successfully generated robust adversarial examples which are robust under the given distribution in case of targeted misclassification. The adversarial examples were also robust in case of untargeted misclassification under any distribution if classified against Inception related models.The adversarial examples reduced confidence by a wide margin in case of non-inception architectures. The image of 3D adversarial turtle can be considered robust under any distribution as it has been misclassified against all the architectures and only classified correctly by VGG16 but with a very insignificant percentage.
Sources:  The Image of adversarial turtle was taken at the recent NIPS conference by a number of viewpoints out of the given distribution.
 Pre-trained keras models: https://keras.io/applications/
 The source code and experiments info can be found in this Github repo: https://github.com/prabhant/synthesizing-robust-adversarial-examples