## Introduction

This notebook shows how to load and evaluate the MNIST and CIFAR-10 models synthesized and trained as described in the following paper:

M.Sinn, M.Wistuba, B.Buesser, M.-I.Nicolae, M.N.Tran: **Evolutionary Search for Adversarially Robust Neural Network** *ICLR SafeML Workshop 2019 (arXiv link to the paper will be added shortly)*.

In [None]:
from keras.datasets import mnist, cifar10
from keras.models import load_model
from keras.utils.np_utils import to_categorical

from art.classifiers import KerasClassifier
from art.attacks import ProjectedGradientDescent

import numpy as np

## MNIST

Three different MNIST models are available under `../models/`:
- `mnist_ratio=0.h5`: trained on 100% benign samples
- `mnist_ratio=0.5.h5`: trained on 50% benign and 50% adversarial samples
- `mnist_ratio=1.h5`: trained on 100% adversarial samples


Load data:

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32')
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

E.g. load the model trained on 50% benign and 50% adversarial samples:

In [None]:
model = load_model('../models/mnist_ratio=0.5.h5')
classifier = KerasClassifier(model=model, use_logits=False, clip_values=[0,1])

Assess accuracy on first `n` benign test samples:

In [None]:
n = 10000
y_pred = classifier.predict(X_test[:n])
accuracy = np.mean(np.argmax(y_pred, axis=1) == np.argmax(y_test[:n], axis=1))
print("Accuracy on first %i benign test samples: %f" % (n, accuracy))

Define adversarial attack:

In [None]:
attack = ProjectedGradientDescent(classifier, eps=0.3, eps_step=0.01, max_iter=40, targeted=False, random_init=True) 

Assess accuracy on first `n` adversarial test samples:

In [None]:
n = 10
X_test_adv = attack.generate(X_test[:n], y=y_test[:n])
y_adv_pred = classifier.predict(X_test_adv)
accuracy = np.mean(np.argmax(y_adv_pred, axis=1) == np.argmax(y_test[:n], axis=1))
print("Accuracy on first %i adversarial test samples: %f" % (n, accuracy))

## CIFAR-10

Three different CIFAR-10 models are available under `../models/`:
- `cifar-10_ratio=0.h5`: trained on 100% benign samples
- `cifar-10_ratio=0.5.h5`: trained on 50% benign and 50% adversarial samples
- `cifar-10_ratio=1.h5`: trained on 100% adversarial samples


Load data:

In [None]:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.reshape(X_train.shape[0], 32, 32, 3).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 32, 32, 3).astype('float32')
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

E.g. load the model trained on 50% benign and 50% adversarial samples:

In [None]:
model = load_model('../models/cifar-10_ratio=1.h5')
classifier = KerasClassifier(model=model, use_logits=False, clip_values=[0,255])

Assess accuracy on first `n` benign test samples:

In [None]:
n = 100
y_pred = classifier.predict(X_test[:n])
accuracy = np.mean(np.argmax(y_pred, axis=1) == np.argmax(y_test[:n], axis=1))
print("Accuracy on first %i benign test samples: %f" % (n, accuracy))

Define adversarial attack:

In [None]:
attack = ProjectedGradientDescent(classifier, eps=8, eps_step=2, max_iter=10, targeted=False, random_init=True) 

Assess accuracy on first `n` adversarial test samples:

In [None]:
n = 100
X_test_adv = attack.generate(X_test[:n], y=y_test[:n])
y_adv_pred = classifier.predict(X_test_adv)
accuracy = np.mean(np.argmax(y_adv_pred, axis=1) == np.argmax(y_test[:n], axis=1))
print("Accuracy on first %i adversarial test samples: %f" % (n, accuracy))