# Defending from Adversarial Attacks

One of the easiest ways to defend against adversarial attacks is to train your model on these types of images.

For example, if we are worried nefarious users applying FGSM attacks to our model, then we can “inoculate” our neural network by training them on FSGM images of our own.

Typically, this type of adversarial inoculation is applied by either:

1. Training our model on a given dataset, generating a set of adversarial images, and then fine-tuning the model on the adversarial images
2. Generating mixed batches of both the original training images and adversarial images, followed by fine-tuning our neural network on these mixed batches

**The first method is simpler and requires less computation** (since we need to generate only one set of adversarial images). **The downside is that this method tends to be less robust** since we’re only fine-tuning the model on adversarial examples at the end of training.

**The second method is much more complicated and requires significantly more computation**. We need to use the model to generate adversarial images for each batch where the network is trained.
**The benefit is that the model tends to be more robust because it sees both original training images and adversarial images during every single batch update during training.**

In [1]:
#---Utilities
from pyimagesearch.simplecnn import SimpleCNN 
from pyimagesearch.datagen import generate_adversarial_batch
#---Tensorflow
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import mnist
#---Others
import numpy as np

In [2]:
(trainX, trainY), (testX, testY) = mnist.load_data()
trainX = trainX/255.0
testX = testX/255.0

In [3]:
trainY = to_categorical(trainY,10)
testY = to_categorical(testY, 10)

In [6]:
model = SimpleCNN.build(width=28, height=28, depth=1, classes=10)
model.compile(loss='categorical_crossentropy', optimizer = Adam(learning_rate=1e-3), metrics=["accuracy"])

In [7]:
model.fit(trainX, trainY, validation_data=(testX, testY),batch_size=64, epochs=20, verbose=1)

Epoch 1/20


2022-09-25 17:33:03.138900: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2022-09-25 17:33:04.424764: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-25 17:33:04.425816: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-25 17:33:04.425858: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-25 17:33:04.426479: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-25 17:33:04.426594: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f89cc0f0160>

In [9]:
#make predictions on the test set and get the loss and accuracy

(loss, acc) = model.evaluate(testX, testY, verbose=0)
print("[INFO] loss: {:.4f}, \nacc: {:.4f}".format(loss, acc))

[INFO] loss: 0.0420, 
acc: 0.9903


In [10]:
#Generate the adversarial set from the test set. Create len(test) number of images
(advX, advY) = next(generate_adversarial_batch(model, len(testX), testX, testY, (28,28,1), eps=0.1))

In [12]:
#re-evaluate the model on the adversarial images
(loss, acc) = model.evaluate(x=advX, y=advY, verbose=0)
print("[INFO] loss: {:.4f}, \nacc: {:.4f}".format(loss, acc))

[INFO] loss: 17.8904, 
acc: 0.0230


In [13]:
# lower the learning rate and re-traiin the model on the adversarial images

model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=1e-4), metrics=["accuracy"])

  super(Adam, self).__init__(name, **kwargs)


In [14]:
model.fit(advX, advY, batch_size=64, epochs=10, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f89c439faf0>

In [20]:
#evaluate it on the test set (i.e., non-adversarial) again to see if performance has degraded
(loss, acc) = model.evaluate(x=testX, y=testY, verbose=0)
print(" normal testing images *after* fine-tuning:")
print("loss: {:.4f}, \nacc: {:.4f}".format(loss, acc))

 normal testing images *after* fine-tuning:
loss: 0.0480, 
acc: 0.9865


In [21]:
# do a final evaluation of the model on the adversarial images
(loss, acc) = model.evaluate(x=advX, y=advY, verbose=0)
print("adversarial images *after* fine-tuning:")
print("loss: {:.4f}, \nacc: {:.4f}".format(loss, acc))

adversarial images *after* fine-tuning:
loss: 0.0474, 
acc: 0.9858
