Neural networks can sometimes be easily fooled. Try to create a random-looking image that makes the NN from [CNNIntro.ipynb](CNNIntro.ipynb) think that it sees a cat. 

(You need to run that notebook before to get the model that we load (`load_model`) below!)

In [None]:
from tensorflow import keras
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

In [None]:
labels = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]

In [None]:
model = keras.models.load_model("CNNIntro_model.h5")

Start with this random image:

In [None]:
rnd_img = np.random.rand(1, 32, 32, 3)

In [None]:
plt.imshow(rnd_img[0])

What does our model say to this?

In [None]:
model(rnd_img)

In [None]:
plt.bar(labels, model(rnd_img)[0])
plt.xticks(rotation=90)

Let's see if we can modify the image such that the NN thinks this is a cat. So we want to maximise the output for cat:

In [None]:
def output_for(model, img, label):
    return model(img)[:, labels.index(label)]

In [None]:
output_for(model, rnd_img, "cat")

By calculating the gradient w.r.t. the input image

In [None]:
input_img = tf.constant(rnd_img)
with tf.GradientTape() as tape:
    # calling tape.watch will ensure this also works
    # when input_img is a Tensor (tf.constant) instead of a tf.Variable
    tape.watch(input_img)
    out = output_for(model, input_img, "cat")
grad_wrt_input = tape.gradient(out, input_img)

In [None]:
grad_wrt_input.shape

Now repeat this for several steps, adding the gradient to the input image until the model predicts a high probability for cat.

In [None]:
def step(input_img, step_size=1.0, label="cat"):
    with tf.GradientTape() as tape:
        tape.watch(input_img)
        out = output_for(model, input_img, label)
    grad_wrt_input = tape.gradient(out, input_img)
    new_img = input_img + step_size * grad_wrt_input
    # set values below 0 to 0 and values above 1 to 1
    new_img = tf.clip_by_value(new_img, 0, 1)
    return new_img

In [None]:
def print_status(it, out):
    print(f"{it}: {out:.3f} cat")

tuned_img = tf.constant(rnd_img)
for it in range(1000):
    tuned_img = step(tuned_img, step_size=1.0)
    out = output_for(model, tuned_img, "cat").numpy()[0]
    if it < 10 or (it % 5) == 0:
        print_status(it, out)
    if out > 0.99:
        break
print_status(it, out)

In [None]:
plt.bar(labels, model(tuned_img)[0])
plt.xticks(rotation=90)

In [None]:
plt.imshow(tuned_img[0])

Now we just do the same for all 10 labels that are used for the classification of the image set and compare the different results.

In [None]:
tuned_images = {}
for label in labels:
    print(f"Tuning for {label}")
    tuned_img = tf.constant(rnd_img)
    for it in range(1000):
        tuned_img = step(tuned_img, step_size=1.0, label=label)
        out = output_for(model, tuned_img, label=label).numpy()[0]
        if out > 0.99:
            break
    tuned_images[label] = tuned_img

In [None]:
fig, axs = plt.subplots(nrows=2, ncols=5, figsize=(20,6))
for ax, (label, img) in zip(axs.ravel(), tuned_images.items()):
    ax.imshow(img[0])
    ax.set_title(label)