# Traffic Signs Adversarial Detector with Kfserving

![demo](demo.png)

Prequisites:

 * Running cluster with kfserving installed and authenticated for use with `kubectl`
    * Istio with Istio Gateway exposed on a LoadBalancer
 * Knative eventing installed
 * Download the Traffic Signs model: run `make model_signs` (Requires `gsutils`)
 * Pip install the alibi-detect library.

## Setup Resources

Enabled eventing on default namespace. This will activate a default Knative Broker.

In [None]:
!kubectl label namespace default knative-eventing-injection=enabled

Create a Knative service to dump events it receives. This will be the example final sink for adversarial events.

In [None]:
!pygmentize message-dumper.yaml

In [None]:
!kubectl apply -f message-dumper.yaml

Create the Kfserving image classification model for Cifar10. We add in a `logger` for requests - the default destination is the namespace Knative Broker.

In [None]:
!pygmentize signs.yaml

In [None]:
!kubectl apply -f signs.yaml

Create the pretrained Traffic Signs Adversarial detector. We forward replies to the message-dumper we started.

In [None]:
!pygmentize signsad.yaml

In [None]:
!kubectl apply -f signsad.yaml

Create a Knative trigger to forward logging events to our Adversarial Detector.

In [None]:
!pygmentize trigger.yaml

In [None]:
!kubectl apply -f trigger.yaml

Get the IP address of the Istio Ingress Gateway. This assumes you have installed istio with a LoadBalancer.

In [None]:
CLUSTER_IPS=!(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
CLUSTER_IP=CLUSTER_IPS[0]
print(CLUSTER_IP)

In [None]:
SERVICE_HOSTNAMES=!(kubectl get inferenceservice signs -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME_SIGNS=SERVICE_HOSTNAMES[0]
print(SERVICE_HOSTNAME_SIGNS)

In [None]:
SERVICE_HOSTNAMES=!(kubectl get ksvc ad-signs -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME_SIGNSAD=SERVICE_HOSTNAMES[0]
print(SERVICE_HOSTNAME_SIGNSAD)

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import tensorflow as tf
import cv2
from PIL import Image
import os

def load_signs(train_folder="./traffic_data/train/"):
    
    data=[]
    labels=[]

    height = 30
    width = 30
    channels = 3
    classes = 43
    n_inputs = height * width*channels

    for i in range(classes) :
        path = train_folder+"{0}/".format(i)
        Class=os.listdir(path)
        for a in Class:
            try:
                image=cv2.imread(path+a)
                image_from_array = Image.fromarray(image, 'RGB')
                size_image = image_from_array.resize((height, width))
                data.append(np.array(size_image))
                labels.append(i)
            except AttributeError:
                print(" ")
            
    Cells=np.array(data)
    labels=np.array(labels)

    #Randomize the order of the input images
    s=np.arange(Cells.shape[0])
    np.random.seed(43)
    np.random.shuffle(s)
    Cells=Cells[s]
    labels=labels[s]

    (X_train,X_val)=Cells[(int)(0.2*len(labels)):],Cells[:(int)(0.2*len(labels))]

    (y_train,y_val)=labels[(int)(0.2*len(labels)):],labels[:(int)(0.2*len(labels))]
    
    train, test = (X_train, y_train), (X_val, y_val)
    
    return train, test

In [None]:
train, test = load_signs()

In [1]:
import matplotlib.pyplot as plt
import tensorflow as tf
tf.keras.backend.clear_session()
from tensorflow.keras.models import Model
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
import requests


X_train, y_train = train
X_test, y_test = test

X_train = X_train.reshape(-1, 30, 30, 3).astype('float32') / 255
X_test = X_test.reshape(-1, 30, 30, 3).astype('float32') / 255
y_train = to_categorical(y_train, 43)
y_test = to_categorical(y_test, 43)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

classes = ('0', '1', '2', '3',
           '4', '5', '6', '7', '8', '9')
img_shape = (30, 30, 3)

def show(Xs):
    for X in Xs:
        plt.imshow(np.squeeze(X))
        plt.axis('off')
        plt.show()

    
meta_folder="./traffic_data/meta/"   
def show_prediction(idxs):
    for idx in idxs.tolist():
        class_image=cv2.imread(meta_folder+"{}.png".format(idx))
        image = Image.fromarray(class_image, 'RGB')
        image = np.array(image)
        show([image])
    
def predict(X):
    formData = {
    'instances': X.tolist()
    }
    headers = {}
    headers["Host"] = SERVICE_HOSTNAME_SIGNS
    res = requests.post('http://'+CLUSTER_IP+'/v1/models/signs:predict', json=formData, headers=headers)
    if res.status_code == 200:
        show_prediction(np.array(res.json()["predictions"]))
    else:
        print("Failed with ",res.status_code)
        
    
def detect(X):
    formData = {
    'instances': X.tolist()
    }
    headers = {"alibi-detect-return-instance-score":"true"}
    headers["Host"] = SERVICE_HOSTNAME_SIGNSAD
    res = requests.post('http://'+CLUSTER_IP+'/', json=formData, headers=headers)
    if res.status_code == 200:
        ad = res.json()
        ad["data"]["instance_score"] = np.array(ad["data"]["instance_score"])
        return ad
    else:
        print("Failed with ",res.status_code)
        return []

NameError: name 'train' is not defined

## Normal Prediction

In [None]:
idx = 2
X = X_train[idx:idx+1]
show(X)
predict(X)

Show logs from message-dumper. The last cloud event should show a line like:

```JSON
"{\"data\": {\"feature_score\": null, \"instance_score\": null, \"is_adversarial\": [0]}, \"meta\": {\"name\": \"AdversarialVAE\", \"detector_type\": \"offline\", \"data_type\": null}}"
```

This shows the last event was not an adversarial attack.

In [None]:
!kubectl logs $(kubectl get pod -l serving.knative.dev/configuration=message-dumper -o jsonpath='{.items[0].metadata.name}') user-container

## Generate adversarial instances

The `cleverhans` adversarial attack methods assume that the model outputs logits, so we will create a modified model by simply removing the softmax output layer:

In [None]:
from alibi_detect.utils.saving import  load_tf_model
filepath = './model_signs/'
model = load_tf_model(filepath)

In [None]:
model_logits = Model(inputs=model.inputs, outputs=model.layers[-2].output)

Select observations for which we will create adversarial instances:

In [None]:
ids = np.arange(2,7)
X_to_adv = X_test[ids]
print(X_to_adv.shape)

Launch adversarial attack. Follow the [Basic Iterative Method (Kurakin et al. 2016)](https://arxiv.org/pdf/1607.02533.pdf) when `rand_init` is set to 0 or the [Madry et al. (2017)](https://arxiv.org/pdf/1706.06083.pdf) method when `rand_minmax` is larger than 0:

In [None]:
# Adversarial attack method. The latest release of the `cleverhans` package does
# not support TensrFlow 2 yet, so we need to install from the master branch:
# pip install git+https://github.com/tensorflow/cleverhans.git#egg=cleverhans
from cleverhans.future.tf2.attacks import projected_gradient_descent
X_adv = projected_gradient_descent(model_logits,
                                   X_to_adv,
                                   eps=2.,
                                   eps_iter=1.,
                                   nb_iter=10,
                                   norm=2,
                                   clip_min=X_train.min(),
                                   clip_max=X_train.max(),
                                   rand_init=None,
                                   rand_minmax=.3,
                                   targeted=False,
                                   sanity_checks=False
                                  ).numpy()

In [None]:
y_pred = predict(X_to_adv)
y_pred_adv = predict(X_adv)

We can look at the logs of the message dumper and see the last 2 cloud events which should show the results of outr batch prediction of ordinary examples and our modifed adversarial attacks:

```JSON
"{\"data\": {\"feature_score\": null, \"instance_score\": null, \"is_adversarial\": [0, 0, 0, 0, 0]}, \"meta\": {\"name\": \"AdversarialVAE\", \"detector_type\": \"offline\", \"data_type\": null}}"
```

and 

```JSON
"{\"data\": {\"feature_score\": null, \"instance_score\": null, \"is_adversarial\": [1, 1, 1, 1, 1]}, \"meta\": {\"name\": \"AdversarialVAE\", \"detector_type\": \"offline\", \"data_type\": null}}"
```

This shows the first batch of 5 were not adversarial but the second 5 were:

  * `is_adversarial: [0, 0, 0, 0, 0]`
  * `is_adversarial: [1, 1, 1, 1, 1]`

In [None]:
!kubectl logs $(kubectl get pod -l serving.knative.dev/configuration=message-dumper -o jsonpath='{.items[0].metadata.name}') user-container

In [None]:
y_pred = np.argmax(model(X_to_adv).numpy(), axis=-1)
y_pred_adv = np.argmax(model(X_adv).numpy(), axis=-1)

n_rows = 5
n_cols = 4
figsize = (15, 20)
img_shape = (30, 30, 3)

fig5 = plt.figure(constrained_layout=False, figsize=figsize)
widths = [5, 1, 5, 1]
heights = [5, 5, 5, 5, 5]
spec5 = fig5.add_gridspec(ncols=4, nrows=5, width_ratios=widths,
                          height_ratios=heights)

for row in range(n_rows):
    ax_0 = fig5.add_subplot(spec5[row, 0])    
    ax_0.imshow(X_to_adv[row].reshape(img_shape))
    if row == 0:
        ax_0.title.set_text('Original')
    ax_0.axis('off')
    
    ax_1 = fig5.add_subplot(spec5[row, 1])
    class_image=cv2.imread(meta_folder+"{}.png".format(y_pred[row]))
    image = Image.fromarray(class_image, 'RGB')
    image = np.array(image)
    ax_1.imshow(image)
    ax_1.title.set_text('Pred original: {}'.format(y_pred[row]))
    ax_1.axis('off')
    
    ax_2 = fig5.add_subplot(spec5[row, 2])
    ax_2.imshow(X_adv[row].reshape(img_shape))
    if row == 0:
        ax_2.title.set_text('Adversarial')
    ax_2.axis('off')
    
    ax_3 = fig5.add_subplot(spec5[row, 3])
    class_image_adv=cv2.imread(meta_folder+"{}.png".format(y_pred_adv[row]))
    image_adv = Image.fromarray(class_image_adv, 'RGB')
    image_adv = np.array(image_adv)   
    ax_3.imshow(image_adv)
    ax_3.title.set_text('Pred adversarial: {}'.format(y_pred_adv[row]))
    ax_3.axis('off')

## Get Adversarial Scores

We call the adversarial detector directly to get instance scores.


In [None]:
from alibi_detect.utils.visualize import plot_instance_score
X = np.concatenate([X_to_adv, X_adv], axis=0)
ad_preds = detect(X)
labels = ['Normal', 'Adversarial']
target = np.array([0 if i < X_to_adv.shape[0] else 1 for i in range(X.shape[0])])
plot_instance_score(ad_preds, target, labels, 0.5)