<div style="width:100%; height:140px">
    <img src="https://www.kuleuven.be/internationaal/thinktank/fotos-en-logos/ku-leuven-logo.png/image_preview" width = 300px, heigh = auto align=left>
</div>


KUL H02A5a Computer Vision: Group Assignment 2
---------------------------------------------------------------
Student numbers: <span style="color:red">r1, r2, r3, r4, r5</span>. (fill in your student numbers!)

In this group assignment your team will delve into some deep learning applications for computer vision. The assignment will be delivered in the same groups from *Group assignment 1* and you start from this template notebook. The notebook you submit for grading is the last notebook you submit in the [Kaggle competition](https://www.kaggle.com/t/d11be6a431b84198bc85f54ae7e2563f) prior to the deadline on **Tuesday 24 May 23:59**. Closely follow [these instructions](https://github.com/gourie/kaggle_inclass) for joining the competition, sharing your notebook with the TAs and making a valid notebook submission to the competition. A notebook submission not only produces a *submission.csv* file that is used to calculate your competition score, it also runs the entire notebook and saves its output as if it were a report. This way it becomes an all-in-one-place document for the TAs to review. As such, please make sure that your final submission notebook is self-contained and fully documented (e.g. provide strong arguments for the design choices that you make). Most likely, this notebook format is not appropriate to run all your experiments at submission time (e.g. the training of CNNs is a memory hungry and time consuming process; due to limited Kaggle resources). It can be a good idea to distribute your code otherwise and only summarize your findings, together with your final predictions, in the submission notebook. For example, you can substitute experiments with some text and figures that you have produced "offline" (e.g. learning curves and results on your internal validation set or even the test set for different architectures, pre-processing pipelines, etc). We advise you to first go through the PDF of this assignment entirely before you really start. Then, it can be a good idea to go through this notebook and use it as your first notebook submission to the competition. You can make use of the *Group assignment 2* forum/discussion board on Toledo if you have any questions. Good luck and have fun!

---------------------------------------------------------------
NOTES:
* This notebook is just a template. Please keep the five main sections, but feel free to adjust further in any way you please!
* Clearly indicate the improvements that you make! You can for instance use subsections like: *3.1. Improvement: applying loss function f instead of g*.


# 1. Overview
This assignment consists of *three main parts* for which we expect you to provide code and extensive documentation in the notebook:
* Image classification (Sect. 2)
* Semantic segmentation (Sect. 3)
* Adversarial attacks (Sect. 4)

In the first part, you will train an end-to-end neural network for image classification. In the second part, you will do the same for semantic segmentation. For these two tasks we expect you to put a significant effort into optimizing performance and as such competing with fellow students via the Kaggle competition. In the third part, you will try to find and exploit the weaknesses of your classification and/or segmentation network. For the latter there is no competition format, but we do expect you to put significant effort in achieving good performance on the self-posed goal for that part. Finally, we ask you to reflect and produce an overall discussion with links to the lectures and "real world" computer vision (Sect. 5). It is important to note that only a small part of the grade will reflect the actual performance of your networks. However, we do expect all things to work! In general, we will evaluate the correctness of your approach and your understanding of what you have done that you demonstrate in the descriptions and discussions in the final notebook.

## 1.1 Deep learning resources
If you did not yet explore this in *Group assignment 1 (Sect. 2)*, we recommend using the TensorFlow and/or Keras library for building deep learning models. You can find a nice crash course [here](https://colab.research.google.com/drive/1UCJt8EYjlzCs1H1d1X0iDGYJsHKwu-NO).

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
import numpy as np
import pandas as pd
import tensorflow as tf
from matplotlib import pyplot as plt
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.metrics import f1_score, precision_score, recall_score
from keras import backend as bck
from sklearn.utils.class_weight import compute_class_weight
from keras.preprocessing import image  
from keras.layers import Activation, MaxPooling2D, BatchNormalization, Conv2D, Dense, GlobalAveragePooling2D, Dropout, Flatten
from keras.callbacks import ModelCheckpoint
from keras.models import Sequential, Model
from keras.preprocessing.image import ImageDataGenerator
import cv2
from sklearn.model_selection import train_test_split
import random

## 1.2 PASCAL VOC 2009
For this project you will be using the [PASCAL VOC 2009](http://host.robots.ox.ac.uk/pascal/VOC/voc2009/index.html) dataset. This dataset consists of colour images of various scenes with different object classes (e.g. animal: *bird, cat, ...*; vehicle: *aeroplane, bicycle, ...*), totalling 20 classes.

In [2]:
# Loading the training data
# train_df = pd.read_csv('/kaggle/input/kul-h02a5a-computer-vision-ga2-2022/train/train_set.csv', index_col="Id")
train_df = pd.read_csv('../input/kul-h02a5a-computer-vision-ga2-2022/train/train_set.csv', index_col="Id")

labels = train_df.columns
train_df["img"] = [np.load('../input/kul-h02a5a-computer-vision-ga2-2022/train/img/train_{}.npy'.format(idx)) for idx, _ in train_df.iterrows()]
train_df["seg"] = [np.load('../input/kul-h02a5a-computer-vision-ga2-2022/train/seg/train_{}.npy'.format(idx)) for idx, _ in train_df.iterrows()]
print("The training set contains {} examples.".format(len(train_df)))

# Show some examples
fig, axs = plt.subplots(2, 20, figsize=(10 * 20, 10 * 2))
for i, label in enumerate(labels):
    df = train_df.loc[train_df[label] == 1]
    axs[0, i].imshow(df.iloc[0]["img"], vmin=0, vmax=255)
    axs[0, i].set_title("\n".join(label for label in labels if df.iloc[0][label] == 1), fontsize=40)
    axs[0, i].axis("off")
    axs[1, i].imshow(df.iloc[0]["seg"], vmin=0, vmax=20)  # with the absolute color scale it will be clear that the arrays in the "seg" column are label maps (labels in [0, 20])
    axs[1, i].axis("off")
    
plt.show()

# The training dataframe contains for each image 20 columns with the ground truth classification labels and 20 column with the ground truth segmentation maps for each class
train_df.head(1)

In [6]:
train_size = train_df.index.size
train_size

In [4]:
labels = train_df.columns[:20].tolist()
labels

In [None]:
# Loading the test data
test_df = pd.read_csv('../input/kul-h02a5a-computer-vision-ga2-2022/test/test_set.csv', index_col="Id")
test_df["img"] = [np.load('../input/kul-h02a5a-computer-vision-ga2-2022/test/img/test_{}.npy'.format(idx)) for idx, _ in test_df.iterrows()]
test_df["seg"] = [-1 * np.ones(img.shape[:2], dtype=np.int8) for img in test_df["img"]]
print("The test set contains {} examples.".format(len(test_df)))

# The test dataframe is similar to the training dataframe, but here the values are -1 --> your task is to fill in these as good as possible in Sect. 2 and Sect. 3; in Sect. 6 this dataframe is automatically transformed in the submission CSV!
test_df.head(1)
test_size = test_df.index.size

## Dataset analysis

To have a good classification, we need that our dataset is balance, i.e. each class occurence has to be roughly equal. In this case, we say that the dataset is balanced.

Let us analyze the dataset. Below is the reparition of the class.

In [None]:
counts = np.zeros(20)

for i, el in enumerate(train_df.iloc):
    for j,lab in enumerate(labels):
        counts[j] += train_df.iloc[i][lab]

fig, ax = plt.subplots(figsize=(15, 10))
ax.barh(sorted(labels), counts, align='center')
ax.set_title('Counts')
ax.xaxis.tick_top()
plt.show()

We can notice that our dataset is unbalanced. As previously said, for a good classification, we need a balanced dataset. Several methods exist to transform our dataset in a balanced one: upsampling (increasing occurences of the minority classes), downsampling (decreasing the occurecens of the majority classes), use weights (set a high weight for minority classes and low of the others).
As we want that our neural network learn a general behaviour, we will not use upsampling and downsampling. Upsampling will set too much importance on certain images by duplicating these in the dataset. Downsampling is also not a good idea as we could lose certain information (we could have a minority and a majority class in the same picture, thus deleting this picture will make us lose information). 

## Preprocessing

The preprocessing part is composed of several steps: resize + normalization + shuffle of the data

Here we make the choice of resizing the images to 64x64 and not 128x128 so that the CNN learns faster

In [None]:
im_size = 64

In [None]:
def preprocess(img, im_size):

  # Resize + normalize
  
  img = cv2.resize(img, dsize=(im_size, im_size), interpolation=cv2.INTER_LINEAR)
  img = img/255.0
  
  return img.astype('float32')

In [None]:
processed_train_data_images = np.zeros((train_size, im_size,im_size,3))
processed_test_data_images = np.zeros((test_size, im_size,im_size,3))

for i in range(train_size):
  processed_train_data_images[i] = preprocess(train_df.loc[i]['img'], im_size)

for i in range(train_size):
  processed_test_data_images[i] = preprocess(test_df.loc[i]['img'], im_size)

processed_train_data_labels = np.array(train_df.drop(['img', 'seg'], axis=1))

In [None]:
X_train, X_test, y_train, y_test = train_test_split(processed_train_data_images, processed_train_data_labels, test_size=0.33)

Let us now shuffle the dataset

In [None]:
  random.Random(30).shuffle(X_train)
  random.Random(30).shuffle(y_train)

## Classification

Now that our dataset is preprocessed, let us now jump into the classification

First, let us solve the unbalanced dataset problem. To do so, we will put weights for every classes
(we avoid that the model thinks it has learned by predicting the majority class only)

In [None]:
def weights_calculator(y):
  w = np.empty([20, 2])
  for i in range(20):
    w[i] = compute_class_weight( class_weight = "balanced", classes = [0.,1.], y = y[:,i] )
  return w

class_weights = weights_calculator(y_train)
class_weights

## Accuracy metrics

An other important step is the choice of the metric we will used to determine the classification accuracy. As we are dealing with a multi-label classificiation problem, the metric used is the fbeta function (with beta = 1.0).

The fbeta function takes into account:
- The precision quantifies the number of correct true positives predicted: $Prec = \frac{TP}{(TP+FP)}$
- The recall quantifies the number of correct positives out of all the positive predictions which has been made: $Rec = \frac{TP}{(TP+FN)}$

By maximizing the precision, we minimize FP and by maximizing recall, we minimize FN.

The fbeta function combines both precision and recall via the following formula:
$F_{\beta} = \frac{(1+\beta^2)*Prec*Rec}{beta^2*Prec+Rec}$

Here we will take $\beta$ = 1 to set an even importance on the precision and the recall

Sklearn proposes a $F_\beta$ function

In [None]:
from sklearn.metrics import fbeta_score

In [None]:
fbeta_score([1,0,0,1],[1,0,0,1],beta=0)

Here we propose one calculated by hand

In [None]:
# Based on https://github.com/kenanEkici/multilabel-class-pascalvoc
def f_beta_1(y, y_pred):
  true_positives = bck.sum(bck.round(bck.clip(y * y_pred, 0, 1)))
  possible_positives = bck.sum(bck.round(bck.clip(y, 0, 1)))
  predicted_positives = bck.sum(bck.round(bck.clip(y_pred, 0, 1)))
  precision = true_positives / (predicted_positives + bck.epsilon())
  recall = true_positives / (possible_positives + bck.epsilon())
  f_beta_1 = 2*(precision*recall)/(precision+recall + bck.epsilon())
  return f_beta_1

## Loss function

Now we have to chose a loss function which corresponds to our multi-label classification problem.

Here we propose a Binary Cross Entropy loss function because it includes the cross entropy, the sigmoid function (we cannot use softmax as we can have multiple classed for one instance/picture) and it forces the output to be 0 or 1.

This loss function calculates the error between the prediction and the ground truth label.

The choice of the sigmoid function for a multi-label classification problem is very important. Sigmoid allows to have N independent binary classification problems (in opposition with Softmax).

As we have to take into account the weight of each class (to deal with our unbalanced dataset), we define a new loss function which combines the binary cross entropy behaviour and the weights for each class.

In [None]:
# Based on https://github.com/kenanEkici/multilabel-class-pascalvoc
def loss_binary_weights(w):
      def weighted_loss(y, y_pred):
        return bck.mean((w[:,0]**(1-y))*(w[:,1]**(y))*bck.binary_crossentropy(y, y_pred), axis=-1)
      return weighted_loss

## Neural network

As we are dealing with images, we will use a convolutional neural network (CNN).

Here below, we proposed a CNN composed of several layers:
- Convolutional: allow to detect pattern inside an image
- Batchnormalization: avoid overfitting by normalizing
- Max pooling: select the maximum of the region covered by the filter
- Dropout: nulifies the contribution of some neurons
- Dense: at the end of the NN when the layer is flattened to finally have N outputs corresponding to the number of classes

It is important to notice that the activation function at the end of the network is a sigmoid function and not a softmax function as we are dealing with a multi-label classification problem (several classes are possible for one instance).

In [None]:
def CNN():
  model = Sequential()
  input_shape = (64,64,3)
  channel_dim = -1
  if bck.image_data_format() == "channels_first":
    inputShape = (depth, height, width)
    channel_dim = 1
  model.add(Conv2D(32, (3, 3), padding="same", input_shape=input_shape))
  model.add(Activation("relu"))
  model.add(BatchNormalization(axis=channel_dim))
  model.add(MaxPooling2D(pool_size=(3, 3)))
  model.add(Dropout(0.25))

  model.add(Conv2D(64, (3, 3), padding="same"))
  model.add(Activation("relu"))
  model.add(BatchNormalization(axis=channel_dim))
  model.add(Conv2D(64, (3, 3), padding="same"))
  model.add(Activation("relu"))
  model.add(BatchNormalization(axis=channel_dim))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))

  model.add(Conv2D(128, (3, 3), padding="same"))
  model.add(Activation("relu"))
  model.add(BatchNormalization(axis=channel_dim))
  model.add(Conv2D(128, (3, 3), padding="same"))
  model.add(Activation("relu"))
  model.add(BatchNormalization(axis=channel_dim))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))

  model.add(Flatten())
  model.add(Dense(1024))
  model.add(Activation("relu"))
  model.add(BatchNormalization())
  model.add(Dropout(0.5))
  model.add(Dense(20)) 
  model.add(Activation('sigmoid'))
  
  return model

## Training

Now we put everything together and we can start the training of our classification problem.

Parameters initialization

In [None]:
model = CNN()
optimizer = 'adam'
batch_size = 32
epochs = 10
filepath = 'save_CNN'

Conversion Int64 to Float32

In [None]:
y_train = np.float32(y_train)
y_test = np.float32(y_test)

### Data augmentation

To improve the learning of the model we will use data augmentation, i.e. we add data to our dataset to have more samples to train on. To do so, we apply several transformations to our initial images such as rotation, translation etc.

In [None]:
save_best = ModelCheckpoint(filepath, monitor='val_get_f1', verbose=0, save_best_only=True, mode='max', period=1)

model.compile(loss=loss_binary_weights(class_weights),optimizer=optimizer,metrics=[f_beta_1])

# data augmentation
datagen = ImageDataGenerator(rotation_range=90, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,zoom_range=0.2, fill_mode="nearest", horizontal_flip=True)

datagen.fit(X_train)

# training
history = model.fit(datagen.flow(X_train, y_train, batch_size=batch_size), validation_data=(X_test, y_test), steps_per_epoch = len(X_train) // batch_size, epochs=epochs, verbose=1, workers=4, callbacks=[save_best], shuffle=True)

### Results

Plot training and validation losses

In [None]:
plt.style.use("ggplot")
plt.figure()
plt.ylim(top=5)
plt.plot(np.arange(0, epochs), history.history["loss"], label="training loss")
plt.plot(np.arange(0, epochs), history.history["val_loss"], label="validation loss")
plt.title("Training result")
plt.xlabel("Epoch")
plt.legend(loc="upper right")
plt.show()

Plot training and validation accuracy

In [None]:
plt.style.use("ggplot")
plt.figure()
plt.ylim(top=1)
plt.plot(np.arange(0, epochs), history.history["f_beta_1"], label="training F1")
plt.plot(np.arange(0, epochs), history.history["val_f_beta_1"], label="validation F1")
plt.title("Training result")
plt.xlabel("Epoch")
plt.legend(loc="upper right")
plt.show()

#### Predictions

In [None]:
# Based on https://github.com/kenanEkici/multilabel-class-pascalvoc

threshold = 0.4

y_pred = model.predict(X_test)

y_pred[y_pred>=threshold] = 1
y_pred[y_pred<threshold] = 0

print('Precision: ', precision_score(y_test, y_pred, average='samples', zero_division=0)) 
print('Recall: ', recall_score(y_test, y_pred, average='samples')) 
print('F beta 1 score: ', f1_score(y_test, y_pred, average='samples'), '\n') 

print('Per class F beta 1 score: ')
f1_scores = f1_score(y_test, y_pred, average=None)
for i in range(len(f1_scores)):
  print(list(sorted(labels))[i], round(f1_scores[i],2))

To chose the threshold, let us plot the precision-recall curve (for a balanced dataset, we use the ROC curve, for an unbalanced, the precision-recall curve). The maximum corresponds to the threshold giving the best accuracy

In [None]:
# Based on https://github.com/kenanEkici/multilabel-class-pascalvoc

fig, ax = plt.subplots(figsize= (5,5))
ax.set_title("Precision recall")
ax.set_ylabel("Precision")
ax.set_xlabel("Recall")
ax.set_xlim(xmin=0, xmax=1)
ax.set_ylim(ymin=0, ymax=1)

x = []
y = []
for threshold in np.arange(0, 1.1, 0.1):
  y_pred = model.predict(X_test)
  y_pred[y_pred>=threshold] = 1
  y_pred[y_pred<threshold] = 0
  p = precision_score(y_test, y_pred, average='samples', zero_division=0)
  r = recall_score(y_test, y_pred, average='samples')
  ax.annotate(str(round(threshold,2)), (r, p))
  x.append(r)
  y.append(p)
  
ax.plot(x, y)
ax.lines[-1].set_label("CNN")

plt.legend(loc="upper right")
plt.show()

We notice that our model does not learn very well. To improve it, we will use what we called transfer learning.

## Transfer learning

This consists of using an already trained NN. First we transfer the learning of this model on our own and then we apply fine tuning by updating the parameters of the model by using our training dataset.

We will use MobilNet which provides an already trained NN. It is base on the ImageNet dataset which consists of million of images with thousand of classes.

We set the input size to 224 as the pre-trained model works with 224x224 images

In [None]:
im_size = 224

In [None]:
processed_train_data_images = np.zeros((train_size, im_size,im_size,3))
processed_test_data_images = np.zeros((test_size, im_size,im_size,3))

for i in range(train_size):
  processed_train_data_images[i] = preprocess(train_df.loc[i]['img'], im_size)

for i in range(train_size):
  processed_test_data_images[i] = preprocess(test_df.loc[i]['img'], im_size)

processed_train_data_labels = np.array(train_df.drop(['img', 'seg'], axis=1))

X_train, X_test, y_train, y_test = train_test_split(processed_train_data_images, processed_train_data_labels, test_size=0.33)

Let use create our pre-trained model

Parameters initialization

In [None]:
optimizer = 'adam'
batch_size = 32
epochs = 40
filepath = 'save_CNN'

Here we will use the CNN of the MobileNet model. To speed up the convergence we will replace the hidden layer at the end with a global average pooling (GAP) layer. The GAP layer allows to go from a 3D map directly to a flattened layer of 20 outputs (because we have 20 classes). So we go from a height x width x depth matrix to a 1 x 1 x depth. For each heigh x weight map it takes the mean to have a the end only a dimension of 1. This decreases the numbers of parameters to learn and thus leads to a faster convergence.

In [None]:
from tensorflow.keras.applications import MobileNetV2

base_model = MobileNetV2(input_shape=(224,224,3), include_top=False, weights='imagenet')
base_model.trainable = False
model = Sequential([base_model,GlobalAveragePooling2D(),Dense(20, activation='sigmoid')])

model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=[f_beta_1])

Here again, we will use data augmentation to improve the learning of our model.

In [None]:
datagen = ImageDataGenerator(rotation_range=40, width_shift_range=0.2 ,height_shift_range=0.2, shear_range=0.2,zoom_range=0.2, fill_mode="nearest", horizontal_flip=True)
datagen.fit(X_train)

save_best = ModelCheckpoint(filepath, monitor='val_f_beta_1', verbose=0, save_best_only=True, mode='max', period=1)

# training the binary classifier
history = model.fit(datagen.flow(X_train, y_train, batch_size=batch_size),validation_data=(X_test, y_test),epochs=epochs, verbose=1, workers=4, shuffle=True, callbacks=[save_best])

### Results

Plot training and validation losses

In [None]:
plt.style.use("ggplot")
plt.figure()
plt.ylim(top=1)
plt.plot(np.arange(0, epochs), history.history["loss"], label="training loss")
plt.plot(np.arange(0, epochs), history.history["val_loss"], label="validation loss")
plt.title("Training result")
plt.xlabel("Epoch")
plt.legend(loc="upper right")
plt.show()

Plot training and validation accuracy

In [None]:
plt.style.use("ggplot")
plt.figure()
plt.ylim(top=1)
plt.plot(np.arange(0, epochs), history.history["f_beta_1"], label="training F1")
plt.plot(np.arange(0, epochs), history.history["val_f_beta_1"], label="validation F1")
plt.title("Training result")
plt.xlabel("Epoch")
plt.legend(loc="upper right")
plt.show()

## Adversarial attacks

In this section, the goal is to full our model by misclassifying some images. To simplify the problem let us just consider a binary classification. We choose two sets of similar sices, the ones of car's and sofa's. A car and sofa also have kind of the same shape too not make this task too easy for us.

In [164]:
## Binary Classifier
class_1 = 'car'
class_2 = 'sofa'

train_bin = train_df[['img', class_1, class_2]].copy()

In [165]:
def label_to_string(label_):
    if label_ == 0:
        return class_1
    if label_ == 1:
         return class_2

In [214]:
im_size = 64

def preprocess(img, im_size):
  #resize
  img = cv2.resize(img, dsize=(im_size, im_size), interpolation=cv2.INTER_LINEAR)
  
  #normalize
  img = img/255.0
  
  return img.astype('float32')

In [215]:
train_data_images = []
train_data_labels = []

for i in range(train_size):
  bool_ = (train_bin[class_1].loc[i] == 1 and train_bin[class_2].loc[i] == 0) or (train_bin.loc[i][class_1] == 0 and train_bin.loc[i][class_2] == 1)
  if bool_:
    train_data_images.append(preprocess( train_bin['img'].iloc[i],im_size))

    if train_bin[class_1].iloc[i]==1:
      train_data_labels.append(0)
    else:
      train_data_labels.append(1)

train_data_images = np.array(train_data_images)
train_data_labels = np.array(train_data_labels)

In [216]:
X_train, X_val, y_train, y_val = train_test_split(train_data_images, train_data_labels, test_size=0.15)

In [217]:
#car

car_label = y_train[y_train==0][0]
car = X_train[y_train==0][0]
print('label', car_label, '->', label_to_string(0))
plt.imshow(car)
plt.show()

In [218]:
#sofa

sofa_label = y_train[y_train==1][0]
sofa = X_train[y_train==1][0]
print('label', sofa_label, '->', label_to_string(1))
plt.imshow(sofa)
plt.show()

Now let us run our model. To do so we will again use transfer learning. As we our doing a binary classification, we just have one output at the end of the NN.

In [219]:
from tensorflow.keras.applications import MobileNetV2

In [220]:
base_model = MobileNetV2(input_shape=(im_size,im_size,3), include_top=False, weights='imagenet')
base_model.trainable = False
model = Sequential([
  base_model,
  GlobalAveragePooling2D(),
  Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

In [221]:
##with data augmentation
# datagen = ImageDataGenerator(rotation_range=40, width_shift_range=0.2,height_shift_range=0.2, shear_range=0.2,zoom_range=0.2,fill_mode="nearest", horizontal_flip=True)

# datagen.fit(X_train)

# history = model.fit(datagen.flow(X_train, y_train, batch_size=4), validation_data=(X_val, y_val), epochs=10, verbose=1, workers=4, shuffle=True)

##without data augmentation
history = model.fit(X_train, y_train, batch_size=4, validation_data=(X_val, y_val), epochs=10, verbose=1)

In [222]:
[score,acc] = model.evaluate(X_train, y_train)
[score,acc] = model.evaluate(X_val, y_val)

Because of our data augmentation the final accuracy does not correspond to the final output accuracy of the training.
The accuracy is very good.

### Aversarial model

#### Deceptive labels

In [223]:
# make false labels
y_train_dec = y_train*-1+1
y_val_dec = y_val*-1+1

print("Check")
print(y_train[:5])
print(y_train_dec[:5])

#### Encoder-decoder model

In [208]:
from keras.layers import Input, Lambda, UpSampling2D

In [224]:
#adversary network

input_shape = (im_size,im_size,3)

input_img = Input(shape=input_shape)
x = Conv2D(64, (3, 3), padding='same')(input_img)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), padding='same')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), padding='same')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

x = Conv2D(16, (3, 3), padding='same')(encoded)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), padding='same')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), padding='same')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(3, (3, 3), padding='same')(x)
x = BatchNormalization()(x)
decoded = Activation('sigmoid')(x)

Add perturbation to our input image

In [225]:
eps_ = 100.

In [226]:
def perturb(args):
    decoded,input_img,eps = args
    return decoded*bck.minimum(bck.pow(bck.sqrt(bck.sum(bck.square(decoded))),-1)*eps,1) 

# Add perturbation
def perturb_img(args):
    perturbation,input_img = args
    return bck.clip(perturbation+input_img,0,1)

perturbation = Lambda(perturb, output_shape=input_shape, name='perturb')([decoded,input_img, eps_])
perturbed_img = Lambda(perturb_img, output_shape=input_shape, name='perturb_img')([perturbation,input_img])

##### Make sure that our clean network is not trained, the weights are frozen.

In [227]:
model.trainable = False

In [228]:
outputs = model(perturbed_img)

Train the adversary model.

In the verbose of the training we can very nicely see that the accuracy of the validation set lowers when being compared to it's correct labels.

In [229]:
batch_size = 4
epochs = 20

adv_CNN = Model(input_img, outputs, name='adv')
adv_CNN.compile(optimizer='adam',loss='binary_crossentropy', metrics = ['accuracy'])

history = adv_CNN.fit(X_train, y_train_dec,validation_data=(X_val, y_val), batch_size=batch_size, epochs=epochs,verbose=1)

Check of the accuracy of the network is lowered

In [230]:
[score,acc] = adv_CNN.evaluate(X_train, y_train_dec)
[score,acc] = adv_CNN.evaluate(X_train, y_train)

In [231]:
[score,acc] = adv_CNN.evaluate(X_val, y_val_dec)
[score,acc] = adv_CNN.evaluate(X_val, y_val)

In [232]:
# check if the original network still works
[score,acc] = model.evaluate(X_train, y_train)

In [233]:
adv_model = adv_CNN
inter_layer_perturb_img = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb_img').output)
inter_layer_perturb = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb').output)
generated_img = inter_layer_perturb_img.predict(X_train)
perturbation = inter_layer_perturb.predict(X_train)
output = adv_model.predict(X_train)

In [None]:
pred_train = np.round(output).reshape(len(X_train))
succesfull_fakes = X_train[abs(pred_train - y_train).astype(bool)]

acc_ = sum(abs(pred_train - y_train).astype(bool))/len(y_train)
print('Percentage of wrongfully classified images:', acc_)

In [234]:
#plot some results which where correctly classified at first and now wrong
img_i = 0
for img_ in succesfull_fakes:
    if img_i<6:
        fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(14,10))
        ax1.imshow(X_train[img_i])
        ax1.set_title('OG')
        ax2.imshow(generated_img[img_i])
        ax2.set_title('Perturberd Image')
        ax3.imshow(perturbation[img_i])
        ax3.set_title('Perturbation')
        plt.show()
        print('Real Label:', y_train[img_i], '->', label_to_string(y_train[img_i]))
        print('Adversary Label:', y_train_dec[img_i], '->', label_to_string(y_train_dec[img_i]))
    img_i+=1
    

### Do the same for a lower eps_ value.

In [235]:
eps_ = 10.

In [236]:
def perturb(args):
    decoded,input_img,eps = args
    return decoded*bck.minimum(bck.pow(bck.sqrt(bck.sum(bck.square(decoded))),-1)*eps,1) 

# Add perturbation
def perturb_img(args):
    perturbation,input_img = args
    return bck.clip(perturbation+input_img,0,1)

perturbation = Lambda(perturb, output_shape=input_shape, name='perturb')([decoded,input_img, eps_])
perturbed_img = Lambda(perturb_img, output_shape=input_shape, name='perturb_img')([perturbation,input_img])

In [237]:
model.trainable = False
outputs = model(perturbed_img)

batch_size = 4
epochs = 20

adv_CNN = Model(input_img, outputs, name='adv')
adv_CNN.compile(optimizer='adam',loss='binary_crossentropy', metrics = ['accuracy'])

history = adv_CNN.fit(X_train, y_train_dec,validation_data=(X_val, y_val), batch_size=batch_size, epochs=epochs,verbose=1)

In [238]:
adv_model = adv_CNN
inter_layer_perturb_img = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb_img').output)
inter_layer_perturb = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb').output)
generated_img = inter_layer_perturb_img.predict(X_train)
perturbation = inter_layer_perturb.predict(X_train)
output = adv_model.predict(X_train)

In [None]:
pred_train = np.round(output).reshape(len(X_train))
succesfull_fakes = X_train[abs(pred_train - y_train).astype(bool)]

acc_ = sum(abs(pred_train - y_train).astype(bool))/len(y_train)
print('Percentage of wrongfully classified images:', acc_)

In [240]:
#plot some results which where correctly classified at first and now wrong
img_i = 0
for img_ in succesfull_fakes:
    if img_i<6:
        fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(14,10))
        ax1.imshow(X_train[img_i])
        ax1.set_title('OG')
        ax2.imshow(generated_img[img_i])
        ax2.set_title('Perturberd Image')
        ax3.imshow(perturbation[img_i])
        ax3.set_title('Perturbation')
        plt.show()
        print('Real Label:', y_train[img_i], '->', label_to_string(y_train[img_i]))
        print('Adversary Label:', y_train_dec[img_i], '->', label_to_string(y_train_dec[img_i]))
    img_i+=1
    

When using a lower perturbation not many pictures will be wrongfully classified as expected.

### Do the same for a higher eps_ value.

In [241]:
eps_ = 600.

In [242]:
def perturb(args):
    decoded,input_img,eps = args
    return decoded*bck.minimum(bck.pow(bck.sqrt(bck.sum(bck.square(decoded))),-1)*eps,1) 

# Add perturbation
def perturb_img(args):
    perturbation,input_img = args
    return bck.clip(perturbation+input_img,0,1)

perturbation = Lambda(perturb, output_shape=input_shape, name='perturb')([decoded,input_img, eps_])
perturbed_img = Lambda(perturb_img, output_shape=input_shape, name='perturb_img')([perturbation,input_img])

In [243]:
model.trainable = False
outputs = model(perturbed_img)

batch_size = 4
epochs = 20

adv_CNN = Model(input_img, outputs, name='adv')
adv_CNN.compile(optimizer='adam',loss='binary_crossentropy', metrics = ['accuracy'])

history = adv_CNN.fit(X_train, y_train_dec,validation_data=(X_val, y_val), batch_size=batch_size, epochs=epochs,verbose=1)

In [244]:
adv_model = adv_CNN
inter_layer_perturb_img = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb_img').output)
inter_layer_perturb = Model(inputs=adv_model.input,outputs=adv_model.get_layer('perturb').output)
generated_img = inter_layer_perturb_img.predict(X_train)
perturbation = inter_layer_perturb.predict(X_train)
output = adv_model.predict(X_train)

In [249]:
pred_train = np.round(output).reshape(len(X_train))
succesfull_fakes = X_train[abs(pred_train - y_train).astype(bool)]

acc_ = sum(abs(pred_train - y_train).astype(bool))/len(y_train)
print('Percentage of wrongfully classified images:', acc_)

In [245]:
#plot some results which where correctly classified at first and now wrong

img_i = 0
for img_ in succesfull_fakes:
    if img_i<6:
        fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(14,10))
        ax1.imshow(X_train[img_i])
        ax1.set_title('OG')
        ax2.imshow(generated_img[img_i])
        ax2.set_title('Perturberd Image')
        ax3.imshow(perturbation[img_i])
        ax3.set_title('Perturbation')
        plt.show()
        print('Real Label:', y_train[img_i], '->', label_to_string(y_train[img_i]))
        print('Adversary Label:', y_train_dec[img_i], '->', label_to_string(y_train_dec[img_i]))
    img_i+=1

# Conclusion

Adversary: using a higher resolution usually means that a higher perturbation value is needed.
When using a high perturbation value the image is such distorted that it is not even the same image. It is logical that image would be easier to distort.