# Deep Learning - Student Research Project

You can use the left link to run this jupiter notebook on google colab.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/TobiasSchaffner/cnn/blob/master/cnn.ipynb">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />
    Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/TobiasSchaffner/cnn/blob/master/cnn.ipynb">
    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
    View source on GitHub</a>
  </td>
</table>

## Imports

We have to import the needed libraries. I used tenorflow keras and google colab. Numpy is used for math and matrices and matplotlib for visualization.

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf

from tensorflow.keras import datasets, layers, models, optimizers
import matplotlib.pyplot as plt

import cv2
import imutils
import numpy

## Loading of the CFAR10 dataset

We use keras to download the CIFAR10 dataset.

In [0]:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
number_of_train_images: int = len(train_images)

## Remapping of the classes

The CIFAR10 dataset is labeled with ten classes. We do a remapping to a binary label. If the picture is in one of the classes 2 to 7 we label it as living. 

In [0]:
def is_living(label: int) -> int:
  """
  Map a CIFAR10 class label to one for living and zero for not living.

  :param label: A CIFAR10 class label in range between zero and nine.
  :type label:  int

  :result:      One for living and zero for not living.
  :type result: int
  """
  return int(label in (2, 3, 4, 5, 6, 7))

def map_to_is_living(labels: numpy.ndarray) -> numpy.ndarray:
  for i in range(len(labels)):
    labels[i][0] = is_living(labels[i][0])

We load the CIFAR10 dataset using keras. The dataset is not yet labled as needed but we can use our new mapping function to relabel it. 

In [0]:
class_names = ['not living', 'living']

map_to_is_living(train_labels)
map_to_is_living(test_labels)

## Regularizations by image manipulation

Some handy helper functions to flip rotate and translate the images.

In [0]:
def rotate_images(images: numpy.ndarray, angle: float) -> numpy.ndarray:
  result = images.copy()
  for i in range(len(images)):
    result[i] = imutils.rotate(images[i], angle)
  return result

def translate_images(images: numpy.ndarray, x_trans: int, y_trans: int) -> numpy.ndarray:
  result = images.copy()
  for i in range(len(images)):
    result[i] = imutils.translate(images[i], x_trans, y_trans)
  return result

def flip_images(images: numpy.ndarray) -> numpy.ndarray:
  result = images.copy()
  for i in range(len(images)):
    result[i] = cv2.flip(images[i], flipCode=1)
  return result

We use our regularization functions to create additional training data.

In [0]:
regularizations = ['normal', 'flip', 'ror', 'flip_ror', 'rol', 'flip_rol', 'left', 'flip_left', 'right', 'flip_right', 'up', 'flip_up', 'down', 'flip_down']

train_images = numpy.concatenate((train_images,
                                  flip_images(train_images),
                                  rotate_images(train_images, 10),
                                  rotate_images(flip_images(train_images), 10),
                                  rotate_images(train_images, -10),
                                  rotate_images(flip_images(train_images), -10),
                                  translate_images(train_images, -5, 0),
                                  translate_images(flip_images(train_images), -5, 0),
                                  translate_images(train_images, 5, 0),
                                  translate_images(flip_images(train_images), 5, 0),
                                  translate_images(train_images, 0, -5),
                                  translate_images(flip_images(train_images), 0, -5),
                                  translate_images(train_images, 0, 5),
                                  translate_images(flip_images(train_images), 0, 5)))

train_labels = numpy.tile(train_labels, (14, 1))

train_images, test_images = train_images / 255.0, test_images / 255.0

## Trainings data preview

Let's create a preview of the classes using matplotlib.

In [0]:
figure = plt.figure(figsize=(20,15))

for regularization in range(len(regularizations)):
  for i in range(10):
      plt.subplot(len(regularizations),10,regularization * 10 + i+1)
      plt.xticks([])
      plt.yticks([])
      plt.grid(False)
      plt.imshow(train_images[regularization * number_of_train_images + i],
                 cmap=plt.cm.binary)
      if (regularization == len(regularizations) - 1):
        plt.xlabel(class_names[train_labels[i][0]])
      if (i == 0):
        plt.ylabel(regularizations[regularization])

plt.show()

## Convolutional neural network architecture



Additional regularization by adding an input dropout.

In [0]:
model = models.Sequential()
model.add(layers.Dropout(0.2, input_shape=(32, 32, 3)))

First step of the convolutional neural network. We start with a low number of filters but with three convolutional layers.

In [0]:
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.2))

Second step. We lower the number of convolutional layers to two but increase the number of filters.

In [0]:
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.2))

In the third step we flatte instead of the pooling layer.

In [0]:
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(0.2))

Last step. We summarize with a softmax and a final sigmoid. Increasing the size of neurons in the last steps leads to bader results.

In [0]:
model.add(layers.Dense(10, activation='softmax'))
model.add(layers.Dense(1, activation='sigmoid'))

Let's get a summary

In [0]:
model.summary()

## Training

We use adam as the optimizer. This uses gradient decent with momentum. We have to use a small learning rate, to avoid oscilations and binary_crossentropy as loss function as we only need a binary output.

In [0]:
model.compile(optimizer=optimizers.Adam(learning_rate=0.0001), loss='binary_crossentropy', metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=40, validation_data=(test_images, test_labels), verbose=2)

## Evaluation

Last but not least we print the accuracy over time.

In [0]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)