# Image Classification

In this jupyter lab notebook, we will train a neural network to classify happy and sad smilies :)

In [None]:
!unzip data.zip -d data


## Import
Here we import libraries we will use in order to code our example.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

import pathlib

# set a seed to have reproducible results
tf.keras.utils.set_random_seed(
    2012
)

## Prepare our dataset
We photographed and cut out all the smilies you drew. We put all the happy smilies in a folder called "happy" and all the sad ones in a folder called "sad".

Also, we took a small part of the smilies (equal amount of sad and happy ones) and moved them to another folder. These smileys will not be seen by our network and we use them only to check how well our network can already recognize new 'unknown' smilies.

In [None]:
# the smilies we will use to train the network
smilies_dir_train = pathlib.Path("data/smilies_train")
# the smilies we will use to evalute the network
smilies_dir_val = pathlib.Path("data/smilies_val")

In [None]:
image_count = len(list(smilies_dir_train.glob('*/*.png')))
print(f"We have {image_count} images loaded and ready to use :)")

In [None]:
# we take a random 'happy' smiley to ensure we did everything correct
happy = list(smilies_dir_train.glob('happy/*'))
PIL.Image.open(str(happy[0]))

In [None]:
# and another one
PIL.Image.open(str(happy[1]))

In [None]:
# we do the same for the sad smileys to be extra sure
sad = list(smilies_dir_train.glob('sad/*'))
PIL.Image.open(str(sad[0]))

In [None]:
# and another one
PIL.Image.open(str(sad[1]))

## Create a tensorflow dataset

Our images are now ready and stored in the named folders. Keras allows us to simply load this folder into our program. Whether it is a happy or sad smiley, Keras recognizes it by the folder name.

The images are loaded into a so-called dataset. From there we can easily access them to train the model or to further process the images.

When working with neural networks, it is common to show the network several images at the same time in one step. This helps it to learn features that appear in several images and can also shorten the training time.
How many such images the network sees in one step is configured by the batch size. Usually one takes a large number, even several hundred images. Since our dataset is not that large today, we will work with a batch size of 4.

Today our network can only work with images of the same size, but since we cut them out ourselves, by hand, it is also important to bring them to a uniform size. Here we use 64x64 pixels.


In [None]:
# Set some hyperparameters
batch_size = 4
img_height = 64
img_width = 64

In [None]:
# load the data in our train dataset
train_ds = tf.keras.utils.image_dataset_from_directory(
    smilies_dir_train,
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size
)

In [None]:
# load the data in our validation dataset
val_ds = tf.keras.utils.image_dataset_from_directory(
    smilies_dir_val,
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size
)

In [None]:
class_names = train_ds.class_names
print(class_names)

We take a look at the samples in our dataset - so we can ensure everything worked out so far.

In [None]:
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
    for i in range(4):
        ax = plt.subplot(2, 2, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")

In [None]:
# check dimensions of batch with multiple images
for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break

In [None]:
# code to optimize the use of buffer when loading data
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
# pixel values will be between 0 and 1 afterwards
normalization_layer = layers.Rescaling(1./255)

In [None]:
# check if normalization worked
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixel values are now in `[0., 1]`.
print(np.min(first_image), np.max(first_image))

### Convolutional layer
![image.png](attachment:a827f077-6caf-4faa-9727-70576b8477c6.png)


In [None]:
# how many classes do we have?
num_classes = len(class_names)

# create neural network
model = Sequential([
    layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
    # here is our convolutional layer -> 8 Kernels, Kernel Size = 3x3, padding to keep dimension of image
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    # here is our convolutional layer -> 16 Kernels, Kernel Size = 3x3, padding to keep dimension of image
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    # reduce our image dimension here
    layers.MaxPooling2D(pool_size=(2, 2)),
    # we flatten our images to get 1D data
    layers.Flatten(),
    layers.Dense(32, activation='relu'),
    # here we use two outputs - the higher one determines the predicted class
    layers.Dense(num_classes)
])

In [None]:
# we want to minimize the loss - and use the adam optimizer to do this
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

In [None]:
# how does our model look like
model.summary()

### Fully connected layer
![image](https://miro.medium.com/max/439/1*sVvC9YwPFD5RJ9xgxrYHPw.png)

In [None]:
%%time

# lets train for 10 epochs
epochs = 10

# here the actual training happens
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

In [None]:
# just some code to plot the training and validation accuracy and loss
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

Now we have the first predictions from our trained model. Let's have a look at the examples we have reserved for validation.

The left value is the correct class, the right value is the class predicted by our model.

In [None]:
for element, labels in val_ds:
    predictions = model.predict(element)
    batch_len = len(predictions)

    plt.figure(figsize=(10, 10))
    for idx, (image, label, prediction) in enumerate(zip(element, labels, predictions)):
        ax = plt.subplot(1, 4, idx + 1)
        img = image.numpy().astype("uint8")
        plt.imshow(img)
        pred_label = class_names[np.argmax(prediction)]
        plt.title(f"{class_names[label]} - {pred_label}")
        plt.axis("off")


## Publikumsbefragung:
### We observed some problems here


#### How can we get better?



## Improvement: Data Augmentation

In [None]:
# here we augment the data
data_augmentation = keras.Sequential(
  [
    # we randomly flip them horizonatally
    layers.RandomFlip(
        "horizontal",
        input_shape=(img_height,img_width, 3)),
    # we randomly rotate them
    layers.RandomRotation(0.1),
    # we randomly zoom
    layers.RandomZoom(0.1),
  ]
)

def augment(img):
    return data_augmentation(img)

In [None]:
plt.figure(figsize=(10, 10))
for images_raw, _ in train_ds:
    for i in range(9):
        augmented_images = augment(images_raw)
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(augmented_images[0].numpy().astype("uint8"))
        plt.axis("off")
    break

In [None]:
model = Sequential([
    # here we add the additional data augmentation layer:
    data_augmentation,
    layers.Rescaling(1./255),
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(pool_size=(2, 2)),
    layers.Flatten(),
    layers.Dense(32, activation='relu'),
    layers.Dense(num_classes)
])

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
%%time
# once again we train the model with some more epochs,
#  because we do not fear overfitting so much with the augmentation
epochs = 16
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

In [None]:
# same plotting code again
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

In [None]:
for element, labels in val_ds:
    predictions = model.predict(element)
    batch_len = len(predictions)

    plt.figure(figsize=(10, 10))
    for idx, (image, label, prediction) in enumerate(zip(element, labels, predictions)):
        ax = plt.subplot(1, 4, idx + 1)
        img = image.numpy().astype("uint8")
        plt.imshow(img)
        pred_label = class_names[np.argmax(prediction)]
        plt.title(f"{class_names[label]} - {pred_label}")
        plt.axis("off")


## Improvement: Dropout
Another way to reduce overfitting might be "dropout": some neurons will be set to 0 during training.

In [None]:
model = Sequential([
    data_augmentation,
    layers.Rescaling(1./255),
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(pool_size=(2,2)),
    layers.Flatten(),
    layers.Dense(32, activation='relu'),
    # we add the dropout layer here
    layers.Dropout(0.05),
    layers.Dense(num_classes)
])

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
%%time
epochs = 12
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

## Visualization
In this area, we can visualize what the neural network predicts on our validation set.

In [None]:
for element, labels in val_ds:
    predictions = model.predict(element)
    batch_len = len(predictions)

    plt.figure(figsize=(10, 10))
    for idx, (image, label, prediction) in enumerate(zip(element, labels, predictions)):
        ax = plt.subplot(1, 4, idx + 1)
        img = image.numpy().astype("uint8")
        plt.imshow(img)
        pred_label = class_names[np.argmax(prediction)]
        plt.title(f"{class_names[label]} - {pred_label}")
        plt.axis("off")


In [None]:
from IPython.display import display, Javascript, Image
from google.colab.output import eval_js
from base64 import b64decode, b64encode
import cv2
import numpy as np
import PIL
import io
import html
import time


# function to convert the JavaScript object into an OpenCV image
def js_to_image(js_reply):
  """
  Params:
          js_reply: JavaScript object containing image from webcam
  Returns:
          img: OpenCV BGR image
  """
  # decode base64 image
  image_bytes = b64decode(js_reply.split(',')[1])
  # convert bytes to numpy array
  jpg_as_np = np.frombuffer(image_bytes, dtype=np.uint8)
  # decode numpy array into OpenCV BGR image
  img = cv2.imdecode(jpg_as_np, flags=1)

  return img



def take_photo(filename='photo.jpg', quality=0.8):
  js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');
      const capture = document.createElement('button');
      capture.textContent = 'Capture';
      div.appendChild(capture);

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: {width: { ideal: 640 },
        height: { ideal: 640 }} });

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // Wait for Capture to be clicked.
      await new Promise((resolve) => capture.onclick = resolve);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')
  display(js)

  # get photo data
  data = eval_js('takePhoto({})'.format(quality))
  # get OpenCV format image
  img = js_to_image(data)

  cv2.imwrite(filename, img)

  return filename

# Test the model on your own examples

In [None]:
#Execute this code block to save a picture from your webcam
try:
  file = take_photo('photo.jpg')



  display(Image(file))
except Exception as err:
    # Errors will be thrown if the user does not have a webcam or if they do not
  # grant the page permission to access it.
  print(str(err))

In [None]:
img = PIL.Image.open('photo.jpg')
img = np.array(img.resize((img_width, img_height)))
img = np.reshape(img[:,:,:3], (1, img_width, img_height, 3))
softmax_layer = layers.Softmax()
prediction = softmax_layer(model.predict(img)).numpy()[0]
pred_class = np.argmax(prediction)
print("Prediction: ",class_names[pred_class], " Probability: ", prediction[pred_class])

# Thank you!
If you read this you really deserve a big THANK YOU!

THANK YOU for participating, you are awesome!
Now it's up to you! Use this notebook as a template and train you own network!
Just exchange the folders with your training data, you can even add more classes or whatever you like!
Be creative! Have FUN!

## Contact
If you have any questions, don't hesitate to write me on
* linkedin: https://www.linkedin.com/in/paul-puntschart-279506a2/
* email:
* paul.puntschart@cloudflight.io
* marcel.brunnbauer@cloudflight.io
* john.uroko@cloudflight.io