The following is from [this article](https://medium.com/towards-data-science/anomaly-detection-in-tensorflow-and-keras-using-the-autoencoder-method-5600aca29c50) in Medium.

In this tutorial, I will explain in detail how an autoencoder works with a working example.

For this example, I chose to use a [public dataset](https://github.com/AlexOlsen/DeepWeeds) (Apache License 2.0) named deep_weeds.

In [1]:
import tensorflow as tf

import tensorflow_datasets as tfds

In [2]:
ds = tfds.load("deep_weeds", split="train", shuffle_files=True)

In [3]:
ds

<PrefetchDataset element_spec={'image': TensorSpec(shape=(256, 256, 3), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>

# Data Preparation

We need to prepare a dataset for this unsupervised anomaly detection example. Only one class will be taken as our main class that will be considered as the valid class. And I will put a few data from another class as an anomaly. Then we will develop the model to see if we can find that few anomaly data.

I chose class 5 as the valid class and class 1 as the anomaly. In the code block below, I am taking all the data of classes 5 and 1 first and creating lists of the images and their corresponding labels.

In [4]:
import numpy as np

In [5]:
images_main = []
images_anomaly = []
labels_main = []
labels_anomaly = []

In [6]:
ds = ds.prefetch(tf.data.AUTOTUNE)

In [7]:
ds

<PrefetchDataset element_spec={'image': TensorSpec(shape=(256, 256, 3), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>

In [8]:
for example in ds:
    # print(np.array(example['label']))
    if np.array(example["label"]) == 5:
        images_main.append(example["image"])
        labels_main.append(example["label"])
    if np.array(example["label"]) == 1:
        images_anomaly.append(example["image"])
        labels_anomaly.append(example["label"])

Let’s see the shape of the main image (images of class 5) data here:

In [9]:
np.array(images_main).shape

(1009, 256, 256, 3)

The image shapes are (256, 256, 3) and we have a total of 1009 data for class 5.

However, we do not need all the data from class 1. Because class 1 is the anomaly class. So, only 1% of the class 1 data will be taken for the training.

In [10]:
parc = round(len(labels_anomaly) * 0.01)
images_anomaly = np.array(images_anomaly)[:parc]

In [11]:
# stacking the main images and anomaly images together
total_images = np.vstack([images_main, images_anomaly])

The shape of the total_images:

In [12]:
total_images.shape

(1020, 256, 256, 3)

We have a total of 1020 images for training. As we saw earlier, we have 1009 class 5 images, and we took 1020–1009 = 11 of class 1 images which is our anomaly.

Let’s see if we can develop an autoencoder model in Keras and Tensorflow to detect these anomalies.

# Model Development

In [13]:
# import the necessary packages
import random

import cv2
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras import backend as K
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import (
    Activation,
    BatchNormalization,
    Conv2D,
    Conv2DTranspose,
    Dense,
    Flatten,
    Input,
    LeakyReLU,
    Reshape,
)
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import Adam

Some of the data should be kept separately for testing purposes. The train_test_split method from the sklearn library can be used for that. Remember, as this is an unsupervised learning method, the labels are not necessary. We will only split the images.

In [14]:
train_x, test_x = train_test_split(total_images, test_size=0.2, random_state=0)

Finally, the autoencoder model. We will build a `Convolution_Autoencoder` class which is a Convolutional Neural Network. The class has the build method where we will define the Autoencoder model.

The ‘build’ takes `width`, `depth`, `height`, `filters`, and `latentDim` as parameters. Here, width, depth, and height are the dimensions of the images that is (256, 256, 3) for us as we have seen with the `total_images.shape` method above.

The parameter `filters` is the filter for the convolution layers.

The `latentDim` is the size of our compressed layer after the encoder method.

In this build method, the first part is an encoder model which is a simple Convolutional Neural Network.

Once the encoder portion is done, a decoder model is developed using `Conv2DTranspose` layers to reconstruct the data again.

Then, we construct the autoencoder model which is actually a combination of both encoder and decoder models.

Finally, we return the encoder, decoder, and autoencoder models.

In [15]:
class Convolution_Autoencoder:
    @staticmethod
    def build(width, height, depth, filters=(16, 32, 64), latentDim=32):
        input_shape = (height, width, depth)
        chanDim = -1

        inputs = Input(shape=input_shape)
        x = inputs

        for f in filters:
            x = Conv2D(f, (3, 3), strides=2, padding="same")(x)
            x = LeakyReLU(alpha=0.3)(x)
            x = BatchNormalization(axis=chanDim)(x)

        volume = K.int_shape(x)
        x = Flatten()(x)
        latent = Dense(latentDim)(x)

        # encoder model
        encoder = Model(inputs, latent, name="encoder")

        # compressed representation
        latent_layer_input = Input(shape=(latentDim,))
        x = Dense(np.prod(volume[1:]))(latent_layer_input)

        x = Reshape((volume[1], volume[2], volume[3]))(x)

        # Recostructing the image with a decoder model
        for f in filters[::-1]:
            x = Conv2DTranspose(f, (3, 3), strides=2, padding="same")(x)
            x = LeakyReLU(alpha=0.3)(x)
            x = BatchNormalization(axis=chanDim)(x)

        x = Conv2DTranspose(depth, (3, 3), padding="same")(x)

        outputs = Activation("sigmoid")(x)

        decoder = Model(latent_layer_input, outputs, name="decoder")

        autoencoder = Model(inputs, decoder(encoder(inputs)), name="autoencoder")

        return (encoder, decoder, autoencoder)

Model development is done. It’s time to run the model and see if it works. It should run like any other TensorFlow model.

Here we will compile the model first with Adam optimizer. And also, I used a decay in the learning rate and the ‘mse’ as the loss.

In [16]:
epochs = 50
lr_start = 0.001
batchSize = 32

In [17]:
(encoder, decoder, autoencoder) = Convolution_Autoencoder.build(256, 256, 3)

In [18]:
opt = tf.keras.optimizers.legacy.Adam(learning_rate=lr_start, decay=lr_start / epochs)

In [19]:
autoencoder.compile(loss="mse", optimizer=opt)

Finally, running the model. Remember, this is an unsupervised learning method. So there won't be any label in the model training. Instead, we need to pass two training features which will be just train_x twice. If you notice the build method in the Convolution_Autoencoder class, autoencoder looks like this there:

In the Model above, we need to pass inputs which is train_x first, and then decoder(encoder(inputs)) where we need to pass the train_x again. Same for the test_x as well.

Before you begin the model training, I should warn you that it is very slow in the default setting of Google Colab. You can make it way faster by running this in the GPU. Please change the settings of your Google Colab notebook before you run this.

In [20]:
history = autoencoder.fit(
    train_x, train_x, validation_data=(test_x, test_x), epochs=30, batch_size=batchSize
)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


As you can see there are not many changes to losses, simply because here we do not have labels. Instead, we pass the training features to it twice. Losses come from comparing the original images to the reconstructed images by autoencoders.

# Model Evaluation

Model evaluation is different from a regular supervised learning model in autoencoders as this is not a supervised learning method. Let’s do that step by step.

First, we will do the prediction as usual, which will be the decoded images by the autoencoder model.

Then, you calculate the mean squared error using the original errors and the reconstructed error and save it to the ‘errors’ list. Here is the code for that.

In [21]:
decoded = autoencoder.predict(test_x)



In [22]:
errors = []

for image, recon in zip(total_images, decoded):
    mse = np.mean((image - recon) ** 2)
    errors.append(mse)

As we have the ‘mse’ for all the images in the test set, we choose a threshold. Here I am using 95% quantile using np. quantile method and getting indices from the ‘errors’ where ‘mse’ is greater than the threshold. When ‘mse’ is greater than the threshold error we decided we will consider them as an anomaly.

In [23]:
threshold = np.quantile(errors, 0.95)

In [24]:
idxs = np.where(np.array(errors) >= threshold)[0]
idxs

array([  6,   9,  35,  55,  88, 135, 145, 184, 196, 201, 202], dtype=int64)

Now, let’s get back to the image dataset ‘total_images’ that we prepared for the training earlier. We need to check if the indices we have which are more than the threshold are actually the anomaly:

In [25]:
for i in idxs:
    if total_images[i] in images_anomaly:
        print(True)

True
True
True
True
True
True
True
True
True
True
True


Yes!! They are all anomaly data. If you count the number of ‘True’ above we have 11 ‘True’ here. We can check how many anomaly data we originally had in the ‘images_anomaly’:

In [26]:
len(images_anomaly)

11

So, we found all the anomaly data using the autoencoder model.