This is a notebook for training an encoder using the daml package. We will be training this encoder on a seriese of images from the xView dataset

In [1]:
import json
import math

import cv2
import numpy as np
from daml.metrics.outlier_detection import OD_AE

2024-02-27 09:40:38.566445: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-27 09:40:38.603752: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


After we import the necessary packages, we then need to load in the data that we will be training our encoder on.

In [2]:
import os

file_path = "./data/unperturbed/"
im_folder = os.listdir(file_path)
ims = []
for im in im_folder:
    img = cv2.imread(file_path + im)[:, :, ::-1]
    ims.append(img)
print(len(ims))

20


The raw xView images are too large, and thus we need to crop them into a set of smaller images. We do this by forming a grid across each image, and creating a new image from each cell in the grid

In [3]:
def extract_patches(image: np.ndarray, patch_size: int) -> np.ndarray:
    """Extracts patches from an image.

    This function extracts overlapping patches from an image. The patches are
    square and have the specified patch_size.

    Args:
      image: A 3D numpy array representing an image.
      patch_size: The size of the patches to extract.

    Returns:
      A 4D numpy array of patches.
    """

    # Get the shape of the image.

    height, width, channels = image.shape

    # Calculate the number of patches that can be extracted from the image.
    num_patches_h = int(math.ceil(height / patch_size))
    num_patches_w = int(math.ceil(width / patch_size))
    num_patches = num_patches_h * num_patches_w
    # Create an empty array to store the patches.

    patches = np.zeros((num_patches, patch_size, patch_size, channels))

    # Iterate over the patches and extract them from the image.

    for i in range(num_patches_h):
        for j in range(num_patches_w):
            # Calculate the starting and ending indices of the patch.

            start_h = i * patch_size
            end_h = min(start_h + patch_size, height)
            start_w = j * patch_size
            end_w = min(start_w + patch_size, width)

            # Extract the patch from the image.

            patch = image[start_h:end_h, start_w:end_w, :]

            # Store the patch in the array.

            patches[i * num_patches_w + j, :, :, :] = patch

    # Return the array of patches.

    return patches

In [4]:
patch_shape = 64
img_patches = np.empty((0, 64, 64, 3))
for img in ims:
    patches = extract_patches(img, patch_shape)
    img_patches = np.concatenate([img_patches, patches], axis=0)

print(img_patches.shape)

(1280, 64, 64, 3)


Now that we have our set of smaller images, we are ready to train our autoencoder. The training works in two stages. In the first stage the model encodes the image into a smaller feature space, and then in the second stage the model attempts to recreate the original image from the feature vector generated in the first step.

In [5]:
detector = OD_AE()
detector.fit_dataset(img_patches, epochs=1, verbose=True, batch_size=1)

2024-02-27 09:40:54.284615: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-27 09:40:54.330801: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-27 09:40:54.330968: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysf


















   1/1280 [.] - ETA: 2:05:27 - loss_ma: 158.5534




1280/1280 [=] - 59s 41ms/step - loss_ma: 2277.2720


We are only interested in the encoder section of the autoencoder, so we extract that portion of the model and save it to the desired output location

In [6]:
encoder = detector.detector.ae.encoder

out_file = "./model/encoder.pkl"
with open(out_file, "wb") as handle:
    json.dump(encoder, handle)