# Course 4 - Project - Part 1: Feature extraction

<a name="top-1"></a>
This notebook is concerned with *Part 1: Feature extraction*.

**Contents:**
* [Step 1: Take a first look at the data set](#step-1.1)
* [Step 2: Set up a pretrained model](#step-1.2)
* [Step 3: Extract features](#step-1.3)

## Step 1: Take a first look at the data set<a name="step-1.1"></a> ([top](#top-1))
---

In [1]:
# Standard library.
import pathlib

We assume that the Swissroads data set has been downloaded and extracted into a directory named _data/_.

In [2]:
base_path = pathlib.Path.cwd() / 'data' / 'swissroads'
assert base_path.is_dir()

We see that the data set is rather small (469 images) and has already been divided into 3 sub subsets.

In [3]:
for kind in ['train', 'valid', 'test']:
    path = base_path / kind
    num = sum(1 for _ in path.glob('**/*.png'))
    print(f'{kind}: {num} images')

train: 280 images
valid: 139 images
test: 50 images


## Step 2: Set up a pretrained model<a name="step-1.2"></a> ([top](#top-1))
---

In [4]:
# 3rd party.
import tensorflow as tf
import tensorflow_hub as hub

We decide to use the MobileNet v2 CNN model from TensorFlow Hub.

In [5]:
# MobileNet V2.
# (An updated implementation exists but it requires TF 1.5 or TF 2.)
MOBILENET_V2_VERSION = 3  # implementation version
MOBILENET_V2_URL = f'https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/{MOBILENET_V2_VERSION}'

We download and setup the pretrained model.

In [6]:
# Create graph.
img_graph = tf.Graph()

with img_graph.as_default():
    # Download module.
    module_url = MOBILENET_V2_URL
    module = hub.Module(module_url)
    
    # Get the expected size.
    height, width = hub.get_expected_image_size(module)
    
    # Create an input placeholder.
    # n [samples] x height [pixels] x width [pixels] x 3 [color channels]
    input_imgs = tf.placeholder(dtype=tf.float32, shape=[None, height, width, 3])
    
    # Get a node with the features.
    imgs_features = module(input_imgs)
    
    # Collect initializers.
    init_op = tf.group([
        tf.global_variables_initializer(), tf.tables_initializer()
    ])
    
img_graph.finalize()  # Make the graph "read-only".

INFO:tensorflow:Using /var/folders/nv/rl462mms4sg561l5l80lhh_hqg2chr/T/tfhub_modules to cache modules.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


## Step 3: Extract features<a name="step-1.3"></a> ([top](#top-1))
---

In [7]:
# Standard library.
import os
import pathlib
import typing as T

# 3rd party.
import numpy as np
import PIL as pil

As per the documentation of MobileNet v2, we need to resize images and scale color channels:

> The input images are expected to have color values in the range [0,1], following the common image input conventions. For this module, the size of the input images is fixed to height x width = 224 x 224 pixels.

We need to load and normalize images while keeping track of their labels. It may be possible to "hijack" an `ImageDataGenerator` by disabling all data augmentations and reading the exact number of images. Here, we decide to do it by hand.

In [8]:
def image_to_array(path: os.PathLike,
                   rescale: T.Optional[float] = 1/255,
                   target_size: T.Tuple[int, int] = (224, 224),
                   resample: int = pil.Image.BICUBIC ) -> np.ndarray:
    """\
    Loads an image from file.

    Args:
        path: The path to the root of the directory structure.
        rescale: An optional rescaling factor. If ``None`` or zero, no rescaling is applied. 
            Otherwise, all values are multipled by the rescaling factor.
        target_size: A tuple of integers ``(height, width)`` that specifies the dimensions to which
            the image will be resized.
        resample: The ID of an optional resampling filter for resizing. See
            https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize

    Returns:
        The image, returned as a NumPy array of shape
        ``1 [samples] x height [pixels] x width [pixels] x 3 [color channels]``.
    """
    img = pil.Image.open(path)
    img = img.resize(target_size, resample)
    array = np.asarray(img, dtype=np.float32)  # height x width x 3
    array = array[np.newaxis, :, : :]  # 1 x height x width x 3
    if rescale:  # Do not rescale if None or 0.
        array *= rescale
    return array


def load_images(path: os.PathLike,
                rescale: T.Optional[float] = 1/255,
                target_size: T.Tuple[int, int] = (224, 224),
                resample: int = pil.Image.BICUBIC) -> T.Iterator[T.Tuple[np.ndarray, str]]:
    """\
    Loads images from a directory structure. The names of the sub-directories are interpreted as
    labels for the images that they contain. The expected directory structure is::

        <path>/<label>/*.png
  
    Args:
        path: The path to the root of the directory structure.
        rescale: An optional rescaling factor. If ``None`` or zero, no rescaling is applied. 
            Otherwise, all values are multipled by the rescaling factor.
        target_size: A tuple of integers ``(height, width)`` that specifies the dimensions to which
            the image will be resized.
        resample: The ID of an optional resampling filter for resizing. See
            https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize

    Returns:
        An iterator over pairs ``(image, label)``. Each image is returned as a NumPy array of shape
        ``1 [samples] x height [pixels] x width [pixels] x 3 [color channels]``.
    """
    label_paths = [entry for entry in path.iterdir() if entry.is_dir()]
    for label_path in sorted(label_paths):
        label = label_path.name
        image_paths = label_path.glob('*.png')
        for image_path in image_paths:
            array = image_to_array(image_path, rescale, target_size, resample)
            yield (array, label)


def load_dataset(path: os.PathLike,
                 rescale: T.Optional[float] = 1/255,
                 target_size: T.Tuple[int, int] = (224, 224),
                 resample: int = pil.Image.BICUBIC) -> T.Dict[str, T.Any]:
    """\
    Loads a dataset from a directory structure. The names of the sub-directories are interpreted as
    labels for the images that they contain. The expected directory structure is::

        <path>/<label>/<name>.png

    Args:
        path: The path to the root of the directory structure.
        rescale: An optional rescaling factor. If ``None`` or zero, no rescaling is applied. 
            Otherwise, all values are multipled by the rescaling factor.
        target_size: A tuple of integers ``(height, width)`` that specifies the dimensions to which
            the image will be resized.
        resample: The ID of an optional resampling filter for resizing. See
            https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize

    Returns:
        A dataset as a dictionary with the following entries:
        
        - ``data``: The list of images, as a NumPy array of shape
          ``n [samples] x height [pixels] x width [pixels] x 3 [color channels]``. 
        - ``labels``: The list of numeric labels of the images, as a Numpy array of shape
          ``n [samples]``. Text labels can be reconstructed using ``names[labels]``.
        - ``names``: The list of unique text labels of the images, as a NumPy array of shape
          ``k [categories]``.
    """
    buf_images = []
    buf_labels = []
    for image, label in load_images(path, rescale, target_size, resample):
        buf_images.append(image)
        buf_labels.append(label)
    # Collect all images into a single array.
    images = np.concatenate(buf_images)
    # Collect all labels into a single array. 
    labels = np.array(buf_labels)
    # Figure out numeric indices for the labels.
    label_names, label_idxs = np.unique(labels, return_inverse=True)
    # We mimic the naming used in CIFAR-10.
    dataset = {
        'data': images,
        'labels': label_idxs,
        'names': label_names
    }
    return dataset

We process the training, validation and test datasets. Since the dataset is rather small, we decide to use bicubic interpolation when resizing the images and to save both images and extracted features in the same NPZ file (this is most likely less efficient than PNG compression).

In [9]:
separator = ''.center(80, '-')

# Create a session.
with tf.Session(graph=img_graph) as sess:
    # Initialize the session.
    sess.run(init_op)
    
    # Extract the features.
    for kind in ['train', 'valid', 'test']:
        print(separator)
        print(f'Dataset: {kind}')
        
        # Load the dataset.
        path = base_path / kind
        print(f'Loading dataset ({path})...')
        dataset = load_dataset(path)
        
        # Extract the features and add them to the dataset.
        print('Extracting features...')
        features = sess.run(imgs_features, feed_dict={input_imgs: dataset['data']})
        print(f'Features: shape={features.shape}, dtype={features.dtype}')
        dataset['features'] = features
        
        # Save the dataset.
        ouput_path = pathlib.Path.cwd() / 'data' / f'swissroads-features-{kind}.npz'
        print(f'Saving ({ouput_path})...')
        np.savez(ouput_path, **dataset)

--------------------------------------------------------------------------------
Dataset: train
Loading dataset (/Users/taariet1/ContEd/Adsml/git/course-04-project/data/swissroads/train)...
Extracting features...
Features: shape=(280, 1280), dtype=float32
Saving (/Users/taariet1/ContEd/Adsml/git/course-04-project/data/swissroads-features-train.npz)...
--------------------------------------------------------------------------------
Dataset: valid
Loading dataset (/Users/taariet1/ContEd/Adsml/git/course-04-project/data/swissroads/valid)...
Extracting features...
Features: shape=(139, 1280), dtype=float32
Saving (/Users/taariet1/ContEd/Adsml/git/course-04-project/data/swissroads-features-valid.npz)...
--------------------------------------------------------------------------------
Dataset: test
Loading dataset (/Users/taariet1/ContEd/Adsml/git/course-04-project/data/swissroads/test)...
Extracting features...
Features: shape=(50, 1280), dtype=float32
Saving (/Users/taariet1/ContEd/Adsml/gi