# Feature Extraction Notebook
In this notebook I'm using a pre-trained MobileNetV2 to extract features from the 3D image dataset.

**Author**: Arthur G.
***

## Loading Dependencies
In this section I'm loading and setting up the dependencies for this notebook.

In [1]:
import os
import typing as t

import numpy as np
import tensorflow as tf
from numpy.typing import NDArray
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.applications import MobileNetV3Small
from tensorflow.keras.applications import MobileNetV3Large
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2

seed = 42
IMG_HEIGHT, IMG_WIDTH = 224, 224
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

## Helper Functions
In this section I'm writing a set of helper functions to automate feature extraction.

In [2]:
def feature_extraction_model_fn(base_model: t.Any) -> Model:
    """
    Add a GlobalAveragePooling2D layer at the end of the
    feature extractor.
    """
    x = base_model.layers[-1].output
    x = GlobalAveragePooling2D()(x)
    
    return Model(inputs=base_model.input, outputs=x)

def feature_extract(
    dataset: NDArray[t.Any],
    model: Model,
    preprocess_fn: t.Any,
    num_processes: int = 8
) -> NDArray[t.Any]:
    """
    Perform feature extraction on a numpy dataset of image matrices in parallel.
    """
    preprocessed_images = preprocess_fn(dataset)
    features = np.squeeze(model.predict(preprocessed_images))
    
    return features

## Loading Dataset
In this section I'm loading the augmented image dataset.

In [3]:
dataset = np.load(os.path.join("..", "data", "processed", "augmented_image_dataset.npz"))

## Image Feature Extraction
In this section I'm using the choosen pre-trained models to extract features from the augmented image dataset.

### Defining Feature Extraction Models
In this subsection I'm defining the feature extraction models.

In [4]:
model_params = {"weights": "imagenet", "include_top": False, "input_shape": (224, 224, 3)}

# feature extraction models definition
v1_model = feature_extraction_model_fn(base_model=MobileNet(**model_params))
v2_model = feature_extraction_model_fn(base_model=MobileNetV2(**model_params))
v3_large_model = feature_extraction_model_fn(base_model=MobileNetV3Large(**model_params))
v3_small_model = feature_extraction_model_fn(base_model=MobileNetV3Small(**model_params))

# preprocessing functions definition
v1_preproc_fn = tf.keras.applications.mobilenet.preprocess_input
v2_preproc_fn = tf.keras.applications.mobilenet_v2.preprocess_input
v3_preproc_fn = tf.keras.applications.mobilenet_v3.preprocess_input

### MobileNet V1
Extracting features with MobileNet V1.

In [5]:
v1_train = feature_extract(dataset["train_images"], model=v1_model, preprocess_fn=v1_preproc_fn)
v1_test = feature_extract(dataset["test_images"], model=v1_model, preprocess_fn=v1_preproc_fn)
v1_valid = feature_extract(dataset["validation_images"], model=v1_model, preprocess_fn=v1_preproc_fn)



### MobileNet V2
Extracting features with MobileNet V2.

In [6]:
v2_train = feature_extract(dataset["train_images"], model=v2_model, preprocess_fn=v2_preproc_fn)
v2_test = feature_extract(dataset["test_images"], model=v2_model, preprocess_fn=v2_preproc_fn)
v2_valid = feature_extract(dataset["validation_images"], model=v2_model, preprocess_fn=v2_preproc_fn)



### MobileNet V3 Large
Extracting features with MobileNet V3 Large.

In [7]:
v3_large_train = feature_extract(dataset["train_images"], model=v3_large_model, preprocess_fn=v3_preproc_fn)
v3_large_test = feature_extract(dataset["test_images"], model=v3_large_model, preprocess_fn=v3_preproc_fn)
v3_large_valid = feature_extract(dataset["validation_images"], model=v3_large_model, preprocess_fn=v3_preproc_fn)



### MobileNet V3 Small
Extracting features with MobileNet V3 Small.

In [8]:
v3_small_train = feature_extract(dataset["train_images"], model=v3_small_model, preprocess_fn=v3_preproc_fn)
v3_small_test = feature_extract(dataset["test_images"], model=v3_small_model, preprocess_fn=v3_preproc_fn)
v3_small_valid = feature_extract(dataset["validation_images"], model=v3_small_model, preprocess_fn=v3_preproc_fn)



### Serializing Augmented Dataset Features
In this subsection I'm serializing the augmented dataset's extracted features.

In [9]:
np.savez(
    os.path.join("..", "data", "finalized", "augmented_images_features_dataset.npz"),
    v1_train_features=v1_train,
    v1_test_features=v1_test,
    v1_valid_features=v1_valid,
    v2_train_features=v2_train,
    v2_test_features=v2_test,
    v2_valid_features=v2_valid,
    v3_large_train_features=v3_large_train,
    v3_large_test_features=v3_large_test,
    v3_large_valid_features=v3_large_valid,
    v3_small_train_features=v3_small_train,
    v3_small_test_features=v3_small_test,
    v3_small_valid_features=v3_small_valid,
    train_targets=dataset["train_targets"],
    test_targets=dataset["test_targets"],
    validation_targets=dataset["validation_targets"]
)