# Visual examination of relations between known classes for neural network classifiers

Code that can be used to create a model for our proposed dataset (UtilityVehicles) and to save the representations of categories that are than used by the visualization methods based on the Hierarchical Clustering and Multidimensional Scaling (MDS). 

In [1]:
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
import numpy as np
import tensorflow as tf

In [2]:
tf.keras.utils.set_random_seed(1)
tf.config.experimental.enable_op_determinism()

Preparing the dataset (reading the data, applying the preprocessing function suitable for the MobileNetV2 model).

In [3]:
image_size = (224, 224)
batch_size = 32

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    "UtilityVehicles/UtilityVehicles",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
    label_mode='categorical',
    shuffle=False
)
normalized_ds = train_ds.map(lambda x, y: (preprocess_input(x), y))

Found 2000 files belonging to 10 classes.


In [4]:
# we take the pretrained MobileNetV2 (pretrained on ImageNet) without is 
# top layers (used for classification) - this is our feature extractor
base_model = MobileNetV2(
    weights="imagenet",
    include_top=False,
    input_shape=(224, 224, 3)
)
# we freeze all the weights of the feature extractor.
base_model.trainable = False

Using the MobileNetV2 feature extractor trained on ImageNet, we create our specialized model that we will use for fine-tuning. It has 10 output neurons (for 10 vehicle classes).

In [5]:
inputs = tf.keras.Input(shape=(224, 224, 3))
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(10, activation='sigmoid')(x)
model = tf.keras.Model(inputs, outputs)

model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 mobilenetv2_1.00_224 (Funct  (None, 7, 7, 1280)       2257984   
 ional)                                                          
                                                                 
 global_average_pooling2d (G  (None, 1280)             0         
 lobalAveragePooling2D)                                          
                                                                 
 dense (Dense)               (None, 10)                12810     
                                                                 
Total params: 2,270,794
Trainable params: 12,810
Non-trainable params: 2,257,984
_________________________________________________________________


Compile and train the model for vehicle classification. We use categorical crossentropy as our loss function and the Adam optimizer. We train our model for 10 epochs.

In [6]:
model.compile(optimizer="adam",
              loss="categorical_crossentropy",
              metrics=["accuracy"])
model.fit(normalized_ds, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1a549afb2b0>

At the end, we create a csv file, in which each row represents a single class (we provide the name of the class as a column in the csv file).

In [7]:
# human-readable names of classes
mapping = {0: "manual_forklift_with_lift", 1: "manual_forklift_yellow", 
           2: "manual_forklift_orange", 3:"petrol_lawn_mower", 4: "electric_mower", 
           5: "large_trolley", 6: "trolley", 7: "robot", 8: "tractor", 9: "trailer"}

In [8]:
import pandas as pd

df2 = pd.DataFrame(np.moveaxis(model.layers[-1].get_weights()[0], 0, -1),
                   columns=[i for i in range(0, model.layers[-1].get_weights()[0].shape[0])])
df2['name'] = mapping.values()

The csv file send below can be used e.g. in Orange visual tool to read the data about categories and produce the results for the hierarchical clustering and MDS (we provide an example pipeline, but also other implementations of these two algorithms can be used - e.g. given via Python SciKit-learn. We use Orange, because it has different built-in interactive tools for MDS and hierarchical clustering)

In [9]:
df2.to_csv('mobilenetv2_weights_transfer_learning.csv', index=False)