# Training the Iris Dataset with InceptionV3 on CPU

### Generate image data generators for training, validation and test data


To ingest the data for training purposes we utilize the Keras **ImageDataGenerator** class.  This allows us to easily read in a directory that is structured with each category in its respective folder.  Earlier in the training during the Exploration phase we structured the data in this manner along with specific folders for train, test and validation.  We're going to utilize a generator for each of those folder classes.

At this point we also are planning to use **Inception V3** which has a Height and Width requirement of **299x299** so we instantiate that here so we can utilize it throughout the rest of the notebook.  The generator will also resize images to that size before feeding it into training, testing or validation so we make sure it will work successfully.  

In [2]:
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.inception_v3 import preprocess_input, decode_predictions

WIDTH=299
HEIGHT=299
BATCH_SIZE=64
test_dir = 'Process_Data/test/'
train_dir = 'Process_Data/train/'
val_dir = 'Process_Data/val/'

#Train DataSet Generator with Augmentation
print("\nTraining Data Set")
train_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
train_flow = train_generator.flow_from_directory(
    train_dir,
    target_size=(HEIGHT, WIDTH),
    batch_size = BATCH_SIZE
)

#Validation DataSet Generator with Augmentation
print("\nValidation Data Set")
val_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
val_flow = val_generator.flow_from_directory(
    val_dir,
    target_size=(HEIGHT, WIDTH),
    batch_size = BATCH_SIZE
)

#Test DataSet Generator with Augmentation
print("\nTest Data Set")
test_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
test_flow = test_generator.flow_from_directory(
    test_dir,
    target_size=(HEIGHT, WIDTH),
    batch_size = BATCH_SIZE
)


Training Data Set
Found 414 images belonging to 5 classes.

Validation Data Set
Found 27 images belonging to 5 classes.

Test Data Set
Found 66 images belonging to 5 classes.


### Optimizations for CPU


In [3]:
from keras.models import Sequential, Model, load_model
from keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard, CSVLogger
from keras import optimizers, models
from keras.layers import Dense, Dropout, GlobalAveragePooling2D
from keras import applications
from tensorflow.compat.v1.keras import backend as K
import tensorflow as tf
import os

NUM_PARALLEL_EXEC_UNITS = 4

#TensorFlow
config = tf.compat.v1.ConfigProto(
    intra_op_parallelism_threads=NUM_PARALLEL_EXEC_UNITS,
    inter_op_parallelism_threads=1
)

session = tf.compat.v1.Session(config=config)
K.set_session(session)

#MKL and OpenMP
os.environ["OMP_NUM_THREADS"] = str(NUM_PARALLEL_EXEC_UNITS)
os.environ["KMP_BLOCKTIME"] = "1"
os.environ["KMP_SETTINGS"] = "1"
os.environ["KMP_AFFINITY"]= "granularity=fine,verbose,compact,1,0"

### Initialize Training Top Layers

Start by brining in the pre-defined **InceptionV3** network provided by Keras.  We'll make sure to include the ImageNet weights since we want to utilize those weights for Transfer Learning which will speed up our training significantly.  We'll also make sure the **Top Layers** aren't included since we don't want to predict 1001 classes and will then modify the network to fit our dataset.

Take the base model and add a GlobalAveragePooling2D layer and pass it the output of the base model.  We'll then add a final **Dense Layer** or **Fully Connected Layer** that has a **softmax activation** which will do our predictions on the number of classes in our dataset.  To make sure this is verstile we use the train_flow generator class indicies number so that it will automatically use the correct number of classes in the dataset.

Now we iterate over the initial layers of the base model and disable them for training by changing the layer.trainable variable to False.  This means we'll only train over the new layers that we added specifically for our dataset.

Then compile your model and add the optimizer that you want to use.  In this case we'll be using **Adam** with a **Learning Rate** of **0.001**.  We also want to use **loss** of **Categorical Crossentropy** since we have a multi-class classification problem.


In [4]:
# Initialize InceptionV3 with transfer learning
base_model = applications.InceptionV3(weights='imagenet', 
                                include_top=False, 
                                input_shape=(WIDTH, HEIGHT,3))

# add a global spatial average pooling layer
x = base_model.output

x = GlobalAveragePooling2D()(x)
# and a dense layer
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(train_flow.class_indices), activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer=optimizers.Adam(lr=0.001), metrics=['accuracy', 'top_k_categorical_accuracy'], loss='categorical_crossentropy')
model.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 299, 299, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 149, 149, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 149, 149, 32) 96          conv2d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 149, 149, 32) 0           batch_normalization[0][0]        
_______________________________________________________________________________________

### Start Training / Training Callbacks


In [5]:
import math
top_layers_file_path="top_layers.iv3.hdf5"

checkpoint = ModelCheckpoint(top_layers_file_path, monitor='loss', verbose=1, save_best_only=True, mode='min')
tb = TensorBoard(log_dir='./logs', batch_size=val_flow.batch_size, write_graph=True, update_freq='batch')
early = EarlyStopping(monitor="loss", mode="min", patience=5)
csv_logger = CSVLogger('./logs/iv3-log.csv', append=True)

history = model.fit_generator(train_flow, 
                              epochs=5, 
                              verbose=1,
                              validation_data=val_flow,
                              validation_steps=math.ceil(val_flow.samples/val_flow.batch_size),
                              steps_per_epoch=math.ceil(train_flow.samples/train_flow.batch_size),
                              callbacks=[checkpoint, early, tb, csv_logger])

Instructions for updating:
Please use Model.fit, which supports generators.
Epoch 1/5
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
Epoch 00001: loss improved from inf to 1.96147, saving model to top_layers.iv3.hdf5
Epoch 2/5
Epoch 00002: loss improved from 1.96147 to 1.02583, saving model to top_layers.iv3.hdf5
Epoch 3/5
Epoch 00003: loss improved from 1.02583 to 0.73406, saving model to top_layers.iv3.hdf5
Epoch 4/5
Epoch 00004: loss improved from 0.73406 to 0.58812, saving model to top_layers.iv3.hdf5
Epoch 5/5
Epoch 00005: loss improved from 0.58812 to 0.52100, saving model to top_layers.iv3.hdf5


### Evaluate Model

In [7]:
model.load_weights(top_layers_file_path)
loss, acc, top_5 = model.evaluate_generator(
    test_flow,
    verbose = True,
    steps=math.ceil(test_flow.samples/test_flow.batch_size))
print("Loss: ", loss)
print("Acc: ", acc)
print("Top 5: ", top_5)

Instructions for updating:
Please use Model.evaluate, which supports generators.
Loss:  1.1684497594833374
Acc:  0.5757575631141663
Top 5:  1.0


### Write Labels File

When we're training our network we're using a numerical value for the actual class that is predicated at the end of each batch through the network.  The network itself doesn't care what the actual string class name is, only that it's optimizing for one of the n classes you have in your dataset.  

So when we move forward and use our network we need to indicate what numerical value the network was using to represent the correct class name.  We can do this by iterating over any of the data generator class_indicies values and use a list comprehension to extract the values.  We're going to write these values out in order to a text file to represent the numerical value mapping to class name for future use.


In [8]:
label = [k for k,v in train_flow.class_indices.items()]
with open('iv3-labels.txt', 'w+') as file:
    file.write("\n".join(label))

### Test Model with Sample image


In [11]:
from keras.preprocessing import image
import numpy as np
import glob
import random

file_list = glob.glob("Process_Data/test/*/*")
img_path = random.choice(file_list)
img_class = os.path.split(os.path.dirname(img_path))[1]
print("Image Category: ", img_class)
img = image.load_img(img_path, target_size=(299, 299))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
print("Raw Predictions: ", preds)

top_x = 3
top_args = preds[0].argsort()[-top_x:][::-1]
preds_label = [label[p] for p in top_args]
print("\nTop " + str(top_x) + " confidence: " + " ".join(map(str, sorted(preds[0])[-top_x:][::-1])))
print("Top " + str(top_x) + " labels: " + " ".join(map(str, preds_label)))

Image Category:  Class 0
Raw Predictions:  [[0.6237114  0.13880964 0.23111694 0.00147715 0.00488484]]

Top 3 confidence: 0.6237114 0.23111694 0.13880964
Top 3 labels: Class 0 Class 2 Class 1


### Fine Tuning the Entire Network

We previously fine tuned only the top layer of the network.  Now we're going to allow for all of the layers in the network to be trained but we're going to use a lower learning rate.  This will let the network narrow in and tune the remaining weights we didn't tune from the ImageNet checkpoint.

We'll start by unfreezing the top two inception layers in our model and then compiling the model again.  The remaining pieces of the code will be almost identical to the above except that we're making sure to change file path names that indicate we're utilizing the top two inception nodes in this training.

In [12]:
model = load_model(top_layers_file_path)

# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
    layer.trainable = False
for layer in model.layers[249:]:
    layer.trainable = True

# we need to recompile the model for these modifications to take effect
# we use Adam with a low learning rate
model.compile(optimizer=optimizers.Adam(lr=0.0001), metrics=['accuracy', 'top_k_categorical_accuracy'], loss='categorical_crossentropy')
model.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 299, 299, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 149, 149, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 149, 149, 32) 96          conv2d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 149, 149, 32) 0           batch_normalization[0][0]        
_______________________________________________________________________________________

In [13]:
#Start Training top layers
inception_layers_file_path="inception_layers.iv3.hdf5"
checkpoint = ModelCheckpoint(inception_layers_file_path, monitor='loss', verbose=1, save_best_only=True, mode='min')

train_flow.reset()
val_flow.reset()
history = model.fit_generator(train_flow, 
                              epochs=5, 
                              verbose=1,
                              validation_data=val_flow,
                              validation_steps=math.ceil(val_flow.samples/val_flow.batch_size),
                              steps_per_epoch=math.ceil(train_flow.samples/train_flow.batch_size),
                              callbacks=[checkpoint, early, tb, csv_logger])

Epoch 1/5
Epoch 00001: loss improved from inf to 0.60026, saving model to inception_layers.iv3.hdf5
Epoch 2/5
Epoch 00002: loss improved from 0.60026 to 0.18809, saving model to inception_layers.iv3.hdf5
Epoch 3/5
Epoch 00003: loss improved from 0.18809 to 0.07266, saving model to inception_layers.iv3.hdf5
Epoch 4/5
Epoch 00004: loss improved from 0.07266 to 0.02847, saving model to inception_layers.iv3.hdf5
Epoch 5/5
Epoch 00005: loss improved from 0.02847 to 0.01508, saving model to inception_layers.iv3.hdf5


In [14]:
#Load Trained Model and Test
model.load_weights(inception_layers_file_path)
test_flow.reset()
loss, acc, top_5 = model.evaluate_generator(
    test_flow,
    verbose = True,
    steps=math.ceil(test_flow.samples/test_flow.batch_size))
print("Loss: ", loss)
print("Acc: ", acc)
print("Top 5: ", top_5)

Loss:  2.0552220344543457
Acc:  0.6666666865348816
Top 5:  1.0


In [15]:
file_list = glob.glob("Process_Data/test/*/*")
img_path = random.choice(file_list)
img_cat = os.path.split(os.path.dirname(img_path))[1]
print("Image Category: ", img_cat)
img = image.load_img(img_path, target_size=(299, 299))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
print("Raw Predictions: ", preds)

top_x = 3
top_args = preds[0].argsort()[-top_x:][::-1]
preds_label = [label[p] for p in top_args]
print("\nTop " + str(top_x) + " confidence: " + " ".join(map(str, sorted(preds[0])[-top_x:][::-1])))
print("Top " + str(top_x) + " labels: " + " ".join(map(str, preds_label)))

Image Category:  Class 0
Raw Predictions:  [[9.9689084e-01 1.7883326e-03 9.3937689e-04 2.4918217e-04 1.3226899e-04]]

Top 3 confidence: 0.99689084 0.0017883326 0.0009393769
Top 3 labels: Class 0 Class 1 Class 2
