<h1> Build the discriminative model </h1>
 
This notebook has the necesary code to create a discriminative model whose purpose is classify 11 types of food and non food.


In [2]:
import os
import logging
import argparse
import requests

import numpy as np

from PIL import Image
from io import BytesIO

from keras.applications.inception_v3 import InceptionV3, preprocess_input
from keras.models import Model, load_model
from keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.layers import Dense, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from keras.optimizers import SGD

In [3]:
# config logs
logging.basicConfig(level=logging.INFO)

logger = logging.getLogger(__name__)

<h4> SETTINGS </h4>
Define the settings of the model

In [14]:
BATCH_SIZE = 32
OUTPUT_MODEL_FILE = "inceptionv3-ft120_910acc.model"

IM_WIDTH, IM_HEIGHT = 299, 299  # fixed size for InceptionV3
NB_EPOCHS_FINETUNE = 100
NB_EPOCHS_TRANSFERLEARNING = 10

BAT_SIZE = 32
FC_SIZE = 1024
NB_IV3_LAYERS_TO_FREEZE = 120

SOURCE_PATH = 'Food12/'

<h3> Prepare the data </h3>

For reading from path the data, pre-process it and use data augmentation, the ImageDataGenerator method of keras is used. Its advantages are that you just hace to point out the path, and with the correct folder distribution, the data is loaded and labelled correctly.

The same as the pre-processing and data augmentation, the methods to data augmentation are signed, so when the data is loaded, the selected pre-processing is developed.

With Image Data Augmentation it is not necessary to be warned with the size of the dataset because Keras manage the data "on the fly", so images are not loaded in memory at once.

In [15]:
# define the food and non-food classes classes

classes = ['Bread', 'Dairy product', 'Dessert', 'Egg', 'Fried food', 'Meat', 'Noodles/Pasta', 'Rice', 'Seafood', 'Soup',
           'Vegetable/Fruit', 'Non food']

class_to_ix = dict(zip(classes, range(len(classes))))
ix_to_class = dict(zip(range(len(classes)), classes))

In [16]:
# lets define the paths
training_path = os.path.join(SOURCE_PATH, 'training/')
validation_path = os.path.join(SOURCE_PATH, 'validation/')
evaluation_path = os.path.join(SOURCE_PATH, 'evaluation/')

In [17]:
# Use ImageDataGenerator to use data augmentation directly.

train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True
)

test_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
)

In [18]:
# create the image generators for training, validation and testing, so the data is loaded directly from path
# and pre-processed as have been pointed out in the ImageDataGenerator function

train_generator = train_datagen.flow_from_directory(
    training_path,
    target_size=(IM_WIDTH, IM_HEIGHT),
    batch_size=BATCH_SIZE,
)

validation_generator = test_datagen.flow_from_directory(
    validation_path,
    target_size=(IM_WIDTH, IM_HEIGHT),
    batch_size=BATCH_SIZE,
)

evaluation_generator = test_datagen.flow_from_directory(
    evaluation_path,
    target_size=(IM_WIDTH, IM_HEIGHT),
    batch_size=BATCH_SIZE,
)

Found 711 images belonging to 12 classes.
Found 612 images belonging to 12 classes.
Found 577 images belonging to 12 classes.


In [19]:
# Get the number of classes
nb_classes = train_generator.num_classes

In [20]:
# Get the number of samples
nb_train_samples = train_generator.samples
nb_val_samples = validation_generator.samples
nb_eval_samples = evaluation_generator.samples

<h3> Create the model </h3>

The selected architecture to develop the model is the InceptionV3. The model provided by keras is the one which is going to be used. 

Regarding with the weights, the 'imagenet' pre-trained weights are used when the model is loaded.

The last layer of the model is not going to be used, because we are going to add a customized layer whose output correspond with the number of classes.

In [21]:
base_model = InceptionV3(weights='imagenet', include_top=False)  # include_top=False excludes final FC layer

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(FC_SIZE, activation='relu')(x)  # new FC layer, random init
predictions = Dense(nb_classes, activation='softmax')(x)  # new softmax layer

model = Model(input=base_model.input, output=predictions)

  


Because we do not want to train all the layers, just the last one, we are going to freeze the layers of the base model (the inception architecture), so just the two last denses classes are going to be trained.

In [22]:
for layer in base_model.layers:
    layer.trainable = False

Once that the model is defined, we need to select the optimer, the loss and if any metrics is desired.

In [23]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

<h3> Prepare the training </h3>

Once that orw model architecture is defined and compiled, we can start to train the model. But where are going to define, before the training to provent over-fitting, "Early Stoppping" method, so, when the loss at validation (bacause it has been selected) does not improve in three iterations, the training will stop directly.

In [24]:
early_tl = EarlyStopping(monitor="val_loss", patience=3)

<h3> Train the model </h3>

Here the model is trained. While is training, at the end of each epoch, the model is going to be validated with the validation subset.

In [None]:
history_tl = model.fit_generator(
            train_generator,
            nb_epoch=NB_EPOCHS_TRANSFERLEARNING,
            samples_per_epoch=nb_train_samples,
            validation_data=validation_generator,
            nb_val_samples=nb_val_samples,
            class_weight='auto',
            callbacks=[early_tl])

  
  


Epoch 1/10

<h3> Fine tune the model </h3>

When the model is trained, maybe the performance is not optimum, so some parameters could be changed and the training will improve.

The parameters that we are goint to modify is:
 * Train more layers
 * Change the optimizer
 * modify the learning rate
 * apply momentum
 

In [None]:
# select the layers which are not going to be trained
for layer in model.layers[:NB_IV3_LAYERS_TO_FREEZE]:
    layer.trainable = False
# select the one which does
for layer in model.layers[NB_IV3_LAYERS_TO_FREEZE:]:
    layer.trainable = True
    
# compile the model with new parameters    
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])

We can train the model again with the new model parametrization

In [None]:
history_ft = model.fit_generator(
    train_generator,
    samples_per_epoch=nb_train_samples,
    nb_epoch=NB_EPOCHS_FINETUNE,
    validation_data=validation_generator,
    nb_val_samples=nb_val_samples,
    class_weight='auto',
    callbacks=[early_tl])

Now that we have the model trained and we are satisfied with its performance, we can save the model just in case we want to use it again, so it is not necessary to follow the whole procedure again.

In [None]:
# save model
model.save(OUTPUT_MODEL_FILE)

<h3> Test the model </h3>

Using the testing subset, the model is evaluated, so we can know how really the performance of the model is.

In [None]:
score = model.evaluate_generator(evaluation_generator, nb_eval_samples / BATCH_SIZE)

In [None]:
print(f'The loss of the model is {score[0]}, and its accuracy is {score[1]}')

<h3> Test on single images </h3>

When all the training and testing procedure is realized, in pointed acassions we will need that the model predict us some images.

The processed image could be a image stored locally, or an url could be provided too.

Because the input size of the model is 229, if the image has not that size, it would have to be resized.

In [None]:
target_size = (229, 229)  # fixed size for InceptionV3 architecture

In [None]:
# load the image or give an image url
img_str = ''

In [None]:
# load the image and resize if its necessary
img = None
if img_str.startswith('http://') or img_str.startswith('https://'):
    response = requests.get(img_str)
    img = Image.open(BytesIO(response.content))
elif os.path.isfile(img_str):
    img = Image.open(img_str)
if img is None:
    return None
if img.size != target_size:
    img = img.resize(target_size)

We can load the model that we have saved during the training procedure

In [None]:
model = load_model(OUTPUT_MODEL_FILE)

Because the label of the images is not the corresponding one, we create a map where we assign our label, to the keras one

In [None]:
class_idx_map = {'11': 3, '10': 2, '1': 1, '0': 0, '3': 5, '2': 4, '5': 7, '4': 6, '7': 9, '6': 8, '9': 11, '8': 10}
idx_class_map = {_v: int(_k) for _k, _v in class_idx_map.iteritems()}

Before test the image, it is need to pre-process it

In [None]:
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

Once the image is ready, we can feed the model with it

In [None]:
preds = model.predict(x)

Because of the fact that the probabilities given by the model are represented in a vector whose length is the number of labels, we obtain the label with higher probability

In [None]:
probs = preds[0]
pred_idx = np.argmax(probs)
pred_class = idx_class_map[pred_idx]
pred_class_name = ix_to_class[pred_class]

We have the image classified, and we can share  it

In [None]:
logger.info("\tClass: '{}'. Prob: {:.5f}".format(pred_class_name, probs[pred_idx]))