# Blogpost: training and serving Tensorflow models with tf.Keras

Keras is a high-level interface for neural networks that runs on top of multiple backends. Its functional API is very user-friendly, yet flexible enough to build all kinds of applications. Keras quickly gained traction after its introduction and in 2017, the Keras API was integrated into core Tensorflow as ```tf.keras```. Although ```tf.keras``` and Keras have separate code bases, they are tightly coupled and with the [updated documentation](https://www.tensorflow.org/tutorials/) and [programmer guides](https://www.tensorflow.org/guide/keras) as of Tensorflow 1.9, ```tf.keras``` is clearly the high level API to look for when building neural networks with Tensorflow.

In this notebook, we will work through the process of training, exporting and serving a neural network with ```tf.keras```. As an example, we will train a convolutional neural network on the Kaggle Planet dataset to predict labels for satellite images of the Amazon forest. The goal is to illustrate an end-to-end pipeline for a real-world use case. **Note that you'll need to install the nightly version of TensorFlow, 1.11.0, to follow along until the end.**

Also, when running on Google colab, make sure to change the runtime settings to Python 3. People who want to run the notebook on their own machine can use the requirements.txt file to set up their virtual environment. 

# Data preparation

The data is available for download on [Kaggle](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data). The code below will download the required data automatically. You'll need to create a [Kaggle API token](https://github.com/Kaggle/kaggle-api#api-credentials) first and upload it here.

In [None]:
# Make sure the latest version of TF is installed
!pip install tf-nightly
!pip install h5py # required for saving Keras models

In [None]:
import os

# Upload the API token.
def get_kaggle_credentials():
    token_dir = os.path.join(os.path.expanduser("~"),".kaggle")
    token_file = os.path.join(token_dir, "kaggle.json")
    if not os.path.isdir(token_dir):
        os.mkdir(token_dir)
    try:
        with open(token_file,'r') as f:
            pass
    except IOError as no_file:
        try:
            from google.colab import files
        except ImportError:
            raise no_file

        uploaded = files.upload()

        if "kaggle.json" not in uploaded:
            raise ValueError("You need an API key! see: "
                           "https://github.com/Kaggle/kaggle-api#api-credentials")
        with open(token_file, "wb") as f:
            f.write(uploaded["kaggle.json"])
        os.chmod(token_file, 600)

get_kaggle_credentials()

Let's download the data. In this notebook, we will only require the training data from the competition. Make sure the accept the [competition rules](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/rules) first so that you can actually download the data. 

In [None]:
!pip install kaggle

In [None]:
import kaggle
# This will download the data (600 MB)
competition_name = 'planet-understanding-the-amazon-from-space'

kaggle.api.competition_download_file(competition_name, file_name='train-jpg.tar.7z')

# This will extract the data
import subprocess
subprocess.call('7z x ./train-jpg.tar.7z'.split(' '))

import tarfile
with tarfile.open('./train-jpg.tar', 'r:') as f:
    f.extractall()

The training data consists of approximately 40000 labeled images of the Amazon rain forest. Each image is associated with: 
    * Exactly one out of four possible 'weather' labels: clear, haze, cloudy or partly cloudy
    * One or more out of 13 possible 'ground' labels: agriculture, bare_ground, habitation, road, water...  

I constructed a Pandas Dataframe with columns for the image names and the weather and ground labels encoded as one-hot binary vectors. The dataframe is available as a .csv file on my github page:

In [None]:
import pandas as pd
df_train = pd.read_csv('https://raw.githubusercontent.com/sdcubber/keras-training-serving/master/KagglePlanetMCML.csv')
df_train.head()

We want to train a model that can accurately predict these labels for new images. We'll try to do this with a network that has two separate outputs for the weather and the ground labels. Predicting the weather labels is an example of a *multi-class classification* problem, whereas the ground labels can be modeled as a *multi-label classification* problem. Therefore, the loss function for both outputs will be different.

# Specifying the model

Since this is a computer vision problem, we will use a convolutional neural network. We will build our own model from scratch. We will go for a fairly classical configuration with some convolutional layers, relu activations and two dense classifiers on top. If you are running this notebook without a GPU, it's best to pick a small image size to reduce the training time.

In [None]:
import tensorflow as tf
IM_SIZE = 16 # image size

image_input = tf.keras.Input(shape=(IM_SIZE, IM_SIZE, 3), name='input_layer')

# Some convolutional layers
conv_1 = tf.keras.layers.Conv2D(32,
                                kernel_size=(3, 3),
                                padding='same',
                                activation='relu')(image_input)
conv_1 = tf.keras.layers.MaxPooling2D(padding='same')(conv_1)
conv_2 = tf.keras.layers.Conv2D(32,
                                kernel_size=(3, 3),
                                padding='same',
                                activation='relu')(conv_1)
conv_2 = tf.keras.layers.MaxPooling2D(padding='same')(conv_2)

# Flatten the output of the convolutional layers
conv_flat = tf.keras.layers.Flatten()(conv_2)

# Some dense layers with two separate outputs
fc_1 = tf.keras.layers.Dense(128,
                             activation='relu')(conv_flat)
fc_1 = tf.keras.layers.Dropout(0.2)(fc_1)
fc_2 = tf.keras.layers.Dense(128,
                             activation='relu')(fc_1)
fc_2 = tf.keras.layers.Dropout(0.2)(fc_2)

# Output layers: separate outputs for the weather and the ground labels
weather_output = tf.keras.layers.Dense(4,
                                       activation='softmax',
                                       name='weather')(fc_2)
ground_output = tf.keras.layers.Dense(13,
                                      activation='sigmoid',
                                      name='ground')(fc_2)

We have two output layers, so these should be passed as a list of outputs when specifying the model. Note the different activation functions for the weather and the ground output layers. Conveniently, the ```tf.keras``` implementation of ```Model``` comes with the handy ```summary()``` method:

In [None]:
model = tf.keras.Model(inputs=image_input, outputs=[weather_output, ground_output])
print(model.summary())

Upon compiling the model, the two different loss functions can be provided as a dictionary that maps tensor names to losses:

In [None]:
model.compile(optimizer='adam',
              loss={'weather': 'categorical_crossentropy',
                    'ground': 'binary_crossentropy'})

Compiling a model initializes it with random weights and also allows us to choose an optimization algorithm for training the network.

# Training the model

Let's train the model! I will be training this model on my laptop, which does not have enough RAM to take the entire dataset into memory. With image data, this is very often the case. Keras provides the ```model.fit_generator()``` method that can use a custom Python generator yielding images from disc for training. However, as of Keras 2.0.6, we can use the ```Sequence``` object instead of a generator which allows for safe multiprocessing which means significant speedups and less risk of bottlenecking your GPU if you have one. The Keras documentation already provides good example code, which I will customize a bit to:
* make it work with a dataframe that maps image names to labels
* shuffle the training data after every epoch

In [None]:
import ast
import os
import numpy as np
import random
import math
from tensorflow.python.keras.preprocessing.image import img_to_array as img_to_array
from tensorflow.python.keras.preprocessing.image import load_img as load_img

In [None]:
def load_image(image_path, size):
    return img_to_array(load_img(image_path, target_size=(size, size))) / 255.

class KagglePlanetSequence(tf.keras.utils.Sequence):
    """
    Custom Sequence object to train a model on out-of-memory datasets. 
    """
    
    def __init__(self, df, data_path, im_size, batch_size, mode='train'):
        """
        df: pandas dataframe that contains columns with image names and labels
        data_path: path that contains the training images
        im_size: image size
        mode: when in training mode, data will be shuffled between epochs
        """
        self.df = df
        self.batch_size = batch_size
        self.im_size = im_size
        self.mode = mode
        
        # Take labels and a list of image locations in memory
        self.wlabels = self.df['weather_labels'].apply(lambda x: ast.literal_eval(x)).tolist()
        self.glabels = self.df['ground_labels'].apply(lambda x: ast.literal_eval(x)).tolist()
        self.image_list = self.df['image_name'].apply(lambda x: os.path.join(data_path, x + '.jpg')).tolist()

    def __len__(self):
        return int(math.ceil(len(self.df) / float(self.batch_size)))

    def on_epoch_end(self):
        # Shuffles indexes after each epoch
        self.indexes = range(len(self.image_list))
        if self.mode == 'train':
            self.indexes = random.sample(self.indexes, k=len(self.indexes))

    def get_batch_labels(self, idx): 
        # Fetch a batch of labels
        return [self.wlabels[idx * self.batch_size: (idx + 1) * self.batch_size],
                self.glabels[idx * self.batch_size: (idx + 1) * self.batch_size]]

    def get_batch_features(self, idx):
        # Fetch a batch of images
        batch_images = self.image_list[idx * self.batch_size: (1 + idx) * self.batch_size]
        return np.array([load_image(im, self.im_size) for im in batch_images])

    def __getitem__(self, idx):
        batch_x = self.get_batch_features(idx)
        batch_y = self.get_batch_labels(idx)
        return batch_x, batch_y
    
seq = KagglePlanetSequence(df_train,
                       './train-jpg/',
                       im_size=IM_SIZE,
                       batch_size=32)

This ```Sequence``` object can be used instead of a custom generator together with ```fit_generator()``` to train the model. Note that there is no need to provide the number of steps per epoch, since the ```__len__``` method implements that logic for the generator. Furthermore, ```tf.keras``` provides access to all the available Keras callbacks that can be used to enhance the training loop. These can be quite powerful and provide options for early stopping, learning rate scheduling, storing files for TensorBoard... Here, we will use the ```ModelCheckPoint``` callback to save the model after every epoch so that we can pick up training afterwards if we want. By default, the model architecture, training configuration, state of the optimizer and the weights are stored, such that the entire model can be recreated from a single file.
Let's train the model for a single epoch:

In [None]:
callbacks = [
    tf.keras.callbacks.ModelCheckpoint('./model.h5', verbose=1)
]

model.fit_generator(generator=seq,
                    verbose=1, 
                    epochs=1,
                    use_multiprocessing=False,
                    workers=1,
                    callbacks=callbacks)

Suppose that we want to finetune the model in a later stage, we can simply read the model file and pick up training even without explicitly recompiling:

In [None]:
another_model = tf.keras.models.load_model('./model.h5')
another_model.fit_generator(generator=seq, verbose=1, epochs=1)

Finally, it's good to verify that our ```Sequence``` effectively passes over all the data by instantiating a ```Sequence``` in test mode (that is, without shuffling) and using it to make predictions for the entire dataset:

In [None]:
test_seq = KagglePlanetSequence(df_train,
                       './train-jpg/',
                       im_size=IM_SIZE,
                       batch_size=32,
                       mode='test') # test mode disables shuffling

predictions = model.predict_generator(generator=test_seq, verbose=1)
# We get a list of two prediction arrays, for weather and for label

In [None]:
len(predictions[1])  == len(df_train) # Total number of images in dataset

# Wait, what about the Dataset API?

The ```tf.data``` API is a powerful library that allows to consume data from various sources and pass it to TensorFlow models. Can we train our ```tf.keras``` model using the ```tf.data``` API instead of with the ```Sequence``` object? Yes. First of all, let's serialize the images and labels together into a ```TFRecordfile```, which is the recommended format for serializing data in TensorFlow:

In [None]:
# Serialize images, together with labels, to TF records
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

tf_records_filename = './/KagglePlanetTFRecord_{}'.format(IM_SIZE)
writer = tf.python_io.TFRecordWriter(tf_records_filename)

# List of image paths, np array of labels
im_list = [os.path.join('./train-jpg/', v + '.jpg') for v in df_train['image_name'].tolist()]
w_labels_arr = np.array([ast.literal_eval(l) for l in df_train['weather_labels']])
g_labels_arr = np.array([ast.literal_eval(l) for l in df_train['ground_labels']])

for i in range(len(df_train)):
    w_labels = w_labels_arr[i].astype(np.float32)
    g_labels = g_labels_arr[i].astype(np.float32)
    im = np.array(img_to_array(load_img(im_list[i], target_size=(IM_SIZE, IM_SIZE))) / 255.)
    w_raw = w_labels.tostring()
    g_raw = g_labels.tostring()
    im_raw = im.tostring()
    
    example = tf.train.Example(features=tf.train.Features(feature={'image': _bytes_feature(im_raw),
                                                                  'weather_labels': _bytes_feature(w_raw),
                                                                  'ground_labels': _bytes_feature(g_raw)}))
    
    writer.write(example.SerializeToString())
    
writer.close()

After dumping the images and the labels into a ```TFRecordfile```, we can come up with another generator using the ```tf.data``` API. The idea is to instantiate a ```TFRecordDataset``` from our file and tell it how to parse the serialized data using the ```map()``` operation.

In [None]:
from tensorflow import FixedLenFeature
featdef = {
           'image': FixedLenFeature(shape=[], dtype=tf.string),
           'weather_labels': FixedLenFeature(shape=[], dtype=tf.string),
           'ground_labels': FixedLenFeature(shape=[], dtype=tf.string)
          }

In [None]:
def _parse_record(example_proto, clip=False):
    ex = tf.parse_single_example(example_proto, featdef)
    
    im = tf.decode_raw(ex['image'], tf.float32)
    im = tf.reshape(im, (IM_SIZE, IM_SIZE, 3))
    
    weather = tf.decode_raw(ex['weather_labels'], tf.float32)
    ground = tf.decode_raw(ex['ground_labels'], tf.float32)
    
    return im, (weather, ground)

# Construct a dataset iterator
batch_size = 32
ds_train = tf.data.TFRecordDataset('./KagglePlanetTFRecord_{}'.format(IM_SIZE)).map(_parse_record)
ds_train = ds_train.repeat().shuffle(1000).batch(batch_size)

```Dataset``` objects provide multiple methods to produce iterator objects to loop over the data. However, as of TensorFlow 1.9, we can simply pass our ```ds_train``` directly to ```model.fit()``` to train the model:

In [None]:
model = tf.keras.Model(inputs=image_input, outputs=[weather_output, ground_output])

model.compile(optimizer='adam',
              loss={'weather': 'categorical_crossentropy',
                    'ground': 'binary_crossentropy'})

history = model.fit(ds_train, 
                    steps_per_epoch=100, # let's just take some steps
                    epochs=1)

Works nicely. This way of working opens up ```tf.keras``` for people who are used to working with ```TFRecords```. If you want use validation data, you can just instantiate another ```Dataset``` with validation data and pass that as well to ```model.fit()```.

# Serving the model

First of all, we want to export our model in a format that the server can handle. TensorFlow provides the ```SavedModel``` format as a universal format for exporting models. Under the hood, our ```tf.keras``` model is fully specified in terms of TensorFlow objects, so we can export it just fine using Tensorflow methods.

The main idea behind exporting a model is to specify an inference computation via a signature definition. A ```SignatureDef``` is fully specified in terms of input and output tensors and is eventually stored together with the model weights. However, TensorFlow provides a convenience function ```tf.saved_model.simple_save()``` which abstracts away some of these details and works fine for most use cases:

In [None]:
# The easy way, with simple_save(), where the signature def is defined implicitly and stored with the
# default graph way
# The export path contains the name and the version of the model
import shutil 

tf.keras.backend.clear_session()
tf.keras.backend.set_learning_phase(0)
model = tf.keras.models.load_model('./model.h5')

if os.path.exists('./PlanetModel/1'):
    shutil.rmtree('./PlanetModel/1')
    
export_path = './PlanetModel/1'

# Fetch the Keras session and save the model
with tf.keras.backend.get_session() as sess:
    tf.saved_model.simple_save(
        sess,
        export_path,
        inputs={'input_image': model.input},
        outputs={t.name:t for t in model.outputs})

These files can be used to the model as described in the blogpost by means of the [TensorFlow Serving library](https://www.tensorflow.org/serving/).  