# Training and Export

In this notebook, I train and export a model to identify dog breeds from photos.

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub

import utils

## Data

Getting the data here is easy, since I did all of the hard work in the data processing script.

First, I load in the label vocabulary from a saved numpy array.

In [2]:
label_vocab = np.load('data/labelvocab.npy')
n_classes = np.shape(label_vocab)[0]

Then, I load in the basis for the transfer learning model so I can get its input size. I'm using the one of the pre-trained MobileNet V2 models from Tensorflow Hub because it works very well on limited resources, so I won't need anything fancy (or expensive) to serve the model.

In [3]:
image_col = hub.image_embedding_column("image", "https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/2")

height, width = hub.get_expected_image_size(image_col.module_spec)
depth = hub.get_num_image_channels(image_col.module_spec)
size = (height, width, depth)

INFO:tensorflow:Using /tmp/tfhub_modules to cache modules.
INFO:tensorflow:Downloading TF-Hub Module 'https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/2'.
INFO:tensorflow:Downloaded TF-Hub Module 'https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/2'.


The input function here is pretty straightforward. it just loads the TFRecords at the given filename, decodes them, shuffles them, and batches them. The function returns a lambda function so I can make versions for both training and validation data.

In [4]:
def make_input_fn(fname, repeat=1, batch_size=256):
    ds = (tf.data.TFRecordDataset(fname)
          .map(lambda im: 
               utils.decode_image_example(im, size))
          .shuffle(batch_size*2) # arbitrary
          .repeat(repeat)
          .batch(batch_size)
          .prefetch(2))
    
    return lambda: ds.make_one_shot_iterator().get_next()

train_input_fn = make_input_fn('data/dogs224_train.tfrecord', 3)
valid_input_fn = make_input_fn('data/dogs224_valid.tfrecord')

## Model

Here's the fun (and slow part): training the model. Keeping with my theme of simplicity, I train a canned linear classifier that consumes the output of MobileNet and outputs a prediction in terms of our labels.

In [5]:
est = tf.estimator.LinearClassifier(
    [image_col],
    n_classes=n_classes,
    label_vocabulary=list(label_vocab)
)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpwppe1d9r', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fe7515dc1d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I turn down log verbosity here because TF Hub modules produce a monumental amount of log spam when they first load in. I also periodically print evaluation metrics from the validation data.

In [6]:
tf.logging.set_verbosity(tf.logging.WARN)
for _ in range(5):
    est.train(train_input_fn)
    print(est.evaluate(valid_input_fn))

{'accuracy': 0.8022352, 'average_loss': 1.9217477, 'loss': 439.43964, 'global_step': 96}
{'accuracy': 0.8070943, 'average_loss': 1.7060735, 'loss': 390.12216, 'global_step': 192}
{'accuracy': 0.81341106, 'average_loss': 1.6882304, 'loss': 386.04202, 'global_step': 288}
{'accuracy': 0.81341106, 'average_loss': 1.680952, 'loss': 384.3777, 'global_step': 384}
{'accuracy': 0.8158406, 'average_loss': 1.6717596, 'loss': 382.2757, 'global_step': 480}


My serving input function takes in a vector (of unknown length) of strings that represent encoded images. They're then preprocessed and resized in the same manner as the training data (with the same function) before being sent to the model for prediction.

In [7]:
def serving_input_fn():
    receiver = tf.placeholder(tf.string, shape=(None))
    examples = tf.parse_example(
        receiver,
        {
            "image": tf.FixedLenFeature((), tf.string),
        }
    )
    
    
    decode_and_prep = lambda image: utils.preprocess_image(image, size[:-1])
    
    images = tf.map_fn(decode_and_prep, examples["image"],
                       tf.float32)
    
    return tf.estimator.export.ServingInputReceiver(
        {"image": images},
        receiver,
    )

In [8]:
est.export_savedmodel("serving/model/", serving_input_fn)

b'serving/model/1530162989'