# Feline Neural Network

**Author(s):** bfoo@google.com, kozyr@google.com, ronbodkin@google.com

**Reviewer(s):** 

Let's export our model into a protobuf format that can be used to serve our tensorflow model. We are going to make inference calls for an image or set of images over the web!

In [0]:
# Enter your username:
YOUR_GMAIL_ACCOUNT = '******' # Whatever is before @gmail.com in your email address

In [0]:
# Libraries for this section:
import os
import tensorflow as tf

# Tensorflow Serving Setup

To create a TF model for Tensorflow serving, we will be using the estimator.export_savedmodel() method. 

During training, estimator checkpoints contain saved values for variables (e.g. weights and biases), which can be restored as long as the estimator loading the checkpoint contains the same variable names. We can copy-paste the estimator, model_fn, cnn, etc from steps 8-9 below, and exporting the model will work correctly.

However, we demonstrate below that the estimator need not be identical to that in training, and probably should not be since sending prediction requests over the web requires not only flattening input tensors, but placing them within a dictionary. Consequently, our model_fn will actually extract and reshape the tensors prior to running them through the cnn. As long as our neural net is identical to the neural net used to train the model, we can still restore the checkpoints correctly.

In [0]:
MODEL_DIR = os.path.join('/home', YOUR_GMAIL_ACCOUNT, 'data/output_cnn_big/')  # Directory where we store our logging and models.
SERVING_DIR = os.path.join('/home', YOUR_GMAIL_ACCOUNT, 'data/serving_cnn_model') # Where we will export our model for serving

# TensorFlow Serving Setup:
NUM_CLASSES = 2  # Identical to training!

# Network parameters: must be identical to training
CNN_KERNEL_SIZE = 3  # Receptive field will be square window with this many pixels per side.
CNN_STRIDES = 2  # Distance between consecutive receptive fields.
CNN_FILTERS = 16  # Number of filters (new receptive fields to train, i.e. new channels) in first convolutional layer.
FC_HIDDEN_UNITS = 512  # Number of hidden units in the fully connected layer of the network.

Ensure that the pixel dimensions for the network are identical to what is used in the training, validation, and testing phase.

In [0]:
pixel_dim = [128, 128]
pixels = pixel_dim[0] * pixel_dim[1]

# CNN architecture

**IMPORTANT**: The cnn() function defines the neural network structure and must be IDENTICAL to that used to train the weights and biases at every layer. This is because we must load the values of saved variables (weights and biases) from the checkpoint into the same components in the network.

However, we no longer need some of the parameters such as 'dropout' (not used), 'reuse' (set to AUTO_REUSE), or 'is_training' (always false), since we are merely loading the values once and running predictions with these values. Consequently, we have removed these arguments from cnn() and explicitly assigned them below to simplify the code. Look for **"NOTE:"** in the comments below for changes, and compare with the definition in step_8_to_9.ipynb.

Nevertheless, if you kept the network definition identical to that during training, everything will still work!

In [0]:
# CNN architecture
# NOTE: only images is required, all other arguments removed!
def cnn(images):
  """Defines the architecture of the neural network.
  
  Will be called within generate_model_fn() below.
  
  Args: 
    image: set of images as 4-d tensor (of batch_size) pulled in when_input_fn() is executed.
    
  Returns:
    2-d tensor: each images [logit(1-p), logit(p)] where p=Pr(1),
                i.e. probability that class is 1 (cat in our case).
                Note: logit(p) = logodds(p) = log(p / (1-p))
  """

  # NOTE: Reuse the variables if they exist. They will be created once upon loading the checkpoint, then reused afterwards.
  with tf.variable_scope('cnn', reuse=tf.AUTO_REUSE):
    layer_1 = tf.layers.conv2d(
      inputs=images,
      kernel_size=CNN_KERNEL_SIZE,
      strides=CNN_STRIDES,
      filters=CNN_FILTERS,
      padding='SAME',
      activation=tf.nn.relu)
    
    layer_2 = tf.layers.conv2d(
      inputs=layer_1,
      kernel_size=CNN_KERNEL_SIZE,
      strides=CNN_STRIDES,
      filters=CNN_FILTERS * (2 ** 1),
      padding='SAME',
      activation=tf.nn.relu)
    
    layer_3 = tf.layers.conv2d(
      inputs=layer_2,
      kernel_size=CNN_KERNEL_SIZE,
      strides=CNN_STRIDES,
      filters=CNN_FILTERS * (2 ** 2),
      padding='SAME',
      activation=tf.nn.relu)
    
    layer_4 = tf.layers.conv2d(
      inputs=layer_3,
      kernel_size=CNN_KERNEL_SIZE,
      strides=CNN_STRIDES,
      filters=CNN_FILTERS * (2 ** 3),
      padding='SAME',
      activation=tf.nn.relu)
    
    layer_5 = tf.layers.conv2d(
      inputs=layer_4,
      kernel_size=CNN_KERNEL_SIZE,
      strides=CNN_STRIDES,
      filters=CNN_FILTERS * (2 ** 4),
      padding='SAME',
      activation=tf.nn.relu)
    
    layer_5_flat = tf.reshape(
      layer_5, 
      shape=[-1,
             CNN_FILTERS * (2 ** 4) *
             pixels  / (CNN_STRIDES ** 5) / (CNN_STRIDES ** 5)])
    
    dense_layer= tf.layers.dense(
      inputs=layer_5_flat,
      units=FC_HIDDEN_UNITS,
      activation=tf.nn.relu)
    
    # NOTE: this dropout_layer here is no longer needed, but we keep it here for comparison with training.
    dropout_layer = tf.layers.dropout( 
      inputs=dense_layer, 
      rate=0.0,  # No dropouts
      training=False)  # Never training

    return tf.layers.dense(inputs=dropout_layer, units=NUM_CLASSES)  # 2-d tensor: [logit(1-p), logit(p)] for each image in batch. 

# Model Function

We are no longer calling the *TRAIN* and *EVAL* paths, so we only need to handle the *PREDICT* mode.

For serving, the `features` argument is a dictionary containing a flattened array of pixel values for the image, rather than a 4d tensor as in training. However, as long as the features are parsed out and reshaped appropriately, and passed into the cnn, the output is consistent with the original estimator.predict() in step 8 and 9.

In [0]:
# Model function:
def generate_model_fn():
  """Return a function that determines how TF estimator operates.

  Only PREDICT mode will be called below. The returned function _cnn_model_fn
  below exports the predicted class and predicted probability to subsequent
  saved protobuf model.

  Args:
    None

  Returns:
    _cnn_model_fn: a function that returns specs for use with TF estimator
  """

  def _cnn_model_fn(features, labels, mode):
    """A function that determines specs for the TF estimator based on mode of operation.
    
    Args: 
      features: actual data containing a dictionary with a single input field 'images'
                (which goes into scope within estimator function) as 4-d tensor (of batch_size),
                pulled in via tf executing _input_fn(), which is the output to generate_input_fn()
                and is in memory
      labels: ignored since we are not training or evaluating
      mode: TF object indicating whether we're in train, eval, or predict mode.
      
    Returns:
           estim_specs: collections of metrics and tensors that are required for training (e.g. prediction values, loss value, train_op tells model weights how to update)
    """

    # NOTE: For serving, the features fed in will be a dict with a serialized image.
    image_flattened = tf.convert_to_tensor(features['flattened_image'], dtype=tf.int64)
    images = tf.reshape(image_flattened, [1, pixel_dim[0], pixel_dim[1], 3])
    images = tf.to_float(images) / 255
    # Rescale from (0,1) to (-1,1) so that the "center" of the image range is 0:
    images = (images * 2) - 1
    
    # Use the cnn() to compute logits:
    logits = cnn(images)
    # We'll be evaluating these later.

    # Transform logits into predictions:
    pred_classes = tf.argmax(logits, axis=1)  # Returns 0 or 1, whichever has larger logit.
    pred_prob = tf.nn.softmax(logits=logits)[:, 1]  # Applies softmax function to return 2-d probability vector.
    # Note: we're not outputting pred_prob in this tutorial, that line just shows you
    # how to get it if you want it. Softmax[i] = exp(logit[i]) / sum(exp((logit[:]))

    # NOTE: For serving (prediction mode), we add a new field export_outputs, which returns
    # a JSON of values for predicted class (0 or 1), and predicted probability of being a cat.
    if mode == tf.estimator.ModeKeys.PREDICT:
      return tf.estimator.EstimatorSpec(mode,
                                        predictions=tf.stack([tf.cast(pred_classes, dtype=tf.float32)[0],
                                                              pred_prob[0],
                                                             ]),
                                        export_outputs={'predictions': tf.estimator.export.PredictOutput(outputs={
                                            'class': pred_classes,
                                            'prob': pred_prob,
                                        })})

    # NOTE: We will not be calling the estimator using train or eval, so this path is ignored.
    # There is no need to define the loss function, optimizer, or training operation.
    return tf.estimator.EstimatorSpec(
      mode=mode,
      predictions=pred_classes,
    )

  return _cnn_model_fn

# Defining the Estimator

The estimator no longer needs configs, which are primarily used for saving checkpoints and outputting training and evaluation summaries.

In [0]:
# TF Estimator:
estimator = tf.estimator.Estimator(
  model_fn=generate_model_fn(),  # Call our generate_model_fn to create model function
  model_dir=MODEL_DIR,  # Where to look for model checkpoints
  #config not needed
)

In [0]:
def serving_input_receiver_fn():
  feature_spec = {'flattened_image': tf.FixedLenFeature(dtype=tf.int64, shape=[pixel_dim[0] * pixel_dim[1] * 3])}
  return tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)()

In [0]:
estimator.export_savedmodel(export_dir_base=SERVING_DIR,
                            serving_input_receiver_fn=serving_input_receiver_fn)