Introduction
You can serve your TensorFlow models using a system called TensorFlow Serving. Serving a model means that clients can access models to make predictions through an API. TensorFlow Serving supports serving TensorFlow models out-of-the-box and is designed for use in production environments.

To serve a model, you can use the tensorflow_model_server binary that is included in the Amazon Deep Learning AMI. You need to serialize your model for TensorFlow Serving to be able to serve it. TensorFlow includes the SavedModelBuilder module to simplify the process. You need to tell the builder the signature of the prediction model. The signature tells the builder the type and shape of the inputs and outputs of the model. The model is serialized and saved to disk using Google's protocol buffer serialization format and produces files with a .pb file extension. TensorFlow Serving supports versioning allowing you to easily serve multiple versions of a model.

The graph you have worked with until now needs to be modified to support serving. The graph had been using constant inputs and was focused on training. When you serve a model you need to use placeholders for inputs so that a client can provide whatever input values they need to make predictions for. The code in this Lab Step indicates tensors used for training by appending train_ to their name. The new graph has separate paths for training and making predictions.

 

In [1]:
'''Export single neuron neural network model for TensorFlow Serving'''

from __future__ import print_function
import os
import numpy as np
import tensorflow as tf

In [2]:
tf.app.flags.DEFINE_integer('model_version', 1, 'Model version number')
tf.app.flags.DEFINE_string('export_dir', 'C:\\Users\\dbwab\\awslab\\tensorflows\\models', 'Export model directory')
#tf.app.flags.DEFINE_string('export_dir', '/tmp/nn', 'Export model directory')

FLAGS = tf.app.flags.FLAGS
print(FLAGS)


C:\Users\dbwab\.conda\envs\ztdl\lib\site-packages\ipykernel_launcher.py:
  --export_dir: Export model directory
    (default: 'C:\\Users\\dbwab\\awslab\\tensorflows\\models')
  --model_version: Model version number
    (default: '1')
    (an integer)

absl.flags:
  --flagfile: Insert flag definitions from the given file into the command line.
    (default: '')
  --undefok: comma-separated list of flag names that it is okay to specify on
    the command line even if the program does not define a flag with that name.
    IMPORTANT: flags in this list that have arguments MUST use the --flag=value
    format.
    (default: '')


In [3]:
# Set up sample points perturbed away from the ideal linear relationship
# y = 0.5*x + 2.5
num_examples = 60
points = np.array([np.linspace(-1, 5, num_examples),
                   np.linspace(2, 5, num_examples)])
points += np.random.randn(2, num_examples)
train_x, train_y = points
# Include a 1 to use as the bias input for neurons
train_x_with_bias = np.array([(1., d) for d in train_x]).astype(np.float32)

In [6]:
# Training parameters
training_steps = 100
learning_rate = 0.001

with tf.Session() as sess:
  # Set up all the tensors, variables, and operations.
  input = tf.constant(train_x_with_bias)
  target = tf.constant(np.transpose([train_y]).astype(np.float32))
  # Set up placeholder for making model predictions (separate from training)
  x = tf.placeholder('float', shape=[None, 1])
  # Initialize weights with small random values
  weights = tf.Variable(tf.random_normal([2, 1], 0, 0.1))

  tf.global_variables_initializer().run()

  # Calculate the current prediction error
  train_y_predicted = tf.matmul(input, weights)
  train_y_error = tf.subtract(train_y_predicted, target)

  # Define prediction operation
  y = tf.matmul(x, weights[1:]) + weights[0]

  # Compute the L2 loss function of the error
  loss = tf.nn.l2_loss(train_y_error)

  # Train the network using an optimizer that minimizes the loss function
  update_weights = tf.train.GradientDescentOptimizer(
    learning_rate).minimize(loss)

  for _ in range(training_steps):
    # Repeatedly run the operations, updating the TensorFlow variable.
    update_weights.run()

  ## Export the Model

  # Create a SavedModelBuilder
  #export_path_base = FLAGS.export_dir
  #export_path = os.path.join(export_path_base, str(FLAGS.model_version))
  
  export_path_base = "C:\\Users\\dbwab\\awslab\\tensorflows\\models" 
    
  export_path = os.path.join(export_path_base, "v1")
  print('Exporting trained model to', export_path)
  builder = tf.saved_model.builder.SavedModelBuilder(export_path)
  
  # Build signature inputs and outputs
  tensor_info_input = tf.saved_model.utils.build_tensor_info(x)
  tensor_info_output = tf.saved_model.utils.build_tensor_info(y)

  # Create the prediction signature
  prediction_signature = (
    tf.saved_model.signature_def_utils.build_signature_def(
      inputs={'input': tensor_info_input},
      outputs={'output': tensor_info_output},
      method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

  # Provide legacy initialization op for compatibility with older version of tf
  legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')

  # Build the model
  builder.add_meta_graph_and_variables(
    sess, [tf.saved_model.tag_constants.SERVING],
    signature_def_map={
      'prediction':
      prediction_signature,
    },
  legacy_init_op=legacy_init_op)

  # Save the model
  builder.save()

print('Complete')

Exporting trained model to C:\Users\dbwab\awslab\tensorflows\models\v1
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
Instructions for updating:
Pass your op to the equivalent parameter main_op instead.
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: C:\Users\dbwab\awslab\tensorflows\models\v1\saved_model.pb
Complete


The code for building and saving the model is at the bottom of the code under the ## Export the Model comment. You can also review the changes to the graph and notice the tf.placeholder x that is used as an input for making predictions. The placeholder is used by the prediction operation y = tf.matmul(x, weights[1:]) + weights[0]. Also notice the training operations are separated out by seeing how the train_ variables are used.

**CMD to serve the module:**
    tensorflow_model_server --port=9000 --model_name=nn --model_base_path=/tmp/nn

Summary
In this Lab Step, you saw how to modify the neural network's graph to make separate operation paths for training and predicting. You also saw how to use the SavedModelBuilder module to save a serialized model to disk. Lastly, you used the tensorflow_model_server to serve the model making it accessible to clients to make predictions with the trained model.