# Integrating TensorFlow Distributed Image Serving with the TensorFlow Object Detection API
By Tyler LaBonte, July 2018

This Notebook is a sequel to [Serving Image-Based Deep Learning Models with TensorFlow-Serving's RESTful API](https://github.com/tmlabonte/tendies/blob/master/minimum_working_example/tendies-basic-tutorial.ipynb). Please be sure to read that article to understand the basics of TensorFlow-Serving and the TensorFlow Distributed Image Serving (Tendies) library. It is highly recommended to clone [the Tendies repository](https://github.com/tmlabonte/tendies) for following along with this tutorial, as I will be focusing on important excerpts of code rather than the entire file.

Here, we will extend the functionality of the basic Tendies classes to integrate a [Faster R-CNN](https://arxiv.org/abs/1506.01497) deep neural network, which uses the [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection). This will allow us to serve the Faster R-CNN for REST-compliant remote inference, much like CycleGAN in the previous article.

CycleGAN was rather simple because it accepted an image and output an image; however, Faster R-CNN accepts an image and outputs a dictionary of tensors. Furthermore, the Object Detection API forces us to build our model from a pipeline.config _and_ redefine the inference function, making serving Faster R-CNN a more difficult task. The steps to integrating a new model with Tendies are as follows:
1. Define pre- and post-processing functions in LayerInjector.
2. Create or import a model inference function in ServerBuilder.
3. Create or import a client.

While I will be demonstrating with Faster R-CNN, these steps remain the same for any arbitrary model, so feel free to follow along with your specific use case.

<img src="images/pipeline_diagram.PNG" width=700px>

# Layer Injection

Each preprocessing function must take as arguments an image bitstring, channels, and \*args (where \*args can be used to represent any number of custom positional arguments), then returns the model input as a tensor. Conversely, each postprocessing function must take as arguments model output and \*args, then return the list of output node names and whether the output should be transmitted as an image. These outputs will then be used in ServerBuilder when exporting the model.

First, we'll define our preprocessing function, which will transform an image bitstring to a uint8 tensor suitable for inference.

In [2]:
import tensorflow as tf
def bitstring_to_uint8_tensor(self, input_bytes, channels, *args):
    input_bytes = tf.reshape(input_bytes, [])

    # Transforms bitstring to uint8 tensor
    input_tensor = tf.image.decode_png(input_bytes, channels=channels)

    # Expands the single tensor into a batch of 1
    input_tensor = tf.expand_dims(input_tensor, 0)
    return input_tensor

Models compliant with the Object Detection API return a dictionary of useful tensors, such as num_detections, detection_boxes, and so on. In our postprocessing function, we'll iterate through these tensors and assign names to them, so we can extract them in ServerBuilder. We also have to account for the 1-indexing of the detection_classes tensor. Finally, we return a list of output node names and set output_as_image to False, since we will be sending the output tensors (not the visualized image) back to the client through JSON.

In [None]:
def object_detection_dict_to_tensor_dict(self, object_detection_tensor_dict, *args):
    # Sets output to a non-image
    OUTPUT_AS_IMAGE = False
    # Class labels are 1-indexed
    LABEL_ID_OFFSET = 1

    # Assigns names to tensors and adds them to output list
    output_node_names = []
    for name, tensor in object_detection_tensor_dict.items():
        if name == "detection_classes":
            tensor += LABEL_ID_OFFSET
        tensor = tf.identity(tensor, name)
        output_node_names.append(name)

    # Returns output list and image boolean
    return output_node_names, OUTPUT_AS_IMAGE

If you are following along with your own model, feel free to utilize \*args to accept as many parameters as you need for processing. Tendies is rather picky about tensor shape and typing, so make sure that the protos of the output of your preprocessor and the input of your postprocessor are equivalent to the input and output of your model, respectively.

# Inference Function

Next, we have to build the Faster R-CNN from pipeline.config and define our inference function. The code for this is in ServerBuilder.py under example_usage(), which is where the exportation of our model occurs. By reading the config file into an Object Detection API model_builder, we can instantiate a Faster R-CNN without ever actually seeing the model code. The next few cells are understood to be in the scope of example_usage().

In [None]:
from object_detection.protos import pipeline_pb2
from object_detection.builders import model_builder
from google.protobuf import text_format

# Builds object detection model from config file
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
with tf.gfile.GFile(config_file_path, 'r') as config:
    text_format.Merge(config.read(), pipeline_config)

detection_model = model_builder.build(pipeline_config.model, is_training=False)

Since export_graph expects a single inference function, but the Object Detection API has its own pre- and post-processing to do, we have to combine them ourselves. This is a great place to use a closure, because we want to preserve the scope where we instantiated the Faster R-CNN as we pass around the inference function. [Closures are the best](https://i.imgflip.com/2en7d1.jpg).

In [None]:
# Creates inference function, encapsulating object detection requirements
def object_detection_inference(input_tensors):
    # Converts uint8 inputs to float tensors
    inputs = tf.to_float(input_tensors)

    # Object detection preprocessing
    preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)
    # Object detection inference
    output_tensors = detection_model.predict(preprocessed_inputs, true_image_shapes)
    # Object detection postprocessing
    postprocessed_tensors = detection_model.postprocess(output_tensors, true_image_shapes)
    return postprocessed_tensors

Lastly, we'll instantiate a ServerBuilder and LayerInjector, then export the model. Note that we pass our inference function, preprocessor, and postprocessor into export_graph().

In [None]:
# Instantiates a ServerBuilder
server_builder = ServerBuilder()

# Instantiates a LayerInjector
layer_injector = LayerInjector()

# Builds the server
server_builder.build_server_from_tf(
    inference_function=object_detection_inference,
    preprocess_function=layer_injector.bitstring_to_uint8_tensor,
    postprocess_function=layer_injector.object_detection_dict_to_tensor_dict,
    model_name=FLAGS.model_name,
    model_version=FLAGS.model_version,
    checkpoint_dir=FLAGS.checkpoint_dir,
    serve_dir=FLAGS.serve_dir,
    channels=FLAGS.channels)

# Client

The best way to create customized Tendies clients is to inherit from Client, which provides a framework for remote inference. In such a child class, one only has to create visualize() and associated helper functions, then call client.inference() to begin the evaluation process.

We'll need a couple of these helper functions; the first is almost exactly the same as our preprocessing function, except without the addition of batching.

In [None]:
def bitstring_to_uint8_tensor(self, input_bytes):
    input_bytes = tf.reshape(input_bytes, [])

    # Transforms bitstring to uint8 tensor
    input_tensor = tf.image.decode_jpeg(input_bytes, channels=3)

    # Ensures tensor has correct shape
    input_tensor = tf.reshape(input_tensor, [self.image_size, self.image_size, 3])
    return input_tensor

Our second helper function will be used to create our category index dictionary from the Object Detection API from the provided label map; this specific implementation of Faster R-CNN has only one class, so it's simple:

In [None]:
from object_detection.utils import label_map_util
def get_category_index(self):
    # Loads label map
    label_map = label_map_util.load_labelmap(self.label_path)
    
    # Builds category index from label map
    categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=1, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return category_index

With the use of our helpers, our visualize function isn't too bad. We'll decode the JSON data and convert it to bounding boxes, then overlay them on our input image with the help of the Object Detection API visualization_utils. Note that we convert the input image to a tensor, so we have to .eval() it before visualization.

In [None]:
from object_detection.utils import visualization_utils
def visualize(self, input_image, response, i):
    # Processes response for visualization
    detection_boxes = response["detection_boxes"]
    detection_classes = response["detection_classes"]
    detection_scores = response["detection_scores"]
    image = self.bitstring_to_uint8_tensor(input_image)
    with tf.Session() as sess:
        image = image.eval()

    # Overlays bounding boxes and labels on image
    visualization_utils.visualize_boxes_and_labels_on_image_array(
        image,
        np.asarray(detection_boxes, dtype=np.float32),
        np.asarray(detection_classes, dtype=np.uint8),
        scores=np.asarray(detection_scores, dtype=np.float32),
        category_index=self.get_category_index(),
        instance_masks=None,
        use_normalized_coordinates=True,
        line_thickness=2)

    # Saves image
    output_file = self.output_dir + "/images/" + self.output_filename + str(i) + self.output_extension
    visualization_utils.save_image_array_as_png(image, output_file)

# Using the Server

Now that we're done integrating Faster R-CNN with Tendies, let's run the server. First, we have to export our model:

`python serverbuilder.py --checkpoint_dir $(path)`

As of July 2018, Python 3 is not officially supported with TensorFlow Serving, but [someone made a solution](https://github.com/tensorflow/serving/issues/700). Install the Python 3 TensorFlow Serving API with:

`pip install tensorflow-serving-api-python3`

Now, we can run this TensorFlow Model Server from bash with the command:

`tensorflow_model_server --rest_api_port=8501 --model_name=saved_model --model_base_path=$(path)`

Where $(path) is the path to the serve directory. In my case, it is /mnt/c/Users/Tyler/Desktop/tendies/full_functionality/serve.

Finally, we can run remote inference by calling our client on a folder of input images:

`python objectdetectionclient.py`

# Conclusion

Thanks for following along with this tutorial; I hope it helped you out! This Notebook was built with my TensorFlow Distributed Image Serving library, which you can download [here](https://github.com/tmlabonte/tendies). For more blog posts and information about me, please visit [my website](https://tmlabonte.github.io).