# TensorRT Optimization for Mask R-CNN

## Overview

This notebook outlines the process and results of converting a TensorFlow saved model developed with Mask R-CNN architecture into an optimized TensorRT model. The primary objective of this conversion is to enhance the inference speed on both edge devices and cloud infrastructure, thereby facilitating real-time application requirements and scalable deployment scenarios.

## Background

The Mask R-CNN model, renowned for its efficiency in instance segmentation tasks, was initially trained using a high-quality dataset to identify and segment objects within images. Although the model achieved a high accuracy, its inference time on standard hardware was a considerable bottleneck, taking approximately 35 seconds per image.

## Objective

To significantly reduce the inference time of the 2 Mask R-CNN model without compromising its accuracy, ensuring it meets the latency requirements of real-time applications. The model should be capable of delivering prompt predictions on edge devices with limited computational resources as well as on cloud platforms.

The TensorFlow saved model was converted into a TensorRT model using Tensorflow library.

**Note:
To execute this Colab notebook effectively, please ensure that you switch the runtime to utilize a GPU. Additionally, for optimal performance, select the 'High-RAM' option which is available under the 'Runtime' tab at the top of the Colab notebook interface. This configuration is essential for handling compute-intensive operations and large datasets without running into memory constraints.**

## Download required files & scripts.

In [None]:
# Download preprocessing script.
url = (
    "https://raw.githubusercontent.com/"
    "tensorflow/models/master/"
    "official/projects/waste_identification_ml/"
    "model_inference/preprocessing.py"
)

!wget -q {url}

In [None]:
# Download the script to pull instance segmentation model weights from the
# TF Model Garden repo.
url = (
    "https://raw.githubusercontent.com/"
    "tensorflow/models/master/"
    "official/projects/waste_identification_ml/"
    "model_inference/download_and_unzip_models.py"
)

!wget -q {url}

In [None]:
# download the sample image from the circularnet project
url1 = (
    "https://raw.githubusercontent.com/tensorflow/models/master/official/"
    "projects/waste_identification_ml/pre_processing/config/sample_images/"
    "image_2.png"
)

url2 = (
    "https://raw.githubusercontent.com/tensorflow/models/master/official/"
    "projects/waste_identification_ml/pre_processing/config/sample_images/"
    "image_4.png"
)

!curl -O {url1}
!curl -O {url2}

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3303k  100 3303k    0     0  2120k      0  0:00:01  0:00:01 --:--:-- 2120k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1913k  100 1913k    0     0   942k      0  0:00:02  0:00:02 --:--:--  943k


## Import required packages.

In [None]:
!python3 -m pip install -q -U  tensorrt tf_keras

  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for tensorrt (setup.py) ... [?25l[?25hdone


In [None]:
import tensorrt
print(tensorrt.__version__)
assert tensorrt.Builder(tensorrt.Logger())

8.6.1


In [None]:
import os
import sys

from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from six import BytesIO
from six.moves.urllib.request import urlopen
from typing import Any, Callable
import preprocessing

import logging
logging.disable(logging.WARNING)

%matplotlib inline

## Utils

In [None]:
def load_image_into_numpy_array(path: str) -> np.ndarray:
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    uint8 numpy array with shape (1, h, w, 3)
  """
  image = None
  if(path.startswith('http')):
    response = urlopen(path)
    image_data = response.read()
    image_data = BytesIO(image_data)
    image = Image.open(image_data)
  else:
    image_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(image_data))

  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (1, im_height, im_width, 3)).astype(np.uint8)


def load_model(model_handle: str) -> Callable:
    """Loads a TensorFlow SavedModel and returns a function that can be used to
    make predictions.

    Args:
      model_handle: A path to a TensorFlow SavedModel.

    Returns:
      A function that can be used to make predictions.
    """
    print('loading model...')
    print(model_handle)
    model = tf.saved_model.load(model_handle)
    print('model loaded!')
    detection_fn = model.signatures['serving_default']
    return detection_fn


def perform_detection(model: Callable, image: np.ndarray) -> dict[str, Any]:
  """Performs Mask RCNN on an image using the specified model.

  Args:
    model: A function that can be used to make predictions.
    image_np: A NumPy array representing the image to be detected.

  Returns:
    A list of detections.
  """
  detection_fn = model(image)
  detection_fn = {key: value.numpy() for key, value in detection_fn.items()}
  return detection_fn


def create_directory(path: str):
    """Create a directory at the specified path if it does not exist.

    Args:
        path (str): The path of the directory to create.
    """
    try:
        os.makedirs(path, exist_ok=True)
        print(f'Directory {path} created successfully')
    except Exception as e:
        print(f'Failed to create directory {path}: {e}')


def convert_to_tensorrt(
    saved_model_dir: str,
    output_saved_model_dir: str
    ) -> Callable:
    """
    Converts a TensorFlow SavedModel to TensorRT format.

    Args:
      saved_model_dir: The directory where the original TensorFlow SavedModel is
      stored.
      output_saved_model_dir: The directory where the TensorRT-converted model
      will be saved.

    Returns:
      Callable: A generator function that yields input data for building TRT
      engines.
    """
    params = tf.experimental.tensorrt.ConversionParams(
    precision_mode='FP16',
    # Set this to a large enough number so it can cache all the engines.
    maximum_cached_engines=16
    )

    converter = tf.experimental.tensorrt.Converter(
        input_saved_model_dir=saved_model_dir, conversion_params=params
    )

    converter.convert()

    # Define a generator function that yields input data, and use it to execute
    # the graph to build TRT engines.
    def my_input_fn():
      yield image1

    converter.build(input_fn=my_input_fn)  # Generate corresponding TRT engines
    converter.save(output_saved_model_dir)  # Generated engines will be saved.


def process_image(image_path: str) -> tf.Tensor:
  """
  Processes an image from a given file path.

  This function reads an image from the specified path, resizes it, and applies
  normalization preprocessing.

  Args:
    image_path: The file path of the image to be processed.

  Returns:
    A TensorFlow Tensor representing the processed image.
  """
  image_np = load_image_into_numpy_array(image_path)
  image_np_cp = tf.image.resize(image_np[0], (512, 1024), method=tf.image.ResizeMethod.AREA)
  image_np_cp = tf.cast(image_np_cp, tf.uint8)
  image_np = preprocessing.normalize_image(image_np_cp)
  image_np = tf.expand_dims(image_np, axis=0)
  return image_np

## Import both Mask RCNN saved model(material & material form) from the repo.

In [None]:
# 'material_model' output is both material and its sub type e.g. Plastics_PET.
# 'material_form_model' outputs the form of an object e.g. can, bottle, etc.
MODEL_WEIGHTS = {
    'material_url': (
        'https://storage.googleapis.com/tf_model_garden/vision/'
        'waste_identification_ml/two_model_strategy/material/'
        'material_version_2.zip'
    ),
    'material_form_url': (
        'https://storage.googleapis.com/tf_model_garden/vision/'
        'waste_identification_ml/two_model_strategy/material_form/'
        'material_form_version_2.zip'
    ),
}


SAVED_MODEL_PATH = {
'material_model' : 'material/saved_model/',
'material_form_model' : 'material_form/saved_model/',
}

In [None]:
# Download the model weights from the Google's repo.
url1 = MODEL_WEIGHTS['material_url']
url2 = MODEL_WEIGHTS['material_form_url']
!python3 download_and_unzip_models.py $url1 $url2

## Preprocess an image.

In [None]:
image1  = process_image('image_2.png')
image2 = process_image('image_4.png')

## Load original SavedModel.

In [None]:
# Loading both models.
detection_fns = [
    load_model(model_path)
    for model_path in SAVED_MODEL_PATH.values()
]

loading model...
material/saved_model/
model loaded!
loading model...
material_form/saved_model/
model loaded!


# Convert to TensorRT model

In [None]:
TENSORRT_MODEL_PATH = {
'material_model' : 'tensorrt/material/saved_model/',
'material_form_model' : 'tensorrt/material_form/saved_model/',
}

In [None]:
# Create directories to store TensorRT models.
for value in TENSORRT_MODEL_PATH.values():
  create_directory(value)

Directory tensorrt/material/saved_model/ created successfully
Directory tensorrt/material_form/saved_model/ created successfully


In [None]:
# Convert Tensorflow saved models into TensorRT models.
for key in SAVED_MODEL_PATH.keys():
    value1 = SAVED_MODEL_PATH.get(key)
    value2 = TENSORRT_MODEL_PATH.get(key)
    print(value1, value2)
    convert_to_tensorrt(value1, value2)

material/saved_model/ tensorrt/material/saved_model/
material_form/saved_model/ tensorrt/material_form/saved_model/


## Load TensorRT models.

In [None]:
# Loading both models.
detection_fns_tensorrt = [
    load_model(model_path)
    for model_path in TENSORRT_MODEL_PATH.values()
]

loading model...
tensorrt/material/saved_model/
model loaded!
loading model...
tensorrt/material_form/saved_model/
model loaded!


## Checking speed with SavedModel.

In [None]:
%%timeit
# Inference speed with first image.
results = list(
    map(
        lambda model: perform_detection(model, image1),
        detection_fns
    )
)

386 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
%%timeit
detection_fns[0](image2)

169 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
%%timeit
detection_fns[1](image2)

200 ms ± 1.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Create an inference engine for TensorRT by predicting over a single image.

In [None]:
%%timeit
# Inference speed with first image.
results = list(
    map(
        lambda model: perform_detection(model, image1),
        detection_fns_tensorrt
    )
)

210 ms ± 4.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Checking speed with TensorRT model.

In [None]:
%%timeit
detection_fns_tensorrt[0](image2)

83.8 ms ± 985 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
%%timeit
detection_fns_tensorrt[1](image2)

122 ms ± 1.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


## Conclusion

Average inference speed of 1st saved model over image 2    = **169 ms**\
Average inference speed of 1st TensorRT model over image 2 = **83.8 ms**


Average inference speed of 2nd saved model over image 2    = **210 ms**\
Average inference speed of 2nd TensorRT model over image 2 = **122 ms**