# Exercise: Benchmark Different Minimum Segment Sizes

In this notebook we will discuss the `minimum_segment_size` conversion parameter, and will ask you to experiment with the value, observing how it impacts throughput in optimized models.

Additionally, you will perform conversion for 2 additional models, VGG19 and InceptionV3.

## Objectives

By the end of this notebook you will be able to:

- Conduct experiments to understand the impact of the minimum segment size conversion parameter on a variety of models

## Imports

In [None]:
from tensorflow.python.compiler.tensorrt import trt_convert as trt

In [None]:
from lab_helpers import (
    get_images, batch_input, load_tf_saved_model,
    predict_and_benchmark_throughput_from_saved, display_prediction_info
)

## Minimum Segment Size Conversion Parameter

The success of a TF-TRT optimization task is also dependent on the architecture of the model. The more supporting layers comprising the model, the greater number of TF-TRT layers generated and consequently, higher performance is achieved.

The `minimum_segment_size` conversion parameter determines the minimum number of nodes required for a subgraph to be replaced by an optimized TF-TRT op. While its default value of 3 tends to offer the best performance for most models, adjusting this value can have varying impact on different models.

For even more on the impacts of this parameter, see the [TF-TRT User Guide](https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/#min-nodes).

## VGG19 and InceptionV3 Models

In this notebook we will utilize 2 additional models: **VGG19** and **InceptionV3**. Execute the following cells to load them, and save them to file, so that they are in the format TF-TRT expects. Feel free to continue reading while the models save.

In [None]:
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.applications.inception_v3 import InceptionV3

In [None]:
vgg19_model = VGG19(weights='imagenet')
inception_v3_model = InceptionV3(weights='imagenet')

In [None]:
vgg19_model.save('vgg19_saved_model')
inception_v3_model.save('inception_v3_saved_model')

### Batch Input for Additional Models

Before we performance inference (and benchmark), we need to batch our input.

Our `batch_input` helper function performs model-specific image preprocessing. Therefore we create one set of batched images for each of the 2 additional models. If you're interested, check out `lab_helpers.py` for the source code.

In [None]:
number_of_images = 16
images = get_images(number_of_images)

In [None]:
vgg19_batched_input = batch_input(images, model="vgg19")

In [None]:
inception_v3_batched_input = batch_input(images, model="inception_v3")

## Benchmark Different Minimum Segment Sizes

As you can see, the default value for `minimum_segment_size` is `3`.

In [None]:
trt.DEFAULT_TRT_CONVERSION_PARAMS

For this exercise you are asked to optimize the  **vgg_19** and **inception_v3** models, varying `minimum_segment_size` to maximize throughput (on FP16 mode only).

### Allow for Different Minimum Segment Size Values

As you can see, `convert_to_trt_graph_and_save` now accepts a `minimum_segment_size` argument, which can be used to control the minimum segment size during conversion to a TF-TRT optimized model. Read the comments to see pertinent changes to our helper function.

In [None]:
def convert_to_trt_graph_and_save(
    precision_mode='float16',
    input_saved_model_dir='vgg19_saved_model',
    max_batch_size=16,
    # Allow for control of minimum_segment_size value
    minimum_segment_size=3
):
    precision_mode = trt.TrtPrecisionMode.FP16
    converted_save_suffix = '_TFTRT_FP16'
    
        
    if minimum_segment_size != 3:
        # Adjust filename for a given minimum segment size
        converted_save_suffix += '_MSS_{}'.format(str(minimum_segment_size))
        
    output_saved_model_dir = input_saved_model_dir + converted_save_suffix
    
    conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
        precision_mode=precision_mode, 
        max_workspace_size_bytes=8000000000,
        max_batch_size=max_batch_size,
        # Pass in adjusted minimum segment size to conversion parameters
        minimum_segment_size=minimum_segment_size
    )

    converter = trt.TrtGraphConverterV2(
        input_saved_model_dir=input_saved_model_dir,
        conversion_params=conversion_params
    )

    print('Converting {} to TF-TRT graph precision mode {}...'.format(input_saved_model_dir, precision_mode))
    
    converter.convert()

    print('Saving converted model to {}...'.format(output_saved_model_dir))
    converter.save(output_saved_model_dir=output_saved_model_dir)
    print('Complete')

### Benchmarking Table

As you perform the following operations, use this table to track your results.

### Benchmark Different Minimum Segment Sizes for VGG19

Run the following cells, adjusting `minimum_segment_size` so that you can observe the impact of its value when using VGG19.

In [None]:
model_name = 'vgg19'
minimum_segment_size = 1 # TODO: Optimize for minimum segment sizes of 1 and 5

In [None]:
input_saved_model_dir = '{}_saved_model'.format(model_name) # See above for where we saved the model

convert_to_trt_graph_and_save(precision_mode='float16',
                              minimum_segment_size=minimum_segment_size, # Here we control minimum segment size for the conversion
                              input_saved_model_dir=input_saved_model_dir)

In [None]:
infer = load_tf_saved_model('{}_saved_model_TFTRT_FP16_MSS_{}'.format(model_name, str(minimum_segment_size)))

In [None]:
# We use batched input, and process predictions, specifically for VGG19
# Record Throughput in the table above
all_preds = predict_and_benchmark_throughput_from_saved(vgg19_batched_input, infer, N_run=150, N_warmup_run=50, model='vgg19')

Optionally, display prediction info for this model.

In [None]:
last_run_preds = all_preds[0]
display_prediction_info(last_run_preds, images, model='vgg19')

### Benchmark Different Minimum Segment Sizes for InceptionV3

Run the following cells, adjusting `minimum_segment_size` so that you can observe the impact of its value when using InceptionV3.

In [None]:
model_name = 'inception_v3'
minimum_segment_size = 1 # TODO: Optimize for minimum segment sizes of 1 and 5

In [None]:
input_saved_model_dir = '{}_saved_model'.format(model_name) # See above for where we saved the model

convert_to_trt_graph_and_save(precision_mode='float16',
                              minimum_segment_size=minimum_segment_size, # Here we control minimum segment size for the conversion
                              input_saved_model_dir=input_saved_model_dir)

In [None]:
infer = load_tf_saved_model('{}_saved_model_TFTRT_FP16_MSS_{}'.format(model_name, str(minimum_segment_size)))

In [None]:
# We use batched input, and process predictions, specifically for InceptionV3
# Record Throughput in the table above
all_preds = predict_and_benchmark_throughput_from_saved(inception_v3_batched_input, infer, N_run=150, N_warmup_run=50, model='inception_v3')

Optionally, display prediction info for this model.

In [None]:
last_run_preds = all_preds[0]
display_prediction_info(last_run_preds, images, model='inception_v3')

## Restart Kernel

Please execute the cell below to restart the kernel and clear GPU memory.

In [None]:
import IPython
IPython.Application.instance().kernel.do_shutdown(True)