# OpenVINO benchmarking with 2D U-Net
In this tutorial, we will use the Intel® Distribution of OpenVINO™ Toolkit to perform benchmarking

This tutorial assumes that you have already downloaded and installed [Intel&reg; OpenVINO&trade;](https://software.intel.com/en-us/openvino-toolkit/choose-download) on your computer. 

In order to use Intel® OpenVINO™, we need to do a few steps:

1. Convert our Keras model to a Tensorflow model. 
1. Freeze the Tensorflow saved format model
1. Use the OpenVINO Model Optimizer to convert the above freezed-model to the OpenVINO Intermediate Representation (IR) format
1. Benchmark using the OpenVINO benchmark tool: `/opt/intel/openvino/deployment_tools/tools/benchmark_tool/benchmark_app.py`


In [2]:
import keras
import os
import tensorflow as tf
import numpy as np
import keras as K
import shutil, sys  

Using TensorFlow backend.





In [3]:
def dice_coef(y_true, y_pred, axis=(1, 2), smooth=1):
    """
    Sorenson (Soft) Dice
    \frac{  2 \times \left | T \right | \cap \left | P \right |}{ \left | T \right | +  \left | P \right |  }
    where T is ground truth mask and P is the prediction mask
    """
    intersection = tf.reduce_sum(y_true * y_pred, axis=axis)
    union = tf.reduce_sum(y_true + y_pred, axis=axis)
    numerator = tf.constant(2.) * intersection + smooth
    denominator = union + smooth
    coef = numerator / denominator

    return tf.reduce_mean(coef)

def soft_dice_coef(target, prediction, axis=(1, 2), smooth=0.01):
    """
    Sorenson (Soft) Dice  - Don't round the predictions
    \frac{  2 \times \left | T \right | \cap \left | P \right |}{ \left | T \right | +  \left | P \right |  }
    where T is ground truth mask and P is the prediction mask
    """

    intersection = tf.reduce_sum(target * prediction, axis=axis)
    union = tf.reduce_sum(target + prediction, axis=axis)
    numerator = tf.constant(2.) * intersection + smooth
    denominator = union + smooth
    coef = numerator / denominator

    return tf.reduce_mean(coef)

def dice_coef_loss(target, prediction, axis=(1, 2), smooth=1.):
    """
    Sorenson (Soft) Dice loss
    Using -log(Dice) as the loss since it is better behaved.
    Also, the log allows avoidance of the division which
    can help prevent underflow when the numbers are very small.
    """
    intersection = tf.reduce_sum(prediction * target, axis=axis)
    p = tf.reduce_sum(prediction, axis=axis)
    t = tf.reduce_sum(target, axis=axis)
    numerator = tf.reduce_mean(intersection + smooth)
    denominator = tf.reduce_mean(t + p + smooth)
    dice_loss = -tf.log(2.*numerator) + tf.log(denominator)

    return dice_loss


def combined_dice_ce_loss(y_true, y_pred, axis=(1, 2), smooth=1.,
                          weight=0.9):
    """
    Combined Dice and Binary Cross Entropy Loss
    """
    return weight*dice_coef_loss(y_true, y_pred, axis, smooth) + \
        (1-weight)*K.losses.binary_crossentropy(y_true, y_pred)


In [5]:
inference_filename = "unet_decathlon_4_8814_128x128_randomcrop-any-input.h5"
model_filename = os.path.join("/home/ubuntu/models/unet", inference_filename)

# Load model
print("Loading Model... ")
model = K.models.load_model(model_filename, custom_objects={
    "combined_dice_ce_loss": combined_dice_ce_loss,
    "dice_coef_loss": dice_coef_loss,
    "soft_dice_coef": soft_dice_coef,
    "dice_coef": dice_coef})
print("Model loaded successfully from: " + model_filename)

sess = keras.backend.get_session()
sess.run(tf.global_variables_initializer())

Loading Model... 





Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.








Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model loaded successfully from: /home/ubuntu/models/unet/unet_decathlon_4_8814_128x128_randomcrop-any-input.h5


In [6]:
import shutil, sys   

output_directory = "/home/ubuntu/models/unet/output"
print("Freezing the graph.")
keras.backend.set_learning_phase(0)

signature = tf.saved_model.signature_def_utils.predict_signature_def(
    inputs={'input': model.input}, outputs={'output': model.output})

#If directory exists, delete it and let builder rebuild the TF model.
if os.path.isdir(output_directory):
    print (output_directory, "exists already. Deleting the folder")
    shutil.rmtree(output_directory)

builder = tf.saved_model.builder.SavedModelBuilder(output_directory)
builder.add_meta_graph_and_variables(sess=sess,    
                                     tags=[tf.saved_model.tag_constants.SERVING],    
                                     signature_def_map={
                                         tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:signature
                                     }, saver=tf.train.Saver())
builder.save() 
print("TensorFlow protobuf version of model is saved in:", output_directory)

print("Model input name = ", model.input.op.name)
print("Model input shape = ", model.input.shape)
print("Model output name = ", model.output.op.name)
print("Model output shape = ", model.output.shape)

Freezing the graph.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: /home/ubuntu/models/unet/output/saved_model.pb
TensorFlow protobuf version of model is saved in: /home/ubuntu/models/unet/output
Model input name =  MRImages
Model input shape =  (?, ?, ?, 4)
Model output name =  PredictionMask/Sigmoid
Model output shape =  (?, ?, ?, 1)


In [7]:
output_frozen_model_dir = "/home/ubuntu/models/unet/frozen_model"
output_frozen_graph = output_frozen_model_dir+'/saved_model_frozen.pb'

if not os.path.isdir(output_frozen_model_dir):
    os.mkdir(output_frozen_model_dir)
else:
    print('Directory', output_frozen_model_dir, 'already exists. Deleting it and re-creating it')
    shutil.rmtree(output_frozen_model_dir)
    os.mkdir(output_frozen_model_dir)

from tensorflow.python.tools.freeze_graph import freeze_graph

_ = freeze_graph(input_graph="",
             input_saver="",
             input_binary=False,
             input_checkpoint="",
             restore_op_name="save/restore_all",
             filename_tensor_name="save/Const:0",
             clear_devices=True,
             initializer_nodes="",
             input_saved_model_dir=output_directory,
             output_node_names=model.output.op.name,
             output_graph=output_frozen_graph)

print("TensorFlow Frozen model model is saved in:", output_frozen_graph)

Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
INFO:tensorflow:Restoring parameters from /home/ubuntu/models/unet/output/variables/variables
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
INFO:tensorflow:Froze 38 variables.
INFO:tensorflow:Converted 38 variables to const ops.
TensorFlow Frozen model model is saved in: /home/ubuntu/models/unet/frozen_model/saved_model_frozen.pb


In [None]:
output_frozen_model_dir = "/home/ubuntu/models/unet/frozen_model"
output_frozen_graph = output_frozen_model_dir+'/saved_model_frozen.pb'

if not os.path.exists(output_frozen_graph):
    print(output_frozen_graph + ' doesn\'t exist. Please make sure you have a trained keras to TF frozen model')

!mo_tf.py \
      --input_model '/home/ubuntu/models/unet/frozen_model/saved_model_frozen.pb' \
      --input_shape=[1,160,160,4] \
      --data_type FP32  \
      --output_dir /home/ubuntu/models/unet/IR_models/FP32  \
      --model_name saved_model

#### Run the following command in the terminal
```
mo_tf.py \
      --input_model '/home/ubuntu/models/unet/frozen_model/saved_model_frozen.pb' \
      --input_shape=[1,160,160,4] \
      --data_type FP32  \
      --output_dir /home/ubuntu/models/unet/IR_models/FP32  \
      --model_name saved_model
```



#### Sample Output: 
```
(tensorflow_p36) ubuntu@ip-172-31-46-30:~$ mo_tf.py \
>       --input_model '/home/ubuntu/models/unet/frozen_model/saved_model_frozen.pb' \
>       --input_shape=[1,160,160,4] \
>       --data_type FP32  \
>       --output_dir /home/ubuntu/models/unet/IR_models/FP32  \
>       --model_name saved_model
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/ubuntu/models/unet/frozen_model/saved_model_frozen.pb
        - Path for generated IR:        /home/ubuntu/models/unet/IR_models/FP32
        - IR output name:       saved_model
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         [1,160,160,4]
        - Mean values:  Not specified
        - Scale values:         Not specified
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       False
TensorFlow specific parameters:
        - Input model in text protobuf format:  False
        - Path to model dump for TensorBoard:   None
        - List of shared libraries with TensorFlow custom layers implementation:        None
        - Update the configuration file with input/output node names:   None
        - Use configuration file used to generate the model with Object Detection API:  None
        - Operations to offload:        None
        - Patterns to offload:  None
        - Use the config file:  None
Model Optimizer version:        2020.1.0-61-gd349c3ba4a

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/ubuntu/models/unet/IR_models/FP32/saved_model.xml
[ SUCCESS ] BIN file: /home/ubuntu/models/unet/IR_models/FP32/saved_model.bin
[ SUCCESS ] Total execution time: 6.41 seconds.
[ SUCCESS ] Memory consumed: 443 MB.
```

## Benchmark

Benchmark using the following command:
```
python3 /opt/intel/openvino/deployment_tools/tools/benchmark_tool/benchmark_app.py \
-m /home/ubuntu/models/unet/IR_models/FP32/saved_model.xml \
-nireq 1 -nstreams 1
```

#### Sample Output
```
[Step 1/11] Parsing and validating input arguments
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
         API version............. 2.1.37988
[ INFO ] Device info
         CPU
         MKLDNNPlugin............ version 2.1
         Build................... 37988

[Step 3/11] Reading the Intermediate Representation network
[Step 4/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1, precision: MIXED
[Step 5/11] Configuring input of the model
[Step 6/11] Setting device configuration
[Step 7/11] Loading the model to the device
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'MRImages' precision FP32, dimensions (NCHW): 1 4 160 160
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'MRImages' with random values (some binary data is expected)
[Step 10/11] Measuring performance (Start inference asyncronously, 1 inference requests using 1 streams for CPU, limits: 60000 ms duration)
[Step 11/11] Dumping statistics report
Count:      11079 iterations
Duration:   60014.36 ms
Latency:    5.11 ms
Throughput: 184.61 FPS
```