# Evaluate YOLO v3 on Inferentia
## Note: this tutorial runs on tensorflow-neuron 1.x only

## Introduction
This tutorial walks through compiling and evaluating YOLO v3 model on Inferentia using the AWS Neuron SDK.


In this tutorial we provide two main sections:

1. Download Dataset and Generate Pretrained SavedModel

2. Compile the YOLO v3 model.

3. Deploy the same compiled model.

Before running the following verify this Jupyter notebook is running “conda_aws_neuron_tensorflow_p36” kernel. You can select the Kernel from the “Kernel -> Change Kernel” option on the top of this Jupyter notebook page.

Instructions of how to setup Neuron Tensorflow environment and run the tutorial as a Jupyter notebook are available in the Tutorial main page [Tensorflow-YOLO_v3 Tutorial](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/tensorflow-neuron/tutorials/yolo_v3_demo/yolo_v3_demo.html)

## Prerequisites


This demo requires the following pip packages:

`pillow matplotlib pycocotools`


In [2]:

import sys
!{sys.executable} -m pip install pillow matplotlib pycocotools==2.0.2 --force --extra-index-url=https://pip.repos.neuron.amazonaws.com
    

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com, https://pip.repos.neuron.amazonaws.com
Collecting pillow
  Downloading Pillow-8.3.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 6.2 MB/s eta 0:00:01
[?25hCollecting matplotlib
  Downloading matplotlib-3.3.4-cp36-cp36m-manylinux1_x86_64.whl (11.5 MB)
[K     |████████████████████████████████| 11.5 MB 144.1 MB/s eta 0:00:01
[?25hCollecting pycocotools==2.0.2
  Downloading pycocotools-2.0.2.tar.gz (23 kB)
Collecting setuptools>=18.0
  Downloading setuptools-58.1.0-py3-none-any.whl (816 kB)
[K     |████████████████████████████████| 816 kB 113.3 MB/s eta 0:00:01
[?25hCollecting cython>=0.27.3
  Downloading Cython-0.29.24-cp36-cp36m-manylinux1_x86_64.whl (2.0 MB)
[K     |████████████████████████████████| 2.0 MB 137.8 MB/s eta 0:00:01
[?25hCollecting python-dateutil>=2.1
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247

## Part 1:  Download Dataset and Generate Pretrained SavedModel
### Download COCO 2017 validation dataset

We start by downloading the COCO validation dataset, which we will use to validate our model. The COCO 2017 dataset is widely used for object-detection, segmentation and image captioning.

In [None]:
!curl -LO http://images.cocodataset.org/zips/val2017.zip
!curl -LO http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip -q val2017.zip
!unzip annotations_trainval2017.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  777M  100  777M    0     0  22.1M      0  0:00:35  0:00:35 --:--:-- 22.7M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  241M  100  241M    0     0  21.7M      0  0:00:11  0:00:11 --:--:-- 22.8M
replace val2017/000000212226.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

In [None]:
!ls


## Generate YOLO v3 tensorflow SavedModel (pretrained on COCO 2017 dataset)

Script yolo_v3_coco_saved_model.py will generate a tensorflow SavedModel using pretrained weights from https://github.com/YunYang1994/tensorflow-yolov3/releases/download/v1.0/yolov3_coco.tar.gz.

In [None]:
%run yolo_v3_coco_saved_model.py ./yolo_v3_coco_saved_model

This tensorflow SavedModel can be loaded as a tensorflow predictor. When a JPEG format image is provided as input, the output result of the tensorflow predictor contains information for drawing bounding boxes and classification results.

In [None]:
import json
import tensorflow as tf
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.patches as patches

# launch predictor and run inference on an arbitrary image in the validation dataset
yolo_pred_cpu = tf.contrib.predictor.from_saved_model('./yolo_v3_coco_saved_model')
image_path = './val2017/000000581781.jpg'
with open(image_path, 'rb') as f:
    feeds = {'image': [f.read()]}
results = yolo_pred_cpu(feeds)

# load annotations to decode classification result
with open('./annotations/instances_val2017.json') as f:
    annotate_json = json.load(f)
label_info = {idx+1: cat['name'] for idx, cat in enumerate(annotate_json['categories'])}

# draw picture and bounding boxes
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(Image.open(image_path).convert('RGB'))
wanted = results['scores'][0] > 0.1
for xyxy, label_no_bg in zip(results['boxes'][0][wanted], results['classes'][0][wanted]):
    xywh = xyxy[0], xyxy[1], xyxy[2] - xyxy[0], xyxy[3] - xyxy[1]
    rect = patches.Rectangle((xywh[0], xywh[1]), xywh[2], xywh[3], linewidth=1, edgecolor='g', facecolor='none')
    ax.add_patch(rect)
    rx, ry = rect.get_xy()
    rx = rx + rect.get_width() / 2.0
    ax.annotate(label_info[label_no_bg + 1], (rx, ry), color='w', backgroundcolor='g', fontsize=10,
                ha='center', va='center', bbox=dict(boxstyle='square,pad=0.01', fc='g', ec='none', alpha=0.5))
plt.show()

## Part 2:  Compile the Pretrained SavedModel for Neuron

We make use of the Python compilation API `tfn.saved_model.compile` that is available in `tensorflow-neuron<2`. For the purpose of reducing Neuron runtime overhead, it is necessary to make use of arguments `no_fuse_ops` and `minimum_segment_size`.
Compiled model is saved in ./yolo_v3_coco_saved_model_neuron.

In [10]:
import shutil
import tensorflow as tf
import tensorflow.neuron as tfn
import os

model_type = 'yolo_v3_coco'

def no_fuse_condition(op):
    return op.name.startswith('Preprocessor') or op.name.startswith('Postprocessor')

with tf.Session(graph=tf.Graph()) as sess:
    tf.saved_model.loader.load(sess, ['serve'], './yolo_v3_coco_saved_model')
    no_fuse_ops = [op.name for op in sess.graph.get_operations() if no_fuse_condition(op)]
def compile_inf1_model(saved_model_dir, inf1_model_dir, batch_size=1, num_cores=1, use_static_weights=False):
    
    compiled_model_dir = f'{model_type}_batch_{batch_size}_inf1_cores_{num_cores}'
    inf1_compiled_model_dir = os.path.join(inf1_model_dir, compiled_model_dir)
    shutil.rmtree(inf1_compiled_model_dir, ignore_errors=True)
    
    compiler_args = ['--verbose','1', '--neuroncore-pipeline-cores', str(num_cores)]
    
    result = tfn.saved_model.compile(
        './yolo_v3_coco_saved_model', './yolo_v3_coco_inf1_saved_models',
        # to enforce trivial compilable subgraphs to run on CPU
    #     no_fuse_ops=no_fuse_ops,
        minimum_segment_size=100,
        batch_size=batch_size,
        dynamic_batch_size=True,
        compiler_args = compiler_args
    )
    print(result)

INFO:tensorflow:Restoring parameters from ./yolo_v3_coco_saved_model/variables/variables


In [11]:
inf1_model_dir = f'{model_type}_inf1_saved_models'
saved_model_dir = f'{model_type}_saved_model'


# testing batch size
batch_list = [1,2,4,8,16,32,64]
num_of_cores = [1,2,3,4]
for batch in batch_list:
    for core in num_of_cores:
        print('batch size:', batch,'core nums', core,'compile start')
        compile_inf1_model(saved_model_dir, inf1_model_dir, batch_size=batch, num_cores=core)

batch size: 1 core nums 1 compile start
INFO:tensorflow:Restoring parameters from ./yolo_v3_coco_saved_model/variables/variables
INFO:tensorflow:Froze 366 variables.
INFO:tensorflow:Converted 366 variables to const ops.
INFO:tensorflow:fusing subgraph {subgraph neuron_op_d458f099f41f205c with input tensors ["<tf.Tensor 'Preprocessor/map/TensorArrayStack/TensorArrayGatherV30/_0:0' shape=(1, 416, 416, 3) dtype=float16>"], output tensors ["<tf.Tensor 'conv_lbbox/BiasAdd:0' shape=(1, 13, 13, 255) dtype=float16>", "<tf.Tensor 'conv_mbbox/BiasAdd:0' shape=(1, 26, 26, 255) dtype=float16>", "<tf.Tensor 'conv_sbbox/BiasAdd:0' shape=(1, 52, 52, 255) dtype=float16>", "<tf.Tensor 'Postprocessor/map/strided_slice:0' shape=() dtype=int32>", "<tf.Tensor 'Postprocessor/map/TensorArrayUnstack/range:0' shape=(1,) dtype=int32>", "<tf.Tensor 'Postprocessor/map/TensorArrayUnstack_1/range:0' shape=(1,) dtype=int32>", "<tf.Tensor 'Postprocessor/map/TensorArrayUnstack_2/range:0' shape=(1,) dtype=int32>"]} wit



INFO:tensorflow:Number of operations in TensorFlow session: 4962
INFO:tensorflow:Number of operations after tf.neuron optimizations: 3010
INFO:tensorflow:Number of operations placed on Neuron runtime: 1920
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: ./yolo_v3_coco_inf1_saved_models/saved_model.pb
INFO:tensorflow:Successfully converted ./yolo_v3_coco_saved_model to ./yolo_v3_coco_inf1_saved_models
{'OnNeuronRatio': 0.6378737541528239}
batch size: 1 core nums 2 compile start


AssertionError: Export directory already exists. Please specify a different export directory: ./yolo_v3_coco_inf1_saved_models