# Human Pose Inference using Onnx runtime 

In this example notebook, we describe how to take a pre-trained human pose estimation model for inference using the ***ONNX Runtime*** interface.
   - The user can choose the model (see section titled *Choosing a Pre-Compiled Model*)
   - The models used in this example were trained on the ***COCO*** dataset because it is a widely used dataset developed for training and benchmarking human pose estimation AI models. 
   - We perform inference on a few sample images which can be chosen from a drop down list.
   - We also describe the input preprocessing and output postprocessing steps, demonstrate how to collect various benchmarking statistics and how to visualize the data.

## Choosing a Pre-Compiled Model
We provide a set of precompiled artifacts to use with this notebook that will appear as a drop-down list once the second code cell is executed.

<img src=docs/images/drop_down.PNG width="400">
 
    
## Onnx Runtime based work flow

The diagram below describes the steps for Onnx Runtime based work flow. 

Note:
- The user needs to compile models(sub-graph creation and quantization) on a PC to generate model artifacts.
    - For this notebook we use pre-compiled models artifacts
- The generated artifacts can then be used to run inference on the target.
- Users can run this notebook as-is, only actions required are to select a model an the sample image for inference.

<img src=docs/images/onnx_work_flow_2.png width="400">

In [None]:
import os
import cv2
import sys
import numpy as np
import onnxruntime as rt
import ipywidgets as widgets
from scripts.utils import get_eval_configs

last_artifacts_id = selected_model_id.value if "selected_model_id" in locals() else None
prebuilt_configs, selected_model_id = get_eval_configs('keypoint_detection','onnxrt', num_quant_bits = 8, last_artifacts_id = last_artifacts_id)
display(selected_model_id)

In [None]:
print(f'Selected Model: {selected_model_id.label}')
config = prebuilt_configs[selected_model_id.value]
config['session'].set_param('model_id', selected_model_id.value)
config['session'].start()

## Define utility function to preprocess input images
Below, we define a utility function to preprocess images for `human pose estimation`. This function takes a path as input, loads the image and preprocesses it for generic ***Onnx*** inference. The steps are as follows: 

 1. load image
 2. convert BGR image to RGB
 3. scale image so that the longer edge is 512 pixels
 4. pad the smaller edge with (127,127,127) to make it to 512 pixels
 5. apply per-channel pixel scaling and mean subtraction


In [None]:
def preprocess_for_onnx_pose_estimation(image_path, size, mean, scale, layout, reverse_channels, pad_color=114, pad_type="center"):
    # Step 1
    # read the image using openCVimport json_tricks as json
    img = cv2.imread(image_path)
    
    # Step 2
    # convert to RGB
    img = img[:,:,::-1]
    
    # Step 3    
    # Most of the onnx models are trained using
    # 512x512 images. The general rule of thumb
    # is to scale the input image while preserving
    # the original aspect ratio so that the
    # longer edge is 512 pixels, and then
    # pad the scaled image to 512x512
    
    size = (size,size) if not isinstance(size, (list,tuple)) else size
    desired_size = size[-1]
    old_size = img.shape[:2] # old_size is in (height, width) format

    ratio = float(desired_size)/max(old_size)
    new_size = tuple([int(x*ratio) for x in old_size])

    # new_size should be in (width, height) format
    img = cv2.resize(img, (new_size[1], new_size[0]))

    delta_w = size[1] - new_size[1]
    delta_h = size[0] - new_size[0]

    if pad_type=="corner":
        top, left = 0, 0
        bottom, right = delta_h, delta_w
    else:
        delta_w = size[1] - new_size[1]
        delta_h = size[0] - new_size[0]
        top, bottom = delta_h//2, delta_h-(delta_h//2)
        left, right = delta_w//2, delta_w-(delta_w//2)


    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT,
        value=pad_color)
    
    # Step 4
    # Apply scaling and mean subtraction.
    # if your model is built with an input
    # normalization layer, then you might
    # need to skip this
    if mean is not None and scale is not None:
        img = img.astype('float32')
        for mean, scale, ch in zip(mean, scale, range(img.shape[2])):
            img[:,:,ch] = ((img.astype('float32')[:,:,ch] - mean) * scale)
            
    # Step 5
    if reverse_channels:
        img = img[:,:,::-1]
    
    # Step 6
    img = np.expand_dims(img,axis=0)
    img = np.transpose(img, (0, 3, 1, 2))
    
    return img, top, left, ratio

In [None]:
from scripts.utils import get_preproc_props

size, mean, scale, layout, reverse_channels = get_preproc_props(config)
print(f'Image size: {size}')

## Create the model using the stored artifacts
<div class="alert alert-block alert-warning">
<b>Warning:</b> It is recommended to use the ONNX Runtime APIs in the cells below without any modifications.
</div>

In [None]:
import onnxruntime as rt

onnx_model_path = config['session'].get_param('model_file')
delegate_options = {}
so = rt.SessionOptions()
delegate_options['artifacts_folder'] = config['session'].get_param('artifacts_folder')

EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']
sess = rt.InferenceSession(onnx_model_path ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so)

input_details = sess.get_inputs()
output_details = sess.get_outputs()

print('onnx_model_path:',onnx_model_path)
print('artifacts_folder:',config['session'].get_param('artifacts_folder'))

## Run the model for inference

### Preprocessing and Inference

   - You can use a portion of images provided in `/sample-images/pose` directory to evaluate the classification inferences. In the cell below, we use a drop down list to select the image and preprocess it and finally provide it as the input to the network.

### Postprocessing and Visualization

 - Once the inference results are available, we postpocess the results and visualize the pose estimation for the input image.
 - Pose Estimation models return the results as a list of `numpy.ndarray`, containing one element which is an array with `shape` = `(34,128,128)` and `dtype` = `'float32'`, where the first 17 channels represents the heatmaps foir keypoints and the last 17 channels represent the tag values for each keypoint. The results from the these inferences above are postprocessed using `single_img_visualise` function described in `utils` to get the output image with pose drawn on it.
 - Then, in this notebook, we use *matplotlib* to plot the final image.

In [None]:
# choose the image to be used for inference from the drop down list
img_name = widgets.Dropdown(
    options=[('horseman.jpg'), ('ski.jpg'), ('ski_jump.jpg'),('street_walk_2.jpg'),('street_walk_3.jpg')],
    value='ski_jump.jpg',
    description='Inference_Image:',
)
display(img_name)

In [None]:
from scripts.utils import get_preproc_props
# read, preprocess, forward pass on a single chosen image  
label=selected_model_id.label
pad_color = 128 if 'ae' in label and 'yolo' not in label else 114
pad_type = "corner" if 'yolox' in label else "center"
image_name = os.path.join('sample-images','pose',img_name.value)
processed_image, top, left, ratio = preprocess_for_onnx_pose_estimation(image_name, size, mean, scale, layout, reverse_channels, pad_color, pad_type)
#size, mean, scale, layout, reverse_channels = get_preproc_props(config)    
#print(f'Image size: {size}')

#processed_image = preprocess(image_name , size, mean, scale, layout, reverse_channels)

if not input_details[0].type == 'tensor(float)':
    processed_image = np.uint8(processed_image)
image_size = processed_image.shape[3]    
out_file=None
output=None
output = list(sess.run(None, {input_details[0].name : processed_image}))

In [None]:
#postprocessing on a single image
from scripts.utils import single_img_visualise
import matplotlib.pyplot as plt
%matplotlib inline
output_image = single_img_visualise(output,image_size,image_name,out_file, top, left, ratio, udp=True, thickness=2, radius=5,label=label)
# plot the outut using matplotlib
plt.rcParams["figure.figsize"]=20,20
plt.rcParams['figure.dpi'] = 200 # 200 e.g. is really fine, but slower
plt.imshow(output_image)
plt.show()

## Plot Inference benchmarking statistics

 - During the model execution several benchmarking statistics such as timestamps at different checkpoints, DDR bandwidth are collected and stored. `get_TI_benchmark_data()` can be used to collect these statistics. This function returns a dictionary of `annotations` and the corresponding markers.
 - We provide the utility function plot_TI_benchmark_data to visualize these benchmark KPIs

<div class="alert alert-block alert-info">
<b>Note:</b> The values represented by <i>Inferences Per Second</i> and <i>Inference Time Per Image</i> uses the total time taken by the inference except the time taken for copying inputs and outputs. In a performance oriented system, these operations can be bypassed by writing the data directly into shared memory and performing on-the-fly input / output normalization.
</div>


In [None]:
sys.path.append("..") 
from scripts import utils
from scripts.utils import plot_TI_performance_data, plot_TI_DDRBW_data, get_benchmark_output
stats = sess.get_TI_benchmark_data()
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(10,5))
plot_TI_performance_data(stats, axis=ax)
plt.show()

tt, st, rb, wb = get_benchmark_output(stats)
print(f'Statistics : \n Inferences Per Second   : {1000.0/tt :7.2f} fps')
print(f' Inference Time Per Image : {tt :7.2f} ms  \n DDR BW Per Image        : {rb+ wb : 7.2f} MB')