# DPU example: Yolo_v3

This notebooks shows how to run a YOLO network based application for object detection. The application, as well as the DPU IP, is pulled from the official [Vitis AI Github Repository](https://github.com/Xilinx/Vitis-AI).
For more information, please refer to the [Xilinx Vitis AI page](https://www.xilinx.com/products/design-tools/vitis/vitis-ai.html).

In this notebook we will be using the DNNDK **Python API** to run the DPU tasks.

## 1. Prepare the overlay
We will download the overlay onto the board. Then we will load the 
corresponding DPU model.

In [1]:
from pynq_dpu import DpuOverlay
overlay = DpuOverlay("dpu.bit")
overlay.load_model("dpu_tf_yolov3.elf")

## 2. Constants and helper functions 

You can view all of the helper functions in [DNNDK yolo example](https://github.com/Xilinx/Vitis-AI/blob/v1.1/mpsoc/vitis_ai_dnndk_samples/tf_yolov3_voc_py/tf_yolov3_voc.py). 
The helper functions released along with Vitis AI cover pre-processing of 
the images, so they can be normalized and resized to be compatible with 
the DPU model. These functions are included in our `pynq_dpu` package.

In [2]:
import numpy as np
import random
import cv2
import colorsys
from PIL import Image
import pylab as plt
from IPython import display
from matplotlib import pyplot as plt
import time
%matplotlib inline
from pynq_dpu.edge.dnndk.tf_yolov3_voc_py.tf_yolov3_voc import *


### Constants

Yolo V2 and V3 predict offsets from a predetermined set of boxes with 
particular height-width ratios; those predetermined set of boxes are the 
anchor boxes. We will use the predefined [anchors](https://github.com/Xilinx/Vitis-AI/blob/v1.1/mpsoc/vitis_ai_dnndk_samples/tf_yolov3_voc_py/model_data/yolo_anchors.txt).

In [3]:
anchor_list = [10,13,16,30,33,23,30,61,62,45,59,119,116,90,156,198,373,326]
anchor_float = [float(x) for x in anchor_list]
anchors = np.array(anchor_float).reshape(-1, 2)

We will use the `get_class()` function in `tf_yolov3_voc` module to
get class names from predefined [class names](https://github.com/Xilinx/Vitis-AI/blob/v1.1/mpsoc/vitis_ai_dnndk_samples/tf_yolov3_voc_py/image/voc_classes.txt).

In [4]:
classes_path = "voc_classes.txt"
class_names = get_class(classes_path)

Depending on the number of classes, we will define a unique color for each
class.

In [5]:
num_classes = len(class_names)
hsv_tuples = [(1.0 * x / num_classes, 1., 1.) for x in range(num_classes)]
colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
colors = list(map(lambda x: 
                  (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), 
                  colors))
random.seed(0)
random.shuffle(colors)
random.seed(None)

We can define some DPU-related parameters, such as DPU kernel name and
input/output node names.

In [6]:
KERNEL_CONV="tf_yolov3"
CONV_INPUT_NODE="conv2d_1_convolution"
CONV_OUTPUT_NODE1="conv2d_59_convolution"
CONV_OUTPUT_NODE2="conv2d_67_convolution"
CONV_OUTPUT_NODE3="conv2d_75_convolution"

### Drawing bounding boxes
We now define a custom function that draws the bounding boxes around 
the identified objects after we have the classification results.

In [7]:
def draw_boxes(image, boxes, scores, classes):
    image_h, image_w, _ = image.shape
    font = cv2.FONT_HERSHEY_SIMPLEX 
    fontScale = 3
    thickness = 10
    for i, bbox in enumerate(boxes):
        [top, left, bottom, right] = bbox
        width, height = right - left, bottom - top
        center_x, center_y = left + width*0.5, top + height*0.5
        score, class_index = scores[i], classes[i]
        if(score > .6 and (class_names[class_index]=="person" or 
                           class_names[class_index]=="car" or 
                           class_names[class_index]=="bicycle" or
                           class_names[class_index]=="bus"or
                           class_names[class_index]=="dog"or
                           class_names[class_index]=="motorbike"or
                           class_names[class_index]=="cat"
                          )):
            label = '{}: {:.4f}'.format(class_names[class_index], score) 

            color = (0,255,0)

            cv2.rectangle(image, (left,top), (right,bottom), color, thickness)
            cv2.putText(image, label, (int(left), int(top-5)) , font, fontScale, color, thickness, cv2.LINE_AA)
    return image

### Predicting classes
We need to define a function that evaluates the scores and makes predictions
based on the provided class names.

In [8]:
def evaluate(yolo_outputs, image_shape, class_names, anchors):
    score_thresh = 0.2
    anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
    boxes = []
    box_scores = []
    input_shape = np.shape(yolo_outputs[0])[1 : 3]
    input_shape = np.array(input_shape)*32

    for i in range(len(yolo_outputs)):
        _boxes, _box_scores = boxes_and_scores(
            yolo_outputs[i], anchors[anchor_mask[i]], len(class_names), 
            input_shape, image_shape)
        boxes.append(_boxes)
        box_scores.append(_box_scores)
    boxes = np.concatenate(boxes, axis = 0)
    box_scores = np.concatenate(box_scores, axis = 0)

    mask = box_scores >= score_thresh
    boxes_ = []
    scores_ = []
    classes_ = []
    for c in range(len(class_names)):
        class_boxes_np = boxes[mask[:, c]]
        class_box_scores_np = box_scores[:, c]
        class_box_scores_np = class_box_scores_np[mask[:, c]]
        nms_index_np = nms_boxes(class_boxes_np, class_box_scores_np) 
        class_boxes_np = class_boxes_np[nms_index_np]
        class_box_scores_np = class_box_scores_np[nms_index_np]
        classes_np = np.ones_like(class_box_scores_np, dtype = np.int32) * c
        boxes_.append(class_boxes_np)
        scores_.append(class_box_scores_np)
        classes_.append(classes_np)
    boxes_ = np.concatenate(boxes_, axis = 0)
    scores_ = np.concatenate(scores_, axis = 0)
    classes_ = np.concatenate(classes_, axis = 0)

    return boxes_, scores_, classes_


## 3. Run application

We create DPU kernel and task.

In [9]:
n2cube.dpuOpen()
kernel = n2cube.dpuLoadKernel(KERNEL_CONV)
task = n2cube.dpuCreateTask(kernel, 0)
input_len = n2cube.dpuGetInputTensorSize(task, CONV_INPUT_NODE)

Now we execute the DPU task to classify an input video frame.

In [10]:
from IPython import display
import requests 
import json

frame_width = int(cap.get(3)) 
frame_height = int(cap.get(4)) 
   
size = (frame_width, frame_height) 

result = cv2.VideoWriter('myvideo.avi',  
                         cv2.VideoWriter_fourcc(*'MJPG'), 
                         30, size) 
while(1):
    try:
        # Ricoh Theta API Take, Read and Delete Image
        API_ENDPOINT = "http://192.168.1.1:80/osc/commands/execute"
        data = {"name":"camera.takePicture"} 
        headers={"Content-Type":"application/json"}

        # sending post request and saving response as response object 
        r = requests.post(url = API_ENDPOINT, data = json.dumps(data),headers=headers) 

        # extracting response text 
        pastebin_url = r.text 

        print("Take Picture Complete") 

        idp=json.loads(r.text)["id"]

        print("Checking Url") 

        API_ENDPOINT = "http://192.168.1.1:80/osc/commands/status"

        data = {"id":idp} 

        headers={"Content-Type":"application/json"}

        # sending post request and saving response as response object 
        r = requests.post(url = API_ENDPOINT, data = json.dumps(data),headers=headers) 

        # extracting response text 
        pastebin_url = json.loads(r.text) 

        while (json.loads(r.text)["state"]!= "done"):
            r = requests.post(url = API_ENDPOINT, data = json.dumps(data),headers=headers)
            pastebin_url = json.loads(r.text)
            print(".", end = '')
        print("")
        
        print("Check Url Complete") 

        img_url = pastebin_url["results"]["fileUrl"]

        with open('image.png', 'wb') as output_file,\
            requests.get(img_url, stream=True) as response:
            shutil.copyfileobj(response.raw, output_file)


        print("Save Image Complete Complete") 

        API_ENDPOINT = "http://192.168.1.1:80/osc/commands/execute"

        # your API key here 
        data = { 
            "name": "camera.delete", 
            "parameters": 
            {
                "fileUrls":
                [img_url]

            }
        }

        headers={"Content-Type":"application/json"}

        # sending post request and saving response as response object 
        r = requests.post(url = API_ENDPOINT, data = json.dumps(data),headers=headers) 

        # extracting response text 
        pastebin_url = r.text 

        print("Delete Image From Internal Storage Complete") 
        
        frame = cv.imread("image.png") 

        # Start Time to Check FPS
        start_time = time.time()
        
        image = frame
        image_size = image.shape[:2]
        image_data = np.array(pre_process(image, (416, 416)), dtype=np.float32)

        n2cube.dpuSetInputTensorInHWCFP32(
            task, CONV_INPUT_NODE, image_data, input_len)

        n2cube.dpuRunTask(task)

        conv_sbbox_size = n2cube.dpuGetOutputTensorSize(task, CONV_OUTPUT_NODE1)
        conv_out1 = n2cube.dpuGetOutputTensorInHWCFP32(task, CONV_OUTPUT_NODE1, 
                                                       conv_sbbox_size)
        conv_out1 = np.reshape(conv_out1, (1, 13, 13, 75))

        conv_mbbox_size = n2cube.dpuGetOutputTensorSize(task, CONV_OUTPUT_NODE2)
        conv_out2 = n2cube.dpuGetOutputTensorInHWCFP32(task, CONV_OUTPUT_NODE2, 
                                                       conv_mbbox_size)
        conv_out2 = np.reshape(conv_out2, (1, 26, 26, 75))

        conv_lbbox_size = n2cube.dpuGetOutputTensorSize(task, CONV_OUTPUT_NODE3)
        conv_out3 = n2cube.dpuGetOutputTensorInHWCFP32(task, CONV_OUTPUT_NODE3, 
                                                       conv_lbbox_size)
        conv_out3 = np.reshape(conv_out3, (1, 52, 52, 75))

        yolo_outputs = [conv_out1, conv_out2, conv_out3] 

        boxes, scores, classes = evaluate(yolo_outputs, image_size, 
                                      class_names, anchors)
        print("FPS: ", 1.0 / (time.time() - start_time))
        image = draw_boxes(image, boxes, scores, classes)

        result.write(image)

        print(".", end = '')
    except:
        print("ok")
        break
    
result.release() 
cv2.destroyAllWindows()
n2cube.dpuDestroyTask(task)
n2cube.dpuDestroyKernel(kernel)
print("ok")

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................