# Object detection on PYNQ

This notebook demonstrates the typical working flow of our solution. In our solution, a shared library `libssd.so` is first created from C/C++ sources and NN models, with the help of [DNNDK](http://www.deephi.com/dnndk.html). `libssd.so` exports necessary handlers to initialize, operate and terminate the object detection IP (DPU) running on the fabric. Then, the python notebook accesses the shared library and its exposed handlers to interact with the fabric. Most scheduling work is done within the shared library. For details, refer to the C++ codes.

Either to start from scratch (compiling from source codes and models to the final binary file) or just have a try of our solution (using our binaries), there are some prerequisites:
- Your PYNQ board
- OpenCV (C++ version)
- Deep Neural Network Development Kit (DNNDK) from DeePhi
If you have problems configuring DNNDK, please contact [DeePhi](http://www.deephi.com).

Below we illustrate our solution.

## Initialization

Here we import the necessary packages and set up the environment. The shared library in Python is accessed using [cffi](https://cffi.readthedocs.io/). The simplest mode (ABI, in-line) already satisfies our need. First, exported interfaces are declared again in Python; then the shared library is opened. With `cffi.new` we are free to pass the arguments and call the functions in the shared library now.

In [None]:
import sys
import math
import os
import time
from datetime import datetime
from pynq import Overlay
from preprocessing import *
from iou import *
from cffi import FFI

team = 'TGIIF'
agent = Agent(team)
print("Team created")

ffi = FFI()
ffi.cdef('''
typedef struct {
    int label;
    int xmin;
    int xmax;
    int ymin;
    int ymax;
    float confidence;
} result_t;
''')

ffi.cdef('''
void dpu_initialize(char *lib_path);
result_t dpu_detect_single(char *path);
void dpu_detect_list(char *, unsigned);
void dpu_clear(void);
void dpu_destroy(void);
result_t *dpu_get_results(void);
''')

lib_path = os.path.join(os.getcwd(), "libraries/libssd.so")
dpu_lib = ffi.dlopen(lib_path)

c_lib_path = ffi.new("char []", lib_path.encode())
print("Lib opened:", lib_path)

c_results = ffi.new("result_t *")

## Overlay loading

The bitstream file is loaded to the PL side of PYNQ.

In [None]:
OVERLAY_PATH = os.path.join(OVERLAY_DIR, "TGIIF/pynq_dpu_142m.bit")
overlay = Overlay(OVERLAY_PATH)
print("Overlay loaded: {}".format(OVERLAY_PATH))

## Image processing

This step we process the images using the declared functions. We need to all `dpu_initialize` to get DPU ready. To obtain the best performance, `dpu_detect_list` is called for processing images. The API accepts the name of a text file which contains the list of image paths as the argument and returns an array of results. To detect images one by one, you could use `dpu_detect_single`.

In [None]:
interval_time = 0
total_time = 0
total_num_img = len(agent.img_list)
result = list()
agent.reset_batch_count()

# Initialize DPU
dpu_lib.dpu_initialize(c_lib_path)
print("DPU initialized")

# Start processing
result_records = []
for i in range(math.ceil(total_num_img/BATCH_SIZE)):
    # get a batch from agent
    batch = agent.send(interval_time, agent.img_batch)
    
    # timer starts
    start = time.time()
    with open(agent.coord_team + "/imgs.txt", 'w') as fimg:
        fimg.write(IMG_DIR+"\n")
        for img in batch:
            fimg.write(img+'\n')
    print("Image list created")
    
    c_imgs_file = ffi.new("char []", (agent.coord_team+"/imgs.txt").encode())
    c_img_num = ffi.new("unsigned *")
    c_img_num[0] = BATCH_SIZE
    dpu_lib.dpu_detect_list(c_imgs_file, c_img_num[0])
    c_results = dpu_lib.dpu_get_results()
    print("Current batch processed")
    
    result_records += [[c_results[j].xmin, c_results[j].xmax, c_results[j].ymin, c_results[j].ymax,
                        c_results[j].confidence, c_results[j].label] for j in range(c_img_num[0])]
        
    # timer stop after PS has received image
    end = time.time()
    t = end - start
    print('Processing time: {} seconds.'.format(t))
    total_time += t

## Results storing

Detection results are stored into xml files.

In [None]:
# Write misc info
agent.write(total_time, total_num_img, team)

# Write detection results into xml files
agent.save_results_xml(result_records)
print("XML results written successfully.")

## Cleaning up

`dpu_destroy` is called to release the system resources and make DPU idle. To start DPU again, you can call `dpu_intialize` later on.

In [None]:
dpu_lib.dpu_destroy()