YOLOv4-Vitis-AI

Custom YOLOv4 (You Only Look Once) for apple recognition (clean/damaged) on Alveo U280 accelerator card using Vitis AI framework.

1) Context

A deep-learning model is caracterized by two distinct computation-intensive processes that are training and inference. During the training step, the model is taught to perform a specific task. On the other hand, inference is the deployment of the trained model to perform on new data. Real-time inference of deep neural network (DNN) models is a big challenge that the Industry faces, with the growth of latency constrained applications. For this reason, inference acceleration has become more critical than faster training. While the training step is most often carried out by GPUs, due to their high throughput, massive parallelism, simple control flow, and energy efficiency, FPGA devices (Field Programmable Gate Arrays) are more adapted to AI inference, by providing better performance per watt of power consumption than GPUs thanks to their flexible hardware configuration.

An important axis of research is the deployment of AI models on embedded platforms. To achieve that, along with smaller neural network architectures, some techniques like quantization and pruning allow to reduce the size of existing architectures without losing much accuracy. It minimizes the hardware footprint and energy consumption of the target board. These techniques perform well on FPGAs, over GPU.

One significant issue about conjugating AI inference with hardware acceleration is the expertise required in both domains, especially regarding low level development on accelerator cards. Fortunately, some frameworks make hardware more accessible to software engineers and data scientists. With the Xilinx’s Vitis AI toolset, we can quite easily deploy models from Keras-TensorFlow straight onto FPGAs.

2) Vitis AI

Vitis™ is a unified software platform for embedded software and accelerated applications development on Xilinx® hardware platforms, with Edge, Cloud or Hybrid computing. The application code can be developed using high-level programming languages such as C++ and Python.

Vitis™ AI is a development environment whose purpose is to accelerate AI inference. Thanks to optimized IP cores and tools, it allows to implement pre-compiled or custom AI models and provides libraries to accelerate the application by interacting with the processor unit of the target platform. With Vitis AI, the user can easily develop Deep Learning inference applications without having a strong FPGA knowledge.

We chose to use the Vitis AI TensorFlow framework. For more information on Vitis AI, please refer to the official user guide.

In our case, the hardware platform is an Alveo™ Data Center Accelerator Card. This FPGA (Field Programmable Gate Arrays) is a Cloud device to accelerate the computing workloads of deep learning inference algorithms. Its processor unit is called a Deep-Learning Processor Unit (DPU), a a group of parameterizable IP cores pre-implemented on the hardware optimized for deep neural networks, compatible with the Vitis AI specialized instruction set. Different versions exists so as to offer different levels of throughput, latency, scalability, and power. The Alveo U280 Data Center Accelerator Card supports the Xilinx DPUCAHX8H DPU optimized for high throughput applications involving Convolutional Neural Networks (CNNs). It is composed of a high performance scheduler module, a hybrid computing array module, an instruction fetch unit module, and a global memory pool module.

3) YOLO

A YOLOv4 model is able to detect objects in images through bounding boxes, classify the objects among a prefefined list of classes, and attribute a confidence score for each prediction. Please read this article and this one too to better understand the concept.

The original Darknet model was made from this tutorial. To implement your custom model, make your changes according to the section "Create your custom config file and upload it to your drive".

Our model was trained to detect apples in images and determine whether they are clean or damaged. The classes are written in this file and the anchors here.

To build the dataset, we used this scraper.

To annotate the samples, we used this GitHub project by developer0hye. The annotations follow the template "image_name class_label x_top_left y_top_left width height".

To make the model fit the accelerator card, we had to change the MaxPool size, and convert the mish activations to leaky RELU. Our changes are based on this tutorial. The '.cfg' file can be found here and the '.weights' can be downloaded here.

4) Requirements

Before running the project, check the requirements from Vitis AI and make sure to complete the following steps :

Weights file :

🠊 Please download the weights of the YOLOv4 trained model here. Place the file in the folder /model/darknet, alongside the '.cfg' Darknet model.

Dataset folder :

🠊 Please unzip the dataset folder.

Versions :

Docker : 20.10.6
Docker Vitis AI image : 1.3.598
Vitis AI : 1.3.2
TensorFlow : 1.15.2
Python : 3.6.12
Anaconda : 4.9.2

Hardware :

Alveo U280 Data Center Accelerator Card

5) User Guide

In this section, we are going to explain how to run the project.
Open a terminal and make sure to be located in the workspace directory.
This project is executed through a succession of bash files, located in the /workflow/ folder.
You may need to first set the permissions for the bash files :

cd ./docker_ws/workflow/
chmod +x *.sh
cd ..
chmod +x *.sh

You can either run the scripts from the /workflow/ folder step by step, or run the two main scripts.
The first script to run serves to open the Vitis AI image in the Docker container.
Indeed, we can use the Vitis™ AI software through Docker Hub. It contains the tools such as the Vitis AI quantizer, AI compiler, and AI runtime for cloud DPU. We chose to use the Vitis AI Docker image for host CPU.

cd docker_ws
 source ./workflow/0_run_docker_cpu.sh

5.1 Demo

See this guide.

source ./run_demo.sh

We used these model and dataset to quickly test our application code before deploying our own model.

5.2 Application

Run the following script to execute the whole process.

 source ./run_all.sh

This project is based on the workflow from Vitis AI tutorials using the Anaconda environment for TensorFlow.

For more details, please consult this guide.

6) Results

Here are some results after running the model on the FPGA :

Let's evaluate the mAP score of the model running on the accelerator card. We set the confidence threshold to 0.6 and the IoU threshold to 0.5.

Model	Original	Intermediate graph	App (on Alveo U280)
mAP @ IOU50 score	75.0 %	x	91.0 % on the training set
FPS	x	x	12

7) Axes of improvement

Find a way to be able to set the input shape with a variable when compiling the model;
Create and annotate a new test set;
Increase the FPS;
Modify the AlexeyAB application that runs the Darknet model on the host machine to measure the execution time of the inference;
Modify the AlexeyAB application to process the whole test set at once;
Evaluate the mAP score for the AlexeyAB application after changing the output data to fit the annotations;
Modify the code to run the freeze/quantized TensorFlow graph to normalize the data to be able to evaluate its score;
Modify the code to run the freeze/quantized TensorFlow graph to draw boxes when running the graph;
Improve the labels display in the application code;
Run the Vitis AI Profiler

In order to deploy on the accelerator card your own YOLOv4 or YOLOv3 model, replace the '.cfg' and '.weights' files in this folder. Then, change the environment variables that determine the model specifications defined in the script "1_set_env.sh". Set the input shape in the script that compiles the model. Don't forget to update the name of the input and output tensors, and the shape of the input tensor. Finally, replace the current dataset by your own in this folder.

8) References

The mentionned projects below were used for this project as tools or source of inspiration :

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
DOC		DOC
IMAGES		IMAGES
data		data
external_tools		external_tools
model		model
src		src
workflow		workflow
LICENSE		LICENSE
README.md		README.md
profile.sh		profile.sh
run_all.sh		run_all.sh
run_demo.sh		run_demo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLOv4-Vitis-AI

Table of contents

1) Context

2) Vitis AI

3) YOLO

4) Requirements

5) User Guide

5.1 Demo

5.2 Application

6) Results

7) Axes of improvement

8) References

About

Releases

Packages

Languages

License

Pomiculture/YOLOv4-Vitis-AI

Folders and files

Latest commit

History

Repository files navigation

YOLOv4-Vitis-AI

Table of contents

1) Context

2) Vitis AI

3) YOLO

4) Requirements

5) User Guide

5.1 Demo

5.2 Application

6) Results

7) Axes of improvement

8) References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages