Skip to content

Custom YOLOv4 for apple recognition (clean/damaged) on Alveo U280 accelerator card using Vitis AI framework.

License

Notifications You must be signed in to change notification settings

Pomiculture/YOLOv4-Vitis-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLOv4-Vitis-AI

Custom YOLOv4 (You Only Look Once) for apple recognition (clean/damaged) on Alveo U280 accelerator card using Vitis AI framework.

Table of contents

  1. Context
  2. Vitis AI
  3. YOLOv4
  4. Requirements
  5. User Guide
  6. Results
  7. Axes of improvement
  8. References

1) Context

A deep-learning model is caracterized by two distinct computation-intensive processes that are training and inference. During the training step, the model is taught to perform a specific task. On the other hand, inference is the deployment of the trained model to perform on new data. Real-time inference of deep neural network (DNN) models is a big challenge that the Industry faces, with the growth of latency constrained applications. For this reason, inference acceleration has become more critical than faster training. While the training step is most often carried out by GPUs, due to their high throughput, massive parallelism, simple control flow, and energy efficiency, FPGA devices (Field Programmable Gate Arrays) are more adapted to AI inference, by providing better performance per watt of power consumption than GPUs thanks to their flexible hardware configuration.

An important axis of research is the deployment of AI models on embedded platforms. To achieve that, along with smaller neural network architectures, some techniques like quantization and pruning allow to reduce the size of existing architectures without losing much accuracy. It minimizes the hardware footprint and energy consumption of the target board. These techniques perform well on FPGAs, over GPU.

One significant issue about conjugating AI inference with hardware acceleration is the expertise required in both domains, especially regarding low level development on accelerator cards. Fortunately, some frameworks make hardware more accessible to software engineers and data scientists. With the Xilinx’s Vitis AI toolset, we can quite easily deploy models from Keras-TensorFlow straight onto FPGAs.


2) Vitis AI

Vitis™ is a unified software platform for embedded software and accelerated applications development on Xilinx® hardware platforms, with Edge, Cloud or Hybrid computing. The application code can be developed using high-level programming languages such as C++ and Python.

Vitis™ AI is a development environment whose purpose is to accelerate AI inference. Thanks to optimized IP cores and tools, it allows to implement pre-compiled or custom AI models and provides libraries to accelerate the application by interacting with the processor unit of the target platform. With Vitis AI, the user can easily develop Deep Learning inference applications without having a strong FPGA knowledge.

VART (Vitis AI Runtime) stack

We chose to use the Vitis AI TensorFlow framework. For more information on Vitis AI, please refer to the official user guide.

Vitis AI workflow

In our case, the hardware platform is an Alveo™ Data Center Accelerator Card. This FPGA (Field Programmable Gate Arrays) is a Cloud device to accelerate the computing workloads of deep learning inference algorithms. Its processor unit is called a Deep-Learning Processor Unit (DPU), a a group of parameterizable IP cores pre-implemented on the hardware optimized for deep neural networks, compatible with the Vitis AI specialized instruction set. Different versions exists so as to offer different levels of throughput, latency, scalability, and power. The Alveo U280 Data Center Accelerator Card supports the Xilinx DPUCAHX8H DPU optimized for high throughput applications involving Convolutional Neural Networks (CNNs). It is composed of a high performance scheduler module, a hybrid computing array module, an instruction fetch unit module, and a global memory pool module.

DPUCAHX8H Top-Level Block Diagram


3) YOLO

A YOLOv4 model is able to detect objects in images through bounding boxes, classify the objects among a prefefined list of classes, and attribute a confidence score for each prediction. Please read this article and this one too to better understand the concept.

YOLOv4 model

The original Darknet model was made from this tutorial. To implement your custom model, make your changes according to the section "Create your custom config file and upload it to your drive".

Our model was trained to detect apples in images and determine whether they are clean or damaged. The classes are written in this file and the anchors here.

To build the dataset, we used this scraper.

To annotate the samples, we used this GitHub project by developer0hye. The annotations follow the template "image_name class_label x_top_left y_top_left width height".

To make the model fit the accelerator card, we had to change the MaxPool size, and convert the mish activations to leaky RELU. Our changes are based on this tutorial. The '.cfg' file can be found here and the '.weights' can be downloaded here.


4) Requirements

Before running the project, check the requirements from Vitis AI and make sure to complete the following steps :

Weights file :

🠊 Please download the weights of the YOLOv4 trained model here. Place the file in the folder /model/darknet, alongside the '.cfg' Darknet model.

Dataset folder :

🠊 Please unzip the dataset folder.

Versions :

  • Docker : 20.10.6
  • Docker Vitis AI image : 1.3.598
  • Vitis AI : 1.3.2
  • TensorFlow : 1.15.2
  • Python : 3.6.12
  • Anaconda : 4.9.2

Hardware :


5) User Guide

In this section, we are going to explain how to run the project.
Open a terminal and make sure to be located in the workspace directory.
This project is executed through a succession of bash files, located in the /workflow/ folder.
You may need to first set the permissions for the bash files :

cd ./docker_ws/workflow/
chmod +x *.sh
cd ..
chmod +x *.sh

You can either run the scripts from the /workflow/ folder step by step, or run the two main scripts.
The first script to run serves to open the Vitis AI image in the Docker container.
Indeed, we can use the Vitis™ AI software through Docker Hub. It contains the tools such as the Vitis AI quantizer, AI compiler, and AI runtime for cloud DPU. We chose to use the Vitis AI Docker image for host CPU.

cd docker_ws
 source ./workflow/0_run_docker_cpu.sh

Vitis AI workflow

5.1 Demo

See this guide.

source ./run_demo.sh

We used these model and dataset to quickly test our application code before deploying our own model.

5.2 Application

Run the following script to execute the whole process.

 source ./run_all.sh

This project is based on the workflow from Vitis AI tutorials using the Anaconda environment for TensorFlow.

For more details, please consult this guide.


6) Results

Here are some results after running the model on the FPGA :

Vitis AI workflow Vitis AI workflow Vitis AI workflow

Vitis AI workflow Vitis AI workflow Vitis AI workflow

Vitis AI workflow Vitis AI workflow

Let's evaluate the mAP score of the model running on the accelerator card. We set the confidence threshold to 0.6 and the IoU threshold to 0.5.

Model Original Intermediate graph App (on Alveo U280)
mAP @ IOU50 score 75.0 % x 91.0 % on the training set
FPS x x 12

7) Axes of improvement

In order to deploy on the accelerator card your own YOLOv4 or YOLOv3 model, replace the '.cfg' and '.weights' files in this folder. Then, change the environment variables that determine the model specifications defined in the script "1_set_env.sh". Set the input shape in the script that compiles the model. Don't forget to update the name of the input and output tensors, and the shape of the input tensor. Finally, replace the current dataset by your own in this folder.


8) References

The mentionned projects below were used for this project as tools or source of inspiration :

About

Custom YOLOv4 for apple recognition (clean/damaged) on Alveo U280 accelerator card using Vitis AI framework.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published