Get the dataset


Source files

My source files for this project are kept here at Github,

There are 4 sub-repos in the linked github repo,

code: all source codes are included here, including scripts for training/quantizing/compiling tasks, as well as a host application to be running on ARM core. output: generated model files. Sign-Language-Digits-Dataset: dataset should be putted here. target_zcu102_zcu104_kv260: files prepared to be copied to target board.


Steps on host machine

Setup Docker environment

Follow Vitis-AI page to set up Docker. Quantize & Compile should be done inside Vitis-AI docker and conda environment.

Train/quantize/compile model

Below steps are done on host machine within Vitis-AI docker and vitis-ai-tensorflow2 conda environment.

Train the model

Run "python3".

A custom CNN model is created and trained by this python script.

For this simple CNN, a accuracy of 0.7682 is abtained on validating dataset at 10th epoch.

(vitis-ai-tensorflow2) Vitis-AI /workspace/chao-proj/handsigndigits_end2end/code > python3

48/48 [==============================] - 11s 226ms/step - loss: 2.2957 - acc: 0.1128 - val_loss: 2.1856 - val_acc: 0.2292

Epoch 2/10

48/48 [==============================] - 4s 84ms/step - loss: 1.7664 - acc: 0.3633 - val_loss: 1.8149 - val_acc: 0.3568

Epoch 3/10

48/48 [==============================] - 4s 83ms/step - loss: 1.1517 - acc: 0.5921 - val_loss: 1.2129 - val_acc: 0.5833

Epoch 4/10

48/48 [==============================] - 4s 85ms/step - loss: 0.8129 - acc: 0.6979 - val_loss: 1.1744 - val_acc: 0.5729

Epoch 5/10

48/48 [==============================] - 4s 84ms/step - loss: 0.6780 - acc: 0.7639 - val_loss: 1.1073 - val_acc: 0.6589

Epoch 6/10

48/48 [==============================] - 4s 80ms/step - loss: 0.5517 - acc: 0.8131 - val_loss: 1.1636 - val_acc: 0.6562

Epoch 7/10

48/48 [==============================] - 4s 84ms/step - loss: 0.4344 - acc: 0.8472 - val_loss: 0.7979 - val_acc: 0.7344

Epoch 8/10

48/48 [==============================] - 4s 84ms/step - loss: 0.3883 - acc: 0.8675 - val_loss: 0.9528 - val_acc: 0.7005

Epoch 9/10

48/48 [==============================] - 4s 82ms/step - loss: 0.3665 - acc: 0.8839 - val_loss: 0.7645 - val_acc: 0.7370

Epoch 10/10

48/48 [==============================] - 4s 84ms/step - loss: 0.3273 - acc: 0.8872 - val_loss: 0.7131 - val_acc: 0.7682

Learning curve as below,


Quantize the model

Run "python3".

''quantized_model.h5'' will be generated in this step.

(vitis-ai-tensorflow2) Vitis-AI /workspace/chao-proj/handsigndigits_end2end/code > python3

Load float model..

model input size: 100 100

Load validation dataset for quantization..

Found 2062 images belonging to 10 classes.

Run quantization..

[VAI INFO] Start CrossLayerEqualization...

10/10 [==============================] - 1s 64ms/step

[VAI INFO] CrossLayerEqualization Done.

[VAI INFO] Start Quantize Calibration...

65/65 [==============================] - 11s 169ms/step

[VAI INFO] Quantize Calibration Done.

[VAI INFO] Start Post-Quantize Adjustment...

[VAI INFO] Post-Quantize Adjustment Done.

[VAI INFO] Quantization Finished.

Saved quantized model as ../output/quantized_model.h5

Evaluate the quantized model

Evaluating of quantized model can be done within python scripts.

Run "python3".

You might ask why a higher accurary(0.797) is obtained after quantizing? Well that's because I don't have the dataset augmented this time for evaluation, per my guess the validation dataset could be a little "easier".

In the training stage I have the dataset augmented since we have a relatively small dataset so it's likely to get overfit after training.

Comile the model

Run "bash -x" to compile the quantized model.

A xmodel file will be generated.

(vitis-ai-tensorflow2) Vitis-AI /workspace/chao-proj/handsigndigits_end2end/code > bash -x

+ ARCH=/opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json

+ OUTDIR=../output

+ NET_NAME=customcnn

+ MODEL=../output/quantized_model.h5

+ echo -----------------------------------------




+ echo -----------------------------------------


+ compile

+ tee compile.log

+ vai_c_tensorflow2 --model ../output/quantized_model.h5 --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json --output_dir ../output --net_name customcnn

[INFO] Namespace(batchsize=1, inputs_shape=None, layout='NHWC', model_files=['../output/quantized_model.h5'], model_type='tensorflow2', named_inputs_shape=None, out_filename='/tmp/customcnn_org.xmodel', proto=None)

[INFO] tensorflow2 model: /workspace/chao-proj/handsigndigits_end2end/output/quantized_model.h5

[INFO] keras version: 2.4.0

[INFO] Tensorflow Keras model type: functional

[INFO] parse raw model     :100%|██████████| 19/19 [00:00<00:00, 8027.78it/s]

[INFO] infer shape (NHWC)  :100%|██████████| 33/33 [00:00<00:00, 731.94it/s]

[INFO] perform level-0 opt :100%|██████████| 2/2 [00:00<00:00, 212.70it/s]

[INFO] perform level-1 opt :100%|██████████| 2/2 [00:00<00:00, 962.22it/s]

[INFO] generate xmodel     :100%|██████████| 33/33 [00:00<00:00, 450.63it/s]

[INFO] dump xmodel: /tmp/customcnn_org.xmodel

[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2

[UNILOG][INFO] Compile mode: dpu

[UNILOG][INFO] Debug mode: function

[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2

[UNILOG][INFO] Graph name: customcnn_model, with op num: 53

[UNILOG][INFO] Begin to compile...

[UNILOG][INFO] Total device subgraph number 3, DPU subgraph number 1

[UNILOG][INFO] Compile done.

[UNILOG][INFO] The meta json is saved to "/workspace/chao-proj/handsigndigits_end2end/code/../output/meta.json"

[UNILOG][INFO] The compiled xmodel is saved to "/workspace/chao-proj/handsigndigits_end2end/code/../output/customcnn.xmodel"

[UNILOG][INFO] The compiled xmodel's md5sum is 93022a6b0243ac1251f7acc60c145a3d, and has been saved to "/workspace/chao-proj/handsigndigits_end2end/code/../output/md5sum.txt"


* VITIS_AI Compilation - Xilinx Inc.


+ echo -----------------------------------------




+ echo -----------------------------------------


After all steps on host machine finished, copy below files to target repository.


If you are using Xilinx zcu102/zcu104/KV260 official vitis-ai image, the DPU configurations are the same on all 3 boards. There files can be deployed on each of the 3 boards.

Steps on target board

Copy target repository to target board.

Run with below command,

root@xilinx-zcu102-2021_1:/home/petalinux/Target_zcu102_HandSignDigit# python3 -d Examples/ -m customcnn.xmodel

Command line options:

 --image_dir :  Examples/

 --threads   :  1

 --model     :  customcnn.xmodel

Pre-processing 10 images...

Starting 1 threads...

Throughput=1111.96 fps, total frames = 10, time=0.0090 seconds

Correct:10, Wrong:0, Accuracy:1.0000

Luckily, we got all 10 images correctly.


