Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False results when deploying Yolov5 on ZCU102 through using Vitis Ai 3.0 #1319

Open
sleipnier opened this issue Aug 22, 2023 · 9 comments
Open

Comments

@sleipnier
Copy link

False results when deploying Yolov5 on ZCU102 through using Vitis Ai 3.0

These days I have encountered a number of difficulties and intractable problems when I tried to quantificat yolov5 model and then deploy it on board ZCU102. I am wondering how can I solve them and get some normal results.

To begin with, let me briefly introduce some infromation concerning the background:

  • OS: Ubuntu 20.04.6 Linux System
  • Vitis ai docker images: xilinx/vitis-ai-pytorch-cpu:ubuntu2004-3.0.0.106
  • the model we use is Yolov5_nano(yolov5n_nndct.yaml)
  • the traning code and quantification code are officially downloaded from yolov5

After using datasets coco and coco128 to train ylov5n model, I have tried several approaches to make quantification in Vitis Ai 3.0 and then delpoyed it on boead ZCU102. The result of quantification is satifactory and qualified and its map is around 0.3. However, without exception, they all failed. Sometimes it had wrong anchors, and sometimes it can not detect the targets. For example, the sample image 1 is from datasets coco128. The result produced by the board is shown below. It is so impercise that it can not be used.

sample1
sample1_wrong2

Below are some ways I have tried to make quantification and deployment.

Quantification official model

I have tried to do the quantification by using the yolov5n_float.pt. The command I executed in the terminal is:

 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 4 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode calib --nndct_quant
 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant
 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant --dump_xmodel

The results of calib and test are normal. Its map is around 0.3 and it can detect some of the items in the image. However, when I compiled the xmodel with the command below and then deployed it in ZCU102, the result is abnormal. Its anchors are smaller than the original labels. Just like the result picture above.

vai_c_xir -x ./Model_0_int.xmodel -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json -o ./ -n model

Quantification model after running train.py and export.py

I also tried to run the train.py to get best.pt file and then used export.py in yolov5 folder to produce xmodel. The command I used in terminal is:

python train.py --weights ./float/yolov5n_float.pt --data ./data/coco128.yaml --batch-size 4 --cfg ./models/yolov5n_nndct.yaml --epochs 10 --nndct_quant --hyp ./data/hyps/hyp.scratch-med.yaml --device cpu
python export.py --weights ./runs/train/exp7/weights/best.pt --data ./data/coco128.yaml --batch-size 1 --conf-thres 0.5 --iou-thres 0.65 --include nndct

The same command was used when compiling. Same the results are as unsatisfactory as last approach.

Quantification model after running train.py

After running train.py, we are able to get best.pt file. Then I retried the first aproach to make quantification by using this pt file. The condition and command are the same. its map is also around 0.3 and the results of calib and test is normal. But as soon as I deployed the xmodel on board, the result it detected will become a mess.

Quantification official model that generated by the official running train.py

There is also a file named yolov5n_qat.pt in yolov5/qat folder. It is said that it is the result trained officially. I tried the same steps above. It still did not work.

Till now I have tried almost every possible ways to slove the problem. But it is still out of control. I doubt there might be something wrong or mistake when compiling or deploying, since the result of quantification is qualfied. Hope there may be someone got the same problems and help me to fix it. Thanks a lot

Sleipnir
2260376317@qq.com

@lishixlnx
Copy link

what is your prototxt file for this model?
can you paste it here?

@sleipnier
Copy link
Author

sleipnier commented Aug 24, 2023 via email

@sleipnier
Copy link
Author

the prototxt is:

model {
  kernel {
     mean: 0.0
     mean: 0.0
     mean: 0.0
     scale: 0.00392156
     scale: 0.00392156
     scale: 0.00392156
  }
  model_type : YOLOv3
  yolo_v3_param {
    num_classes: 80
    anchorCnt: 3
    layer_name: "ip_fix"
    layer_name: "ip_2_fix"
    layer_name: "ip_1_fix"
    conf_threshold: 0.5
    nms_threshold: 0.65
    biases: 10
    biases: 13
    biases: 16
    biases: 30
    biases: 33 
    biases: 23
    biases: 30
    biases: 61
    biases: 62
    biases: 45 
    biases: 59
    biases: 119
    biases: 116
    biases: 90
    biases: 156
    biases: 198
    biases: 373 
    biases: 326
    test_mAP: false
    type: YOLOV5
  }
  is_tf: true 
}

@lishixlnx
Copy link

can you please try again by modifying your prototxt with " is_tf: false " ?

@sleipnier
Copy link
Author

can you please try again by modifying your prototxt with " is_tf: false " ?

Thanks a lot. I am going to try this again

@raziqi
Copy link

raziqi commented Nov 3, 2023

I want to run the yolov5 on the device , via PYNQ DPU

i have quantized and compiled the model, now trying to load the xmodel on the device and do the detection

@IkrameBeggar
Copy link

IkrameBeggar commented Nov 13, 2023

False results when deploying Yolov5 on ZCU102 through using Vitis Ai 3.0

These days I have encountered a number of difficulties and intractable problems when I tried to quantificat yolov5 model and then deploy it on board ZCU102. I am wondering how can I solve them and get some normal results. To begin with, let me briefly introduce some infromation concerning the background:

  • OS: Ubuntu 20.04.6 Linux System
  • Vitis ai docker images: xilinx/vitis-ai-pytorch-cpu:ubuntu2004-3.0.0.106
  • the model we use is Yolov5_nano(yolov5n_nndct.yaml)
  • the traning code and quantification code are officially downloaded from yolov5

After using datasets coco and coco128 to train ylov5n model, I have tried several approaches to make quantification in Vitis Ai 3.0 and then delpoyed it on boead ZCU102. The result of quantification is satifactory and qualified and its map is around 0.3. However, without exception, they all failed. Sometimes it had wrong anchors, and sometimes it can not detect the targets. For example, the sample image 1 is from datasets coco128. The result produced by the board is shown below. It is so impercise that it can not be used.

sample1 sample1_wrong2

Below are some ways I have tried to make quantification and deployment.

Quantification official model

I have tried to do the quantification by using the yolov5n_float.pt. The command I executed in the terminal is:

 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 4 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode calib --nndct_quant
 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant
 python val.py --data ./data/coco128.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant --dump_xmodel

The results of calib and test are normal. Its map is around 0.3 and it can detect some of the items in the image. However, when I compiled the xmodel with the command below and then deployed it in ZCU102, the result is abnormal. Its anchors are smaller than the original labels. Just like the result picture above.

vai_c_xir -x ./Model_0_int.xmodel -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json -o ./ -n model

Quantification model after running train.py and export.py

I also tried to run the train.py to get best.pt file and then used export.py in yolov5 folder to produce xmodel. The command I used in terminal is:

python train.py --weights ./float/yolov5n_float.pt --data ./data/coco128.yaml --batch-size 4 --cfg ./models/yolov5n_nndct.yaml --epochs 10 --nndct_quant --hyp ./data/hyps/hyp.scratch-med.yaml --device cpu
python export.py --weights ./runs/train/exp7/weights/best.pt --data ./data/coco128.yaml --batch-size 1 --conf-thres 0.5 --iou-thres 0.65 --include nndct

The same command was used when compiling. Same the results are as unsatisfactory as last approach.

Quantification model after running train.py

After running train.py, we are able to get best.pt file. Then I retried the first aproach to make quantification by using this pt file. The condition and command are the same. its map is also around 0.3 and the results of calib and test is normal. But as soon as I deployed the xmodel on board, the result it detected will become a mess.

Quantification official model that generated by the official running train.py

There is also a file named yolov5n_qat.pt in yolov5/qat folder. It is said that it is the result trained officially. I tried the same steps above. It still did not work.

Till now I have tried almost every possible ways to slove the problem. But it is still out of control. I doubt there might be something wrong or mistake when compiling or deploying, since the result of quantification is qualfied. Hope there may be someone got the same problems and help me to fix it. Thanks a lot

Sleipnir 2260376317@qq.com

Hello @sleipnier , I am having the same issue. Did you manage to solve it please?

@DeepKnowledge1
Copy link

@IkrameBeggar
Copy link

I have solve this issue, please have a look here

https://support.xilinx.com/s/question/0D54U00007SLaSYSA1/quantizing-ultralytics-yolov5-vitis-ai-v35-modifying-forward?language=en_US

I took a look to the link you sent, but I did not understand the approach exactly. Can you give me more details please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants