Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend performance test models #24298

Merged
merged 7 commits into from
Oct 4, 2023
Merged

Conversation

WanliZhong
Copy link
Member

@WanliZhong WanliZhong commented Sep 19, 2023

Merged With opencv/opencv_extra#1095

This PR aims to extend the performance tests.

  • YOLOv5 for object detection
  • YOLOv8 for object detection
  • EfficientNet for classification

Models from OpenCV Zoo:

  • YOLOX for object detection
  • YuNet for face detection
  • SFace for face recognization
  • MPPalm for palm detection
  • MPHand for hand landmark
  • MPPose for pose estimation
  • ViTTrack for object tracking
  • PPOCRv3 for text detection
  • CRNN for text recognization
  • PPHumanSeg for human segmentation

If other models should be added, please leave some comments. Thanks!

Build opencv with script:

-DBUILD_opencv_python2=OFF
-DBUILD_opencv_python3=OFF
-DBUILD_opencv_gapi=OFF
-DINSTALL_PYTHON_EXAMPLES=OFF
-DINSTALL_C_EXAMPLES=OFF
-DBUILD_DOCS=OFF
-DBUILD_EXAMPLES=OFF
-DBUILD_ZLIB=OFF
-DWITH_FFMPEG=OFF

Performance Test on Apple M2 CPU

MacOS 14.0
8 threads

1 thread:

Name of Test 4.5.5-1th 4.6.0-1th 4.7.0-1th 4.8.0-1th 4.8.1-1th
CRNN 76.244 76.611 62.534 57.678 57.238
EfficientNet --- --- 109.224 130.753 109.076
MPHand --- --- 19.289 22.727 27.593
MPPalm 47.150 47.061 41.064 65.598 40.109
MPPose --- --- 26.592 32.022 26.956
PPHumanSeg 41.672 41.790 27.819 27.212 30.461
PPOCRv3 --- --- 140.371 187.922 170.026
SFace 43.830 43.834 27.575 30.653 26.387
ViTTrack --- --- --- 14.617 15.028
YOLOX 1060.507 1061.361 495.816 533.309 549.713
YOLOv5 --- --- --- 191.350 193.261
YOLOv8 --- --- 198.893 218.733 223.142
YuNet 27.084 27.095 26.238 30.512 34.439
MobileNet_SSD_Caffe 44.742 44.565 33.005 29.421 29.286
MobileNet_SSD_v1_TensorFlow 49.352 49.274 35.163 32.134 31.904
MobileNet_SSD_v2_TensorFlow 83.537 83.379 56.403 42.947 42.148
ResNet_50 148.872 148.817 77.331 67.682 67.760

n threads:

Name of Test 4.5.5-nth 4.6.0-nth 4.7.0-nth 4.8.0-nth 4.8.1-nth
CRNN 44.262 44.408 41.540 40.731 41.151
EfficientNet --- --- 28.683 42.676 38.204
MPHand --- --- 6.738 13.126 8.155
MPPalm 16.613 16.588 12.477 31.370 17.048
MPPose --- --- 12.985 19.700 16.537
PPHumanSeg 14.993 15.133 13.438 15.269 15.252
PPOCRv3 --- --- 63.752 85.469 76.190
SFace 10.685 10.822 8.127 8.318 7.934
ViTTrack --- --- --- 10.079 9.579
YOLOX 417.358 422.977 230.036 234.662 228.555
YOLOv5 --- --- --- 74.249 75.480
YOLOv8 --- --- 63.762 88.770 70.927
YuNet 8.589 8.731 11.269 16.466 14.513
MobileNet_SSD_Caffe 12.575 12.636 11.529 12.114 12.236
MobileNet_SSD_v1_TensorFlow 13.922 14.160 13.078 12.124 13.298
MobileNet_SSD_v2_TensorFlow 25.096 24.836 22.823 20.238 20.319
ResNet_50 41.561 41.296 29.092 30.412 29.339

Performance Test on Intel Core i7-12700K

Ubuntu 22.04.2 LTS
8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz)
4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz)
20 threads

1 thread:

Name of Test 4.5.5-1th 4.6.0-1th 4.7.0-1th 4.8.0-1th 4.8.1-1th
CRNN 16.752 16.851 16.840 16.625 16.663
EfficientNet --- --- 61.107 76.037 53.890
MPHand --- --- 8.906 9.969 8.403
MPPalm 24.243 24.638 18.104 35.140 18.387
MPPose --- --- 12.322 16.515 12.355
PPHumanSeg 15.249 15.303 10.203 10.298 10.353
PPOCRv3 --- --- 87.788 144.253 90.648
SFace 15.583 15.884 13.957 13.298 13.284
ViTTrack --- --- --- 11.760 11.710
YOLOX 324.927 325.173 235.986 253.653 254.472
YOLOv5 --- --- --- 102.163 102.621
YOLOv8 --- --- 87.013 103.182 103.146
YuNet 12.806 12.645 10.515 12.647 12.711
MobileNet_SSD_Caffe 23.556 23.768 24.304 22.569 22.602
MobileNet_SSD_v1_TensorFlow 26.136 26.276 26.854 24.828 24.961
MobileNet_SSD_v2_TensorFlow 43.521 43.614 46.892 44.044 44.682
ResNet_50 73.588 73.501 75.191 66.893 65.144

n thread:

Name of Test 4.5.5-nth 4.6.0-nth 4.7.0-nth 4.8.0-nth 4.8.1-nth
CRNN 8.665 8.827 10.643 7.703 7.743
EfficientNet --- --- 16.591 12.715 9.022
MPHand --- --- 2.678 2.785 1.680
MPPalm 5.309 5.319 3.822 10.568 4.467
MPPose --- --- 3.644 6.088 4.608
PPHumanSeg 4.756 4.865 5.084 5.179 5.148
PPOCRv3 --- --- 32.023 50.591 32.414
SFace 3.838 3.980 4.629 3.145 3.155
ViTTrack --- --- --- 10.335 10.357
YOLOX 68.314 68.081 82.801 74.219 73.970
YOLOv5 --- --- --- 47.150 47.523
YOLOv8 --- --- 32.195 30.359 30.267
YuNet 2.604 2.644 2.622 3.278 3.349
MobileNet_SSD_Caffe 13.005 5.935 8.586 4.629 4.713
MobileNet_SSD_v1_TensorFlow 7.002 7.129 9.314 5.271 5.213
MobileNet_SSD_v2_TensorFlow 11.939 12.111 22.688 12.038 12.086
ResNet_50 18.227 18.600 26.150 15.584 15.706
force_builders=Linux32,Win32,Win64 OpenCL

@WanliZhong
Copy link
Member Author

@dkurt Hi, I want to introduce more models for performance test, but I met 2 problems.

  1. The ViT Track model have 2 inputs.
    image

  2. Some models need more outputLayers to measure its performance.
    image

How to modify the processNet method? Could I just add more override?

@dkurt
Copy link
Member

dkurt commented Sep 21, 2023

@WanliZhong, to make model produce all outputs you may use getUnconnectedOutLayersNames:

std::vector<String> outNames = net.getUnconnectedOutLayersNames();
std::vector<Mat> outs;
net.forward(outs, outNames);

For multiple inputs, I think make sense to add one more override with a vector of Mat:

void processNet(std::string weights, std::string proto, std::string halide_scheduler,
                    const Mat& input, const std::string& outputLayer = "")
processNet(weights, proto, halide_scheduler, {{"", input}}, outputLayer);
void processNet(std::string weights, std::string proto, std::string halide_scheduler,
                    const std::map<std::string, Mat>& inputs, const std::string& outputLayer = "")
// use setInput for multiple inputs

@WanliZhong
Copy link
Member Author

@dkurt Hi, I add more models for performance test. Do you have any other models to recommend? like some lightweight transformer model?

@dkurt
Copy link
Member

dkurt commented Sep 22, 2023

@dkurt Hi, I add more models for performance test. Do you have any other models to recommend? like some lightweight transformer model?

I don't have other models in mind right now. So proposed set is fine.

@WanliZhong WanliZhong marked this pull request as ready for review September 25, 2023 10:04
@WanliZhong
Copy link
Member Author

I'll add a comparison of performance tests from previous releases later.

@asmorkalov asmorkalov added this to the 4.9.0 milestone Sep 27, 2023
@asmorkalov
Copy link
Contributor

@WanliZhong Please rebase and fix conflicts.

@vpisarev
Copy link
Contributor

@WanliZhong, thank you very much, this is a really useful information! Can you please do the following?

  1. Remove single-thread tables, they are much less important, don't waste time on it.
  2. Add Resnet-50 and MobileNetSSD to the list of models.
  3. Please, provide more details on "Intel CPU" - CPU model, number of cores, OS and compiler used for tests.

@vpisarev vpisarev self-requested a review September 29, 2023 07:25
Copy link
Member

@dkurt dkurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opencv-alalek
Copy link
Contributor

Remove single-thread tables, they are much less important, don't waste time on it.

Disagree.
Single thread performance is an important part for investigation of degradation.

Using 1 single thread allows to:

  • reduce complexity of tested code
  • improve results stability as performance of multi-threaded system has issues.
  • issues with SIMD optimizations are clearly visible in single thread mode.

BTW, there are too many places with misused stripes of cv::parallel_for() in OpenCV DNN. Search for getNumThreads().

@WanliZhong
Copy link
Member Author

I have solved the conflicts, and add the additional models test results. I have kept the test result of 1 thread test.

@asmorkalov asmorkalov merged commit 62b5470 into opencv:4.x Oct 4, 2023
17 of 24 checks passed
hanliutong pushed a commit to hanliutong/opencv that referenced this pull request Oct 7, 2023
Extend performance test models opencv#24298

**Merged With opencv/opencv_extra#1095

This PR aims to extend the performance tests. 

- **YOLOv5** for object detection
- **YOLOv8** for object detection
- **EfficientNet** for classification

Models from OpenCV Zoo:

- **YOLOX** for object detection
- **YuNet** for face detection
- **SFace** for face recognization
- **MPPalm** for palm detection
- **MPHand** for hand landmark
- **MPPose** for pose estimation
- **ViTTrack** for object tracking
- **PPOCRv3** for text detection
- **CRNN** for text recognization
- **PPHumanSeg** for human segmentation

If other models should be added, **please leave some comments**. Thanks!



Build opencv with script:
```shell
-DBUILD_opencv_python2=OFF
-DBUILD_opencv_python3=OFF
-DBUILD_opencv_gapi=OFF
-DINSTALL_PYTHON_EXAMPLES=OFF
-DINSTALL_C_EXAMPLES=OFF
-DBUILD_DOCS=OFF
-DBUILD_EXAMPLES=OFF
-DBUILD_ZLIB=OFF
-DWITH_FFMPEG=OFF
```



Performance Test on **Apple M2 CPU**
```shell
MacOS 14.0
8 threads
```

**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  76.244   |  76.611   |  62.534   |  57.678   |  57.238   |
| EfficientNet |    ---    |    ---    |  109.224  |  130.753  |  109.076  |
| MPHand       |    ---    |    ---    |  19.289   |  22.727   |  27.593   |
| MPPalm       |  47.150   |  47.061   |  41.064   |  65.598   |  40.109   |
| MPPose       |    ---    |    ---    |  26.592   |  32.022   |  26.956   |
| PPHumanSeg   |  41.672   |  41.790   |  27.819   |  27.212   |  30.461   |
| PPOCRv3      |    ---    |    ---    |  140.371  |  187.922  |  170.026  |
| SFace        |  43.830   |  43.834   |  27.575   |  30.653   |  26.387   |
| ViTTrack     |    ---    |    ---    |    ---    |  14.617   |  15.028   |
| YOLOX        | 1060.507  | 1061.361  |  495.816  |  533.309  |  549.713  |
| YOLOv5       |    ---    |    ---    |    ---    |  191.350  |  193.261  |
| YOLOv8       |    ---    |    ---    |  198.893  |  218.733  |  223.142  |
| YuNet        |  27.084   |  27.095   |  26.238   |  30.512   |  34.439   |
| MobileNet_SSD_Caffe         |  44.742   |  44.565   |  33.005   |  29.421   |  29.286   |
| MobileNet_SSD_v1_TensorFlow |  49.352   |  49.274   |  35.163   |  32.134   |  31.904   |
| MobileNet_SSD_v2_TensorFlow |  83.537   |  83.379   |  56.403   |  42.947   |  42.148   |
| ResNet_50                   |  148.872  |  148.817  |  77.331   |  67.682   |  67.760   |


**n threads:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  44.262   |  44.408   |  41.540   |  40.731   |  41.151   |
| EfficientNet |    ---    |    ---    |  28.683   |  42.676   |  38.204   |
| MPHand       |    ---    |    ---    |   6.738   |  13.126   |   8.155   |
| MPPalm       |  16.613   |  16.588   |  12.477   |  31.370   |  17.048   |
| MPPose       |    ---    |    ---    |  12.985   |  19.700   |  16.537   |
| PPHumanSeg   |  14.993   |  15.133   |  13.438   |  15.269   |  15.252   |
| PPOCRv3      |    ---    |    ---    |  63.752   |  85.469   |  76.190   |
| SFace        |  10.685   |  10.822   |   8.127   |   8.318   |   7.934   |
| ViTTrack     |    ---    |    ---    |    ---    |  10.079   |   9.579   |
| YOLOX        |  417.358  |  422.977  |  230.036  |  234.662  |  228.555  |
| YOLOv5       |    ---    |    ---    |    ---    |  74.249   |  75.480   |
| YOLOv8       |    ---    |    ---    |  63.762   |  88.770   |  70.927   |
| YuNet        |   8.589   |   8.731   |  11.269   |  16.466   |  14.513   |
| MobileNet_SSD_Caffe         |  12.575   |  12.636   |  11.529   |  12.114   |  12.236   |
| MobileNet_SSD_v1_TensorFlow |  13.922   |  14.160   |  13.078   |  12.124   |  13.298   |
| MobileNet_SSD_v2_TensorFlow |  25.096   |  24.836   |  22.823   |  20.238   |  20.319   |
| ResNet_50                   |  41.561   |  41.296   |  29.092   |  30.412   |  29.339   |


Performance Test on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)
```shell
Ubuntu 22.04.2 LTS
8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz)
4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz)
20 threads
```


**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  16.752   |  16.851   |  16.840   |  16.625   |  16.663   |
| EfficientNet |    ---    |    ---    |  61.107   |  76.037   |  53.890   |
| MPHand       |    ---    |    ---    |   8.906   |   9.969   |   8.403   |
| MPPalm       |  24.243   |  24.638   |  18.104   |  35.140   |  18.387   |
| MPPose       |    ---    |    ---    |  12.322   |  16.515   |  12.355   |
| PPHumanSeg   |  15.249   |  15.303   |  10.203   |  10.298   |  10.353   |
| PPOCRv3      |    ---    |    ---    |  87.788   |  144.253  |  90.648   |
| SFace        |  15.583   |  15.884   |  13.957   |  13.298   |  13.284   |
| ViTTrack     |    ---    |    ---    |    ---    |  11.760   |  11.710   |
| YOLOX        |  324.927  |  325.173  |  235.986  |  253.653  |  254.472  |
| YOLOv5       |    ---    |    ---    |    ---    |  102.163  |  102.621  |
| YOLOv8       |    ---    |    ---    |  87.013   |  103.182  |  103.146  |
| YuNet        |  12.806   |  12.645   |  10.515   |  12.647   |  12.711   |
| MobileNet_SSD_Caffe         |  23.556   |  23.768   |  24.304   |  22.569   |  22.602   |
| MobileNet_SSD_v1_TensorFlow |  26.136   |  26.276   |  26.854   |  24.828   |  24.961   |
| MobileNet_SSD_v2_TensorFlow |  43.521   |  43.614   |  46.892   |  44.044   |  44.682   |
| ResNet_50                   |  73.588   |  73.501   |  75.191   |  66.893   |  65.144   |


**n thread:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth | 
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |   8.665   |   8.827   |  10.643   |   7.703   |   7.743   | 
| EfficientNet |    ---    |    ---    |  16.591   |  12.715   |   9.022   |   
| MPHand       |    ---    |    ---    |   2.678   |   2.785   |   1.680   |           
| MPPalm       |   5.309   |   5.319   |   3.822   |  10.568   |   4.467   |       
| MPPose       |    ---    |    ---    |   3.644   |   6.088   |   4.608   |        
| PPHumanSeg   |   4.756   |   4.865   |   5.084   |   5.179   |   5.148   |        
| PPOCRv3      |    ---    |    ---    |  32.023   |  50.591   |  32.414   |      
| SFace        |   3.838   |   3.980   |   4.629   |   3.145   |   3.155   |       
| ViTTrack     |    ---    |    ---    |    ---    |  10.335   |  10.357   |   
| YOLOX        |  68.314   |  68.081   |  82.801   |  74.219   |  73.970   |      
| YOLOv5       |    ---    |    ---    |    ---    |  47.150   |  47.523   |    
| YOLOv8       |    ---    |    ---    |  32.195   |  30.359   |  30.267   |    
| YuNet        |   2.604   |   2.644   |   2.622   |   3.278   |   3.349   |    
| MobileNet_SSD_Caffe         |  13.005   |   5.935   |   8.586   |   4.629   |   4.713   |
| MobileNet_SSD_v1_TensorFlow |   7.002   |   7.129   |   9.314   |   5.271   |   5.213   |
| MobileNet_SSD_v2_TensorFlow |  11.939   |  12.111   |  22.688   |  12.038   |  12.086   |
| ResNet_50                   |  18.227   |  18.600   |  26.150   |  15.584   |  15.706   |
@asmorkalov asmorkalov mentioned this pull request Oct 17, 2023
@fengyuentau
Copy link
Member

BTW, there are too many places with misused stripes of cv::parallel_for() in OpenCV DNN. Search for getNumThreads().

It worths a fix for all these layers.

thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
Extend performance test models opencv#24298

**Merged With opencv/opencv_extra#1095

This PR aims to extend the performance tests. 

- **YOLOv5** for object detection
- **YOLOv8** for object detection
- **EfficientNet** for classification

Models from OpenCV Zoo:

- **YOLOX** for object detection
- **YuNet** for face detection
- **SFace** for face recognization
- **MPPalm** for palm detection
- **MPHand** for hand landmark
- **MPPose** for pose estimation
- **ViTTrack** for object tracking
- **PPOCRv3** for text detection
- **CRNN** for text recognization
- **PPHumanSeg** for human segmentation

If other models should be added, **please leave some comments**. Thanks!



Build opencv with script:
```shell
-DBUILD_opencv_python2=OFF
-DBUILD_opencv_python3=OFF
-DBUILD_opencv_gapi=OFF
-DINSTALL_PYTHON_EXAMPLES=OFF
-DINSTALL_C_EXAMPLES=OFF
-DBUILD_DOCS=OFF
-DBUILD_EXAMPLES=OFF
-DBUILD_ZLIB=OFF
-DWITH_FFMPEG=OFF
```



Performance Test on **Apple M2 CPU**
```shell
MacOS 14.0
8 threads
```

**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  76.244   |  76.611   |  62.534   |  57.678   |  57.238   |
| EfficientNet |    ---    |    ---    |  109.224  |  130.753  |  109.076  |
| MPHand       |    ---    |    ---    |  19.289   |  22.727   |  27.593   |
| MPPalm       |  47.150   |  47.061   |  41.064   |  65.598   |  40.109   |
| MPPose       |    ---    |    ---    |  26.592   |  32.022   |  26.956   |
| PPHumanSeg   |  41.672   |  41.790   |  27.819   |  27.212   |  30.461   |
| PPOCRv3      |    ---    |    ---    |  140.371  |  187.922  |  170.026  |
| SFace        |  43.830   |  43.834   |  27.575   |  30.653   |  26.387   |
| ViTTrack     |    ---    |    ---    |    ---    |  14.617   |  15.028   |
| YOLOX        | 1060.507  | 1061.361  |  495.816  |  533.309  |  549.713  |
| YOLOv5       |    ---    |    ---    |    ---    |  191.350  |  193.261  |
| YOLOv8       |    ---    |    ---    |  198.893  |  218.733  |  223.142  |
| YuNet        |  27.084   |  27.095   |  26.238   |  30.512   |  34.439   |
| MobileNet_SSD_Caffe         |  44.742   |  44.565   |  33.005   |  29.421   |  29.286   |
| MobileNet_SSD_v1_TensorFlow |  49.352   |  49.274   |  35.163   |  32.134   |  31.904   |
| MobileNet_SSD_v2_TensorFlow |  83.537   |  83.379   |  56.403   |  42.947   |  42.148   |
| ResNet_50                   |  148.872  |  148.817  |  77.331   |  67.682   |  67.760   |


**n threads:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  44.262   |  44.408   |  41.540   |  40.731   |  41.151   |
| EfficientNet |    ---    |    ---    |  28.683   |  42.676   |  38.204   |
| MPHand       |    ---    |    ---    |   6.738   |  13.126   |   8.155   |
| MPPalm       |  16.613   |  16.588   |  12.477   |  31.370   |  17.048   |
| MPPose       |    ---    |    ---    |  12.985   |  19.700   |  16.537   |
| PPHumanSeg   |  14.993   |  15.133   |  13.438   |  15.269   |  15.252   |
| PPOCRv3      |    ---    |    ---    |  63.752   |  85.469   |  76.190   |
| SFace        |  10.685   |  10.822   |   8.127   |   8.318   |   7.934   |
| ViTTrack     |    ---    |    ---    |    ---    |  10.079   |   9.579   |
| YOLOX        |  417.358  |  422.977  |  230.036  |  234.662  |  228.555  |
| YOLOv5       |    ---    |    ---    |    ---    |  74.249   |  75.480   |
| YOLOv8       |    ---    |    ---    |  63.762   |  88.770   |  70.927   |
| YuNet        |   8.589   |   8.731   |  11.269   |  16.466   |  14.513   |
| MobileNet_SSD_Caffe         |  12.575   |  12.636   |  11.529   |  12.114   |  12.236   |
| MobileNet_SSD_v1_TensorFlow |  13.922   |  14.160   |  13.078   |  12.124   |  13.298   |
| MobileNet_SSD_v2_TensorFlow |  25.096   |  24.836   |  22.823   |  20.238   |  20.319   |
| ResNet_50                   |  41.561   |  41.296   |  29.092   |  30.412   |  29.339   |


Performance Test on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)
```shell
Ubuntu 22.04.2 LTS
8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz)
4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz)
20 threads
```


**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  16.752   |  16.851   |  16.840   |  16.625   |  16.663   |
| EfficientNet |    ---    |    ---    |  61.107   |  76.037   |  53.890   |
| MPHand       |    ---    |    ---    |   8.906   |   9.969   |   8.403   |
| MPPalm       |  24.243   |  24.638   |  18.104   |  35.140   |  18.387   |
| MPPose       |    ---    |    ---    |  12.322   |  16.515   |  12.355   |
| PPHumanSeg   |  15.249   |  15.303   |  10.203   |  10.298   |  10.353   |
| PPOCRv3      |    ---    |    ---    |  87.788   |  144.253  |  90.648   |
| SFace        |  15.583   |  15.884   |  13.957   |  13.298   |  13.284   |
| ViTTrack     |    ---    |    ---    |    ---    |  11.760   |  11.710   |
| YOLOX        |  324.927  |  325.173  |  235.986  |  253.653  |  254.472  |
| YOLOv5       |    ---    |    ---    |    ---    |  102.163  |  102.621  |
| YOLOv8       |    ---    |    ---    |  87.013   |  103.182  |  103.146  |
| YuNet        |  12.806   |  12.645   |  10.515   |  12.647   |  12.711   |
| MobileNet_SSD_Caffe         |  23.556   |  23.768   |  24.304   |  22.569   |  22.602   |
| MobileNet_SSD_v1_TensorFlow |  26.136   |  26.276   |  26.854   |  24.828   |  24.961   |
| MobileNet_SSD_v2_TensorFlow |  43.521   |  43.614   |  46.892   |  44.044   |  44.682   |
| ResNet_50                   |  73.588   |  73.501   |  75.191   |  66.893   |  65.144   |


**n thread:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth | 
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |   8.665   |   8.827   |  10.643   |   7.703   |   7.743   | 
| EfficientNet |    ---    |    ---    |  16.591   |  12.715   |   9.022   |   
| MPHand       |    ---    |    ---    |   2.678   |   2.785   |   1.680   |           
| MPPalm       |   5.309   |   5.319   |   3.822   |  10.568   |   4.467   |       
| MPPose       |    ---    |    ---    |   3.644   |   6.088   |   4.608   |        
| PPHumanSeg   |   4.756   |   4.865   |   5.084   |   5.179   |   5.148   |        
| PPOCRv3      |    ---    |    ---    |  32.023   |  50.591   |  32.414   |      
| SFace        |   3.838   |   3.980   |   4.629   |   3.145   |   3.155   |       
| ViTTrack     |    ---    |    ---    |    ---    |  10.335   |  10.357   |   
| YOLOX        |  68.314   |  68.081   |  82.801   |  74.219   |  73.970   |      
| YOLOv5       |    ---    |    ---    |    ---    |  47.150   |  47.523   |    
| YOLOv8       |    ---    |    ---    |  32.195   |  30.359   |  30.267   |    
| YuNet        |   2.604   |   2.644   |   2.622   |   3.278   |   3.349   |    
| MobileNet_SSD_Caffe         |  13.005   |   5.935   |   8.586   |   4.629   |   4.713   |
| MobileNet_SSD_v1_TensorFlow |   7.002   |   7.129   |   9.314   |   5.271   |   5.213   |
| MobileNet_SSD_v2_TensorFlow |  11.939   |  12.111   |  22.688   |  12.038   |  12.086   |
| ResNet_50                   |  18.227   |  18.600   |  26.150   |  15.584   |  15.706   |
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
Extend performance test models opencv#24298

**Merged With opencv/opencv_extra#1095

This PR aims to extend the performance tests. 

- **YOLOv5** for object detection
- **YOLOv8** for object detection
- **EfficientNet** for classification

Models from OpenCV Zoo:

- **YOLOX** for object detection
- **YuNet** for face detection
- **SFace** for face recognization
- **MPPalm** for palm detection
- **MPHand** for hand landmark
- **MPPose** for pose estimation
- **ViTTrack** for object tracking
- **PPOCRv3** for text detection
- **CRNN** for text recognization
- **PPHumanSeg** for human segmentation

If other models should be added, **please leave some comments**. Thanks!



Build opencv with script:
```shell
-DBUILD_opencv_python2=OFF
-DBUILD_opencv_python3=OFF
-DBUILD_opencv_gapi=OFF
-DINSTALL_PYTHON_EXAMPLES=OFF
-DINSTALL_C_EXAMPLES=OFF
-DBUILD_DOCS=OFF
-DBUILD_EXAMPLES=OFF
-DBUILD_ZLIB=OFF
-DWITH_FFMPEG=OFF
```



Performance Test on **Apple M2 CPU**
```shell
MacOS 14.0
8 threads
```

**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  76.244   |  76.611   |  62.534   |  57.678   |  57.238   |
| EfficientNet |    ---    |    ---    |  109.224  |  130.753  |  109.076  |
| MPHand       |    ---    |    ---    |  19.289   |  22.727   |  27.593   |
| MPPalm       |  47.150   |  47.061   |  41.064   |  65.598   |  40.109   |
| MPPose       |    ---    |    ---    |  26.592   |  32.022   |  26.956   |
| PPHumanSeg   |  41.672   |  41.790   |  27.819   |  27.212   |  30.461   |
| PPOCRv3      |    ---    |    ---    |  140.371  |  187.922  |  170.026  |
| SFace        |  43.830   |  43.834   |  27.575   |  30.653   |  26.387   |
| ViTTrack     |    ---    |    ---    |    ---    |  14.617   |  15.028   |
| YOLOX        | 1060.507  | 1061.361  |  495.816  |  533.309  |  549.713  |
| YOLOv5       |    ---    |    ---    |    ---    |  191.350  |  193.261  |
| YOLOv8       |    ---    |    ---    |  198.893  |  218.733  |  223.142  |
| YuNet        |  27.084   |  27.095   |  26.238   |  30.512   |  34.439   |
| MobileNet_SSD_Caffe         |  44.742   |  44.565   |  33.005   |  29.421   |  29.286   |
| MobileNet_SSD_v1_TensorFlow |  49.352   |  49.274   |  35.163   |  32.134   |  31.904   |
| MobileNet_SSD_v2_TensorFlow |  83.537   |  83.379   |  56.403   |  42.947   |  42.148   |
| ResNet_50                   |  148.872  |  148.817  |  77.331   |  67.682   |  67.760   |


**n threads:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  44.262   |  44.408   |  41.540   |  40.731   |  41.151   |
| EfficientNet |    ---    |    ---    |  28.683   |  42.676   |  38.204   |
| MPHand       |    ---    |    ---    |   6.738   |  13.126   |   8.155   |
| MPPalm       |  16.613   |  16.588   |  12.477   |  31.370   |  17.048   |
| MPPose       |    ---    |    ---    |  12.985   |  19.700   |  16.537   |
| PPHumanSeg   |  14.993   |  15.133   |  13.438   |  15.269   |  15.252   |
| PPOCRv3      |    ---    |    ---    |  63.752   |  85.469   |  76.190   |
| SFace        |  10.685   |  10.822   |   8.127   |   8.318   |   7.934   |
| ViTTrack     |    ---    |    ---    |    ---    |  10.079   |   9.579   |
| YOLOX        |  417.358  |  422.977  |  230.036  |  234.662  |  228.555  |
| YOLOv5       |    ---    |    ---    |    ---    |  74.249   |  75.480   |
| YOLOv8       |    ---    |    ---    |  63.762   |  88.770   |  70.927   |
| YuNet        |   8.589   |   8.731   |  11.269   |  16.466   |  14.513   |
| MobileNet_SSD_Caffe         |  12.575   |  12.636   |  11.529   |  12.114   |  12.236   |
| MobileNet_SSD_v1_TensorFlow |  13.922   |  14.160   |  13.078   |  12.124   |  13.298   |
| MobileNet_SSD_v2_TensorFlow |  25.096   |  24.836   |  22.823   |  20.238   |  20.319   |
| ResNet_50                   |  41.561   |  41.296   |  29.092   |  30.412   |  29.339   |


Performance Test on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)
```shell
Ubuntu 22.04.2 LTS
8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz)
4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz)
20 threads
```


**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  16.752   |  16.851   |  16.840   |  16.625   |  16.663   |
| EfficientNet |    ---    |    ---    |  61.107   |  76.037   |  53.890   |
| MPHand       |    ---    |    ---    |   8.906   |   9.969   |   8.403   |
| MPPalm       |  24.243   |  24.638   |  18.104   |  35.140   |  18.387   |
| MPPose       |    ---    |    ---    |  12.322   |  16.515   |  12.355   |
| PPHumanSeg   |  15.249   |  15.303   |  10.203   |  10.298   |  10.353   |
| PPOCRv3      |    ---    |    ---    |  87.788   |  144.253  |  90.648   |
| SFace        |  15.583   |  15.884   |  13.957   |  13.298   |  13.284   |
| ViTTrack     |    ---    |    ---    |    ---    |  11.760   |  11.710   |
| YOLOX        |  324.927  |  325.173  |  235.986  |  253.653  |  254.472  |
| YOLOv5       |    ---    |    ---    |    ---    |  102.163  |  102.621  |
| YOLOv8       |    ---    |    ---    |  87.013   |  103.182  |  103.146  |
| YuNet        |  12.806   |  12.645   |  10.515   |  12.647   |  12.711   |
| MobileNet_SSD_Caffe         |  23.556   |  23.768   |  24.304   |  22.569   |  22.602   |
| MobileNet_SSD_v1_TensorFlow |  26.136   |  26.276   |  26.854   |  24.828   |  24.961   |
| MobileNet_SSD_v2_TensorFlow |  43.521   |  43.614   |  46.892   |  44.044   |  44.682   |
| ResNet_50                   |  73.588   |  73.501   |  75.191   |  66.893   |  65.144   |


**n thread:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth | 
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |   8.665   |   8.827   |  10.643   |   7.703   |   7.743   | 
| EfficientNet |    ---    |    ---    |  16.591   |  12.715   |   9.022   |   
| MPHand       |    ---    |    ---    |   2.678   |   2.785   |   1.680   |           
| MPPalm       |   5.309   |   5.319   |   3.822   |  10.568   |   4.467   |       
| MPPose       |    ---    |    ---    |   3.644   |   6.088   |   4.608   |        
| PPHumanSeg   |   4.756   |   4.865   |   5.084   |   5.179   |   5.148   |        
| PPOCRv3      |    ---    |    ---    |  32.023   |  50.591   |  32.414   |      
| SFace        |   3.838   |   3.980   |   4.629   |   3.145   |   3.155   |       
| ViTTrack     |    ---    |    ---    |    ---    |  10.335   |  10.357   |   
| YOLOX        |  68.314   |  68.081   |  82.801   |  74.219   |  73.970   |      
| YOLOv5       |    ---    |    ---    |    ---    |  47.150   |  47.523   |    
| YOLOv8       |    ---    |    ---    |  32.195   |  30.359   |  30.267   |    
| YuNet        |   2.604   |   2.644   |   2.622   |   3.278   |   3.349   |    
| MobileNet_SSD_Caffe         |  13.005   |   5.935   |   8.586   |   4.629   |   4.713   |
| MobileNet_SSD_v1_TensorFlow |   7.002   |   7.129   |   9.314   |   5.271   |   5.213   |
| MobileNet_SSD_v2_TensorFlow |  11.939   |  12.111   |  22.688   |  12.038   |  12.086   |
| ResNet_50                   |  18.227   |  18.600   |  26.150   |  15.584   |  15.706   |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants