Benchmark

Lianmin Zheng edited this page Oct 24, 2018 · 22 revisions

About

This page contains the benchmark results for several popular image classification models. We auto-tune all listed models on target platforms and benchmark the inference performance (time cost per image).

Content

ARM CPU

Note: If a board has big.LITTLE architecture, we will use all big cores. Otherwise, we will use all cores. In the following device specifications, we only list the cores being used.

Devices

  • Firefly-RK3399 : 2 x Cortex A72 1.8Ghz
  • Raspberry Pi 3B : 4 x Cortex A53 1.2Ghz
  • Huawei P20 Pro / Mate10 Pro (Soc: HiSilicon Kirin 970) : (4 x Cortex A73 2.36GHz)
  • Google Pixel 2 (Soc: Qualcomm Snapdragon 835) : (4 × Kyro 2.35 GHz)
  • PYNQ (2 x Cortex-A9 650MHz)

Results

  • dtype = float32, batch_size = 1 (unit: ms)
densenet-121 inception-v3 mobilenet mobilenet-v2 resnet-18 resnet-50 squeezenet-v1.0 squeezenet-v1.1 vgg-16 vgg-19
Raspberry Pi 3B 610.2 2074.2 121.8 104.8 320.0 726.0 185.1 94.0 1772.0 2119.8
Firefly RK3399 336.8 1304.4 77.9 64.8 158.6 403.2 94.3 48.2 903.5 1086.0
Huawei P20 Pro 179.7 444.7 41.3 33.4 77.4 232.5 51.4 26.0 486.3 729.4
Google Pixel2 161.0 434.8 39.6 29.3 66.0 181.1 47.3 23.0 397.1 485.0
Xilinx PYNQ 2887.0 9691.7 721.4 513.3 1231.7 3585.5 913.0 478.3 -1.0 -1.0

Mobile GPU

Devices

  • Mali-T860 MP4: On Firefly-RK3399. Its frequency is locked to 800MHz.

Results

  • dtype = float32, batch_size = 1 (unit: ms)
densenet-121 inception-v3 mobilenet mobilenet-v2 resnet-18 resnet-50 squeezenet-v1.0 squeezenet-v1.1 vgg-16 vgg-19
Mali-T860 410.6 784.7 79.5 77.7 127.3 354.7 111.0 62.5 673.2 792.1
  • dtype = float16 and batch_size = 1 (unit: ms)
densenet-121 inception-v3 mobilenet mobilenet-v2 resnet-18 resnet-50 squeezenet-v1.0 squeezenet-v1.1 vgg-16 vgg-19
Mali-T860 295.4 464.9 52.9 60.7 84.3 221.0 77.3 46.7 405.6 472.8

NVIDIA GPU

Devices

  • Jetson TX2: on Max-N mode 1.3GHz
  • GTX 1080 TI, GTX Titan X

Results

  • dtype = float32, batch_size = 1 (unit: ms)
densenet-121 inception-v3 mobilenet mobilenet-v2 resnet-18 resnet-50 vgg-16 vgg-19
GTX 1080 Ti 3.6 5.8 0.7 1.0 1.1 2.8 4.2 4.8
GTX TITAN X 5.8 9.9 1.0 1.6 1.6 4.3 6.3 7.4
Jetson TX2 26.8 45.7 5.2 8.8 9.6 26.2 58.2 68.8

AMD GPU

  • dtype = float32, batch_size = 1 (unit: ms)
densenet-121 inception-v3 mobilenet resnet-18 resnet-50 vgg-16 vgg-19
Vega FE 5.8 8.9 1.0 1.6 4.5 6.3 7.2

Reproduce

See readme page https://github.com/dmlc/tvm/tree/master/apps/benchmark on how to get these numbers.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.