Skip to content

Inference benchmark of deep learning models implemented by paddlepaddle.

Notifications You must be signed in to change notification settings

hedaoyuan/paddle-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paddle Benchmark

Inference benchmark of deep learning models implemented by paddlepaddle.

Environment

  • MI 5, Android 7.0, Snapdragon 820 1.8GHz
  • android-ndk-r13b
    • gcc version 4.9.x 20150123 (prerelease) (GCC)
    • Android clang version 3.8.256229 (based on LLVM 3.8.256229)

Mobilenet

Benchmark for Mobilenet inference(input image 3x224x224).

Currently, on MI 5 phones, single-threaded inference takes 122.607ms and takes up 48M of system memory.

version times(ms) mem(MB) size(KB) optimization(accelerate)
d2258a4 321.682 - - base
d2258a4 225.044 - - merge bn(30%)
b45d020 148.201 - - depthwise convolution(34.1%)
0146e8b 127.032 - - clang compile(14.3%)
d59295f 122.607 48 4306 -> 1431 neon::relu(3.5%)
  • The convolution layer of the Base version is achieved by im2col + gemm way.
  • The merge bn optimization is merge the parameters of batch normalization layer's into the parameters of convolution layer.
  • The depthwise convolution is a depthwise convolution optimization base on arm neon intrinsics.
  • The clang compile is better than gcc compile.
  • The test method of mem(MB) is running the paddle inference program, and use the free command access the changes of memory usage in the system.
  • The previous value in size (KB) column is the size of the paddle inference.so, and the latter is the size after zip compressed.

About

Inference benchmark of deep learning models implemented by paddlepaddle.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages