Skip to content

Latest commit

 

History

History
42 lines (28 loc) · 1.4 KB

tensorflow.md

File metadata and controls

42 lines (28 loc) · 1.4 KB

Some TensorFlow Test Results

Overview

Unfortunately TF doesn't support runtime code dispatch according to CPU capacity. Hence recompiling TF with SIMD enabled is expected to bring performance boost.

The exception is when TF is built against MKL, MKL is able to dynamically select code according to the CPU.

[amd64/CPU] generic code / native code, inception5h

Source, Data, Basic Environment

Benchmark Procedure

  1. compile the benchmark program with sh debian/tests/tf_tool_benchmark_model.sh
  2. download the pretrained model and unzip to the parent directory.
  3. ./tf_benchmark_model --graph=../tensorflow_inception_graph.pb

Results

CPU Compiler Average Timing (Generic -O2) (µs) Average Timing (Opt march=native) (µs) Boost
I5-7440HQ (4C4T @ 2.8~3.8GHz, AVX2) GCC (Debian 8.2.0-6) 66953.4 ± 3017 24661.1 ± 1769 2.71x
I5-7440HQ (4C4T @ 2.8~3.8GHz, AVX2) Clang 7.0.0-+rc2-1~exp3 (tags/RELEASE_700/rc2) 49932.8 ± 2550 26324.7 ± 1743 1.90x

See Also

  1. https://github.com/soumith/convnet-benchmarks
  2. CPU details: https://ark.intel.com/

Copyright

lumin AT debian.org, CC-BY-SA 4.0