Latest commit 32582e7 Dec 1, 2016 @samjabrahams italics fix
Failed to load latest commit information. italics fix Dec 1, 2016 Upgrade benchmark stats Dec 1, 2016 Add first basic Inception-v3 benchmark notes Mar 18, 2016

Inception-v3 speed: Raspberry Pi 3

Latest update: December 1, 2016; TensorFlow 0.11.0


This file contains some very basic run-time statistics for TensorFlow's pre-trained Inception-v3 model running on a Raspberry Pi 3 Model B as compared to an Early 2013 15 inch Retina MacBook Pro with an Intel i7-3740QM CPU as well as a desktop rig running Ubuntu 14.04 with a Titan X Maxwell GPU and Intel i7-5820K CPU.

To run this benchmark, I use a modified version of the example script. I made minor modifications to collect and print out run-time information after processing. The modified file is available here:


  • warmup_runs refers to the number of calls to before starting the benchmarking in order to "warmup" the model. TensorFlow makes adjustments on the fly, so the first few times running the model are slower than subsequent runs
  • A run is the time between the start of a call to and when it returns. We list the best, worst, and average time (averaged over 25 runs)
  • Build is the amount of time spent constructing the Inception model from the protobuf file.
TensorFlow version 0.11.0
Model Best run (sec) Worst run (sec) Average run (sec) Build time(sec)
warmup_runs=10 Raspberry Pi 3 1.8646 2.1782 1.9805 4.8962
Intel i7-3740QM (Early 2013 MacBook Pro) 0.2146 0.2425 0.2272 1.3104
Intel i7-5820K (Ubuntu 14.04) 0.1397 0.1730 0.1567 0.7064
NVIDIA Titan X (Maxwell), Intel i7-5820K (Ubuntu 14.04) 0.0240 0.0290 0.0259 0.9566
warmup_runs=0 Raspberry Pi 3 1.8541 6.3338 2.0656 4.9755
Intel i7-3740QM (Early 2013 Retina MacBook Pro) 0.2174 1.3151 0.2662 1.2761
Intel i7-5820K (Ubuntu 14.04) 0.1435 0.7027 0.1750 0.7103
NVIDIA Titan X (Maxwell), Intel i7-5820K (Ubuntu 14.04) 0.0232 1.5800 0.0871 0.7659


  • Test performance has gotten significantly better over the past several releases of TensorFlow, though running Inception on a Raspberry Pi still takes longer than a second when using Python
  • Warming up your Session is crucial. There have been many issues opened in this repo asking how to improve performance, so here's the number one thing to start with: keep your Session persistent to take advantage of automatic optimization tweaks.
  • Along the same lines: do not simply call your Python script from bash every time you want to classify an image. It takes multiple seconds to rebuild the Inception graph from scratch, which can slow down your model by multiple times (this test doesn't include the time it takes to import tensorflow, which is another thing to benchmark...). This goes for pretty much any TensorFlow model you use- keep some sort of rudimentary server running that can respond to requests and utilize a live TensorFlow Session
  • Running the TensorFlow benchmark tool shows sub-second (~500-600ms) average run times for the Raspberry Pi (I'll need to do another write-up with more details). Since this benchmark is run entirely in C++, we'd expect it to run faster than through Python. The question is whether or not all ~1.5 seconds of difference between these tests is entirely due to the communication layer between Python and the C++ core.


I add two additional flags to which allow users to easily change the number of test runs (runs that will collect information), as well as the number of "warmup" runs used. Simply pass in a number to --num_runs or --warmup_runs when calling the script:

# Use a sample size of 100 runs
$ python --num_runs=100

# Don't include any warmup runs
$ python --warmup_runs=0