Skip to content
Reference implementations of inference benchmarks
Branch: master
Clone or download
christ1ne Merge pull request #273 from ClarkChin08/moudle_change
[green] change preprocess module from PIL to opencv
Latest commit 939d5f9 Jul 17, 2019

MLPerf Inference Reference Implementations

This is a repository of reference implementations for the MLPerf inference benchmark suite.

Reference implementations are valid as only starting points for benchmark implementations. They are not fully optimized, and they are not intended to be used for "real" performance measurements of software frameworks or hardware platforms. The objective is for hardware and software vendors to take these reference implementations, optimize then, and submit them as optimized solutions to the MLPerf Inference call for submissions.

Preliminary release (v0.5)

MLPerf inference release is very much an "alpha" release -- it could be improved in many ways. The benchmark suite is still being developed and refined. Please see the suggestions section below to learn how to contribute. We anticipate a significant round of updates after the v0.5 call for submission finishes. Much of the input would be taken into account for v0.6.


We provide reference implementations for each of the 5 benchmarks in the MLPerf inference suite:

Area Task Model Dataset Quality Target Latency Constraint
Vision Image classification Resnet50-v1.5 ImageNet (224x224) TBD TBD
Vision Image classification MobileNets-v1 224 ImageNet (224x224) TBD TBD
Vision Object detection SSD-ResNet34 COCO (1200x1200) TBD TBD
Vision Object detection SSD-MobileNets-v1 COCO (300x300) TBD TBD
Language Machine translation GNMT WMT16 TBD TBD

Each reference implementation provides the following:

  • Code that implements the model in at least one framework.
  • A Dockerfile which can be used to run the benchmark in a container. The exception is GNMT.
  • A script which downloads the appropriate dataset.
  • A script which runs and times the model inference.
  • Documentation on the dataset, model, and machine setup.

Running Benchmarks

You will need to install the LoadGen and download and install the datasets and models.

Load Generator

Please refer to README at under /loadgen directory to install LoadGen. Also see useful presentation material here.



Please refer to README at


Please refer to README at

Test Setup

These benchmarks have been tested on the following machine configuration:

  • Tested SW configurations:

    • Python 3
    • Cuda10.0, TF 1.14
  • Tested HW configurations:

    • Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz with 12 PCUs with one Titan XP
    • 64G DRAM
    • 200G Disk (recommmended)

So far, all models/scenarios except for SSD-resnet34 have been tested under the above hardware configuration.


We are still in the early stages of developing MLPerf, and we are looking for areas to improve, partners, and contributors. If you have recommendations for new benchmarks, or otherwise would like to be involved in the process, please reach out to For technical bugs or support, email


Please search!forum/mlperf-inference-submitters for frequently asked questions. There isn't any one link/resource in there as the questions are spread across the mailing list. Please search for the appropriate question you have in mind. If you cannot find a response, or are unclear please start a new thread by sending an email to

You can’t perform that action at this time.