Skip to content

Introduction of Accuracy Calibration Tool for 8 Bit Inference

Haihao Shen edited this page May 13, 2019 · 13 revisions

The 8-bits (INT8) inference is also known as the Low Precision Inference which could speed up the inference with the lower accuracy loss. It has higher throughput and lower memory requirements compared to FP32. As the Intel Caffe enables the INT8 inference supporting from the 1.1.0 release, we release a accuracy tool(Calibrator) at the same time. This tool will generate the initial quantization parameters firstly and try to adjust them to meet the accuracy claim later. It will generate a quantized prototxt finally.


This tool is dedicated to run on Intel SkyLake platform which supports the 8-Bit Inference.


  1. Build the caffe firstly, please disable MLSL option by adding -DUSE_MLSL=0.
  2. cd the /path/to/caffe/scripts
  3. OMP_NUM_THREADS=%CORE_NUMBER% python -r ../build/ -m /path/to/model_prototxt -w /path/to/weights -i iterations -n accuracy-top1 -l 0.01 -d 0 Note that you need to specify OMP_NUM_THREADS to physical core number before running
  4. It will generate the quantized prototxt with the "_quantized" postfix. This prototxt will be used for INT8 inference.

2.Tool's Parameters

Currently, we supports seven parameters as the input of the tool while the first five of them are mandatory.

  • -r/--root: the build folder path of caffe;
  • -w/--weights: the pre-trained FP32 weights of the model;
  • -m/--model: the prototxt of the model;
  • -i/--iterations: equal to test iteration definition, usually its value equal to epoch/batch size;
  • -n/--blob_name: the blob name which used for expected accuracy or detection output value. E.g., below is a typical accuracy layer description, in this case, you need to specify the top blob name accuracy-top1 as the input.
layer {
   name: "accuracy/top-1"
   type: "Accuracy"
   bottom: "fc1000"
   bottom: "label"
   top: "accuracy-top1"
   include {
     phase: TEST
  • -l/--accuracy_loss: tolerance for accuracy loss, the default value is 0.01(which means %1).This is not a mandatory option.
  • -d/--detection: inference type is detection or classification, the default value is 0(classification) while 1 stands for detection, e.g ssd. This is not a mandatory option.
  • -fu/--first_conv_force_u8: enable the inference optimization of first layer INT8 convolution. Note that the new caffemodel will be generated.
Clone this wiki locally
You can’t perform that action at this time.