Skip to content

Latest commit

 

History

History
32 lines (22 loc) · 1.58 KB

File metadata and controls

32 lines (22 loc) · 1.58 KB

Post-Training Quantization of SSD PyTorch Model

This example demonstrates how to use Post-Training Quantization API from Neural Network Compression Framework (NNCF) to quantize PyTorch models on the example of SSD300_VGG16 from torchvision library.

The example includes the following steps:

  • Loading the COCO128 dataset (~7 Mb).
  • Loading SSD300_VGG16 from torchvision pretrained on the full COCO dataset.
  • Patching some internal methods with no_nncf_trace context so that the model graph is traced properly by NNCF.
  • Quantizing the model using NNCF Post-Training Quantization algorithm.
  • Output of the following characteristics of the quantized model:
    • Accuracy drop of the quantized model (INT8) over the pre-trained model (FP32).
    • Compression rate of the quantized model file size relative to the pre-trained model file size.
    • Performance speed up of the quantized model (INT8).

Install requirements

At this point it is assumed that you have already installed NNCF. You can find information on installation NNCF here.

To work with the example you should install the corresponding Python package dependencies:

pip install -r requirements.txt

Run Example

The example does not require any additional preparation, just run:

python main.py