Skip to content

Files

Latest commit

 

History

History

pytorch-post-training-quantization-nncf

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Post-Training Quantization of PyTorch models with NNCF

This tutorial demonstrates how to use NNCF 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize a PyTorch model for high-speed inference via OpenVINO Toolkit. For more advanced NNCF usage, refer to these examples.

To speed up download and validation, this tutorial uses a pre-trained ResNet-50 model on the Tiny ImageNet dataset.

Notebook contents

The tutorial consists of the following steps:

  • Evaluating the original model.
  • Transforming the original FP32 model to INT8.
  • Exporting optimized and original models to ONNX and then to OpenVINO IR.
  • Comparing performance of the obtained FP32 and INT8 models.

Installation Instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.