Skip to content

Latest commit

 

History

History
140 lines (105 loc) · 13.1 KB

README.md

File metadata and controls

140 lines (105 loc) · 13.1 KB

Image Classification Sample

This sample demonstrates a DL model compression in case of an image-classification problem. The sample consists of basic steps such as DL model initialization, dataset preparation, training loop over epochs, training and validation steps. The sample receives a configuration file where the training schedule, hyper-parameters, and compression settings are defined.

Features

  • Torchvision models (ResNets, VGG, Inception, etc.) and datasets (ImageNet, CIFAR 10, CIFAR 100) support
  • Custom models support
  • Configuration file examples for sparsity, quantization, filter pruning and quantization with sparsity
  • Export to ONNX that is supported by the OpenVINO™ toolkit
  • DataParallel and DistributedDataParallel modes
  • Tensorboard-compatible output

Installation

At this point it is assumed that you have already installed nncf. You can find information on downloading nncf here.

To work with the sample you should install the corresponding Python package dependencies:

pip install -r examples/torch/requirements.txt

Quantize FP32 Pretrained Model

This scenario demonstrates quantization with fine-tuning of MobileNet v2 on the ImageNet dataset.

Dataset Preparation

To prepare the ImageNet dataset, refer to the following tutorial.

Run Classification Sample

  • If you did not install the package, add the repository root folder to the PYTHONPATH environment variable.
  • Go to the examples/torch/classification folder.

Test Pretrained Model

Before compressing a model, it is highly recommended checking the accuracy of the pretrained model. All models which are supported in the sample has pretrained weights for ImageNet.

To load pretrained weights into a model and then evaluate the accuracy of that model, make sure that the pretrained=True option is set in the configuration file and use the following command:

python main.py \
--mode=test \
--config=configs/quantization/mobilenet_v2_imagenet_int8.json \
--data=<path_to_imagenet_dataset> \
--disable-compression 

Compress Pretrained Model

  • Run the following command to start compression with fine-tuning on GPUs:
    python main.py -m train --config configs/quantization/mobilenet_v2_imagenet_int8.json --data /data/imagenet/ --log-dir=../../results/quantization/mobilenet_v2_int8/
    
    It may take a few epochs to get the baseline accuracy results.
  • Use the --multiprocessing-distributed flag to run in the distributed mode.
  • Use the --resume flag with the path to a previously saved model to resume training.
  • For Torchvision-supported image classification models, set "pretrained": true inside the NNCF config JSON file supplied via --config to initialize the model to be compressed with Torchvision-supplied pretrained weights, or, alternatively:
  • Use the --weights flag with the path to a compatible PyTorch checkpoint in order to load all matching weights from the checkpoint into the model - useful if you need to start compression-aware training from a previously trained uncompressed (FP32) checkpoint instead of performing compression-aware training from scratch.

Validate Your Model Checkpoint

To estimate the test scores of your trained model checkpoint, use the following command:

python main.py -m test --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume <path_to_trained_model_checkpoint>

WARNING: The samples use torch.load functionality for checkpoint loading which, in turn, uses pickle facilities by default which are known to be vulnerable to arbitrary code execution attacks. Only load the data you trust

Export Compressed Model

To export trained model to the ONNX format, use the following command:

python main.py -m export --config=configs/quantization/mobilenet_v2_imagenet_int8.json --resume=../../results/quantization/mobilenet_v2_int8/6/checkpoints/epoch_1.pth --to-onnx=../../results/mobilenet_v2_int8.onnx

Export to OpenVINO™ Intermediate Representation (IR)

To export a model to the OpenVINO IR and run it using the Intel® Deep Learning Deployment Toolkit, refer to this tutorial.

Results for quantization

Model Compression algorithm Dataset Accuracy (Drop) % NNCF config file PyTorch checkpoint
ResNet-50 None ImageNet 76.16 resnet50_imagenet.json -
ResNet-50 INT8 ImageNet 76.42 (-0.26) resnet50_imagenet_int8.json Link
ResNet-50 INT8 (per-tensor only) ImageNet 76.37 (-0.21) resnet50_imagenet_int8_per_tensor.json Link
ResNet-50 Mixed, 43.12% INT8 / 56.88% INT4 ImageNet 75.8 (-0.35) resnet50_imagenet_mixed_int_hawq.json Link
ResNet-50 INT8 + Sparsity 61% (RB) ImageNet 75.43 (0.73) resnet50_imagenet_rb_sparsity_int8.json Link
ResNet-50 INT8 + Sparsity 50% (RB) ImageNet 75.55 (0.61) resnet50_imagenet_rb_sparsity50_int8.json Link
Inception V3 None ImageNet 77.34 inception_v3_imagenet.json -
Inception V3 INT8 ImageNet 78.25 (-0.91) inception_v3_imagenet_int8.json Link
Inception V3 INT8 + Sparsity 61% (RB) ImageNet 77.58 (-0.24) inception_v3_imagenet_rb_sparsity_int8.json Link
MobileNet V2 None ImageNet 71.87 mobilenet_v2_imagenet.json -
MobileNet V2 INT8 ImageNet 71.35 (0.58) mobilenet_v2_imagenet_int8.json Link
MobileNet V2 INT8 (per-tensor only) ImageNet 71.3 (0.63) mobilenet_v2_imagenet_int8_per_tensor.json Link
MobileNet V2 Mixed, 41.12% INT8 / 58.88% INT4 ImageNet 70.89 (-0.94) mobilenet_v2_imagenet_mixed_int_hawq.json Link
MobileNet V2 INT8 + Sparsity 52% (RB) ImageNet 71.11 (0.82) mobilenet_v2_imagenet_rb_sparsity_int8.json Link
MobileNet V3 small None ImageNet 67.67 mobilenet_v3_small_imagenet.json -
MobileNet V3 small INT8 ImageNet 66.94 (0.73) mobilenet_v3_small_imagenet_int8.json Link
SqueezeNet V1.1 None ImageNet 58.24 squeezenet1_1_imagenet.json -
SqueezeNet V1.1 INT8 ImageNet 58.28 (-0.04) squeezenet1_1_imagenet_int8.json Link
SqueezeNet V1.1 INT8 (per-tensor only) ImageNet 58.26 (-0.02) squeezenet1_1_imagenet_int8_per_tensor.json Link
SqueezeNet V1.1 Mixed, 52.83% INT8 / 47.17% INT4 ImageNet 57.61 (0.63) squeezenet1_1_imagenet_mixed_int_hawq.json Link

Binarization

As an example of NNCF convolution binarization capabilities, you may use the configs in examples/torch/classification/configs/binarization to binarize ResNet18. Use the same steps/command line parameters as for quantization (for best results, specify --pretrained), except for the actual binarization config path.

Results for binarization

Model Compression algorithm Dataset Accuracy (Drop) % NNCF config file PyTorch Checkpoint
ResNet-18 None ImageNet 69.8 resnet18_imagenet.json -
ResNet-18 XNOR (weights), scale/threshold (activations) ImageNet 61.63 (8.17) resnet18_imagenet_binarization_xnor.json Link
ResNet-18 DoReFa (weights), scale/threshold (activations) ImageNet 61.61 (8.19) resnet18_imagenet_binarization_dorefa.json Link

Results for filter pruning

Model Compression algorithm Dataset Accuracy (Drop) % GFLOPS MParams NNCF config file PyTorch Checkpoint
ResNet-50 None ImageNet 76.16 8.18 (100%) 25.50 (100%) Link -
ResNet-50 Filter pruning, 40%, geometric median criterion ImageNet 75.62 (0.54) 4.58 (56.00%) 16.06 (62.98%) Link Link
ResNet-18 None ImageNet 69.8 3.63 (100%) 11.68 (100%) Link -
ResNet-18 Filter pruning, 40%, magnitude criterion ImageNet 69.26 (0.54) 2.75 (75.75%) 9.23 (79.02%) Link Link
ResNet-18 Filter pruning, 40%, geometric median criterion ImageNet 69.32 (0.48) 2.75 (75.75%) 9.23 (79.02%) Link Link
ResNet-34 None ImageNet 73.26 7.33 (100%) 21.78 (100%) Link -
ResNet-34 Filter pruning, 50%, geometric median criterion + KD ImageNet 73.11 (0.15) 4.32 (58.96%) 13.56 (62.25%) Link Link
GoogLeNet None ImageNet 69.72 2.99 (100%) 6.61 (100%) Link -
GoogLeNet Filter pruning, 40%, geometric median criterion ImageNet 68.89 (0.83) 1.36 (45.48%) 3.47 (52.50%) Link Link

Results for accuracy-aware compressed training

Model Compression algorithm Dataset Accuracy (Drop) % NNCF config file
ResNet-50 None ImageNet 76.16 resnet50_imagenet.json
ResNet-50 Filter pruning, 52.5%, geometric median criterion ImageNet 75.23 (0.93) resnet50_imagenet_accuracy_aware.json
ResNet-18 None ImageNet 69.8 resnet18_imagenet.json
ResNet-18 Filter pruning, 60%, geometric median criterion ImageNet 69.2 (-0.6) resnet18_imagenet_accuracy_aware.json