ZeroQ: A Novel Zero Shot Quantization Framework
This repository contains the PyTorch implementation for the paper ZeroQ: A Novel Zero-Shot Quantization Framework.
# Code is based on PyTorch 1.2 (Cuda10). Other dependancies could be installed as follows: pip install -r requirements.txt --user # Set a symbolic link to ImageNet validation data (used only to evaluate model) mkdir data ln -s /path/to/imagenet/ data/
The folder structures should be the same as following
zeroq ├── utils ├── data │ ├── imagenet │ │ ├── val
Afterwards you can test Zero Shot quantization with W8A8 by running:
Below are the results that you should get for 8-bit quantization (W8A8 refers to the quantizing model to 8-bit weights and 8-bit activations).
|Models||Single Precision Top-1||W8A8 Top-1|
- You can test a single model using the following command:
export CUDA_VISIBLE_DEVICES=0 python uniform_test.py [--dataset] [--model] [--batch_size] [--test_batch_size] optional arguments: --dataset type of dataset (default: imagenet) --model model to be quantized (default: resnet18) --batch-size batch size of distilled data (default: 64) --test-batch-size batch size of test data (default: 512)
ZeroQ has been developed as part of the following paper. We appreciate it if you would please cite the following paper if you found the implementation useful for your work:
Y. Cai, Z. Yao, Z. Dong, A. Gholami, M. W. Mahoney, K. Keutzer. ZeroQ: A Novel Zero Shot Quantization Framework, under review [PDF].