## LAPQ
This notebook demonstrates the implimentation of the paper [Loss Aware Post-training Quantization](https://arxiv.org/abs/1911.07190)

### Steps to quantize the pretrained model
- Load the dataset and create dataloader. A subset of training data is used for calibration.
- Load the pretrained full precision model.
- Load the configurations from the YAML file.
- Create a `LAPQ` object and pass the full precision model, dataloaders and configurations.
- Quantize the model by calling the `compress_model` method.

In [1]:
import sys

sys.path.append("../../../")

import os

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"

In [2]:
import yaml
import torch
from torch.utils.data import DataLoader
from torchvision import transforms
from trailmet.datasets.classification import DatasetFactory
from trailmet.models import ModelsFactory
from trailmet.algorithms import quantize

  from .autonotebook import tqdm as notebook_tqdm


## Datasets

### Augmentations

In [3]:
stats = ((0.5071, 0.4867, 0.4408), (0.2675, 0.2565, 0.2761))

train_transform = transforms.Compose(
    [
        transforms.RandomCrop(32, padding=4, padding_mode="reflect"),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(*stats, inplace=True),
    ]
)
val_transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize(*stats)]
)
test_transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize(*stats)]
)

input_transforms = {
    "train": train_transform,
    "val": val_transform,
    "test": test_transform,
}

target_transforms = {"train": None, "val": None, "test": None}

### Load Datasets

In [4]:
cifar100_dataset = DatasetFactory.create_dataset(
    name="CIFAR100",
    root="./data",
    split_types=["train", "val", "test"],
    val_fraction=0.2,
    transform=input_transforms,
    target_transform=target_transforms,
)

# getting the size of the different splits
print("Train samples: ", cifar100_dataset["info"]["train_size"])
print("Val samples: ", cifar100_dataset["info"]["val_size"])
print("Test samples: ", cifar100_dataset["info"]["test_size"])

Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
Train samples:  40000
Val samples:  10000
Test samples:  10000


### Define Dataloaders

In [5]:
train_loader = DataLoader(
    cifar100_dataset["train"],
    batch_size=128,
    sampler=cifar100_dataset["train_sampler"],
    num_workers=2,
)
val_loader = DataLoader(
    cifar100_dataset["val"],
    batch_size=128,
    sampler=cifar100_dataset["val_sampler"],
    num_workers=2,
)
test_loader = DataLoader(
    cifar100_dataset["test"],
    batch_size=128,
    sampler=cifar100_dataset["test_sampler"],
    num_workers=2,
)

dataloaders = {"train": train_loader, "val": val_loader, "test": test_loader}

print("No. of training batches: ", len(dataloaders["train"]))
print("No. of validation batches: ", len(dataloaders["val"]))
print("No. of test batches: ", len(dataloaders["test"]))

No. of training batches:  313
No. of validation batches:  79
No. of test batches:  79


### Load Pretrained Model

In [6]:
res50_model = ModelsFactory.create_model(
    name="resnet50", num_classes=100, pretrained=False, insize=32
)

### Load Method Config

In [7]:
with open("./lapq_config.yaml", "r") as f:
    config = yaml.safe_load(f)
    kwargs = config["GENERAL"]

kwargs

{'GPU_ID': 0,
 'SEED': 42,
 'W_BITS': 4,
 'A_BITS': 8,
 'ACT_QUANT': True,
 'CALIB_BATCHES': 4,
 'MAX_ITER': 1000,
 'MAX_FEV': 1000,
 'VERBOSE': True}

### Quantization Method: BRECQ

In [9]:
quantizer = quantize.lapq.LAPQ(res50_model, dataloaders, **kwargs)

print("testing pretrained model before quantization")
_, acc1, acc5 = quantizer.test(
    model=res50_model,
    dataloader=dataloaders["test"],
    loss_fn=torch.nn.CrossEntropyLoss(),
)
print(f"top-1 acc: {acc1:.2f}%, top-5 acc: {acc5:.2f}%")

qmodel = quantizer.compress_model()

==> Using seed: 42 and device: cuda:0


[34m[1mwandb[0m: Currently logged in as: [33manimesh-007[0m. Use [1m`wandb login --relogin`[0m to force relogin


testing pretrained model before quantization


Validating network (79 / 79 Steps) (batch time=0.01544s) (loss=9.17796) (top1=0.00000) (top5=0.00000): 100%|| 79/79 [00:04<00:00, 17.72it/s] 


 * acc@1 1.040 acc@5 5.190
top-1 acc: 1.04%, top-5 acc: 5.19%


Validating network (79 / 79 Steps) (batch time=0.04852s) (loss=11.44887) (top1=0.00000) (top5=0.00000): 100%|| 79/79 [00:04<00:00, 17.18it/s] 


 * acc@1 1.010 acc@5 5.040
==> Quantization (W4A8) accuracy before LAPQ: 1.0100 | 5.0400


100%|██████████| 10/10 [00:07<00:00,  1.30it/s, loss=9.16, p_val=4]  


==> using p intr : 4.09


Validating network (79 / 79 Steps) (batch time=0.04860s) (loss=9.68071) (top1=0.00000) (top5=0.00000): 100%|| 79/79 [00:04<00:00, 16.46it/s] 


 * acc@1 0.960 acc@5 5.070
==> Quantization (W4A8) accuracy before Optimization: 0.9600 | 5.0700
==> Loss after LpNormQuantization: 9.4259
==> Starting Powell Optimization


100%|██████████| 1000/1000 [03:28<00:00,  4.80it/s, curr_loss=4.62, min_loss=4.61]


==> Layer-wise Scales :
 [-2.71854830e+00 -3.08051988e-01  9.95627994e-01  3.81617464e+00
 -9.40671566e-01 -3.61076604e+00 -1.81663796e+00  1.21657309e+00
  1.91811575e+00  2.52682898e+00 -1.42890691e+00 -1.70284810e+00
  3.04076019e+00 -1.71109412e+00 -4.41954680e-01 -2.75285367e-01
  4.77568090e+00 -8.70588564e-01 -1.46963710e+01 -6.32294023e-01
 -1.39139490e+00  4.28594499e+00 -1.21911043e+02 -1.34789206e+00
  1.03977837e+00  1.19332061e+01 -1.46906333e+01 -4.85501959e-01
  1.23643283e+00  1.34744476e+01 -7.74482854e-01 -1.09169621e+00
  3.03794852e-01  1.49893110e+01 -1.44991452e+00 -6.44897627e+00
 -9.93479876e-01  1.35904461e-01  2.40335316e+01  2.86178112e-01
  9.66371353e-02  1.44008197e-01  2.28464613e+01  2.84444329e-01
  9.55632382e-02  1.41487494e-01  3.59558318e+01  2.82226657e-01
  9.38017642e-02  1.43612936e-01  5.25450935e+01  2.79624492e-01
  9.47439224e-02  1.43324569e-01  1.15084450e+02  2.01696590e-01
  6.74792528e-02  1.00485601e-01  9.49488449e+01  1.02076188e-01


Validating network (79 / 79 Steps) (batch time=0.04774s) (loss=4.62941) (top1=0.00000) (top5=0.00000): 100%|| 79/79 [00:04<00:00, 16.54it/s]


 * acc@1 1.090 acc@5 4.920
==> Full quantization (W4A8) accuracy: 1.0899999141693115


In [10]:
print("testing quantized model")
_, acc1, acc5 = quantizer.test(
    model=qmodel, dataloader=dataloaders["test"], loss_fn=torch.nn.CrossEntropyLoss()
)
print(f"top-1 acc: {acc1:.2f}%, top-5 acc: {acc5:.2f}%")

testing quantized model


Validating network (79 / 79 Steps) (batch time=0.05487s) (loss=4.62941) (top1=0.00000) (top5=0.00000): 100%|| 79/79 [00:04<00:00, 18.34it/s]

 * acc@1 1.090 acc@5 4.920
top-1 acc: 1.09%, top-5 acc: 4.92%



