#BukaVINO

Dalam buku catatan ini, kami akan menunjukkan cara menggunakan toolkit OpenVINO untuk menerapkan model pembelajaran mendalam pada perangkat edge dan mengkuantisasi model untuk mengurangi ukuran model dan latensi inferensi. Kami akan melatih model CNN sederhana pada kumpulan data MNIST, mengonversinya ke format OpenVINO IR, dan mengkuantisasi model tersebut ke presisi INT8. Kami kemudian akan membandingkan ukuran dan kinerja model terkuantisasi dengan model FP32 asli.

## Siapkan OpenVINO

Pertama, kita perlu menginstal OpenVINO, NNCF dan torch

In [1]:
%pip install -q "openvino>=2023.1.0" torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "nncf>=2.6.0"

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.7/44.7 MB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.6/70.6 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m40.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m63.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m422.9/422.9 kB[0m [31m25.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.2/4.2 MB[0m [31m49.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import pathlib
import numpy as np
import openvino as ov
import nncf

INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, openvino


## Model Kereta Api

Selanjutnya, tentukan dan latih model CNN sederhana pada kumpulan data MNIST

In [3]:
transform=transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))
        ])

train_dataset = datasets.MNIST('./data', train=True, download=True,transform=transform)
test_dataset = datasets.MNIST('./data', train=False,transform=transform)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=12, kernel_size=3)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc = nn.Linear(12 * 13 * 13, 10)

    def forward(self, x):
        x = x.view(-1, 1, 28, 28)
        x = F.relu(self.conv1(x))
        x = self.pool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        output = F.log_softmax(x, dim=1)
        return output


train_loader = torch.utils.data.DataLoader(train_dataset, 32)
test_loader = torch.utils.data.DataLoader(test_dataset, 32)

device = "cpu"

epochs = 1

model = Net().to(device)
optimizer = optim.Adam(model.parameters())

model.train()

for epoch in range(1, epochs+1):
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
            epoch, batch_idx * len(data), len(train_loader.dataset),
            100. * batch_idx / len(train_loader), loss.item()))

MODEL_DIR = pathlib.Path("./models")
MODEL_DIR.mkdir(exist_ok=True)
torch.save(model.state_dict(), MODEL_DIR / "original_model.p")

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9.91M/9.91M [00:00<00:00, 15.9MB/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28.9k/28.9k [00:00<00:00, 499kB/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1.65M/1.65M [00:00<00:00, 4.35MB/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4.54k/4.54k [00:00<00:00, 3.51MB/s]


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw



## Konversikan ke OpenVINO IR

Kemudian, konversikan model ke format OpenVINO IR

In [4]:
core = ov.Core()
example_input = next(iter(test_loader))[0]
ov_model = ov.convert_model(model, example_input=example_input)
ov.save_model(ov_model, MODEL_DIR / f"openvino_ir.xml")

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'


## Kuantisasi

Untuk mengkuantisasi model menggunakan NNCF, pertama-tama, buat fungsi transformasi untuk mengonversi tensor obor ke array NumPy, lalu gunakan fungsi yang dibuat bersama dengan pemuat data pytorch untuk membuat kumpulan data kalibrasi menggunakan kelas `Dataset` dari NNCF. Selanjutnya, kuantisasi model menggunakan fungsi `quantize` dari NNCF. Terakhir, kompilasi model terkuantisasi dan simpan sebagai format OpenVINO IR.

In [5]:
def transform_fn(data_item):
    images, _ = data_item
    return images.numpy()

calibration_dataset = nncf.Dataset(train_loader, transform_fn)
quantized_model = nncf.quantize(ov_model, calibration_dataset)
model_int8 = ov.compile_model(quantized_model)
input_fp32 = next(iter(test_loader))[0][0:1]
res = model_int8(input_fp32)
ov.save_model(quantized_model, MODEL_DIR / f"quant_openvino_ir.xml")

Output()

Output()

## Periksa Ukuran

Bandingkan ukuran model FP32 dan INT8

In [6]:
%ls -lh {MODEL_DIR}

total 176K
-rw-r--r-- 1 root root 40K Jan  3 15:22 openvino_ir.bin
-rw-r--r-- 1 root root 11K Jan  3 15:22 openvino_ir.xml
-rw-r--r-- 1 root root 82K Jan  3 15:22 original_model.p
-rw-r--r-- 1 root root 21K Jan  3 15:22 quant_openvino_ir.bin
-rw-r--r-- 1 root root 16K Jan  3 15:22 quant_openvino_ir.xml


## Periksa Akurasi

Evaluasi keakuratan model INT8 dan bandingkan dengan model FP32

In [7]:
def test_ov(model, data_loader):
    compiled_model = ov.compile_model(model)
    test_loss = 0
    correct = 0
    for data, target in data_loader:
        output = torch.tensor(compiled_model(data)[0])
        test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
        pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
        correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(data_loader.dataset)

    return 100. * correct / len(data_loader.dataset)

acc = test_ov(ov_model, test_loader)
print(f"Accuracy of original model: {acc}")

qacc = test_ov(quantized_model, test_loader)
print(f"Accuracy of quantized model: {qacc}")

Accuracy of original model: 96.55
Accuracy of quantized model: 96.65
