Skip to content

RAYKALI/simple-int8-pytorch-implement

Repository files navigation

nvidia's int8 quantize simple test in fp32(not real int8) use pytorch This experiment is devoted to the quantification principle of int8. But using fp32 to implement the process. Implementing int8 requires cudnn or cublas based on DP4A The results are credible because int32 and float32 have similar accuracy.

create ./dataset/test ./dataset/train ./dataset/q folder first

dataset.py------transforming pytorch cifar10 datasets into JPG format and pytorch dataloader

vgg16.py------VGG16-mini model

train_fp32.py------Simple Implementation of Training cifar10 with vgg-mini

net_inference.py------usr quantization table to do int8(actually fp32) inference on vgg-mini

int8_weight_save.py------change fp32 conv's weight to int8 and save to reduce storage capacity

int8_quantize.py------generate int8 calibration table vgg_cifar10.table/log

int8_test.py------do int8 inference (use net_inference.py) and compare int8-inference acc with fp32 acc

vgg_cifar10.table/log------generated by int8_quantize.py

when ./dataset/q Qdatasets From the test set, select the top 100 pictures for each category

result:

before int8 79.66 acc

after int8 79.68 acc

when ./dataset/q Qdatasets From the test set, select the top 50 pictures for each category

result:

before int8 79.66 acc

after int8 79.63 acc

About

int8 calibration implement and inference (help you understand int8 Q)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages