MQBench的结果与SNPE DSP的结果不是位精确的 #109

changewOw · 2022-06-08T03:28:15Z

MQBench是一个非常有趣的项目。

环境
pytorch: 1.8.1
MQBench: branch main, e217520
SNPE: snpe-1.61.0.3358

问题:
我用一个只有两层卷积模型做了一个简单的测试，比对MQBench 量化后的结果和SNPE DSP的结果，发现并不是位精确的，请问一下这是否是正常的，我是否有哪里做错了。

复现

MQBench量化

def seed_torchv2(seed: int = 42) -> None:
    np.random.seed(seed)
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.conv = nn.Conv2d(3, 128,1,1, bias=True)
        self.conv2 = nn.Conv2d(128, 20,1,1,bias=True)
        self.relu = nn.ReLU()
        self.flat = nn.Flatten(1)

    def forward(self, x): # (1,3,20,20)
        x = self.avg_pool(x)
        x = self.conv(x)
        x = self.conv2(x)
        x = self.flat(x)
        return x

    
SIZE = 20
backend = BackendType.SNPE

np.set_printoptions(suppress=True, precision=6)
torch.set_printoptions(6)
seed_torchv2(42)


def gen_input_data(length=100):
    data = []
    for _ in range(length):
        data.append(np.ones((1,3,SIZE,SIZE), dtype=np.float32) * 0.1 * np.random.randint(0, 10))
    return np.stack(data, axis=0)


model = Net()          # use vision pre-defined model
model.eval()

train_data = gen_input_data(100)
dummy_input = np.zeros((1,3,SIZE,SIZE), dtype=np.float32) + 0.5


print("pytorch fp32 result")
print(model(torch.from_numpy(dummy_input.copy())).float())


# quant
model = prepare_by_platform(model, backend)

enable_calibration(model)

for i, d in enumerate(train_data):
    _ = model(torch.from_numpy(d).float())

enable_quantization(model)


print("quant sim result")
print(model(torch.from_numpy(dummy_input.copy())).float())


input_shape = {"image":[1,3,SIZE,SIZE]}
convert_deploy(model, backend, input_shape)

# save dummy input and test it on DSP
image = dummy_input.copy()
assert image.shape == (1,3,SIZE,SIZE)
assert image.dtype == np.float32
image.tofile("./tmp.raw")
print("#" * 50)

pytorch fp32 result
tensor([[-0.347889, -0.289117, -0.083191, -0.222827,  0.124699,  0.235278,
          0.434433, -0.302174, -0.047763,  0.229472, -0.037784,  0.082496,
         -0.150852, -0.170281,  0.130777,  0.146441, -0.494992, -0.182881,
          0.600709, -0.063706]], grad_fn=<ViewBackward>)

quant sim result
tensor([[-0.344930, -0.290467, -0.081694, -0.222389,  0.131618,  0.231466,
          0.435701, -0.299544, -0.049924,  0.226927, -0.036308,  0.081694,
         -0.149772, -0.172465,  0.131618,  0.149772, -0.494702, -0.181542,
          0.599088, -0.063540]], grad_fn=<ViewBackward>

DLC转换
./snpe-onnx-to-dlc --input_network mqbench_qmodel_deploy_model.onnx --output_path tmp.dlc --quantization_overrides mqbench_qmodel_clip_ranges.json
./snpe-dlc-quantize --input_dlc tmp.dlc --input_list tmp_file.txt --output_dlc tmp_quat_mq.dlc --override_params --bias_bitwidth 32
tmp_file.txt和tmp_file_android.txt都只有一个文件就是tmp.raw,tmp.raw在上面python程序里面保存下来为一个3x20x20的float文件
SNPE DSP run
./snpe-net-run --container /sdcard/tmp_quat_mq.dlc --input_list /sdcard/tmp_file_android.txt --use_dsp

##################################################
74.raw
(20,)
[-0.34493 -0.285929 -0.081694 -0.222389 0.127079 0.236005 0.435701
-0.299544 -0.049924 0.226927 -0.036308 0.081694 -0.149772 -0.172465
0.131618 0.149772 -0.490163 -0.177003 0.599088 -0.068078]

比对quant sim result 和 DSP 的结果，可以看到粗斜体是二者不一致的地方

The text was updated successfully, but these errors were encountered:

Tracin · 2022-06-08T03:42:30Z

整个流程没有任何问题,事实上我们几乎不可能做到在pytorch里bit对齐后端硬件,有太多未知的运算细节.
MQBench旨在量化模式,量化位置上尽力对齐后端运算.
我们一般以cosine指标来计算两者误差,0.99+可以认为训练部署的精度是有保证的.

changewOw · 2022-06-08T03:56:35Z

谢谢！

changewOw · 2022-06-09T04:46:12Z

@Tracin 我进一步测试了一下mobilenetv3-small(num_classes=2)，发现有些样本quantsim的结果为[-0.6173174 -0.05346843]
DSP的结果为[-0.792305 -0.490937]，他们的cosine为0.8923229。

想请问一下，有哪些地方可以改进来尽量对齐quantsim和DSP的结果，是不是mobilenetv3不适合来做量化

Tracin · 2022-06-09T06:22:14Z

可以先检查量化参数是否正确

查看带量化节点的ONNX模型,查看量化节点插入是否正确
snpe-dlc-quantize会打印重载的tensor截断值,可以检查snpe_encodings中是否有缺失

changewOw · 2022-06-09T07:13:25Z

我的整个模型是UNet的模式,有三个输出
encoder: mobilenetv3-small
decoder:upNearest2d->upNearest2d->upNearest2d

    def forward(self, x):
        feature_8x, feature_16x, feature_32x = self.model(x)

        logits_cls = self.head_cls(feature_32x)

        accu_radius, heatmaps_uv = self.decode_model(feature_8x, feature_16x, feature_32x)

        return logits_cls, accu_radius, heatmaps_uv

1.我检查了带量化节点的onnx，看上去没有问题。
2.看上去有点问题

        "1572": [
            {
                "bitwidth": 8,
                "min": -17.530067443847656,
                "max": 12.27104663848877
            }
        ],
        "1583": [
            {
                "bitwidth": 8,
                "min": -0.3967282474040985,
                "max": 12.248984336853027
            }
        ],
        "1591": [
            {
                "bitwidth": 8,
                "min": 0.0,
                "max": 5.7605085372924805
            }
        ],

DLC-quant的输出是:

[INFO] InitializeStderr: DebugLog initialized.
[INFO] Writing intermediate model
[INFO] Setting activation for layer: image and buffer: image
[INFO] bw: 8, min: 0.000000, max: 1.000000, delta: 0.003922, offset: 0.000000
[INFO] Setting activation for layer: Conv_6 and buffer: 1572
[INFO] bw: 8, min: -17.530067, max: 12.271047, delta: 0.116867, offset: -150.000000
[INFO] Setting activation for layer: Add_11_Hswish and buffer: 1583
[INFO] bw: 8, min: -0.359582, max: 17.979096, delta: 0.071916, offset: -5.000000
[INFO] Setting activation for layer: Conv_24 and buffer: 1590
[INFO] bw: 8, min: -18.765767, max: 4.577016, delta: 0.091540, offset: -205.000000
[INFO] Setting activation for layer: Relu_25 and buffer: 1591
[INFO] bw: 8, min: 0.000000, max: 5.760509, delta: 0.022590, offset: 0.000000
[INFO] Setting activation for layer: GlobalAveragePool_29 and buffer: 1595
[INFO] bw: 8, min: 0.000000, max: 2.837885, delta: 0.011129, offset: 0.000000

我发现几个问题：
a）比如1583这个buffer，json给的是-0.39和12.24，而dlc给的是-0.35和17.97。
b）1583这个buffer，json没有给出delta和offset，dlc是自己算出来的吗
c）json文件里面没有出现1590这个buffer

我把onnx json和dlc的log放在了zip文件里面，如果可以帮忙看看，谢谢

detnet_center_unet.zip

Tracin · 2022-06-10T06:00:22Z

可以尝试:

1583为什么没有被重载? 目前看应该是通过inputlist输入计算得出的
json为什么没有1590? 说明MQbench在此位置没有插入量化节点
json中有没有多余的项?
num_classes有点少,可以适当增大,多统计一些样本计算均值

github-actions · 2022-10-09T03:50:31Z

This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!

Tracin added the good first issue Good for newcomers label Jun 8, 2022

changewOw closed this as completed Jun 8, 2022

Tracin reopened this Jun 9, 2022

github-actions bot added the Stale label Oct 9, 2022

github-actions bot closed this as completed Oct 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MQBench的结果与SNPE DSP的结果不是位精确的 #109

MQBench的结果与SNPE DSP的结果不是位精确的 #109

changewOw commented Jun 8, 2022 •

edited

Tracin commented Jun 8, 2022

changewOw commented Jun 8, 2022

changewOw commented Jun 9, 2022

Tracin commented Jun 9, 2022

changewOw commented Jun 9, 2022 •

edited

Tracin commented Jun 10, 2022

github-actions bot commented Oct 9, 2022

MQBench的结果与SNPE DSP的结果不是位精确的 #109

MQBench的结果与SNPE DSP的结果不是位精确的 #109

Comments

changewOw commented Jun 8, 2022 • edited

Tracin commented Jun 8, 2022

changewOw commented Jun 8, 2022

changewOw commented Jun 9, 2022

Tracin commented Jun 9, 2022

changewOw commented Jun 9, 2022 • edited

Tracin commented Jun 10, 2022

github-actions bot commented Oct 9, 2022

changewOw commented Jun 8, 2022 •

edited

changewOw commented Jun 9, 2022 •

edited